I agree with Xuefu that inconsistent handling with all the other objects is
not a big problem.

Regarding to option#3, the special "system.system" namespace may confuse
users.
Users need to know the set of built-in function names to know when to use
"system.system" namespace.
What will happen if user registers a non-builtin function name under the
"system.system" namespace?
Besides, I think it doesn't solve the "explode" problem I mentioned at the
beginning of this thread.

So here is my vote:

+1 for #1
0 for #2
-1 for #3

Best,
Jark


On Thu, 19 Sep 2019 at 08:38, Xuefu Z <usxu...@gmail.com> wrote:

> @Dawid, Re: we also don't need additional referencing the specialcatalog
> anywhere.
>
> True. But once we allow such reference, then user can do so in any possible
> place where a function name is expected, for which we have to handle.
> That's a big difference, I think.
>
> Thanks,
> Xuefu
>
> On Wed, Sep 18, 2019 at 5:25 PM Dawid Wysakowicz <
> wysakowicz.da...@gmail.com>
> wrote:
>
> > @Bowen I am not suggesting introducing additional catalog. I think we
> need
> > to get rid of the current built-in catalog.
> >
> > @Xuefu in option #3 we also don't need additional referencing the special
> > catalog anywhere else besides in the CREATE statement. The resolution
> > behaviour is exactly the same in both options.
> >
> > On Thu, 19 Sep 2019, 08:17 Xuefu Z, <usxu...@gmail.com> wrote:
> >
> > > Hi Dawid,
> > >
> > > "GLOBAL" is a temporary keyword that was given to the approach. It can
> be
> > > changed to something else for better.
> > >
> > > The difference between this and the #3 approach is that we only need
> the
> > > keyword for this create DDL. For other places (such as function
> > > referencing), no keyword or special namespace is needed.
> > >
> > > Thanks,
> > > Xuefu
> > >
> > > On Wed, Sep 18, 2019 at 4:32 PM Dawid Wysakowicz <
> > > wysakowicz.da...@gmail.com>
> > > wrote:
> > >
> > > > Hi,
> > > > I think it makes sense to start voting at this point.
> > > >
> > > > Option 1: Only 1-part identifiers
> > > > PROS:
> > > > - allows shadowing built-in functions
> > > > CONS:
> > > > - incosistent with all the other objects, both permanent & temporary
> > > > - does not allow shadowing catalog functions
> > > >
> > > > Option 2: Special keyword for built-in function
> > > > I think this is quite similar to the special catalog/db. The thing I
> am
> > > > strongly against in this proposal is the GLOBAL keyword. This keyword
> > > has a
> > > > meaning in rdbms systems and means a function that is present for a
> > > > lifetime of a session in which it was created, but available in all
> > other
> > > > sessions. Therefore I really don't want to use this keyword in a
> > > different
> > > > context.
> > > >
> > > > Option 3: Special catalog/db
> > > >
> > > > PROS:
> > > > - allows shadowing built-in functions
> > > > - allows shadowing catalog functions
> > > > - consistent with other objects
> > > > CONS:
> > > > - we introduce a special namespace for built-in functions
> > > >
> > > > I don't see a problem with introducing the special namespace. In the
> > end
> > > it
> > > > is very similar to the keyword approach. In this case the catalog/db
> > > > combination would be the "keyword"
> > > >
> > > > Therefore my votes:
> > > > Option 1: -0
> > > > Option 2: -1 (I might change to +0 if we can come up with a better
> > > keyword)
> > > > Option 3: +1
> > > >
> > > > Best,
> > > > Dawid
> > > >
> > > >
> > > > On Thu, 19 Sep 2019, 05:12 Xuefu Z, <usxu...@gmail.com> wrote:
> > > >
> > > > > Hi Aljoscha,
> > > > >
> > > > > Thanks for the summary and these are great questions to be
> answered.
> > > The
> > > > > answer to your first question is clear: there is a general
> agreement
> > to
> > > > > override built-in functions with temp functions.
> > > > >
> > > > > However, your second and third questions are sort of related, as a
> > > > function
> > > > > reference can be either just function name (like "func") or in the
> > form
> > > > or
> > > > > "cat.db.func". When a reference is just function name, it can mean
> > > > either a
> > > > > built-in function or a function defined in the current cat/db. If
> we
> > > > > support overriding a built-in function with a temp function, such
> > > > > overriding can also cover a function in the current cat/db.
> > > > >
> > > > > I think what Timo referred as "overriding a catalog function"
> means a
> > > > temp
> > > > > function defined as "cat.db.func" overrides a catalog function
> "func"
> > > in
> > > > > cat/db even if cat/db is not current. To support this, temp
> function
> > > has
> > > > to
> > > > > be tied to a cat/db. What's why I said above that the 2nd and 3rd
> > > > questions
> > > > > are related. The problem with such support is the ambiguity when
> user
> > > > > defines a function w/o namespace, "CREATE TEMPORARY FUNCTION func
> > ...".
> > > > > Here "func" can means a global temp function, or a temp function in
> > > > current
> > > > > cat/db. If we can assume the former, this creates an inconsistency
> > > > because
> > > > > "CREATE FUNCTION func" actually means a function in current cat/db.
> > If
> > > we
> > > > > assume the latter, then there is no way for user to create a global
> > > temp
> > > > > function.
> > > > >
> > > > > Giving a special namespace for built-in functions may solve the
> > > ambiguity
> > > > > problem above, but it also introduces artificial catalog/database
> > that
> > > > > needs special treatment and pollutes the cleanness of  the code. I
> > > would
> > > > > rather introduce a syntax in DDL to solve the problem, like "CREATE
> > > > > [GLOBAL] TEMPORARY FUNCTION func".
> > > > >
> > > > > Thus, I'd like to summarize a few candidate proposals for voting
> > > > purposes:
> > > > >
> > > > > 1. Support only global, temporary functions without namespace. Such
> > > temp
> > > > > functions overrides built-in functions and catalog functions in
> > current
> > > > > cat/db. The resolution order is: temp functions -> built-in
> functions
> > > ->
> > > > > catalog functions. (Partially or fully qualified functions has no
> > > > > ambiguity!)
> > > > >
> > > > > 2. In addition to #1, support creating and referencing temporary
> > > > functions
> > > > > associated with a cat/db with "GLOBAL" qualifier in DDL for global
> > temp
> > > > > functions. The resolution order is: global temp functions ->
> built-in
> > > > > functions -> temp functions in current cat/db -> catalog function.
> > > > > (Resolution for partially or fully qualified function reference is:
> > > temp
> > > > > functions -> persistent functions.)
> > > > >
> > > > > 3. In addition to #1, support creating and referencing temporary
> > > > functions
> > > > > associated with a cat/db with a special namespace for built-in
> > > functions
> > > > > and global temp functions. The resolution is the same as #2, except
> > > that
> > > > > the special namespace might be prefixed to a reference to a
> built-in
> > > > > function or global temp function. (In absence of the special
> > namespace,
> > > > the
> > > > > resolution order is the same as in #2.)
> > > > >
> > > > > My personal preference is #1, given the unknown use case and
> > introduced
> > > > > complexity for #2 and #3. However, #2 is an acceptable alternative.
> > > Thus,
> > > > > my votes are:
> > > > >
> > > > > +1 for #1
> > > > > +0 for #2
> > > > > -1 for #3
> > > > >
> > > > > Everyone, please cast your vote (in above format please!), or let
> me
> > > know
> > > > > if you have more questions or other candidates.
> > > > >
> > > > > Thanks,
> > > > > Xuefu
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Wed, Sep 18, 2019 at 6:42 AM Aljoscha Krettek <
> > aljos...@apache.org>
> > > > > wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > I think this discussion and the one for FLIP-64 are very
> connected.
> > > To
> > > > > > resolve the differences, think we have to think about the basic
> > > > > principles
> > > > > > and find consensus there. The basic questions I see are:
> > > > > >
> > > > > >  - Do we want to support overriding builtin functions?
> > > > > >  - Do we want to support overriding catalog functions?
> > > > > >  - And then later: should temporary functions be tied to a
> > > > > > catalog/database?
> > > > > >
> > > > > > I don’t have much to say about these, except that we should
> > somewhat
> > > > > stick
> > > > > > to what the industry does. But I also understand that the
> industry
> > is
> > > > > > already very divided on this.
> > > > > >
> > > > > > Best,
> > > > > > Aljoscha
> > > > > >
> > > > > > > On 18. Sep 2019, at 11:41, Jark Wu <imj...@gmail.com> wrote:
> > > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > +1 to strive for reaching consensus on the remaining topics. We
> > are
> > > > > > close to the truth. It will waste a lot of time if we resume the
> > > topic
> > > > > some
> > > > > > time later.
> > > > > > >
> > > > > > > +1 to “1-part/override” and I’m also fine with Timo’s
> > “cat.db.fun”
> > > > way
> > > > > > to override a catalog function.
> > > > > > >
> > > > > > > I’m not sure about “system.system.fun”, it introduces a
> > nonexistent
> > > > cat
> > > > > > & db? And we still need to do special treatment for the dedicated
> > > > > > system.system cat & db?
> > > > > > >
> > > > > > > Best,
> > > > > > > Jark
> > > > > > >
> > > > > > >
> > > > > > >> 在 2019年9月18日,06:54,Timo Walther <twal...@apache.org> 写道:
> > > > > > >>
> > > > > > >> Hi everyone,
> > > > > > >>
> > > > > > >> @Xuefu: I would like to avoid adding too many things
> > > incrementally.
> > > > > > Users should be able to override all catalog objects consistently
> > > > > according
> > > > > > to FLIP-64 (Support for Temporary Objects in Table module). If
> > > > functions
> > > > > > are treated completely different, we need more code and special
> > > cases.
> > > > > From
> > > > > > an implementation perspective, this topic only affects the lookup
> > > logic
> > > > > > which is rather low implementation effort which is why I would
> like
> > > to
> > > > > > clarify the remaining items. As you said, we have a slight
> consenus
> > > on
> > > > > > overriding built-in functions; we should also strive for reaching
> > > > > consensus
> > > > > > on the remaining topics.
> > > > > > >>
> > > > > > >> @Dawid: I like your idea as it ensures registering catalog
> > objects
> > > > > > consistent and the overriding of built-in functions more
> explicit.
> > > > > > >>
> > > > > > >> Thanks,
> > > > > > >> Timo
> > > > > > >>
> > > > > > >>
> > > > > > >> On 17.09.19 11:59, kai wang wrote:
> > > > > > >>> hi, everyone
> > > > > > >>> I think this flip is very meaningful. it supports functions
> > that
> > > > can
> > > > > be
> > > > > > >>> shared by different catalogs and dbs, reducing the
> duplication
> > of
> > > > > > functions.
> > > > > > >>>
> > > > > > >>> Our group based on flink's sql parser module implements
> create
> > > > > function
> > > > > > >>> feature, stores the parsed function metadata and schema into
> > > mysql,
> > > > > and
> > > > > > >>> also customizes the catalog, customizes sql-client to support
> > > > custom
> > > > > > >>> schemas and functions. Loaded, but the function is currently
> > > > global,
> > > > > > and is
> > > > > > >>> not subdivided according to catalog and db.
> > > > > > >>>
> > > > > > >>> In addition, I very much hope to participate in the
> development
> > > of
> > > > > this
> > > > > > >>> flip, I have been paying attention to the community, but
> found
> > it
> > > > is
> > > > > > more
> > > > > > >>> difficult to join.
> > > > > > >>> thank you.
> > > > > > >>>
> > > > > > >>> Xuefu Z <usxu...@gmail.com> 于2019年9月17日周二 上午11:19写道:
> > > > > > >>>
> > > > > > >>>> Thanks to Tmo and Dawid for sharing thoughts.
> > > > > > >>>>
> > > > > > >>>> It seems to me that there is a general consensus on having
> > temp
> > > > > > functions
> > > > > > >>>> that have no namespaces and overwrite built-in functions.
> (As
> > a
> > > > side
> > > > > > note
> > > > > > >>>> for comparability, the current user defined functions are
> all
> > > > > > temporary and
> > > > > > >>>> having no namespaces.)
> > > > > > >>>>
> > > > > > >>>> Nevertheless, I can also see the merit of having namespaced
> > temp
> > > > > > functions
> > > > > > >>>> that can overwrite functions defined in a specific cat/db.
> > > > However,
> > > > > > this
> > > > > > >>>> idea appears orthogonal to the former and can be added
> > > > > incrementally.
> > > > > > >>>>
> > > > > > >>>> How about we first implement non-namespaced temp functions
> now
> > > and
> > > > > > leave
> > > > > > >>>> the door open for namespaced ones for later releases as the
> > > > > > requirement
> > > > > > >>>> might become more crystal? This also helps shorten the
> debate
> > > and
> > > > > > allow us
> > > > > > >>>> to make some progress along this direction.
> > > > > > >>>>
> > > > > > >>>> As to Dawid's idea of having a dedicated cat/db to host the
> > > > > temporary
> > > > > > temp
> > > > > > >>>> functions that don't have namespaces, my only concern is the
> > > > special
> > > > > > >>>> treatment for a cat/db, which makes code less clean, as
> > evident
> > > in
> > > > > > treating
> > > > > > >>>> the built-in catalog currently.
> > > > > > >>>>
> > > > > > >>>> Thanks,
> > > > > > >>>> Xuefiu
> > > > > > >>>>
> > > > > > >>>> On Mon, Sep 16, 2019 at 5:07 PM Dawid Wysakowicz <
> > > > > > >>>> wysakowicz.da...@gmail.com>
> > > > > > >>>> wrote:
> > > > > > >>>>
> > > > > > >>>>> Hi,
> > > > > > >>>>> Another idea to consider on top of Timo's suggestion. How
> > about
> > > > we
> > > > > > have a
> > > > > > >>>>> special namespace (catalog + database) for built-in
> objects?
> > > This
> > > > > > catalog
> > > > > > >>>>> would be invisible for users as Xuefu was suggesting.
> > > > > > >>>>>
> > > > > > >>>>> Then users could still override built-in functions, if they
> > > fully
> > > > > > qualify
> > > > > > >>>>> object with the built-in namespace, but by default the
> common
> > > > logic
> > > > > > of
> > > > > > >>>>> current dB & cat would be used.
> > > > > > >>>>>
> > > > > > >>>>> CREATE TEMPORARY FUNCTION func ...
> > > > > > >>>>> registers temporary function in current cat & dB
> > > > > > >>>>>
> > > > > > >>>>> CREATE TEMPORARY FUNCTION cat.db.func ...
> > > > > > >>>>> registers temporary function in cat db
> > > > > > >>>>>
> > > > > > >>>>> CREATE TEMPORARY FUNCTION system.system.func ...
> > > > > > >>>>> Overrides built-in function with temporary function
> > > > > > >>>>>
> > > > > > >>>>> The built-in/system namespace would not be writable for
> > > permanent
> > > > > > >>>> objects.
> > > > > > >>>>> WDYT?
> > > > > > >>>>>
> > > > > > >>>>> This way I think we can have benefits of both solutions.
> > > > > > >>>>>
> > > > > > >>>>> Best,
> > > > > > >>>>> Dawid
> > > > > > >>>>>
> > > > > > >>>>>
> > > > > > >>>>> On Tue, 17 Sep 2019, 07:24 Timo Walther, <
> twal...@apache.org
> > >
> > > > > wrote:
> > > > > > >>>>>
> > > > > > >>>>>> Hi Bowen,
> > > > > > >>>>>>
> > > > > > >>>>>> I understand the potential benefit of overriding certain
> > > > built-in
> > > > > > >>>>>> functions. I'm open to such a feature if many people
> agree.
> > > > > > However, it
> > > > > > >>>>>> would be great to still support overriding catalog
> functions
> > > > with
> > > > > > >>>>>> temporary functions in order to prototype a query even
> > though
> > > a
> > > > > > >>>>>> catalog/database might not be available currently or
> should
> > > not
> > > > be
> > > > > > >>>>>> modified yet. How about we support both cases?
> > > > > > >>>>>>
> > > > > > >>>>>> CREATE TEMPORARY FUNCTION abs
> > > > > > >>>>>> -> creates/overrides a built-in function and never
> > consideres
> > > > > > current
> > > > > > >>>>>> catalog and database; inconsistent with other DDL but
> > > acceptable
> > > > > for
> > > > > > >>>>>> functions I guess.
> > > > > > >>>>>> CREATE TEMPORARY FUNCTION cat.db.fun
> > > > > > >>>>>> -> creates/overrides a catalog function
> > > > > > >>>>>>
> > > > > > >>>>>> Regarding "Flink don't have any other built-in objects
> > > (tables,
> > > > > > views)
> > > > > > >>>>>> except functions", this might change in the near future.
> > Take
> > > > > > >>>>>> https://issues.apache.org/jira/browse/FLINK-13900 as an
> > > > example.
> > > > > > >>>>>>
> > > > > > >>>>>> Thanks,
> > > > > > >>>>>> Timo
> > > > > > >>>>>>
> > > > > > >>>>>> On 14.09.19 01:40, Bowen Li wrote:
> > > > > > >>>>>>> Hi Fabian,
> > > > > > >>>>>>>
> > > > > > >>>>>>> Yes, I agree 1-part/no-override is the least favorable
> > thus I
> > > > > > didn't
> > > > > > >>>>>>> include that as a voting option, and the discussion is
> > mainly
> > > > > > between
> > > > > > >>>>>>> 1-part/override builtin and 3-part/not override builtin.
> > > > > > >>>>>>>
> > > > > > >>>>>>> Re > However, it means that temp functions are
> differently
> > > > > treated
> > > > > > >>>> than
> > > > > > >>>>>>> other db objects.
> > > > > > >>>>>>> IMO, the treatment difference results from the fact that
> > > > > functions
> > > > > > >>>> are
> > > > > > >>>>> a
> > > > > > >>>>>>> bit different from other objects - Flink don't have any
> > other
> > > > > > >>>> built-in
> > > > > > >>>>>>> objects (tables, views) except functions.
> > > > > > >>>>>>>
> > > > > > >>>>>>> Cheers,
> > > > > > >>>>>>> Bowen
> > > > > > >>>>>>>
> > > > > > >>>>>>
> > > > > > >>>>
> > > > > > >>>> --
> > > > > > >>>> Xuefu Zhang
> > > > > > >>>>
> > > > > > >>>> "In Honey We Trust!"
> > > > > > >>>>
> > > > > > >>
> > > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > > --
> > > > > Xuefu Zhang
> > > > >
> > > > > "In Honey We Trust!"
> > > > >
> > > >
> > >
> > >
> > > --
> > > Xuefu Zhang
> > >
> > > "In Honey We Trust!"
> > >
> >
>
>
> --
> Xuefu Zhang
>
> "In Honey We Trust!"
>

Reply via email to