Hi Bowen,

I revised the doc according to our existing agreement. In the
implementation section, TODO items are split into two parts.
Currently, we just want to have a basic implementation for Flink 1.10
release. Please take one more round of look.

Yes. The vote only for the section of Flink 1.10 release.


Best Regards
Peter Huang




On Fri, Nov 1, 2019 at 2:22 PM Bowen Li <bowenl...@gmail.com> wrote:

> Re 1) I'd prefer syntax of [LANGUAGE JVM|PYTHON|...]. It's also adopted by
> Postgres [1] and MySQL [2]
>
> "USING 'python .....' " seems need extra parsing of the content in single
> quotes, which is not very ideal.
>
> Re 2) I agree.
>
> Besides, the doc proposes new field to be a string. I think it's better be
> an enum, say LanguangeType. JAVA and PYTHON can be the only values
> available for now.
>
> Re 3) I think we can re-evaluate the situation when requirements come, and
> can remove properties for now. Afterall, the interface can evolve. Please
> update the doc to remove the properties field.
>
>
> W.r.t voting, can we have a dedicated section for Flink 1.10 and include
> all the outcome we reached consensus so far? I think we are only gonna vote
> for that section, rather than the full FLIP-79, right?
>
> [1] https://www.postgresql.org/docs/9.5/sql-createfunction.html
> [2] https://dev.mysql.com/doc/refman/8.0/en/create-procedure.html
>
>
> On Thu, Oct 31, 2019 at 10:30 PM Peter Huang <huangzhenqiu0...@gmail.com>
> wrote:
>
> > Hi Terry,
> >
> > Thanks for the quick response. We are on the same page. For the
> > properties of function DDL, let's see whether there is such a need from
> > other people.
> > I will start voting on the design in 24 hours.
> >
> >
> > Best Regards
> > Peter Huang
> >
> >
> >
> >
> >
> >
> >
> > On Thu, Oct 31, 2019 at 3:18 AM Terry Wang <zjuwa...@gmail.com> wrote:
> >
> > > Hi Peter,
> > >
> > > I’d like to share some thoughts from mysids:
> > > 1. what's the syntax to distinguish function language ?
> > >         +1 for using `[LANGUAGE JVM|PYTHON] USING JAR`
> > > 2. How to persist function language in backend catalog ?
> > >         + 1 for a separate field in CatalogFunction. But as to specific
> > > backend, we may persist it case by case. Special case includes how
> > > HiveCatalog store the kind of CatalogFucnction.
> > > 3. do we really need to allow users set a properties map for a udf?
> > >     There are use case requiring passing external arguments to udf for
> > > sure, but the need can also be met by passing arguments to `eval` when
> > > calling udf in sql.
> > > IMO, there is not much need to support set properties map for a udf.
> > >
> > > 4. Should a catalog implement to be able to decide whether it can take
> a
> > > properties map, and which language of a udf it can persist?
> > > IMO, it’s necessary for catalog implementation to provide such
> > > information. But for flink 1.10 map goal, we can just skip this part.
> > >
> > >
> > >
> > > Best,
> > > Terry Wang
> > >
> > >
> > >
> > > > 2019年10月30日 13:52,Peter Huang <huangzhenqiu0...@gmail.com> 写道:
> > > >
> > > > Hi Bowen,
> > > >
> > > > I can't agree more about we first have an agreement on the DDL syntax
> > and
> > > > focus on the MVP in the current phase.
> > > >
> > > > 1) what's the syntax to distinguish function language
> > > > Currently, there are two opinions:
> > > >
> > > >   - USING 'python .....'
> > > >   - [LANGUAGE JVM|PYTHON] USING JAR '...'
> > > >
> > > > As we need to support multiple resources as HQL, we shouldn't repeat
> > the
> > > > language symbol as a suffix of each resource.
> > > > I would prefer option two, but definitely open to more comments.
> > > >
> > > > 2) How to persist function language in backend catalog? as a k-v pair
> > in
> > > > properties map, or a dedicate field?
> > > > Even though language type is also a property, I think a separate
> field
> > in
> > > > CatalogFunction is a more clean solution.
> > > >
> > > > 3) do we really need to allow users set a properties map for udf?
> what
> > > needs
> > > > to be stored there? what are they used for?
> > > >
> > > > I am considering a type of use case that use UDFS for realtime
> > inference.
> > > > The model is nested in the udf as a resource. But there are
> > > > multiple parameters are customizable. In this way, user can use
> > > properties
> > > > to define those parameters.
> > > >
> > > > I only have answers to these questions. For questions about the
> catalog
> > > > implementation, I hope we can collect more feedback from the
> community.
> > > >
> > > >
> > > > Best Regards
> > > > Peter Huang
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > Best Regards
> > > > Peter Huang
> > > >
> > > > On Tue, Oct 29, 2019 at 11:31 AM Bowen Li <bowenl...@gmail.com>
> wrote:
> > > >
> > > >> Hi all,
> > > >>
> > > >> Besides all the good questions raised above, we seem all agree to
> > have a
> > > >> MVP for Flink 1.10, "to support users to create and persist a java
> > > >> class-based udf that's already in classpath (no extra resource
> > loading),
> > > >> and use it later in queries".
> > > >>
> > > >> IIUIC, to achieve that in 1.10, the following are currently the core
> > > >> issues/blockers we should figure out, and solve them as our
> **highest
> > > >> priority**:
> > > >>
> > > >> - what's the syntax to distinguish function language (java, scala,
> > > python,
> > > >> etc)? we only need to implement the java one in 1.10 but have to
> > settle
> > > >> down the long term solution
> > > >> - how to persist function language in backend catalog? as a k-v pair
> > in
> > > >> properties map, or a dedicate field?
> > > >> - do we really need to allow users set a properties map for udf?
> what
> > > needs
> > > >> to be stored there? what are they used for?
> > > >> - should a catalog impl be able to decide whether it can take a
> > > properties
> > > >> map (if we decide to have one), and which language of a udf it can
> > > persist?
> > > >>   - E.g. Hive metastore, which backs Flink's HiveCatalog, cannot
> take
> > a
> > > >> properties map and is only able to persist java udf [1], unless we
> do
> > > >> something hacky to it
> > > >>
> > > >> I feel these questions are essential to Flink functions in the long
> > run,
> > > >> but most importantly, are also the minimum scope for Flink 1.10.
> > Aspects
> > > >> like resource loading security or compatibility with Hive syntax are
> > > >> important too, however if we focus on them now, we may not be able
> to
> > > get
> > > >> the MVP out in time.
> > > >>
> > > >> [1]
> > > >> -
> > > >>
> > > >>
> > >
> >
> https://hive.apache.org/javadocs/r3.1.2/api/org/apache/hadoop/hive/metastore/api/Function.html
> > > >> -
> > > >>
> > > >>
> > >
> >
> https://hive.apache.org/javadocs/r3.1.2/api/org/apache/hadoop/hive/metastore/api/FunctionType.html
> > > >>
> > > >>
> > > >>
> > > >> On Sun, Oct 27, 2019 at 8:22 PM Peter Huang <
> > huangzhenqiu0...@gmail.com
> > > >
> > > >> wrote:
> > > >>
> > > >>> Hi Timo,
> > > >>>
> > > >>> Thanks for the feedback. I replied and adjust the design
> accordingly.
> > > For
> > > >>> the concern of class loading.
> > > >>> I think we need to distinguish the function class loading for
> > Temporary
> > > >> and
> > > >>> Permanent function.
> > > >>>
> > > >>> 1) For Permanent function, we can add it to the job graph so that
> we
> > > >> don't
> > > >>> need to load it multiple times for the different sessions.
> > > >>> 2) For Temporary function, we can register function with a session
> > key,
> > > >> and
> > > >>> use different class loaders in RuntimeContext implementation.
> > > >>>
> > > >>> I added more description in the doc. Please review it again.
> > > >>>
> > > >>>
> > > >>> Best Regards
> > > >>> Peter Huang
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>> On Thu, Oct 24, 2019 at 2:14 AM Timo Walther <twal...@apache.org>
> > > wrote:
> > > >>>
> > > >>>> Hi Peter,
> > > >>>>
> > > >>>> thanks for your proposal. I left some comments in the FLIP
> > document. I
> > > >>>> agree with Terry that we can have a MVP in Flink 1.10 but should
> > > >> already
> > > >>>> discuss the bigger picture as a DDL string cannot be changed
> easily
> > > >> once
> > > >>>> released.
> > > >>>>
> > > >>>> In particular we should discuss how resources for function are
> > loaded.
> > > >>>> If they are simply added to the JobGraph they are available to all
> > > >>>> functions and could potentially interfere with each other, right?
> > > >>>>
> > > >>>> Thanks,
> > > >>>> Timo
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>>> On 24.10.19 05:32, Terry Wang wrote:
> > > >>>>> Hi Peter,
> > > >>>>>
> > > >>>>> Sorry late to reply. Thanks for your efforts on this and I just
> > > >> looked
> > > >>>> through your design.
> > > >>>>> I left some comments in the doc about alter function section and
> > > >>>> function catalog interface.
> > > >>>>> IMO, the overall design is ok and we can discuss further more
> about
> > > >>> some
> > > >>>> details.
> > > >>>>> I also think it’s necessary to have this awesome feature limit to
> > > >> basic
> > > >>>> function (of course better to have all :) ) in 1.10 release.
> > > >>>>>
> > > >>>>> Best,
> > > >>>>> Terry Wang
> > > >>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>>> 2019年10月16日 14:19,Peter Huang <huangzhenqiu0...@gmail.com> 写道:
> > > >>>>>>
> > > >>>>>> Hi Xuefu,
> > > >>>>>>
> > > >>>>>> Thank you for the feedback. I think you are pointing out a
> similar
> > > >>>> concern
> > > >>>>>> with Bowen. Let me describe
> > > >>>>>> how the catalog function and function factory will be changed in
> > the
> > > >>>>>> implementation section.
> > > >>>>>> Then, we can have more discussion in detail.
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> Best Regards
> > > >>>>>> Peter Huang
> > > >>>>>>
> > > >>>>>> On Tue, Oct 15, 2019 at 4:18 PM Xuefu Z <usxu...@gmail.com>
> > wrote:
> > > >>>>>>
> > > >>>>>>> Thanks to Peter for the proposal!
> > > >>>>>>>
> > > >>>>>>> I left some comments in the google doc. Besides what Bowen
> > pointed
> > > >>>> out, I'm
> > > >>>>>>> unclear about how things  work end to end from the document.
> For
> > > >>>> instance,
> > > >>>>>>> SQL DDL-like function definition is mentioned. I guess just
> > having
> > > >> a
> > > >>>> DDL
> > > >>>>>>> for it doesn't explain how it's supported functionally. I think
> > > >> it's
> > > >>>> better
> > > >>>>>>> to have some clarification on what is expected work and what's
> > for
> > > >>> the
> > > >>>>>>> future.
> > > >>>>>>>
> > > >>>>>>> Thanks,
> > > >>>>>>> Xuefu
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>> On Tue, Oct 15, 2019 at 11:05 AM Bowen Li <bowenl...@gmail.com
> >
> > > >>> wrote:
> > > >>>>>>>
> > > >>>>>>>> Hi Zhenqiu,
> > > >>>>>>>>
> > > >>>>>>>> Thanks for taking on this effort!
> > > >>>>>>>>
> > > >>>>>>>> A couple questions:
> > > >>>>>>>> - Though this FLIP is about function DDL, can we also think
> > about
> > > >>> how
> > > >>>> the
> > > >>>>>>>> created functions can be mapped to CatalogFunction and see if
> we
> > > >>> need
> > > >>>> to
> > > >>>>>>>> modify CatalogFunction interface? Syntax changes need to be
> > backed
> > > >>> by
> > > >>>> the
> > > >>>>>>>> backend.
> > > >>>>>>>> - Can we define a clearer, smaller scope targeting for Flink
> > 1.10
> > > >>>> among
> > > >>>>>>> all
> > > >>>>>>>> the proposed changes? The current overall scope seems to be
> > quite
> > > >>>> wide,
> > > >>>>>>> and
> > > >>>>>>>> it may be unrealistic to get everything in a single release,
> or
> > > >>> even a
> > > >>>>>>>> couple. However, I believe the most common user story can be
> > > >>>> something as
> > > >>>>>>>> simple as "being able to create and persist a java class-based
> > udf
> > > >>> and
> > > >>>>>>> use
> > > >>>>>>>> it later in queries", which will add great value for most
> Flink
> > > >>> users
> > > >>>> and
> > > >>>>>>>> is achievable in 1.10.
> > > >>>>>>>>
> > > >>>>>>>> Bowen
> > > >>>>>>>>
> > > >>>>>>>> On Sun, Oct 13, 2019 at 10:46 PM Peter Huang <
> > > >>>> huangzhenqiu0...@gmail.com
> > > >>>>>>>>
> > > >>>>>>>> wrote:
> > > >>>>>>>>
> > > >>>>>>>>> Dear Community,
> > > >>>>>>>>>
> > > >>>>>>>>> FLIP-79 Flink Function DDL Support
> > > >>>>>>>>> <
> > > >>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>
> > > >>>
> > > >>
> > >
> >
> https://docs.google.com/document/d/16kkHlis80s61ifnIahCj-0IEdy5NJ1z-vGEJd_JuLog/edit#
> > > >>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>> This proposal aims to support function DDL with the
> > consideration
> > > >>> of
> > > >>>>>>> SQL
> > > >>>>>>>>> syntax, language compliance, and advanced external UDF lib
> > > >>>>>>> registration.
> > > >>>>>>>>> The Flink DDL is initialized and discussed in the design
> > > >>>>>>>>> <
> > > >>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>
> > > >>>
> > > >>
> > >
> >
> https://docs.google.com/document/d/1TTP-GCC8wSsibJaSUyFZ_5NBAHYEB1FVmPpP7RgDGBA/edit#heading=h.wpsqidkaaoil
> > > >>>>>>>>>>
> > > >>>>>>>>> [1] by Shuyi Chen and Timo. As the initial discussion mainly
> > > >>> focused
> > > >>>> on
> > > >>>>>>>> the
> > > >>>>>>>>> table, type and view. FLIP-69 [2] extend it with a more
> > detailed
> > > >>>>>>>> discussion
> > > >>>>>>>>> of DDL for catalog, database, and function. Original the
> > function
> > > >>> DDL
> > > >>>>>>> was
> > > >>>>>>>>> under the scope of FLIP-69. After some discussion
> > > >>>>>>>>> <https://issues.apache.org/jira/browse/FLINK-7151> with the
> > > >>>> community,
> > > >>>>>>>> we
> > > >>>>>>>>> found that there are several ongoing efforts, such as FLIP-64
> > > >> [3],
> > > >>>>>>>> FLIP-65
> > > >>>>>>>>> [4], and FLIP-78 [5]. As they will directly impact the SQL
> > syntax
> > > >>> of
> > > >>>>>>>>> function DDL, the proposal wants to describe the problem
> > clearly
> > > >>> with
> > > >>>>>>> the
> > > >>>>>>>>> consideration of existing works and make sure the design
> aligns
> > > >>> with
> > > >>>>>>>>> efforts of API change of temporary objects and type inference
> > for
> > > >>> UDF
> > > >>>>>>>>> defined by different languages.
> > > >>>>>>>>>
> > > >>>>>>>>> The FlLIP outlines the requirements from related works, and
> > > >>> propose a
> > > >>>>>>> SQL
> > > >>>>>>>>> syntax to meet those requirements. The corresponding
> > > >> implementation
> > > >>>> is
> > > >>>>>>>> also
> > > >>>>>>>>> discussed. Please kindly review and give feedback.
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>> Best Regards
> > > >>>>>>>>> Peter Huang
> > > >>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>> --
> > > >>>>>>> Xuefu Zhang
> > > >>>>>>>
> > > >>>>>>> "In Honey We Trust!"
> > > >>>>>>>
> > > >>>>
> > > >>>>
> > > >>>
> > > >>
> > >
> > >
> >
>

Reply via email to