Thanks for the clarification Timo, that's sounds good to me. Let's continue
to discuss other things.

Best,
Kurt


On Thu, Oct 10, 2019 at 4:14 PM Timo Walther <twal...@apache.org> wrote:

> The purpose of ConnectTableDescriptor was to have a programmatic way of
> expressing `CREATE TABLE` with JavaDocs and IDE support. But I agree
> that it needs some rework in the future, once we have finalized the DDL
> to ensure that both concepts are in sync.
>
> Regards,
> Timo
>
>
> On 10.10.19 16:08, Kurt Young wrote:
> > Regarding to ConnectTableDescriptor, if in the end it becomes a builder
> > of CatalogTable, I would be ok with that. But it doesn't look like a
> builder
> > now, it's more like another form of TableSource/TableSink.
> >
> > Best,
> > Kurt
> >
> >
> > On Thu, Oct 10, 2019 at 3:44 PM Timo Walther <twal...@apache.org> wrote:
> >
> >> Hi everyone,
> >>
> >> thanks for the great discussion with a good outcome.
> >>
> >> I have one last comment about:
> >>
> >> void createTemporaryFunction(String path, UserDefinedFunction function);
> >>
> >> We should encourage users to register a class instead of an instance. We
> >> should enforce a default constructor in the future. If we need metadata
> >> or type inference, the planner can simply instantiate the class and call
> >> the corresponding getters. If people would like to parameterize a
> >> function, they can use global job parameters instead - which are
> >> available in the open() method.
> >>
> >> I suggest:
> >>
> >> void createTemporaryFunction(String path, Class<? extends
> >> UserDefinedFunction> functionClass);
> >>
> >> This is also closer to the SQL DDL that is about to be implemented in:
> >> https://issues.apache.org/jira/browse/FLINK-7151
> >>
> >> Thanks,
> >> Timo
> >>
> >>
> >> On 09.10.19 17:07, Bowen Li wrote:
> >>> Hi Dawid,
> >>>
> >>> +1 for proposed changes
> >>>
> >>> On Wed, Oct 9, 2019 at 12:15 PM Dawid Wysakowicz <
> dwysakow...@apache.org
> >>>
> >>> wrote:
> >>>
> >>>> Sorry for a very delayed response.
> >>>>
> >>>> @Kurt Yes, this is the goal to have a function created like new
> >>>> Function(...) also be wrapped into CatalogFunction. This would have to
> >>>> be though a temporary function as we cannot represent that as a set of
> >>>> properties. Similar to the createTemporaryView(DataStream stream).
> >>>>
> >>>> As for the ConnectTableDescriptor I agree this is very similar to
> >>>> CatalogTable. I am not sure though if we should get rid of it. In the
> >>>> end I see it as a builder for a CatalogTable, which is a slightly more
> >>>> internal API, but we might revisit that some time in the future if we
> >>>> find that it makes more sense.
> >>>>
> >>>> @All I updated the FLIP page with some more details from the outcome
> of
> >>>> the discussions around FLIP-57. Please take a look. I would like to
> >>>> start a vote on this FLIP as soon as the vote on FLIP-57 goes through.
> >>>>
> >>>> Best,
> >>>>
> >>>> Dawid
> >>>>
> >>>>
> >>>> On 19/09/2019 09:24, Kurt Young wrote:
> >>>>> IIUC it's good to see that both serializable (tables description from
> >>>> DDL)
> >>>>> and unserializable (tables with DataStream underneath) tables are
> >> treated
> >>>>> unify with CatalogTable.
> >>>>>
> >>>>> Can I also assume functions that either come from a function class
> >> (from
> >>>>> DDL)
> >>>>> or function objects (newed by user) will also treated unify with
> >>>>> CatalogFunction?
> >>>>>
> >>>>> This will greatly simplify and unify current API level concepts and
> >>>> design.
> >>>>> And it seems only one thing left, how do we deal with
> >>>>> ConnectTableDescriptor?
> >>>>> It's actually very similar with serializable CatalogTable, both carry
> >>>> some
> >>>>> text
> >>>>> properties which even are the same. Is there any chance we can
> further
> >>>> unify
> >>>>> this to CatalogTable?
> >>>>>
> >>>>> object
> >>>>> Best,
> >>>>> Kurt
> >>>>>
> >>>>>
> >>>>> On Thu, Sep 19, 2019 at 3:13 PM Jark Wu <imj...@gmail.com> wrote:
> >>>>>
> >>>>>> Thanks Dawid for the design doc.
> >>>>>>
> >>>>>> In general, I’m +1 to the FLIP.
> >>>>>>
> >>>>>>
> >>>>>> +1 to the single-string and parse way to express object path.
> >>>>>>
> >>>>>> +1 to deprecate registerTableSink & registerTableSource.
> >>>>>> But I would suggest to provide an easy way to register a custom
> >>>>>> source/sink before we drop them (this is another story).
> >>>>>> Currently, it’s not easy to implement a custom connector descriptor.
> >>>>>>
> >>>>>> Best,
> >>>>>> Jark
> >>>>>>
> >>>>>>
> >>>>>>> 在 2019年9月19日,11:37,Dawid Wysakowicz <wysakowicz.da...@gmail.com>
> 写道:
> >>>>>>>
> >>>>>>> Hi JingsongLee,
> >>>>>>>   From my understanding they can. Underneath they will be
> >> CatalogTables.
> >>>>>> The
> >>>>>>> difference is the lifetime of the tables. Plus some of the user
> >> facing
> >>>>>>> interfaces cannot be persisted e.g. datastream. Therefore we must
> >> have
> >>>> a
> >>>>>>> separate methods for that. In the end the temporary tables are held
> >> in
> >>>>>>> memory as CatalogTables.
> >>>>>>> Best,
> >>>>>>> Dawid
> >>>>>>>
> >>>>>>> On Thu, 19 Sep 2019, 10:08 JingsongLee, <lzljs3620...@aliyun.com
> >>>>>> .invalid>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> Hi dawid:
> >>>>>>>> Can temporary tables achieve the same capabilities as catalog
> table?
> >>>>>>>> like statistics: CatalogTableStatistics, CatalogColumnStatistics,
> >>>>>>>> PartitionStatistics
> >>>>>>>> like partition support: we have added some catalog equivalent
> >>>> interfaces
> >>>>>>>> on TableSource/TableSink: getPartitions, getPartitionFieldNames
> >>>>>>>> Maybe it's not a good idea to add these interfaces to
> >>>>>>>> TableSource/TableSink. What do you think?
> >>>>>>>>
> >>>>>>>> Best,
> >>>>>>>> Jingsong Lee
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> ------------------------------------------------------------------
> >>>>>>>> From:Kurt Young <ykt...@gmail.com>
> >>>>>>>> Send Time:2019年9月18日(星期三) 17:54
> >>>>>>>> To:dev <dev@flink.apache.org>
> >>>>>>>> Subject:Re: [DISCUSS] FLIP-64: Support for Temporary Objects in
> >> Table
> >>>>>>>> module
> >>>>>>>>
> >>>>>>>> Hi all,
> >>>>>>>>
> >>>>>>>> Sorry to join this party late. Big +1 to this flip, especially for
> >> the
> >>>>>>>> dropping
> >>>>>>>> "registerTableSink & registerTableSource" part. These are indeed
> >>>> legacy
> >>>>>>>> and we should try to unify them through CatalogTable after we
> >>>> introduce
> >>>>>>>> the concept of Catalog.
> >>>>>>>>
> >>>>>>>>   From my understanding, what we can registered should all be
> >> metadata,
> >>>>>>>> TableSource/TableSink should only be the one who is responsible to
> >> do
> >>>>>>>> the real work, i.e. reading and writing data according to the
> schema
> >>>> and
> >>>>>>>> other information like computed column, partition, .e.g.
> >>>>>>>>
> >>>>>>>> Best,
> >>>>>>>> Kurt
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Wed, Sep 18, 2019 at 5:14 PM JingsongLee <
> >> lzljs3620...@aliyun.com
> >>>>>>>> .invalid>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> After some development and thinking, I have a general
> >> understanding.
> >>>>>>>>> +1 to registering a source/sink does not fit into the SQL world.
> >>>>>>>>> I am OK to have a deprecated registerTemporarySource/Sink to
> >>>> compatible
> >>>>>>>>> with old ways.
> >>>>>>>>>
> >>>>>>>>> Best,
> >>>>>>>>> Jingsong Lee
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> ------------------------------------------------------------------
> >>>>>>>>> From:Timo Walther <twal...@apache.org>
> >>>>>>>>> Send Time:2019年9月17日(星期二) 08:00
> >>>>>>>>> To:dev <dev@flink.apache.org>
> >>>>>>>>> Subject:Re: [DISCUSS] FLIP-64: Support for Temporary Objects in
> >> Table
> >>>>>>>>> module
> >>>>>>>>>
> >>>>>>>>> Hi Dawid,
> >>>>>>>>>
> >>>>>>>>> thanks for the design document. It fixes big concept gaps due to
> >>>>>>>>> historical reasons with proper support for serializability and
> >>>> catalog
> >>>>>>>>> support in mind.
> >>>>>>>>>
> >>>>>>>>> I would not mind a registerTemporarySource/Sink, but the problem
> >>>> that I
> >>>>>>>>> see is that many people think that this is the recommended way of
> >>>>>>>>> registering a table source/sink which is not true. We should
> guide
> >>>>>> users
> >>>>>>>>> to either use connect() or DDL API which can be validated and
> >> stored
> >>>> in
> >>>>>>>>> catalog.
> >>>>>>>>>
> >>>>>>>>> Also from a concept perspective, registering a source/sink does
> not
> >>>> fit
> >>>>>>>>> into the SQL world. SQL does not know about source/sinks but only
> >>>> about
> >>>>>>>>> tables. If the responsibility of a TableSource/TableSink is just
> a
> >>>> pure
> >>>>>>>>> physical data consumer/producer that is not connected to the
> actual
> >>>>>>>>> logical table schema, we would need a possibility of defining
> time
> >>>>>>>>> attributes and interpreting/converting a changelog. This should
> be
> >>>> done
> >>>>>>>>> by the framework with information from the DDL/connect() and not
> be
> >>>>>>>>> defined in every table source.
> >>>>>>>>>
> >>>>>>>>> Regards,
> >>>>>>>>> Timo
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On 09.09.19 14:16, JingsongLee wrote:
> >>>>>>>>>> Hi dawid:
> >>>>>>>>>>
> >>>>>>>>>> It is difficult to describe specific examples.
> >>>>>>>>>> Sometimes users will generate some java converters through some
> >>>>>>>>>>    Java code, or generate some Java classes through third-party
> >>>>>>>>>>    libraries. Of course, these can be best done through
> properties.
> >>>>>>>>>> But this requires additional work from users.My suggestion is to
> >>>>>>>>>>    keep this Java instance class way that is user-friendly.
> >>>>>>>>>>
> >>>>>>>>>> Best,
> >>>>>>>>>> Jingsong Lee
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> ------------------------------------------------------------------
> >>>>>>>>>> From:Dawid Wysakowicz <dwysakow...@apache.org>
> >>>>>>>>>> Send Time:2019年9月6日(星期五) 16:21
> >>>>>>>>>> To:dev <dev@flink.apache.org>
> >>>>>>>>>> Subject:Re: [DISCUSS] FLIP-64: Support for Temporary Objects in
> >>>> Table
> >>>>>>>>> module
> >>>>>>>>>> Hi all,
> >>>>>>>>>> @Jingsong Could you elaborate a bit more what do you mean by
> >>>>>>>>>> "some Connectors are difficult to convert all states to
> >> properties"
> >>>>>>>>>> All the Flink provided connectors will definitely be expressible
> >>>> with
> >>>>>>>>> properties (In the end you should be able to use them from DDL).
> I
> >>>>>> think
> >>>>>>>> if
> >>>>>>>>> a TableSource is complex enough that it handles filter push down,
> >>>>>>>> partition
> >>>>>>>>> support etc. should rather be made available both from DDL &
> >>>> java/scala
> >>>>>>>>> code. I'm happy to reconsider adding
> registerTemporaryTable(String
> >>>>>> path,
> >>>>>>>>> TableSource source) if you have some concrete examples in mind.
> >>>>>>>>>> @Xuefu: We also considered the ObjectIdentifier (or actually
> >>>>>>>> introducing
> >>>>>>>>> a new identifier representation to differentiate between resolved
> >> and
> >>>>>>>>> unresolved identifiers) with the same concerns. We decided to
> >> suggest
> >>>>>> the
> >>>>>>>>> string & parsing logic because of usability.
> >>>>>>>>>>       tEnv.from("cat.db.table")
> >>>>>>>>>> is shorter and easier to write than
> >>>>>>>>>>       tEnv.from(Identifier.for("cat", "db", "name")
> >>>>>>>>>> And also implicitly solves the problem what happens if a user
> >> (e.g.
> >>>>>>>> used
> >>>>>>>>> to other systems) uses that API in a following manner:
> >>>>>>>>>>       tEnv.from(Identifier.for("db.name")
> >>>>>>>>>> I'm happy to revisit it if the general consensus is that it's
> >> better
> >>>>>> to
> >>>>>>>>> use the OO aproach.
> >>>>>>>>>> Best,
> >>>>>>>>>> Dawid
> >>>>>>>>>>
> >>>>>>>>>> On 06/09/2019 10:00, Xuefu Z wrote:
> >>>>>>>>>>
> >>>>>>>>>> Thanks to Dawid for starting the discussion and writeup. It
> looks
> >>>>>>>> pretty
> >>>>>>>>>> good to me except that I'm a little concerned about the object
> >>>>>>>> reference
> >>>>>>>>>> and string parsing in the code, which seems to an anti-pattern
> to
> >>>> OOP.
> >>>>>>>>> Have
> >>>>>>>>>> we considered using ObjectIdenitifier with optional catalog and
> db
> >>>>>>>> parts,
> >>>>>>>>>> esp. if we are worried about arguments of variable length or
> >> method
> >>>>>>>>>> overloading? It's quite likely that the result of string parsing
> >> is
> >>>> an
> >>>>>>>>>> ObjectIdentifier instance any way.
> >>>>>>>>>>
> >>>>>>>>>> Having string parsing logic in the code is a little dangerous as
> >> it
> >>>>>>>>>> duplicates part of the DDL/DML parsing, and they can easily get
> >> out
> >>>> of
> >>>>>>>>> sync.
> >>>>>>>>>> Thanks,
> >>>>>>>>>> Xuefu
> >>>>>>>>>>
> >>>>>>>>>> On Fri, Sep 6, 2019 at 1:57 PM JingsongLee <
> >> lzljs3620...@aliyun.com
> >>>>>>>>> .invalid>
> >>>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Thanks dawid, +1 for this approach.
> >>>>>>>>>>
> >>>>>>>>>> One concern is the removal of registerTableSink &
> >>>> registerTableSource
> >>>>>>>>>>    in TableEnvironment. It has two alternatives:
> >>>>>>>>>> 1.the properties approach (DDL, descriptor).
> >>>>>>>>>> 2.from/toDataStream.
> >>>>>>>>>>
> >>>>>>>>>> #1 can only be properties, not java states, and some Connectors
> >>>>>>>>>>    are difficult to convert all states to properties.
> >>>>>>>>>> #2 can contain java state. But can't use TableSource-related
> >>>> features,
> >>>>>>>>>> like project & filter push down, partition support, etc..
> >>>>>>>>>>
> >>>>>>>>>> Any idea about this?
> >>>>>>>>>>
> >>>>>>>>>> Best,
> >>>>>>>>>> Jingsong Lee
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> ------------------------------------------------------------------
> >>>>>>>>>> From:Dawid Wysakowicz <dwysakow...@apache.org>
> >>>>>>>>>> Send Time:2019年9月4日(星期三) 22:20
> >>>>>>>>>> To:dev <dev@flink.apache.org>
> >>>>>>>>>> Subject:[DISCUSS] FLIP-64: Support for Temporary Objects in
> Table
> >>>>>>>> module
> >>>>>>>>>> Hi all,
> >>>>>>>>>> As part of FLIP-30 a Catalog API was introduced that enables
> >> storing
> >>>>>>>>> table
> >>>>>>>>>> meta objects permanently. At the same time the majority of
> current
> >>>>>> APIs
> >>>>>>>>>> create temporary objects that cannot be serialized. We should
> >>>> clarify
> >>>>>>>> the
> >>>>>>>>>> creation of meta objects (tables, views, functions) in a unified
> >>>> way.
> >>>>>>>>>> Another current problem in the API is that all the temporary
> >> objects
> >>>>>>>> are
> >>>>>>>>>> stored in a special built-in catalog, which is not very
> intuitive
> >>>> for
> >>>>>>>>> many
> >>>>>>>>>> users, as they must be aware of that catalog to reference
> >> temporary
> >>>>>>>>> objects.
> >>>>>>>>>> Lastly, different APIs have different ways of providing object
> >>>> paths:
> >>>>>>>>>> String path…,
> >>>>>>>>>> String path, String pathContinued…
> >>>>>>>>>> String name
> >>>>>>>>>> We should choose one approach and unify it across all APIs.
> >>>>>>>>>> I suggest a FLIP to address the above issues.
> >>>>>>>>>> Looking forward to your opinions.
> >>>>>>>>>> FLIP link:
> >>>>>>>>>>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-64%3A+Support+for+Temporary+Objects+in+Table+module
> >>
>
>

Reply via email to