Thanks Dawid for the design doc. In general, I’m +1 to the FLIP.
+1 to the single-string and parse way to express object path. +1 to deprecate registerTableSink & registerTableSource. But I would suggest to provide an easy way to register a custom source/sink before we drop them (this is another story). Currently, it’s not easy to implement a custom connector descriptor. Best, Jark > 在 2019年9月19日,11:37,Dawid Wysakowicz <wysakowicz.da...@gmail.com> 写道: > > Hi JingsongLee, > From my understanding they can. Underneath they will be CatalogTables. The > difference is the lifetime of the tables. Plus some of the user facing > interfaces cannot be persisted e.g. datastream. Therefore we must have a > separate methods for that. In the end the temporary tables are held in > memory as CatalogTables. > Best, > Dawid > > On Thu, 19 Sep 2019, 10:08 JingsongLee, <lzljs3620...@aliyun.com.invalid> > wrote: > >> Hi dawid: >> Can temporary tables achieve the same capabilities as catalog table? >> like statistics: CatalogTableStatistics, CatalogColumnStatistics, >> PartitionStatistics >> like partition support: we have added some catalog equivalent interfaces >> on TableSource/TableSink: getPartitions, getPartitionFieldNames >> Maybe it's not a good idea to add these interfaces to >> TableSource/TableSink. What do you think? >> >> Best, >> Jingsong Lee >> >> >> ------------------------------------------------------------------ >> From:Kurt Young <ykt...@gmail.com> >> Send Time:2019年9月18日(星期三) 17:54 >> To:dev <dev@flink.apache.org> >> Subject:Re: [DISCUSS] FLIP-64: Support for Temporary Objects in Table >> module >> >> Hi all, >> >> Sorry to join this party late. Big +1 to this flip, especially for the >> dropping >> "registerTableSink & registerTableSource" part. These are indeed legacy >> and we should try to unify them through CatalogTable after we introduce >> the concept of Catalog. >> >> From my understanding, what we can registered should all be metadata, >> TableSource/TableSink should only be the one who is responsible to do >> the real work, i.e. reading and writing data according to the schema and >> other information like computed column, partition, .e.g. >> >> Best, >> Kurt >> >> >> On Wed, Sep 18, 2019 at 5:14 PM JingsongLee <lzljs3620...@aliyun.com >> .invalid> >> wrote: >> >>> After some development and thinking, I have a general understanding. >>> +1 to registering a source/sink does not fit into the SQL world. >>> I am OK to have a deprecated registerTemporarySource/Sink to compatible >>> with old ways. >>> >>> Best, >>> Jingsong Lee >>> >>> >>> ------------------------------------------------------------------ >>> From:Timo Walther <twal...@apache.org> >>> Send Time:2019年9月17日(星期二) 08:00 >>> To:dev <dev@flink.apache.org> >>> Subject:Re: [DISCUSS] FLIP-64: Support for Temporary Objects in Table >>> module >>> >>> Hi Dawid, >>> >>> thanks for the design document. It fixes big concept gaps due to >>> historical reasons with proper support for serializability and catalog >>> support in mind. >>> >>> I would not mind a registerTemporarySource/Sink, but the problem that I >>> see is that many people think that this is the recommended way of >>> registering a table source/sink which is not true. We should guide users >>> to either use connect() or DDL API which can be validated and stored in >>> catalog. >>> >>> Also from a concept perspective, registering a source/sink does not fit >>> into the SQL world. SQL does not know about source/sinks but only about >>> tables. If the responsibility of a TableSource/TableSink is just a pure >>> physical data consumer/producer that is not connected to the actual >>> logical table schema, we would need a possibility of defining time >>> attributes and interpreting/converting a changelog. This should be done >>> by the framework with information from the DDL/connect() and not be >>> defined in every table source. >>> >>> Regards, >>> Timo >>> >>> >>> On 09.09.19 14:16, JingsongLee wrote: >>>> Hi dawid: >>>> >>>> It is difficult to describe specific examples. >>>> Sometimes users will generate some java converters through some >>>> Java code, or generate some Java classes through third-party >>>> libraries. Of course, these can be best done through properties. >>>> But this requires additional work from users.My suggestion is to >>>> keep this Java instance class way that is user-friendly. >>>> >>>> Best, >>>> Jingsong Lee >>>> >>>> >>>> ------------------------------------------------------------------ >>>> From:Dawid Wysakowicz <dwysakow...@apache.org> >>>> Send Time:2019年9月6日(星期五) 16:21 >>>> To:dev <dev@flink.apache.org> >>>> Subject:Re: [DISCUSS] FLIP-64: Support for Temporary Objects in Table >>> module >>>> >>>> Hi all, >>>> @Jingsong Could you elaborate a bit more what do you mean by >>>> "some Connectors are difficult to convert all states to properties" >>>> All the Flink provided connectors will definitely be expressible with >>> properties (In the end you should be able to use them from DDL). I think >> if >>> a TableSource is complex enough that it handles filter push down, >> partition >>> support etc. should rather be made available both from DDL & java/scala >>> code. I'm happy to reconsider adding registerTemporaryTable(String path, >>> TableSource source) if you have some concrete examples in mind. >>>> >>>> >>>> @Xuefu: We also considered the ObjectIdentifier (or actually >> introducing >>> a new identifier representation to differentiate between resolved and >>> unresolved identifiers) with the same concerns. We decided to suggest the >>> string & parsing logic because of usability. >>>> tEnv.from("cat.db.table") >>>> is shorter and easier to write than >>>> tEnv.from(Identifier.for("cat", "db", "name") >>>> And also implicitly solves the problem what happens if a user (e.g. >> used >>> to other systems) uses that API in a following manner: >>>> tEnv.from(Identifier.for("db.name") >>>> I'm happy to revisit it if the general consensus is that it's better to >>> use the OO aproach. >>>> Best, >>>> Dawid >>>> >>>> On 06/09/2019 10:00, Xuefu Z wrote: >>>> >>>> Thanks to Dawid for starting the discussion and writeup. It looks >> pretty >>>> good to me except that I'm a little concerned about the object >> reference >>>> and string parsing in the code, which seems to an anti-pattern to OOP. >>> Have >>>> we considered using ObjectIdenitifier with optional catalog and db >> parts, >>>> esp. if we are worried about arguments of variable length or method >>>> overloading? It's quite likely that the result of string parsing is an >>>> ObjectIdentifier instance any way. >>>> >>>> Having string parsing logic in the code is a little dangerous as it >>>> duplicates part of the DDL/DML parsing, and they can easily get out of >>> sync. >>>> >>>> Thanks, >>>> Xuefu >>>> >>>> On Fri, Sep 6, 2019 at 1:57 PM JingsongLee <lzljs3620...@aliyun.com >>> .invalid> >>>> wrote: >>>> >>>> >>>> Thanks dawid, +1 for this approach. >>>> >>>> One concern is the removal of registerTableSink & registerTableSource >>>> in TableEnvironment. It has two alternatives: >>>> 1.the properties approach (DDL, descriptor). >>>> 2.from/toDataStream. >>>> >>>> #1 can only be properties, not java states, and some Connectors >>>> are difficult to convert all states to properties. >>>> #2 can contain java state. But can't use TableSource-related features, >>>> like project & filter push down, partition support, etc.. >>>> >>>> Any idea about this? >>>> >>>> Best, >>>> Jingsong Lee >>>> >>>> >>>> ------------------------------------------------------------------ >>>> From:Dawid Wysakowicz <dwysakow...@apache.org> >>>> Send Time:2019年9月4日(星期三) 22:20 >>>> To:dev <dev@flink.apache.org> >>>> Subject:[DISCUSS] FLIP-64: Support for Temporary Objects in Table >> module >>>> >>>> Hi all, >>>> As part of FLIP-30 a Catalog API was introduced that enables storing >>> table >>>> meta objects permanently. At the same time the majority of current APIs >>>> create temporary objects that cannot be serialized. We should clarify >> the >>>> creation of meta objects (tables, views, functions) in a unified way. >>>> Another current problem in the API is that all the temporary objects >> are >>>> stored in a special built-in catalog, which is not very intuitive for >>> many >>>> users, as they must be aware of that catalog to reference temporary >>> objects. >>>> Lastly, different APIs have different ways of providing object paths: >>>> >>>> String path…, >>>> String path, String pathContinued… >>>> String name >>>> We should choose one approach and unify it across all APIs. >>>> I suggest a FLIP to address the above issues. >>>> Looking forward to your opinions. >>>> FLIP link: >>>> >>> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-64%3A+Support+for+Temporary+Objects+in+Table+module >>>> >> >>