Thanks Dawid for the design doc. 

In general, I’m +1 to the FLIP.


+1 to the single-string and parse way to express object path. 

+1 to deprecate registerTableSink & registerTableSource. 
But I would suggest to provide an easy way to register a custom source/sink 
before we drop them (this is another story). 
Currently, it’s not easy to implement a custom connector descriptor.

Best,
Jark


> 在 2019年9月19日,11:37,Dawid Wysakowicz <wysakowicz.da...@gmail.com> 写道:
> 
> Hi JingsongLee,
> From my understanding they can. Underneath they will be CatalogTables. The
> difference is the lifetime of the tables. Plus some of the user facing
> interfaces cannot be persisted e.g. datastream. Therefore we must have a
> separate methods for that. In the end the temporary tables are held in
> memory as CatalogTables.
> Best,
> Dawid
> 
> On Thu, 19 Sep 2019, 10:08 JingsongLee, <lzljs3620...@aliyun.com.invalid>
> wrote:
> 
>> Hi dawid:
>> Can temporary tables achieve the same capabilities as catalog table?
>> like statistics: CatalogTableStatistics, CatalogColumnStatistics,
>> PartitionStatistics
>> like partition support: we have added some catalog equivalent interfaces
>> on TableSource/TableSink: getPartitions, getPartitionFieldNames
>> Maybe it's not a good idea to add these interfaces to
>> TableSource/TableSink. What do you think?
>> 
>> Best,
>> Jingsong Lee
>> 
>> 
>> ------------------------------------------------------------------
>> From:Kurt Young <ykt...@gmail.com>
>> Send Time:2019年9月18日(星期三) 17:54
>> To:dev <dev@flink.apache.org>
>> Subject:Re: [DISCUSS] FLIP-64: Support for Temporary Objects in Table
>> module
>> 
>> Hi all,
>> 
>> Sorry to join this party late. Big +1 to this flip, especially for the
>> dropping
>> "registerTableSink & registerTableSource" part. These are indeed legacy
>> and we should try to unify them through CatalogTable after we introduce
>> the concept of Catalog.
>> 
>> From my understanding, what we can registered should all be metadata,
>> TableSource/TableSink should only be the one who is responsible to do
>> the real work, i.e. reading and writing data according to the schema and
>> other information like computed column, partition, .e.g.
>> 
>> Best,
>> Kurt
>> 
>> 
>> On Wed, Sep 18, 2019 at 5:14 PM JingsongLee <lzljs3620...@aliyun.com
>> .invalid>
>> wrote:
>> 
>>> After some development and thinking, I have a general understanding.
>>> +1 to registering a source/sink does not fit into the SQL world.
>>> I am OK to have a deprecated registerTemporarySource/Sink to compatible
>>> with old ways.
>>> 
>>> Best,
>>> Jingsong Lee
>>> 
>>> 
>>> ------------------------------------------------------------------
>>> From:Timo Walther <twal...@apache.org>
>>> Send Time:2019年9月17日(星期二) 08:00
>>> To:dev <dev@flink.apache.org>
>>> Subject:Re: [DISCUSS] FLIP-64: Support for Temporary Objects in Table
>>> module
>>> 
>>> Hi Dawid,
>>> 
>>> thanks for the design document. It fixes big concept gaps due to
>>> historical reasons with proper support for serializability and catalog
>>> support in mind.
>>> 
>>> I would not mind a registerTemporarySource/Sink, but the problem that I
>>> see is that many people think that this is the recommended way of
>>> registering a table source/sink which is not true. We should guide users
>>> to either use connect() or DDL API which can be validated and stored in
>>> catalog.
>>> 
>>> Also from a concept perspective, registering a source/sink does not fit
>>> into the SQL world. SQL does not know about source/sinks but only about
>>> tables. If the responsibility of a TableSource/TableSink is just a pure
>>> physical data consumer/producer that is not connected to the actual
>>> logical table schema, we would need a possibility of defining time
>>> attributes and interpreting/converting a changelog. This should be done
>>> by the framework with information from the DDL/connect() and not be
>>> defined in every table source.
>>> 
>>> Regards,
>>> Timo
>>> 
>>> 
>>> On 09.09.19 14:16, JingsongLee wrote:
>>>> Hi dawid:
>>>> 
>>>> It is difficult to describe specific examples.
>>>> Sometimes users will generate some java converters through some
>>>>  Java code, or generate some Java classes through third-party
>>>>  libraries. Of course, these can be best done through properties.
>>>> But this requires additional work from users.My suggestion is to
>>>>  keep this Java instance class way that is user-friendly.
>>>> 
>>>> Best,
>>>> Jingsong Lee
>>>> 
>>>> 
>>>> ------------------------------------------------------------------
>>>> From:Dawid Wysakowicz <dwysakow...@apache.org>
>>>> Send Time:2019年9月6日(星期五) 16:21
>>>> To:dev <dev@flink.apache.org>
>>>> Subject:Re: [DISCUSS] FLIP-64: Support for Temporary Objects in Table
>>> module
>>>> 
>>>> Hi all,
>>>> @Jingsong Could you elaborate a bit more what do you mean by
>>>> "some Connectors are difficult to convert all states to properties"
>>>> All the Flink provided connectors will definitely be expressible with
>>> properties (In the end you should be able to use them from DDL). I think
>> if
>>> a TableSource is complex enough that it handles filter push down,
>> partition
>>> support etc. should rather be made available both from DDL & java/scala
>>> code. I'm happy to reconsider adding registerTemporaryTable(String path,
>>> TableSource source) if you have some concrete examples in mind.
>>>> 
>>>> 
>>>> @Xuefu: We also considered the ObjectIdentifier (or actually
>> introducing
>>> a new identifier representation to differentiate between resolved and
>>> unresolved identifiers) with the same concerns. We decided to suggest the
>>> string & parsing logic because of usability.
>>>>     tEnv.from("cat.db.table")
>>>> is shorter and easier to write than
>>>>     tEnv.from(Identifier.for("cat", "db", "name")
>>>> And also implicitly solves the problem what happens if a user (e.g.
>> used
>>> to other systems) uses that API in a following manner:
>>>>     tEnv.from(Identifier.for("db.name")
>>>> I'm happy to revisit it if the general consensus is that it's better to
>>> use the OO aproach.
>>>> Best,
>>>> Dawid
>>>> 
>>>> On 06/09/2019 10:00, Xuefu Z wrote:
>>>> 
>>>> Thanks to Dawid for starting the discussion and writeup. It looks
>> pretty
>>>> good to me except that I'm a little concerned about the object
>> reference
>>>> and string parsing in the code, which seems to an anti-pattern to OOP.
>>> Have
>>>> we considered using ObjectIdenitifier with optional catalog and db
>> parts,
>>>> esp. if we are worried about arguments of variable length or method
>>>> overloading? It's quite likely that the result of string parsing is an
>>>> ObjectIdentifier instance any way.
>>>> 
>>>> Having string parsing logic in the code is a little dangerous as it
>>>> duplicates part of the DDL/DML parsing, and they can easily get out of
>>> sync.
>>>> 
>>>> Thanks,
>>>> Xuefu
>>>> 
>>>> On Fri, Sep 6, 2019 at 1:57 PM JingsongLee <lzljs3620...@aliyun.com
>>> .invalid>
>>>> wrote:
>>>> 
>>>> 
>>>> Thanks dawid, +1 for this approach.
>>>> 
>>>> One concern is the removal of registerTableSink & registerTableSource
>>>>  in TableEnvironment. It has two alternatives:
>>>> 1.the properties approach (DDL, descriptor).
>>>> 2.from/toDataStream.
>>>> 
>>>> #1 can only be properties, not java states, and some Connectors
>>>>  are difficult to convert all states to properties.
>>>> #2 can contain java state. But can't use TableSource-related features,
>>>> like project & filter push down, partition support, etc..
>>>> 
>>>> Any idea about this?
>>>> 
>>>> Best,
>>>> Jingsong Lee
>>>> 
>>>> 
>>>> ------------------------------------------------------------------
>>>> From:Dawid Wysakowicz <dwysakow...@apache.org>
>>>> Send Time:2019年9月4日(星期三) 22:20
>>>> To:dev <dev@flink.apache.org>
>>>> Subject:[DISCUSS] FLIP-64: Support for Temporary Objects in Table
>> module
>>>> 
>>>> Hi all,
>>>> As part of FLIP-30 a Catalog API was introduced that enables storing
>>> table
>>>> meta objects permanently. At the same time the majority of current APIs
>>>> create temporary objects that cannot be serialized. We should clarify
>> the
>>>> creation of meta objects (tables, views, functions) in a unified way.
>>>> Another current problem in the API is that all the temporary objects
>> are
>>>> stored in a special built-in catalog, which is not very intuitive for
>>> many
>>>> users, as they must be aware of that catalog to reference temporary
>>> objects.
>>>> Lastly, different APIs have different ways of providing object paths:
>>>> 
>>>> String path…,
>>>> String path, String pathContinued…
>>>> String name
>>>> We should choose one approach and unify it across all APIs.
>>>> I suggest a FLIP to address the above issues.
>>>> Looking forward to your opinions.
>>>> FLIP link:
>>>> 
>>> 
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-64%3A+Support+for+Temporary+Objects+in+Table+module
>>>> 
>> 
>> 

Reply via email to