Hi Dawid,

+1 for proposed changes

On Wed, Oct 9, 2019 at 12:15 PM Dawid Wysakowicz <dwysakow...@apache.org>
wrote:

> Sorry for a very delayed response.
>
> @Kurt Yes, this is the goal to have a function created like new
> Function(...) also be wrapped into CatalogFunction. This would have to
> be though a temporary function as we cannot represent that as a set of
> properties. Similar to the createTemporaryView(DataStream stream).
>
> As for the ConnectTableDescriptor I agree this is very similar to
> CatalogTable. I am not sure though if we should get rid of it. In the
> end I see it as a builder for a CatalogTable, which is a slightly more
> internal API, but we might revisit that some time in the future if we
> find that it makes more sense.
>
> @All I updated the FLIP page with some more details from the outcome of
> the discussions around FLIP-57. Please take a look. I would like to
> start a vote on this FLIP as soon as the vote on FLIP-57 goes through.
>
> Best,
>
> Dawid
>
>
> On 19/09/2019 09:24, Kurt Young wrote:
> > IIUC it's good to see that both serializable (tables description from
> DDL)
> > and unserializable (tables with DataStream underneath) tables are treated
> > unify with CatalogTable.
> >
> > Can I also assume functions that either come from a function class (from
> > DDL)
> > or function objects (newed by user) will also treated unify with
> > CatalogFunction?
> >
> > This will greatly simplify and unify current API level concepts and
> design.
> >
> > And it seems only one thing left, how do we deal with
> > ConnectTableDescriptor?
> > It's actually very similar with serializable CatalogTable, both carry
> some
> > text
> > properties which even are the same. Is there any chance we can further
> unify
> > this to CatalogTable?
> >
> > object
> > Best,
> > Kurt
> >
> >
> > On Thu, Sep 19, 2019 at 3:13 PM Jark Wu <imj...@gmail.com> wrote:
> >
> >> Thanks Dawid for the design doc.
> >>
> >> In general, I’m +1 to the FLIP.
> >>
> >>
> >> +1 to the single-string and parse way to express object path.
> >>
> >> +1 to deprecate registerTableSink & registerTableSource.
> >> But I would suggest to provide an easy way to register a custom
> >> source/sink before we drop them (this is another story).
> >> Currently, it’s not easy to implement a custom connector descriptor.
> >>
> >> Best,
> >> Jark
> >>
> >>
> >>> 在 2019年9月19日,11:37,Dawid Wysakowicz <wysakowicz.da...@gmail.com> 写道:
> >>>
> >>> Hi JingsongLee,
> >>> From my understanding they can. Underneath they will be CatalogTables.
> >> The
> >>> difference is the lifetime of the tables. Plus some of the user facing
> >>> interfaces cannot be persisted e.g. datastream. Therefore we must have
> a
> >>> separate methods for that. In the end the temporary tables are held in
> >>> memory as CatalogTables.
> >>> Best,
> >>> Dawid
> >>>
> >>> On Thu, 19 Sep 2019, 10:08 JingsongLee, <lzljs3620...@aliyun.com
> >> .invalid>
> >>> wrote:
> >>>
> >>>> Hi dawid:
> >>>> Can temporary tables achieve the same capabilities as catalog table?
> >>>> like statistics: CatalogTableStatistics, CatalogColumnStatistics,
> >>>> PartitionStatistics
> >>>> like partition support: we have added some catalog equivalent
> interfaces
> >>>> on TableSource/TableSink: getPartitions, getPartitionFieldNames
> >>>> Maybe it's not a good idea to add these interfaces to
> >>>> TableSource/TableSink. What do you think?
> >>>>
> >>>> Best,
> >>>> Jingsong Lee
> >>>>
> >>>>
> >>>> ------------------------------------------------------------------
> >>>> From:Kurt Young <ykt...@gmail.com>
> >>>> Send Time:2019年9月18日(星期三) 17:54
> >>>> To:dev <dev@flink.apache.org>
> >>>> Subject:Re: [DISCUSS] FLIP-64: Support for Temporary Objects in Table
> >>>> module
> >>>>
> >>>> Hi all,
> >>>>
> >>>> Sorry to join this party late. Big +1 to this flip, especially for the
> >>>> dropping
> >>>> "registerTableSink & registerTableSource" part. These are indeed
> legacy
> >>>> and we should try to unify them through CatalogTable after we
> introduce
> >>>> the concept of Catalog.
> >>>>
> >>>> From my understanding, what we can registered should all be metadata,
> >>>> TableSource/TableSink should only be the one who is responsible to do
> >>>> the real work, i.e. reading and writing data according to the schema
> and
> >>>> other information like computed column, partition, .e.g.
> >>>>
> >>>> Best,
> >>>> Kurt
> >>>>
> >>>>
> >>>> On Wed, Sep 18, 2019 at 5:14 PM JingsongLee <lzljs3620...@aliyun.com
> >>>> .invalid>
> >>>> wrote:
> >>>>
> >>>>> After some development and thinking, I have a general understanding.
> >>>>> +1 to registering a source/sink does not fit into the SQL world.
> >>>>> I am OK to have a deprecated registerTemporarySource/Sink to
> compatible
> >>>>> with old ways.
> >>>>>
> >>>>> Best,
> >>>>> Jingsong Lee
> >>>>>
> >>>>>
> >>>>> ------------------------------------------------------------------
> >>>>> From:Timo Walther <twal...@apache.org>
> >>>>> Send Time:2019年9月17日(星期二) 08:00
> >>>>> To:dev <dev@flink.apache.org>
> >>>>> Subject:Re: [DISCUSS] FLIP-64: Support for Temporary Objects in Table
> >>>>> module
> >>>>>
> >>>>> Hi Dawid,
> >>>>>
> >>>>> thanks for the design document. It fixes big concept gaps due to
> >>>>> historical reasons with proper support for serializability and
> catalog
> >>>>> support in mind.
> >>>>>
> >>>>> I would not mind a registerTemporarySource/Sink, but the problem
> that I
> >>>>> see is that many people think that this is the recommended way of
> >>>>> registering a table source/sink which is not true. We should guide
> >> users
> >>>>> to either use connect() or DDL API which can be validated and stored
> in
> >>>>> catalog.
> >>>>>
> >>>>> Also from a concept perspective, registering a source/sink does not
> fit
> >>>>> into the SQL world. SQL does not know about source/sinks but only
> about
> >>>>> tables. If the responsibility of a TableSource/TableSink is just a
> pure
> >>>>> physical data consumer/producer that is not connected to the actual
> >>>>> logical table schema, we would need a possibility of defining time
> >>>>> attributes and interpreting/converting a changelog. This should be
> done
> >>>>> by the framework with information from the DDL/connect() and not be
> >>>>> defined in every table source.
> >>>>>
> >>>>> Regards,
> >>>>> Timo
> >>>>>
> >>>>>
> >>>>> On 09.09.19 14:16, JingsongLee wrote:
> >>>>>> Hi dawid:
> >>>>>>
> >>>>>> It is difficult to describe specific examples.
> >>>>>> Sometimes users will generate some java converters through some
> >>>>>>  Java code, or generate some Java classes through third-party
> >>>>>>  libraries. Of course, these can be best done through properties.
> >>>>>> But this requires additional work from users.My suggestion is to
> >>>>>>  keep this Java instance class way that is user-friendly.
> >>>>>>
> >>>>>> Best,
> >>>>>> Jingsong Lee
> >>>>>>
> >>>>>>
> >>>>>> ------------------------------------------------------------------
> >>>>>> From:Dawid Wysakowicz <dwysakow...@apache.org>
> >>>>>> Send Time:2019年9月6日(星期五) 16:21
> >>>>>> To:dev <dev@flink.apache.org>
> >>>>>> Subject:Re: [DISCUSS] FLIP-64: Support for Temporary Objects in
> Table
> >>>>> module
> >>>>>> Hi all,
> >>>>>> @Jingsong Could you elaborate a bit more what do you mean by
> >>>>>> "some Connectors are difficult to convert all states to properties"
> >>>>>> All the Flink provided connectors will definitely be expressible
> with
> >>>>> properties (In the end you should be able to use them from DDL). I
> >> think
> >>>> if
> >>>>> a TableSource is complex enough that it handles filter push down,
> >>>> partition
> >>>>> support etc. should rather be made available both from DDL &
> java/scala
> >>>>> code. I'm happy to reconsider adding registerTemporaryTable(String
> >> path,
> >>>>> TableSource source) if you have some concrete examples in mind.
> >>>>>>
> >>>>>> @Xuefu: We also considered the ObjectIdentifier (or actually
> >>>> introducing
> >>>>> a new identifier representation to differentiate between resolved and
> >>>>> unresolved identifiers) with the same concerns. We decided to suggest
> >> the
> >>>>> string & parsing logic because of usability.
> >>>>>>     tEnv.from("cat.db.table")
> >>>>>> is shorter and easier to write than
> >>>>>>     tEnv.from(Identifier.for("cat", "db", "name")
> >>>>>> And also implicitly solves the problem what happens if a user (e.g.
> >>>> used
> >>>>> to other systems) uses that API in a following manner:
> >>>>>>     tEnv.from(Identifier.for("db.name")
> >>>>>> I'm happy to revisit it if the general consensus is that it's better
> >> to
> >>>>> use the OO aproach.
> >>>>>> Best,
> >>>>>> Dawid
> >>>>>>
> >>>>>> On 06/09/2019 10:00, Xuefu Z wrote:
> >>>>>>
> >>>>>> Thanks to Dawid for starting the discussion and writeup. It looks
> >>>> pretty
> >>>>>> good to me except that I'm a little concerned about the object
> >>>> reference
> >>>>>> and string parsing in the code, which seems to an anti-pattern to
> OOP.
> >>>>> Have
> >>>>>> we considered using ObjectIdenitifier with optional catalog and db
> >>>> parts,
> >>>>>> esp. if we are worried about arguments of variable length or method
> >>>>>> overloading? It's quite likely that the result of string parsing is
> an
> >>>>>> ObjectIdentifier instance any way.
> >>>>>>
> >>>>>> Having string parsing logic in the code is a little dangerous as it
> >>>>>> duplicates part of the DDL/DML parsing, and they can easily get out
> of
> >>>>> sync.
> >>>>>> Thanks,
> >>>>>> Xuefu
> >>>>>>
> >>>>>> On Fri, Sep 6, 2019 at 1:57 PM JingsongLee <lzljs3620...@aliyun.com
> >>>>> .invalid>
> >>>>>> wrote:
> >>>>>>
> >>>>>>
> >>>>>> Thanks dawid, +1 for this approach.
> >>>>>>
> >>>>>> One concern is the removal of registerTableSink &
> registerTableSource
> >>>>>>  in TableEnvironment. It has two alternatives:
> >>>>>> 1.the properties approach (DDL, descriptor).
> >>>>>> 2.from/toDataStream.
> >>>>>>
> >>>>>> #1 can only be properties, not java states, and some Connectors
> >>>>>>  are difficult to convert all states to properties.
> >>>>>> #2 can contain java state. But can't use TableSource-related
> features,
> >>>>>> like project & filter push down, partition support, etc..
> >>>>>>
> >>>>>> Any idea about this?
> >>>>>>
> >>>>>> Best,
> >>>>>> Jingsong Lee
> >>>>>>
> >>>>>>
> >>>>>> ------------------------------------------------------------------
> >>>>>> From:Dawid Wysakowicz <dwysakow...@apache.org>
> >>>>>> Send Time:2019年9月4日(星期三) 22:20
> >>>>>> To:dev <dev@flink.apache.org>
> >>>>>> Subject:[DISCUSS] FLIP-64: Support for Temporary Objects in Table
> >>>> module
> >>>>>> Hi all,
> >>>>>> As part of FLIP-30 a Catalog API was introduced that enables storing
> >>>>> table
> >>>>>> meta objects permanently. At the same time the majority of current
> >> APIs
> >>>>>> create temporary objects that cannot be serialized. We should
> clarify
> >>>> the
> >>>>>> creation of meta objects (tables, views, functions) in a unified
> way.
> >>>>>> Another current problem in the API is that all the temporary objects
> >>>> are
> >>>>>> stored in a special built-in catalog, which is not very intuitive
> for
> >>>>> many
> >>>>>> users, as they must be aware of that catalog to reference temporary
> >>>>> objects.
> >>>>>> Lastly, different APIs have different ways of providing object
> paths:
> >>>>>>
> >>>>>> String path…,
> >>>>>> String path, String pathContinued…
> >>>>>> String name
> >>>>>> We should choose one approach and unify it across all APIs.
> >>>>>> I suggest a FLIP to address the above issues.
> >>>>>> Looking forward to your opinions.
> >>>>>> FLIP link:
> >>>>>>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-64%3A+Support+for+Temporary+Objects+in+Table+module
> >>>>
> >>
>
>

Reply via email to