Re: [DISCUSS] FLIP-131: Consolidate the user-facing Dataflow SDKs/APIs (and deprecate the DataSet API)

2020-08-26 Thread Aljoscha Krettek
I added more changes to the FLIP to try and address comments. You can see the changes from the last version here: https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=158866741=31=27 If no-one objects anymore I would like to proceed to a VOTE soon. Best, Aljoscha On

Re: [DISCUSS] FLIP-131: Consolidate the user-facing Dataflow SDKs/APIs (and deprecate the DataSet API)

2020-07-30 Thread Aljoscha Krettek
I see, we actually have some thoughts along that line as well. We have ideas about adding such functionality for `Transformation`, which is the graph structure that underlies both the DataStream API and the newer Table API Runner/Planner. There a very rough PoC for that available at [1]. It's

Re: [DISCUSS] FLIP-131: Consolidate the user-facing Dataflow SDKs/APIs (and deprecate the DataSet API)

2020-07-30 Thread Flavio Pompermaier
We use runCustomOperation to group a set of operators and into a single functional unit, just to make the code more modular.. It's very comfortable indeed. On Thu, Jul 30, 2020 at 5:20 PM Aljoscha Krettek wrote: > That is good input! I was not aware that anyone was actually using >

Re: [DISCUSS] FLIP-131: Consolidate the user-facing Dataflow SDKs/APIs (and deprecate the DataSet API)

2020-07-30 Thread Aljoscha Krettek
That is good input! I was not aware that anyone was actually using `runCustomOperation()`. Out of curiosity, what are you using that for? We have definitely thought about the first two points you mentioned, though. Especially processing-time will make it tricky to define unified execution

Re: [DISCUSS] FLIP-131: Consolidate the user-facing Dataflow SDKs/APIs (and deprecate the DataSet API)

2020-07-30 Thread Flavio Pompermaier
I just wanted to be propositive about missing api.. :D On Thu, Jul 30, 2020 at 4:29 PM Seth Wiesman wrote: > +1 Its time to drop DataSet > > Flavio, those issues are expected. This FLIP isn't just to drop DataSet > but to also add the necessary enhancements to DataStream such that it works >

Re: [DISCUSS] FLIP-131: Consolidate the user-facing Dataflow SDKs/APIs (and deprecate the DataSet API)

2020-07-30 Thread Seth Wiesman
+1 Its time to drop DataSet Flavio, those issues are expected. This FLIP isn't just to drop DataSet but to also add the necessary enhancements to DataStream such that it works well on bounded input. On Thu, Jul 30, 2020 at 8:49 AM Flavio Pompermaier wrote: > Just to contribute to the

Re: [DISCUSS] FLIP-131: Consolidate the user-facing Dataflow SDKs/APIs (and deprecate the DataSet API)

2020-07-30 Thread Flavio Pompermaier
Just to contribute to the discussion, when we tried to do the migration we faced some problems that could make migration quite difficult. 1 - It's difficult to test because of https://issues.apache.org/jira/browse/FLINK-18647 2 - missing mapPartition 3 - missing DataSet

Re: [DISCUSS] FLIP-131: Consolidate the user-facing Dataflow SDKs/APIs (and deprecate the DataSet API)

2020-07-30 Thread Till Rohrmann
+1 for this effort. Great to see that we are making progress towards our goal of a truly unified batch and stream processing engine. Cheers, Till On Thu, Jul 30, 2020 at 2:28 PM Kurt Young wrote: > +1, looking forward to the follow up FLIPs. > > Best, > Kurt > > > On Thu, Jul 30, 2020 at 6:40

Re: [DISCUSS] FLIP-131: Consolidate the user-facing Dataflow SDKs/APIs (and deprecate the DataSet API)

2020-07-30 Thread Kurt Young
+1, looking forward to the follow up FLIPs. Best, Kurt On Thu, Jul 30, 2020 at 6:40 PM Arvid Heise wrote: > +1 of getting rid of the DataSet API. Is DataStream#iterate already > superseding DataSet iterations or would that also need to be accounted for? > > In general, all surviving APIs

Re: [DISCUSS] FLIP-131: Consolidate the user-facing Dataflow SDKs/APIs (and deprecate the DataSet API)

2020-07-30 Thread Arvid Heise
+1 of getting rid of the DataSet API. Is DataStream#iterate already superseding DataSet iterations or would that also need to be accounted for? In general, all surviving APIs should also offer a smooth experience for switching back and forth. On Thu, Jul 30, 2020 at 9:39 AM Márton Balassi

Re: [DISCUSS] FLIP-131: Consolidate the user-facing Dataflow SDKs/APIs (and deprecate the DataSet API)

2020-07-30 Thread Márton Balassi
Hi All, Thanks for the write up and starting the discussion. I am in favor of unifying the APIs the way described in the FLIP and deprecating the DataSet API. I am looking forward to the detailed discussion of the changes necessary. Best, Marton On Wed, Jul 29, 2020 at 12:46 PM Aljoscha Krettek

[DISCUSS] FLIP-131: Consolidate the user-facing Dataflow SDKs/APIs (and deprecate the DataSet API)

2020-07-29 Thread Aljoscha Krettek
Hi Everyone, my colleagues (in cc) and I would like to propose this FLIP for discussion. In short, we want to reduce the number of APIs that we have by deprecating the DataSet API. This is a big step for Flink, that's why I'm also cross-posting this to the User Mailing List. FLIP-131: