Thanks for the nice discussion all. I was recently trying to implement a very simple polling source and would've loved a higher-level base to work from. I'm wondering if in addition to the data generator use cases, it would be good to support a simple non-parallel polling abstraction to make it easier to, for instance, start prototyping with data in existing APIs without adding a Kafka or such in the middle.
Best, Austin On Mon, Jun 6, 2022 at 10:02 AM tison <wander4...@gmail.com> wrote: > Well. It's a bit off-topic. For deprecating SourceFunction as FLIP-27 > series works go ahead, +1 from my side. It's a significant work towards the > unification of batch and streaming effort :) > > Best, > tison. > > > tison <wander4...@gmail.com> 于2022年6月6日周一 21:54写道: > > > The starting point of the version bump and removal question is that > > downstream projects may experience a tough time to adapt new interfaces > > while Flink keeps in 1.x versions so that users may expect it as an easy > > task. From my experience, it's really challenge to maintain > > compatibility between multiple versions of Flink while significant > changes > > made but sharing 1.x version series - users may not be aware that it's > > almost a major version bump. > > > > Best, > > tison. > > > > > > tison <wander4...@gmail.com> 于2022年6月6日周一 21:51写道: > > > >> One question from my side: > >> > >> As SourceFunction a @Public interface, we cannot remove it before doing > >> a major version bump (Flink 2.0). > >> > >> Of course it's not a blocker to make such deprecation and let the new > >> interface step in. My question is whether we have a plan to finally > remove > >> the deprecated interfaces, or postpone it until a clear plan of Flink > 2.0? > >> > >> Best, > >> tison. > >> > >> > >> David Anderson <dander...@apache.org> 于2022年6月6日周一 21:35写道: > >> > >>> > > >>> > David, can you elaborate why you need watermark generation in the > >>> source > >>> > for your data generators? > >>> > >>> > >>> The training exercises should strive to provide examples of best > >>> practices. > >>> If the exercises and their solutions use > >>> > >>> env.fromSource(source, WatermarkStrategy.noWatermarks(), > >>> "name-of-source") > >>> .map(...) > >>> .assignTimestampsAndWatermarks(...) > >>> > >>> this will help establish this anti-pattern as the normal way of doing > >>> things. > >>> > >>> Most new Flink users are using a KafkaSource with a noWatermarks > strategy > >>> and a SimpleStringSchema, followed by a map that does the real > >>> deserialization, followed by the real watermarking -- because they > aren't > >>> seeing examples that teach how these interfaces are meant to be used. > >>> > >>> When we redo the sources used in training exercises, I want to avoid > >>> these > >>> pitfalls. > >>> > >>> David > >>> > >>> On Mon, Jun 6, 2022 at 9:12 AM Konstantin Knauf <kna...@apache.org> > >>> wrote: > >>> > >>> > Hi everyone, > >>> > > >>> > very interesting thread. The proposal for deprecation seems to have > >>> sparked > >>> > a very important discussion. Do we what users struggle with > >>> specifically? > >>> > > >>> > Speaking for myself, when I upgrade flink-faker to the new Source API > >>> an > >>> > unbounded version of the NumberSequenceSource would have been all I > >>> needed, > >>> > but that's just the data generator use case. I think, that one could > be > >>> > solved quite easily. David, can you elaborate why you need watermark > >>> > generation in the source for your data generators? > >>> > > >>> > Cheers, > >>> > > >>> > Konstantin > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > Am So., 5. Juni 2022 um 17:48 Uhr schrieb Piotr Nowojski < > >>> > pnowoj...@apache.org>: > >>> > > >>> > > Also +1 to what David has written. But it doesn't mean we should be > >>> > waiting > >>> > > indefinitely to deprecate SourceFunction. > >>> > > > >>> > > Best, > >>> > > Piotrek > >>> > > > >>> > > niedz., 5 cze 2022 o 16:46 Jark Wu <imj...@gmail.com> napisał(a): > >>> > > > >>> > > > +1 to David's point. > >>> > > > > >>> > > > Usually, when we deprecate some interfaces, we should point users > >>> to > >>> > use > >>> > > > the recommended alternatives. > >>> > > > However, implementing the new Source interface for some simple > >>> > scenarios > >>> > > is > >>> > > > too challenging and complex. > >>> > > > We also found it isn't easy to push the internal connector to > >>> upgrade > >>> > to > >>> > > > the new Source because > >>> > > > "FLIP-27 are hard to understand, while SourceFunction is easy". > >>> > > > > >>> > > > +1 to make implementing a simple Source easier before deprecating > >>> > > > SourceFunction. > >>> > > > > >>> > > > Best, > >>> > > > Jark > >>> > > > > >>> > > > > >>> > > > On Sun, 5 Jun 2022 at 07:29, Jingsong Lee < > lzljs3620...@apache.org > >>> > > >>> > > wrote: > >>> > > > > >>> > > > > +1 to David and Ingo. > >>> > > > > > >>> > > > > Before deprecate and remove SourceFunction, we should have some > >>> > easier > >>> > > > APIs > >>> > > > > to wrap new Source, the cost to write a new Source is too high > >>> now. > >>> > > > > > >>> > > > > > >>> > > > > > >>> > > > > Ingo Bürk <airbla...@apache.org>于2022年6月5日 周日05:32写道: > >>> > > > > > >>> > > > > > I +1 everything David said. The new Source API raised the > >>> > complexity > >>> > > > > > significantly. It's great to have such a rich, powerful API > >>> that > >>> > can > >>> > > do > >>> > > > > > everything, but in the process we lost the ability to onboard > >>> > people > >>> > > to > >>> > > > > > the APIs. > >>> > > > > > > >>> > > > > > > >>> > > > > > Best > >>> > > > > > Ingo > >>> > > > > > > >>> > > > > > On 04.06.22 21:21, David Anderson wrote: > >>> > > > > > > I'm in favor of this, but I think we need to make it easier > >>> to > >>> > > > > implement > >>> > > > > > > data generators and test sources. As things stand in 1.15, > >>> unless > >>> > > you > >>> > > > > can > >>> > > > > > > be satisfied with using a NumberSequenceSource followed by > a > >>> map, > >>> > > > > things > >>> > > > > > > get quite complicated. I looked into reworking the data > >>> > generators > >>> > > > used > >>> > > > > > in > >>> > > > > > > the training exercises, and got discouraged by the amount > of > >>> work > >>> > > > > > involved. > >>> > > > > > > (The sources used in the training want to be unbounded, and > >>> need > >>> > > > > > > watermarking in the sources, which means that using > >>> > > > > NumberSequenceSource > >>> > > > > > > isn't an option.) > >>> > > > > > > > >>> > > > > > > I think the proposed deprecation will be better received if > >>> it > >>> > can > >>> > > be > >>> > > > > > > accompanied by something that makes implementing a simple > >>> Source > >>> > > > easier > >>> > > > > > > than it is now. People are continuing to implement new > >>> > > > SourceFunctions > >>> > > > > > > because the interfaces defined by FLIP-27 are hard to > >>> understand, > >>> > > > while > >>> > > > > > > SourceFunction is easy. Alex, I believe you were looking > into > >>> > > > > > implementing > >>> > > > > > > an easier-to-use building block that could be used in > >>> situations > >>> > > like > >>> > > > > > this. > >>> > > > > > > Can we get something like that in place first? > >>> > > > > > > > >>> > > > > > > David > >>> > > > > > > > >>> > > > > > > On Fri, Jun 3, 2022 at 4:52 PM Jing Ge <j...@ververica.com > > > >>> > wrote: > >>> > > > > > > > >>> > > > > > >> Hi, > >>> > > > > > >> > >>> > > > > > >> Thanks Alex for driving this! > >>> > > > > > >> > >>> > > > > > >> +1 To give the Flink developers, especially Connector > >>> developers > >>> > > the > >>> > > > > > clear > >>> > > > > > >> signal that the new Source API is recommended according to > >>> > > FLIP-27, > >>> > > > we > >>> > > > > > >> should mark them as deprecated. > >>> > > > > > >> > >>> > > > > > >> There are some open questions to discuss: > >>> > > > > > >> > >>> > > > > > >> 1. Do we need to mark all subinterfaces/subclasses as > >>> > deprecated? > >>> > > > e.g. > >>> > > > > > >> FromElementsFunction, etc. there are many. What are the > >>> > > > replacements? > >>> > > > > > >> 2. Do we need to mark all subclasses that have replacement > >>> as > >>> > > > > > deprecated? > >>> > > > > > >> e.g. ExternallyInducedSource whose replacement class, if I > >>> am > >>> > not > >>> > > > > > mistaken, > >>> > > > > > >> ExternallyInducedSourceReader is @Experimental > >>> > > > > > >> 3. Do we need to mark all related test utility classes as > >>> > > > deprecated? > >>> > > > > > >> > >>> > > > > > >> I think it might make sense to create an umbrella ticket > to > >>> > cover > >>> > > > all > >>> > > > > of > >>> > > > > > >> these with the following process: > >>> > > > > > >> > >>> > > > > > >> 1. Mark SourceFunction as deprecated asap. > >>> > > > > > >> 2. Mark subinterfaces and subclasses as deprecated, if > >>> there are > >>> > > > > > graduated > >>> > > > > > >> replacements. Good example is that KafkaSource replaced > >>> > > > KafkaConsumer > >>> > > > > > which > >>> > > > > > >> has been marked as deprecated. > >>> > > > > > >> 3. Do not mark subinterfaces and subclasses as deprecated, > >>> if > >>> > > > > > replacement > >>> > > > > > >> classes are still experimental, check if it is time to > >>> graduate > >>> > > > them. > >>> > > > > > After > >>> > > > > > >> graduation, go to step 2. It might take a while for > >>> graduation. > >>> > > > > > >> 4. Do not mark subinterfaces and subclasses as deprecated, > >>> if > >>> > the > >>> > > > > > >> replacement classes are experimental and are too young to > >>> > > graduate. > >>> > > > We > >>> > > > > > have > >>> > > > > > >> to wait. But in this case we could create new tickets > under > >>> the > >>> > > > > umbrella > >>> > > > > > >> ticket. > >>> > > > > > >> 5. Do not mark subinterfaces and subclasses as deprecated, > >>> if > >>> > > there > >>> > > > is > >>> > > > > > no > >>> > > > > > >> replacement at all. We have to create new tickets and wait > >>> until > >>> > > the > >>> > > > > new > >>> > > > > > >> implementation has been done and graduated. It will take a > >>> > longer > >>> > > > > time, > >>> > > > > > >> roughly 1,5 years. > >>> > > > > > >> 6. For test classes, we could follow the same rule. But I > >>> think > >>> > > for > >>> > > > > some > >>> > > > > > >> cases, we could consider doing the replacement directly > >>> without > >>> > > > going > >>> > > > > > >> through the deprecation phase. > >>> > > > > > >> > >>> > > > > > >> When we look back on all of these, we can realize it is a > >>> big > >>> > epic > >>> > > > > (even > >>> > > > > > >> bigger than an epic). It needs someone to drive it and > keep > >>> > focus > >>> > > on > >>> > > > > it > >>> > > > > > >> continuously with support from the community and push the > >>> > > > development > >>> > > > > > >> towards the new Source API of FLIP-27. > >>> > > > > > >> > >>> > > > > > >> If we could have consensus for this, Alex and I could > >>> create > >>> > the > >>> > > > > > umbrella > >>> > > > > > >> ticket to kick it off. > >>> > > > > > >> > >>> > > > > > >> Best regards, > >>> > > > > > >> Jing > >>> > > > > > >> > >>> > > > > > >> > >>> > > > > > >> On Fri, Jun 3, 2022 at 3:54 PM Alexander Fedulov < > >>> > > > > > alexan...@ververica.com> > >>> > > > > > >> wrote: > >>> > > > > > >> > >>> > > > > > >>> Hi everyone, > >>> > > > > > >>> > >>> > > > > > >>> I would like to start the discussion about marking > >>> > > > > SourceFunction-based > >>> > > > > > >>> interfaces as deprecated. With the FLIP-27 APIs becoming > >>> the > >>> > new > >>> > > > > > >> standard, > >>> > > > > > >>> the old ones have to be eventually phased out. Although > >>> this > >>> > > state > >>> > > > is > >>> > > > > > >> well > >>> > > > > > >>> known within the community and no new connectors based on > >>> the > >>> > old > >>> > > > > > >>> interfaces can be accepted into the project, the > footprint > >>> of > >>> > > > > > >>> SourceFunction in the user code still keeps growing > >>> (primarily > >>> > > for > >>> > > > > data > >>> > > > > > >>> generators and test utilities). I believe it is best to > >>> mark > >>> > > > > > >> SourceFunction > >>> > > > > > >>> as deprecated as soon as possible. What do you think? > >>> > > > > > >>> > >>> > > > > > >>> Best, > >>> > > > > > >>> Alexander Fedulov > >>> > > > > > >>> > >>> > > > > > >> > >>> > > > > > > > >>> > > > > > > >>> > > > > > >>> > > > > >>> > > > >>> > > >>> > > >>> > -- > >>> > https://twitter.com/snntrable > >>> > https://github.com/knaufk > >>> > > >>> > >> >