Hi everyone, thank you Jing for redirecting the discussion back to the topic at hand. I agree with all of your points.
+1 to deprecate SourceFunction Is there really no replacement for the StreamExecutionEnvironment#readXXX. There is already a FLIP-27 based FileSource, right? What's missing to recommend using that as opposed to the the readXXX methods? Cheers, Konstantin Am Do., 9. Juni 2022 um 20:11 Uhr schrieb Alexander Fedulov < alexan...@ververica.com>: > Hi all, > > It seems that there is some understandable cautiousness with regard to > deprecating methods and subclasses that do not have alternatives just yet. > > We should probably first agree if it is in general OK for Flink to use > @Deprecated > annotation for parts of the code that do not have alternatives. In that > case, > we could add a comment along the lines of: > "This implementation is based on a deprecated SourceFunction API that > will gradually be phased out from Flink. No direct substitute exists at the > moment. > If you want to have a more future-proof solution, consider helping the > project by > contributing an implementation based on the new Source API." > > This should clearly communicate the message that usage of these > methods/classes > is discouraged and at the same time promote contributions for addressing > the gap. > What do you think? > > Best, > Alexander Fedulov > > > On Thu, Jun 9, 2022 at 6:27 PM Ingo Bürk <airbla...@apache.org> wrote: > > > Hi, > > > > these APIs don't expose the underlying source directly, so I don't think > > we need to worry about deprecating them as well. There's also nothing > > inherently wrong with using a deprecated API internally, though even > > just for the experience of using our own new APIs I would personally say > > that they should be migrated to the new Source API. It's hard to reason > > that users must migrate to a new API if we don't do it internally as > well. > > > > > > Best > > Ingo > > > > On 09.06.22 15:41, Lijie Wang wrote: > > > Hi Martijn, > > > > > > I don't mean it's a blocker. Just a information. And I'm also +1 for > > this. > > > > > > Put it another way: should we migrate the `#readFile(...)` to new API > or > > > provide a similar method "readxxx“ based on the new Source API? > > > > > > And if we don't migrate it, does it mean that the `#readFile(...)` > should > > > also be marked as deprecated? > > > > > > Best, > > > Lijie > > > > > > Martijn Visser <martijnvis...@apache.org> 于2022年6月9日周四 21:03写道: > > > > > >> Hi Lijie, > > >> > > >> I don't see any problem with deprecating those methods at this moment, > > as > > >> long as we don't remove them until the replacements are available. > > Besides > > >> that, are we sure there are no replacements already, especially with > the > > >> new FileSource? > > >> > > >> Best regards, > > >> > > >> Martijn > > >> > > >> Op do 9 jun. 2022 om 14:23 schreef Lijie Wang < > wangdachui9...@gmail.com > > >: > > >> > > >>> Hi all, > > >>> > > >>> FYI, currently, some commonly used methods in > > StreamExecutionEnvironment > > >>> are still based on the old SourceFunction (and there is no > > alternative): > > >>> `StreamExecutionEnvironment#readFile(...)` > > >>> `StreamExecutionEnvironment#readTextFile(...)` > > >>> > > >>> I think these should be migrated to the new source API before > deprecate > > >> the > > >>> SourceFunction. > > >>> > > >>> Best, > > >>> Lijie > > >>> > > >>> Martijn Visser <martijnvis...@apache.org> 于2022年6月9日周四 16:05写道: > > >>> > > >>>> Hi all, > > >>>> > > >>>> I think implicitly we've already considered the SourceFunction and > > >>>> SinkFunction as deprecated. They are even marked as so on the Flink > > >>> roadmap > > >>>> [1]. That also shows that connectors that are using these interfaces > > >> are > > >>>> either approaching end-of-life. The fact that we're actively > migrating > > >>>> connectors from Source/SinkFunction to FLIP-27/FLIP-143 (plus add-on > > >>> FLIPs) > > >>>> shows that we've already determined that target. > > >>>> > > >>>> With regards to the motivation of FLIP-27, I think reading up on the > > >>>> original discussion thread is also worthwhile [2] to see more > context. > > >>>> FLIP-27 was also very important as it brought a unified connector > > which > > >>> can > > >>>> support both streaming and batch (with batch being considered a > > special > > >>>> case of streaming in Flink's vision). > > >>>> > > >>>> So +1 to deprecate SourceFunction. I would also argue that we should > > >>>> already mark the SinkFunction as deprecated to avoid having this > > >>> discussion > > >>>> again in a couple of months. > > >>>> > > >>>> Best regards, > > >>>> > > >>>> Martijn > > >>>> > > >>>> [1] https://flink.apache.org/roadmap.html > > >>>> [2] > https://lists.apache.org/thread/334co89dbhc8qpr9nvmz8t1gp4sz2c8y > > >>>> > > >>>> Op do 9 jun. 2022 om 09:48 schreef Jing Ge <j...@ververica.com>: > > >>>> > > >>>>> Hi, > > >>>>> > > >>>>> I am very happy to see opinions from different perspectives. That > > >> will > > >>>> help > > >>>>> us understand the problem better. Thanks all for the informative > > >>>>> discussion. > > >>>>> > > >>>>> Let's see the big picture and check following facts together: > > >>>>> > > >>>>> 1. FLIP-27 was intended to solve some technical issues that are > very > > >>>>> difficult to solve with SourceFunction[1]. When we say > > >> "SourceFunction > > >>> is > > >>>>> easy", well, it depends. If we take a look at the implementation of > > >> the > > >>>>> Kafka connector, we will know how complicated it is to build a > > >> serious > > >>>>> connector for production with the old SourceFunction. To every > > >> problem > > >>>>> there is a solution and to every solution there is a problem. The > > >> fact > > >>> is > > >>>>> that there is no perfect but a feasible solution. If we try to > solve > > >>>>> complicated problems, we have to expose some complexity. Comparing > to > > >>>>> connectors for POC, demo, training(no offense), I would also solve > > >>> issues > > >>>>> for connectors like Kafka connector that are widely used in > > >> production > > >>>> with > > >>>>> higher priority. I think that should be one reason why FLIP-27 has > > >> been > > >>>>> designed and why the new source API went public. > > >>>>> > > >>>>> 2. FLIP-27 and the implementation was introduced roughly at the end > > >> of > > >>>> 2019 > > >>>>> and went public on 19.04.2021, which means Flink has provided two > > >>>> different > > >>>>> public/graduated source solutions for more than one year. On the > day > > >>> that > > >>>>> the new source API went public, there should be a consensus in the > > >>>>> community that we should start the migration. Old SourceFunction > > >>>> interface, > > >>>>> in the ideal case, should have been deprecated on that day, > otherwise > > >>> we > > >>>>> should not graduate the new source API to avoid confusing > (connector) > > >>>>> developers[2]. > > >>>>> > > >>>>> 3. It is true that the new source API is hard to understand and > even > > >>> hard > > >>>>> to implement for simple cases. Thanks for the feedback. That is > > >>> something > > >>>>> we need to improve. The current design&implementation could be > > >>> considered > > >>>>> as the low level API. The next step is to create the high level API > > >> to > > >>>>> reduce some unnecessary complexity for those simple cases. But, > IMHO, > > >>>> this > > >>>>> should not be the prerequisite to postpone the deprecation of the > old > > >>>>> SourceFunction APIs. > > >>>>> > > >>>>> 4. As long as the old SourceFunction is not marked as deprecated, > > >>>>> developers will continue asking which one should be used. Let's > make > > >> a > > >>>>> concrete example. If a new connector is developed now and the > > >> developer > > >>>>> asks for a suggestion of the choice between the old and new source > > >> API > > >>> on > > >>>>> the ML, which one should we suggest? I think it should be the new > > >>> Source > > >>>>> API. If a fresh new connector has been developed with the old > > >>>>> SourceFunction API before asking for the consensus in the community > > >> and > > >>>> the > > >>>>> developer wants to merge it to the master. Should we allow it? If > the > > >>>>> answer of all these questions is pointing to the new Source API, > the > > >>> old > > >>>>> SourceFunction is de facto already deprecated, just has not been > > >> marked > > >>>> as > > >>>>> @deprecated, which confuses developers even more. > > >>>>> > > >>>>> Best regards, > > >>>>> Jing > > >>>>> > > >>>>> [1] > > >>>>> > > >>>>> > > >>>> > > >>> > > >> > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface > > >>>>> [2] > https://lists.apache.org/thread/7okp4y46n3o3rx5mn0t3qobrof8zxwqs > > >>>>> > > >>>>> On Wed, Jun 8, 2022 at 2:21 AM Alexander Fedulov < > > >>>> alexan...@ververica.com> > > >>>>> wrote: > > >>>>> > > >>>>>> Hey Austin, > > >>>>>> > > >>>>>> Since we are getting deeper into the implementation details of the > > >>>>>> DataGeneratorSource > > >>>>>> and it is not the main topic of this thread, I propose to move our > > >>>>>> discussion to where it belongs: [DISCUSS] FLIP-238 [1]. Could you > > >>>> please > > >>>>>> briefly formulate your requirements to make it easier for the > > >> others > > >>> to > > >>>>>> follow? I am happy to continue this conversation there. > > >>>>>> > > >>>>>> [1] > > >> https://lists.apache.org/thread/7gjxto1rmkpff4kl54j8nlg5db2rqhkt > > >>>>>> > > >>>>>> Best, > > >>>>>> Alexander Fedulov > > >>>>>> > > >>>>>> On Tue, Jun 7, 2022 at 6:14 PM Austin Cawley-Edwards < > > >>>>>> austin.caw...@gmail.com> wrote: > > >>>>>> > > >>>>>>>> @Austin, in the FLIP I mentioned above [1], the user is > > >> expected > > >>> to > > >>>>>>> pass a MapFunction<Long, > > >>>>>>> OUT> > > >>>>>>> to the generator. I wonder if you could have your external client > > >>> and > > >>>>>>> polling logic wrapped in a custom > > >>>>>>> MapFunction implementation class? Would that answer your needs or > > >>> do > > >>>>> you > > >>>>>>> have some > > >>>>>>> more sophisticated scenario in mind? > > >>>>>>> > > >>>>>>> At first glance, the FLIP looks good but for this case in regards > > >>> to > > >>>>> the > > >>>>>>> map function, but leaves out 1) ability to control polling > > >>> intervals > > >>>>> and > > >>>>>> 2) > > >>>>>>> ability to produce an unknown number of records, both per-poll > > >> and > > >>>>>> overall > > >>>>>>> boundedness. Do you think something like this could be built from > > >>> the > > >>>>>> same > > >>>>>>> pieces? > > >>>>>>> I'm also wondering what handles threading, is that on the user or > > >>> is > > >>>>> that > > >>>>>>> part of the DataGeneratorSource? > > >>>>>>> > > >>>>>>> Best, > > >>>>>>> Austin > > >>>>>>> > > >>>>>>> On Tue, Jun 7, 2022 at 9:34 AM Alexander Fedulov < > > >>>>>> alexan...@ververica.com> > > >>>>>>> wrote: > > >>>>>>> > > >>>>>>>> Hi everyone, > > >>>>>>>> > > >>>>>>>> Thanks for all the input and a lively discussion. It seems that > > >>>> there > > >>>>>> is > > >>>>>>> a > > >>>>>>>> consensus that due to > > >>>>>>>> the inherent complexity of FLIP-27 sources we should provide > > >> more > > >>>>>>>> user-facing utilities to bridge > > >>>>>>>> the gap between the existing SourceFunction-based functionality > > >>> and > > >>>>> the > > >>>>>>> new > > >>>>>>>> APIs. > > >>>>>>>> > > >>>>>>>> To start addressing this I picked the issue that David raised > > >> and > > >>>>> many > > >>>>>>>> upvoted. Here is a proposal > > >>>>>>>> for the new DataGeneratorSource: FLIP-238 [1]. Please take a > > >>>> look, I > > >>>>>> am > > >>>>>>>> going to open a separate > > >>>>>>>> discussion thread on it shortly. > > >>>>>>>> > > >>>>>>>> Jing also raised some great points regarding the interfaces and > > >>>>>>> subclasses. > > >>>>>>>> It seems to me that > > >>>>>>>> what might actually help is some sort of a "soft deprecation" > > >>>> concept > > >>>>>> and > > >>>>>>>> annotation. It could be > > >>>>>>>> used in places where we do not have an alternative > > >> implementation > > >>>>> yet, > > >>>>>>> but > > >>>>>>>> we clearly want > > >>>>>>>> to indicate that continuing to build on top of these interfaces > > >>> is > > >>>>>>>> discouraged. The area of > > >>>>>>>> impact of deprecating all SourceFunction subclasses is rather > > >>> big, > > >>>>> and > > >>>>>> we > > >>>>>>>> can expect it to > > >>>>>>>> take a while. The hope would be that if in the meantime someone > > >>>> finds > > >>>>>>>> themselves using one of > > >>>>>>>> such old APIs, the "soft deprecation" annotation will be a > > >> clear > > >>>>>>> indication > > >>>>>>>> and encouragement to > > >>>>>>>> work on introducing an alternative FLIP-27-based implementation > > >>>>>> instead. > > >>>>>>>> > > >>>>>>>> @Austin, in the FLIP I mentioned above [1], the user is > > >> expected > > >>> to > > >>>>>>>> pass a MapFunction<Long, > > >>>>>>>> OUT> > > >>>>>>>> to the generator. I wonder if you could have your external > > >> client > > >>>> and > > >>>>>>>> polling logic wrapped in a custom > > >>>>>>>> MapFunction implementation class? Would that answer your needs > > >> or > > >>>> do > > >>>>>> you > > >>>>>>>> have some > > >>>>>>>> more sophisticated scenario in mind? > > >>>>>>>> > > >>>>>>>> [1] https://cwiki.apache.org/confluence/x/9Av1D > > >>>>>>>> Best, > > >>>>>>>> Alexander Fedulov > > >>>>>>>> > > >>>>>>>> On Mon, Jun 6, 2022 at 7:08 PM Austin Cawley-Edwards < > > >>>>>>>> austin.caw...@gmail.com> wrote: > > >>>>>>>> > > >>>>>>>>> Thanks for the nice discussion all. > > >>>>>>>>> > > >>>>>>>>> I was recently trying to implement a very simple polling > > >> source > > >>>> and > > >>>>>>>>> would've loved a higher-level base to work from. I'm > > >> wondering > > >>> if > > >>>>> in > > >>>>>>>>> addition to the data generator use cases, it would be good to > > >>>>>> support a > > >>>>>>>>> simple non-parallel polling abstraction to make it easier to, > > >>> for > > >>>>>>>> instance, > > >>>>>>>>> start prototyping with data in existing APIs without adding a > > >>>> Kafka > > >>>>>> or > > >>>>>>>> such > > >>>>>>>>> in the middle. > > >>>>>>>>> > > >>>>>>>>> Best, > > >>>>>>>>> Austin > > >>>>>>>>> > > >>>>>>>>> On Mon, Jun 6, 2022 at 10:02 AM tison <wander4...@gmail.com> > > >>>>> wrote: > > >>>>>>>>> > > >>>>>>>>>> Well. It's a bit off-topic. For deprecating SourceFunction > > >> as > > >>>>>> FLIP-27 > > >>>>>>>>>> series works go ahead, +1 from my side. It's a significant > > >>> work > > >>>>>>> towards > > >>>>>>>>> the > > >>>>>>>>>> unification of batch and streaming effort :) > > >>>>>>>>>> > > >>>>>>>>>> Best, > > >>>>>>>>>> tison. > > >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> tison <wander4...@gmail.com> 于2022年6月6日周一 21:54写道: > > >>>>>>>>>> > > >>>>>>>>>>> The starting point of the version bump and removal > > >> question > > >>>> is > > >>>>>> that > > >>>>>>>>>>> downstream projects may experience a tough time to adapt > > >>> new > > >>>>>>>> interfaces > > >>>>>>>>>>> while Flink keeps in 1.x versions so that users may > > >> expect > > >>> it > > >>>>> as > > >>>>>> an > > >>>>>>>>> easy > > >>>>>>>>>>> task. From my experience, it's really challenge to > > >> maintain > > >>>>>>>>>>> compatibility between multiple versions of Flink while > > >>>>>> significant > > >>>>>>>>>> changes > > >>>>>>>>>>> made but sharing 1.x version series - users may not be > > >>> aware > > >>>>> that > > >>>>>>>> it's > > >>>>>>>>>>> almost a major version bump. > > >>>>>>>>>>> > > >>>>>>>>>>> Best, > > >>>>>>>>>>> tison. > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>>> tison <wander4...@gmail.com> 于2022年6月6日周一 21:51写道: > > >>>>>>>>>>> > > >>>>>>>>>>>> One question from my side: > > >>>>>>>>>>>> > > >>>>>>>>>>>> As SourceFunction a @Public interface, we cannot remove > > >> it > > >>>>>> before > > >>>>>>>>> doing > > >>>>>>>>>>>> a major version bump (Flink 2.0). > > >>>>>>>>>>>> > > >>>>>>>>>>>> Of course it's not a blocker to make such deprecation > > >> and > > >>>> let > > >>>>>> the > > >>>>>>>> new > > >>>>>>>>>>>> interface step in. My question is whether we have a plan > > >>> to > > >>>>>>> finally > > >>>>>>>>>> remove > > >>>>>>>>>>>> the deprecated interfaces, or postpone it until a clear > > >>> plan > > >>>>> of > > >>>>>>>> Flink > > >>>>>>>>>> 2.0? > > >>>>>>>>>>>> > > >>>>>>>>>>>> Best, > > >>>>>>>>>>>> tison. > > >>>>>>>>>>>> > > >>>>>>>>>>>> > > >>>>>>>>>>>> David Anderson <dander...@apache.org> 于2022年6月6日周一 > > >>> 21:35写道: > > >>>>>>>>>>>> > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> David, can you elaborate why you need watermark > > >>>> generation > > >>>>> in > > >>>>>>> the > > >>>>>>>>>>>>> source > > >>>>>>>>>>>>>> for your data generators? > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> The training exercises should strive to provide > > >> examples > > >>> of > > >>>>>> best > > >>>>>>>>>>>>> practices. > > >>>>>>>>>>>>> If the exercises and their solutions use > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> env.fromSource(source, > > >> WatermarkStrategy.noWatermarks(), > > >>>>>>>>>>>>> "name-of-source") > > >>>>>>>>>>>>> .map(...) > > >>>>>>>>>>>>> .assignTimestampsAndWatermarks(...) > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> this will help establish this anti-pattern as the > > >> normal > > >>>> way > > >>>>> of > > >>>>>>>> doing > > >>>>>>>>>>>>> things. > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> Most new Flink users are using a KafkaSource with a > > >>>>>> noWatermarks > > >>>>>>>>>> strategy > > >>>>>>>>>>>>> and a SimpleStringSchema, followed by a map that does > > >> the > > >>>>> real > > >>>>>>>>>>>>> deserialization, followed by the real watermarking -- > > >>>> because > > >>>>>>> they > > >>>>>>>>>> aren't > > >>>>>>>>>>>>> seeing examples that teach how these interfaces are > > >> meant > > >>>> to > > >>>>> be > > >>>>>>>> used. > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> When we redo the sources used in training exercises, I > > >>> want > > >>>>> to > > >>>>>>>> avoid > > >>>>>>>>>>>>> these > > >>>>>>>>>>>>> pitfalls. > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> David > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> On Mon, Jun 6, 2022 at 9:12 AM Konstantin Knauf < > > >>>>>>> kna...@apache.org > > >>>>>>>>> > > >>>>>>>>>>>>> wrote: > > >>>>>>>>>>>>> > > >>>>>>>>>>>>>> Hi everyone, > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> very interesting thread. The proposal for deprecation > > >>>> seems > > >>>>>> to > > >>>>>>>> have > > >>>>>>>>>>>>> sparked > > >>>>>>>>>>>>>> a very important discussion. Do we what users > > >> struggle > > >>>> with > > >>>>>>>>>>>>> specifically? > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> Speaking for myself, when I upgrade flink-faker to > > >> the > > >>>> new > > >>>>>>> Source > > >>>>>>>>> API > > >>>>>>>>>>>>> an > > >>>>>>>>>>>>>> unbounded version of the NumberSequenceSource would > > >>> have > > >>>>> been > > >>>>>>>> all I > > >>>>>>>>>>>>> needed, > > >>>>>>>>>>>>>> but that's just the data generator use case. I think, > > >>>> that > > >>>>>> one > > >>>>>>>>> could > > >>>>>>>>>> be > > >>>>>>>>>>>>>> solved quite easily. David, can you elaborate why you > > >>>> need > > >>>>>>>>> watermark > > >>>>>>>>>>>>>> generation in the source for your data generators? > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> Cheers, > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> Konstantin > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> Am So., 5. Juni 2022 um 17:48 Uhr schrieb Piotr > > >>> Nowojski > > >>>> < > > >>>>>>>>>>>>>> pnowoj...@apache.org>: > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> Also +1 to what David has written. But it doesn't > > >>> mean > > >>>> we > > >>>>>>>> should > > >>>>>>>>> be > > >>>>>>>>>>>>>> waiting > > >>>>>>>>>>>>>>> indefinitely to deprecate SourceFunction. > > >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> Best, > > >>>>>>>>>>>>>>> Piotrek > > >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> niedz., 5 cze 2022 o 16:46 Jark Wu < > > >> imj...@gmail.com > > >>>> > > >>>>>>>>> napisał(a): > > >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>> +1 to David's point. > > >>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>> Usually, when we deprecate some interfaces, we > > >>> should > > >>>>>> point > > >>>>>>>>> users > > >>>>>>>>>>>>> to > > >>>>>>>>>>>>>> use > > >>>>>>>>>>>>>>>> the recommended alternatives. > > >>>>>>>>>>>>>>>> However, implementing the new Source interface > > >> for > > >>>> some > > >>>>>>>> simple > > >>>>>>>>>>>>>> scenarios > > >>>>>>>>>>>>>>> is > > >>>>>>>>>>>>>>>> too challenging and complex. > > >>>>>>>>>>>>>>>> We also found it isn't easy to push the internal > > >>>>>> connector > > >>>>>>> to > > >>>>>>>>>>>>> upgrade > > >>>>>>>>>>>>>> to > > >>>>>>>>>>>>>>>> the new Source because > > >>>>>>>>>>>>>>>> "FLIP-27 are hard to understand, while > > >>> SourceFunction > > >>>>> is > > >>>>>>>> easy". > > >>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>> +1 to make implementing a simple Source easier > > >>> before > > >>>>>>>>> deprecating > > >>>>>>>>>>>>>>>> SourceFunction. > > >>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>> Best, > > >>>>>>>>>>>>>>>> Jark > > >>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>> On Sun, 5 Jun 2022 at 07:29, Jingsong Lee < > > >>>>>>>>>> lzljs3620...@apache.org > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> wrote: > > >>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>> +1 to David and Ingo. > > >>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>> Before deprecate and remove SourceFunction, we > > >>>> should > > >>>>>>> have > > >>>>>>>>> some > > >>>>>>>>>>>>>> easier > > >>>>>>>>>>>>>>>> APIs > > >>>>>>>>>>>>>>>>> to wrap new Source, the cost to write a new > > >>> Source > > >>>> is > > >>>>>> too > > >>>>>>>>> high > > >>>>>>>>>>>>> now. > > >>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>> Ingo Bürk <airbla...@apache.org>于2022年6月5日 > > >>>>> 周日05:32写道: > > >>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>> I +1 everything David said. The new Source > > >> API > > >>>>> raised > > >>>>>>> the > > >>>>>>>>>>>>>> complexity > > >>>>>>>>>>>>>>>>>> significantly. It's great to have such a > > >> rich, > > >>>>>> powerful > > >>>>>>>> API > > >>>>>>>>>>>>> that > > >>>>>>>>>>>>>> can > > >>>>>>>>>>>>>>> do > > >>>>>>>>>>>>>>>>>> everything, but in the process we lost the > > >>>> ability > > >>>>> to > > >>>>>>>>> onboard > > >>>>>>>>>>>>>> people > > >>>>>>>>>>>>>>> to > > >>>>>>>>>>>>>>>>>> the APIs. > > >>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>> Best > > >>>>>>>>>>>>>>>>>> Ingo > > >>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>> On 04.06.22 21:21, David Anderson wrote: > > >>>>>>>>>>>>>>>>>>> I'm in favor of this, but I think we need > > >> to > > >>>> make > > >>>>>> it > > >>>>>>>>> easier > > >>>>>>>>>>>>> to > > >>>>>>>>>>>>>>>>> implement > > >>>>>>>>>>>>>>>>>>> data generators and test sources. As things > > >>>> stand > > >>>>>> in > > >>>>>>>>> 1.15, > > >>>>>>>>>>>>> unless > > >>>>>>>>>>>>>>> you > > >>>>>>>>>>>>>>>>> can > > >>>>>>>>>>>>>>>>>>> be satisfied with using a > > >>> NumberSequenceSource > > >>>>>>> followed > > >>>>>>>>> by > > >>>>>>>>>> a > > >>>>>>>>>>>>> map, > > >>>>>>>>>>>>>>>>> things > > >>>>>>>>>>>>>>>>>>> get quite complicated. I looked into > > >>> reworking > > >>>>> the > > >>>>>>> data > > >>>>>>>>>>>>>> generators > > >>>>>>>>>>>>>>>> used > > >>>>>>>>>>>>>>>>>> in > > >>>>>>>>>>>>>>>>>>> the training exercises, and got discouraged > > >>> by > > >>>>> the > > >>>>>>>> amount > > >>>>>>>>>> of > > >>>>>>>>>>>>> work > > >>>>>>>>>>>>>>>>>> involved. > > >>>>>>>>>>>>>>>>>>> (The sources used in the training want to > > >> be > > >>>>>>> unbounded, > > >>>>>>>>> and > > >>>>>>>>>>>>> need > > >>>>>>>>>>>>>>>>>>> watermarking in the sources, which means > > >> that > > >>>>> using > > >>>>>>>>>>>>>>>>> NumberSequenceSource > > >>>>>>>>>>>>>>>>>>> isn't an option.) > > >>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>> I think the proposed deprecation will be > > >>> better > > >>>>>>>> received > > >>>>>>>>> if > > >>>>>>>>>>>>> it > > >>>>>>>>>>>>>> can > > >>>>>>>>>>>>>>> be > > >>>>>>>>>>>>>>>>>>> accompanied by something that makes > > >>>> implementing > > >>>>> a > > >>>>>>>> simple > > >>>>>>>>>>>>> Source > > >>>>>>>>>>>>>>>> easier > > >>>>>>>>>>>>>>>>>>> than it is now. People are continuing to > > >>>>> implement > > >>>>>>> new > > >>>>>>>>>>>>>>>> SourceFunctions > > >>>>>>>>>>>>>>>>>>> because the interfaces defined by FLIP-27 > > >> are > > >>>>> hard > > >>>>>> to > > >>>>>>>>>>>>> understand, > > >>>>>>>>>>>>>>>> while > > >>>>>>>>>>>>>>>>>>> SourceFunction is easy. Alex, I believe you > > >>>> were > > >>>>>>>> looking > > >>>>>>>>>> into > > >>>>>>>>>>>>>>>>>> implementing > > >>>>>>>>>>>>>>>>>>> an easier-to-use building block that could > > >> be > > >>>>> used > > >>>>>> in > > >>>>>>>>>>>>> situations > > >>>>>>>>>>>>>>> like > > >>>>>>>>>>>>>>>>>> this. > > >>>>>>>>>>>>>>>>>>> Can we get something like that in place > > >>> first? > > >>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>> David > > >>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>> On Fri, Jun 3, 2022 at 4:52 PM Jing Ge < > > >>>>>>>>> j...@ververica.com > > >>>>>>>>>>> > > >>>>>>>>>>>>>> wrote: > > >>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>>> Hi, > > >>>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>>> Thanks Alex for driving this! > > >>>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>>> +1 To give the Flink developers, > > >> especially > > >>>>>>> Connector > > >>>>>>>>>>>>> developers > > >>>>>>>>>>>>>>> the > > >>>>>>>>>>>>>>>>>> clear > > >>>>>>>>>>>>>>>>>>>> signal that the new Source API is > > >>> recommended > > >>>>>>>> according > > >>>>>>>>> to > > >>>>>>>>>>>>>>> FLIP-27, > > >>>>>>>>>>>>>>>> we > > >>>>>>>>>>>>>>>>>>>> should mark them as deprecated. > > >>>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>>> There are some open questions to discuss: > > >>>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>>> 1. Do we need to mark all > > >>>>> subinterfaces/subclasses > > >>>>>>> as > > >>>>>>>>>>>>>> deprecated? > > >>>>>>>>>>>>>>>> e.g. > > >>>>>>>>>>>>>>>>>>>> FromElementsFunction, etc. there are many. > > >>>> What > > >>>>>> are > > >>>>>>>> the > > >>>>>>>>>>>>>>>> replacements? > > >>>>>>>>>>>>>>>>>>>> 2. Do we need to mark all subclasses that > > >>> have > > >>>>>>>>> replacement > > >>>>>>>>>>>>> as > > >>>>>>>>>>>>>>>>>> deprecated? > > >>>>>>>>>>>>>>>>>>>> e.g. ExternallyInducedSource whose > > >>> replacement > > >>>>>>> class, > > >>>>>>>>> if I > > >>>>>>>>>>>>> am > > >>>>>>>>>>>>>> not > > >>>>>>>>>>>>>>>>>> mistaken, > > >>>>>>>>>>>>>>>>>>>> ExternallyInducedSourceReader is > > >>> @Experimental > > >>>>>>>>>>>>>>>>>>>> 3. Do we need to mark all related test > > >>> utility > > >>>>>>> classes > > >>>>>>>>> as > > >>>>>>>>>>>>>>>> deprecated? > > >>>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>>> I think it might make sense to create an > > >>>>> umbrella > > >>>>>>>> ticket > > >>>>>>>>>> to > > >>>>>>>>>>>>>> cover > > >>>>>>>>>>>>>>>> all > > >>>>>>>>>>>>>>>>> of > > >>>>>>>>>>>>>>>>>>>> these with the following process: > > >>>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>>> 1. Mark SourceFunction as deprecated asap. > > >>>>>>>>>>>>>>>>>>>> 2. Mark subinterfaces and subclasses as > > >>>>>> deprecated, > > >>>>>>> if > > >>>>>>>>>>>>> there are > > >>>>>>>>>>>>>>>>>> graduated > > >>>>>>>>>>>>>>>>>>>> replacements. Good example is that > > >>> KafkaSource > > >>>>>>>> replaced > > >>>>>>>>>>>>>>>> KafkaConsumer > > >>>>>>>>>>>>>>>>>> which > > >>>>>>>>>>>>>>>>>>>> has been marked as deprecated. > > >>>>>>>>>>>>>>>>>>>> 3. Do not mark subinterfaces and > > >> subclasses > > >>> as > > >>>>>>>>> deprecated, > > >>>>>>>>>>>>> if > > >>>>>>>>>>>>>>>>>> replacement > > >>>>>>>>>>>>>>>>>>>> classes are still experimental, check if > > >> it > > >>> is > > >>>>>> time > > >>>>>>> to > > >>>>>>>>>>>>> graduate > > >>>>>>>>>>>>>>>> them. > > >>>>>>>>>>>>>>>>>> After > > >>>>>>>>>>>>>>>>>>>> graduation, go to step 2. It might take a > > >>>> while > > >>>>>> for > > >>>>>>>>>>>>> graduation. > > >>>>>>>>>>>>>>>>>>>> 4. Do not mark subinterfaces and > > >> subclasses > > >>> as > > >>>>>>>>> deprecated, > > >>>>>>>>>>>>> if > > >>>>>>>>>>>>>> the > > >>>>>>>>>>>>>>>>>>>> replacement classes are experimental and > > >> are > > >>>> too > > >>>>>>> young > > >>>>>>>>> to > > >>>>>>>>>>>>>>> graduate. > > >>>>>>>>>>>>>>>> We > > >>>>>>>>>>>>>>>>>> have > > >>>>>>>>>>>>>>>>>>>> to wait. But in this case we could create > > >>> new > > >>>>>>> tickets > > >>>>>>>>>> under > > >>>>>>>>>>>>> the > > >>>>>>>>>>>>>>>>> umbrella > > >>>>>>>>>>>>>>>>>>>> ticket. > > >>>>>>>>>>>>>>>>>>>> 5. Do not mark subinterfaces and > > >> subclasses > > >>> as > > >>>>>>>>> deprecated, > > >>>>>>>>>>>>> if > > >>>>>>>>>>>>>>> there > > >>>>>>>>>>>>>>>> is > > >>>>>>>>>>>>>>>>>> no > > >>>>>>>>>>>>>>>>>>>> replacement at all. We have to create new > > >>>>> tickets > > >>>>>>> and > > >>>>>>>>> wait > > >>>>>>>>>>>>> until > > >>>>>>>>>>>>>>> the > > >>>>>>>>>>>>>>>>> new > > >>>>>>>>>>>>>>>>>>>> implementation has been done and > > >> graduated. > > >>> It > > >>>>>> will > > >>>>>>>>> take a > > >>>>>>>>>>>>>> longer > > >>>>>>>>>>>>>>>>> time, > > >>>>>>>>>>>>>>>>>>>> roughly 1,5 years. > > >>>>>>>>>>>>>>>>>>>> 6. For test classes, we could follow the > > >>> same > > >>>>>> rule. > > >>>>>>>> But > > >>>>>>>>> I > > >>>>>>>>>>>>> think > > >>>>>>>>>>>>>>> for > > >>>>>>>>>>>>>>>>> some > > >>>>>>>>>>>>>>>>>>>> cases, we could consider doing the > > >>> replacement > > >>>>>>>> directly > > >>>>>>>>>>>>> without > > >>>>>>>>>>>>>>>> going > > >>>>>>>>>>>>>>>>>>>> through the deprecation phase. > > >>>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>>> When we look back on all of these, we can > > >>>>> realize > > >>>>>> it > > >>>>>>>> is > > >>>>>>>>> a > > >>>>>>>>>>>>> big > > >>>>>>>>>>>>>> epic > > >>>>>>>>>>>>>>>>> (even > > >>>>>>>>>>>>>>>>>>>> bigger than an epic). It needs someone to > > >>>> drive > > >>>>> it > > >>>>>>> and > > >>>>>>>>>> keep > > >>>>>>>>>>>>>> focus > > >>>>>>>>>>>>>>> on > > >>>>>>>>>>>>>>>>> it > > >>>>>>>>>>>>>>>>>>>> continuously with support from the > > >> community > > >>>> and > > >>>>>>> push > > >>>>>>>>> the > > >>>>>>>>>>>>>>>> development > > >>>>>>>>>>>>>>>>>>>> towards the new Source API of FLIP-27. > > >>>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>>> If we could have consensus for this, Alex > > >>>> and I > > >>>>>>> could > > >>>>>>>>>>>>> create > > >>>>>>>>>>>>>> the > > >>>>>>>>>>>>>>>>>> umbrella > > >>>>>>>>>>>>>>>>>>>> ticket to kick it off. > > >>>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>>> Best regards, > > >>>>>>>>>>>>>>>>>>>> Jing > > >>>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>>> On Fri, Jun 3, 2022 at 3:54 PM Alexander > > >>>>> Fedulov < > > >>>>>>>>>>>>>>>>>> alexan...@ververica.com> > > >>>>>>>>>>>>>>>>>>>> wrote: > > >>>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>>>> Hi everyone, > > >>>>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>>>> I would like to start the discussion > > >> about > > >>>>>> marking > > >>>>>>>>>>>>>>>>> SourceFunction-based > > >>>>>>>>>>>>>>>>>>>>> interfaces as deprecated. With the > > >> FLIP-27 > > >>>> APIs > > >>>>>>>>> becoming > > >>>>>>>>>>>>> the > > >>>>>>>>>>>>>> new > > >>>>>>>>>>>>>>>>>>>> standard, > > >>>>>>>>>>>>>>>>>>>>> the old ones have to be eventually phased > > >>>> out. > > >>>>>>>> Although > > >>>>>>>>>>>>> this > > >>>>>>>>>>>>>>> state > > >>>>>>>>>>>>>>>> is > > >>>>>>>>>>>>>>>>>>>> well > > >>>>>>>>>>>>>>>>>>>>> known within the community and no new > > >>>>> connectors > > >>>>>>>> based > > >>>>>>>>> on > > >>>>>>>>>>>>> the > > >>>>>>>>>>>>>> old > > >>>>>>>>>>>>>>>>>>>>> interfaces can be accepted into the > > >>> project, > > >>>>> the > > >>>>>>>>>> footprint > > >>>>>>>>>>>>> of > > >>>>>>>>>>>>>>>>>>>>> SourceFunction in the user code still > > >> keeps > > >>>>>> growing > > >>>>>>>>>>>>> (primarily > > >>>>>>>>>>>>>>> for > > >>>>>>>>>>>>>>>>> data > > >>>>>>>>>>>>>>>>>>>>> generators and test utilities). I believe > > >>> it > > >>>> is > > >>>>>>> best > > >>>>>>>> to > > >>>>>>>>>>>>> mark > > >>>>>>>>>>>>>>>>>>>> SourceFunction > > >>>>>>>>>>>>>>>>>>>>> as deprecated as soon as possible. What > > >> do > > >>>> you > > >>>>>>> think? > > >>>>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>>>> Best, > > >>>>>>>>>>>>>>>>>>>>> Alexander Fedulov > > >>>>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> -- > > >>>>>>>>>>>>>> https://twitter.com/snntrable > > >>>>>>>>>>>>>> https://github.com/knaufk > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>> > > >>>>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>> > > >>>>>>>> > > >>>>>>> > > >>>>>> > > >>>>> > > >>>> > > >>> > > >> > > > > > > -- https://twitter.com/snntrable https://github.com/knaufk