Hi all, Sorry for my mistake. The `StreamExecutionEnvironment#readFiles` and can be easily replaced by `FileSource#forRecordStreamFormat/forBulkFileFormat`. I have no other concerns.
+1 to deprecate SourceFunction and deprecate the methods (in StreamExecutionEnvironment) based on SourceFunction . Best, Lijie Konstantin Knauf <kna...@apache.org> 于2022年6月10日周五 05:11写道: > Hi everyone, > > thank you Jing for redirecting the discussion back to the topic at hand. I > agree with all of your points. > > +1 to deprecate SourceFunction > > Is there really no replacement for the StreamExecutionEnvironment#readXXX. > There is already a FLIP-27 based FileSource, right? What's missing to > recommend using that as opposed to the the readXXX methods? > > Cheers, > > Konstantin > > Am Do., 9. Juni 2022 um 20:11 Uhr schrieb Alexander Fedulov < > alexan...@ververica.com>: > > > Hi all, > > > > It seems that there is some understandable cautiousness with regard to > > deprecating methods and subclasses that do not have alternatives just > yet. > > > > We should probably first agree if it is in general OK for Flink to use > > @Deprecated > > annotation for parts of the code that do not have alternatives. In that > > case, > > we could add a comment along the lines of: > > "This implementation is based on a deprecated SourceFunction API that > > will gradually be phased out from Flink. No direct substitute exists at > the > > moment. > > If you want to have a more future-proof solution, consider helping the > > project by > > contributing an implementation based on the new Source API." > > > > This should clearly communicate the message that usage of these > > methods/classes > > is discouraged and at the same time promote contributions for addressing > > the gap. > > What do you think? > > > > Best, > > Alexander Fedulov > > > > > > On Thu, Jun 9, 2022 at 6:27 PM Ingo Bürk <airbla...@apache.org> wrote: > > > > > Hi, > > > > > > these APIs don't expose the underlying source directly, so I don't > think > > > we need to worry about deprecating them as well. There's also nothing > > > inherently wrong with using a deprecated API internally, though even > > > just for the experience of using our own new APIs I would personally > say > > > that they should be migrated to the new Source API. It's hard to reason > > > that users must migrate to a new API if we don't do it internally as > > well. > > > > > > > > > Best > > > Ingo > > > > > > On 09.06.22 15:41, Lijie Wang wrote: > > > > Hi Martijn, > > > > > > > > I don't mean it's a blocker. Just a information. And I'm also +1 for > > > this. > > > > > > > > Put it another way: should we migrate the `#readFile(...)` to new API > > or > > > > provide a similar method "readxxx“ based on the new Source API? > > > > > > > > And if we don't migrate it, does it mean that the `#readFile(...)` > > should > > > > also be marked as deprecated? > > > > > > > > Best, > > > > Lijie > > > > > > > > Martijn Visser <martijnvis...@apache.org> 于2022年6月9日周四 21:03写道: > > > > > > > >> Hi Lijie, > > > >> > > > >> I don't see any problem with deprecating those methods at this > moment, > > > as > > > >> long as we don't remove them until the replacements are available. > > > Besides > > > >> that, are we sure there are no replacements already, especially with > > the > > > >> new FileSource? > > > >> > > > >> Best regards, > > > >> > > > >> Martijn > > > >> > > > >> Op do 9 jun. 2022 om 14:23 schreef Lijie Wang < > > wangdachui9...@gmail.com > > > >: > > > >> > > > >>> Hi all, > > > >>> > > > >>> FYI, currently, some commonly used methods in > > > StreamExecutionEnvironment > > > >>> are still based on the old SourceFunction (and there is no > > > alternative): > > > >>> `StreamExecutionEnvironment#readFile(...)` > > > >>> `StreamExecutionEnvironment#readTextFile(...)` > > > >>> > > > >>> I think these should be migrated to the new source API before > > deprecate > > > >> the > > > >>> SourceFunction. > > > >>> > > > >>> Best, > > > >>> Lijie > > > >>> > > > >>> Martijn Visser <martijnvis...@apache.org> 于2022年6月9日周四 16:05写道: > > > >>> > > > >>>> Hi all, > > > >>>> > > > >>>> I think implicitly we've already considered the SourceFunction and > > > >>>> SinkFunction as deprecated. They are even marked as so on the > Flink > > > >>> roadmap > > > >>>> [1]. That also shows that connectors that are using these > interfaces > > > >> are > > > >>>> either approaching end-of-life. The fact that we're actively > > migrating > > > >>>> connectors from Source/SinkFunction to FLIP-27/FLIP-143 (plus > add-on > > > >>> FLIPs) > > > >>>> shows that we've already determined that target. > > > >>>> > > > >>>> With regards to the motivation of FLIP-27, I think reading up on > the > > > >>>> original discussion thread is also worthwhile [2] to see more > > context. > > > >>>> FLIP-27 was also very important as it brought a unified connector > > > which > > > >>> can > > > >>>> support both streaming and batch (with batch being considered a > > > special > > > >>>> case of streaming in Flink's vision). > > > >>>> > > > >>>> So +1 to deprecate SourceFunction. I would also argue that we > should > > > >>>> already mark the SinkFunction as deprecated to avoid having this > > > >>> discussion > > > >>>> again in a couple of months. > > > >>>> > > > >>>> Best regards, > > > >>>> > > > >>>> Martijn > > > >>>> > > > >>>> [1] https://flink.apache.org/roadmap.html > > > >>>> [2] > > https://lists.apache.org/thread/334co89dbhc8qpr9nvmz8t1gp4sz2c8y > > > >>>> > > > >>>> Op do 9 jun. 2022 om 09:48 schreef Jing Ge <j...@ververica.com>: > > > >>>> > > > >>>>> Hi, > > > >>>>> > > > >>>>> I am very happy to see opinions from different perspectives. That > > > >> will > > > >>>> help > > > >>>>> us understand the problem better. Thanks all for the informative > > > >>>>> discussion. > > > >>>>> > > > >>>>> Let's see the big picture and check following facts together: > > > >>>>> > > > >>>>> 1. FLIP-27 was intended to solve some technical issues that are > > very > > > >>>>> difficult to solve with SourceFunction[1]. When we say > > > >> "SourceFunction > > > >>> is > > > >>>>> easy", well, it depends. If we take a look at the implementation > of > > > >> the > > > >>>>> Kafka connector, we will know how complicated it is to build a > > > >> serious > > > >>>>> connector for production with the old SourceFunction. To every > > > >> problem > > > >>>>> there is a solution and to every solution there is a problem. The > > > >> fact > > > >>> is > > > >>>>> that there is no perfect but a feasible solution. If we try to > > solve > > > >>>>> complicated problems, we have to expose some complexity. > Comparing > > to > > > >>>>> connectors for POC, demo, training(no offense), I would also > solve > > > >>> issues > > > >>>>> for connectors like Kafka connector that are widely used in > > > >> production > > > >>>> with > > > >>>>> higher priority. I think that should be one reason why FLIP-27 > has > > > >> been > > > >>>>> designed and why the new source API went public. > > > >>>>> > > > >>>>> 2. FLIP-27 and the implementation was introduced roughly at the > end > > > >> of > > > >>>> 2019 > > > >>>>> and went public on 19.04.2021, which means Flink has provided two > > > >>>> different > > > >>>>> public/graduated source solutions for more than one year. On the > > day > > > >>> that > > > >>>>> the new source API went public, there should be a consensus in > the > > > >>>>> community that we should start the migration. Old SourceFunction > > > >>>> interface, > > > >>>>> in the ideal case, should have been deprecated on that day, > > otherwise > > > >>> we > > > >>>>> should not graduate the new source API to avoid confusing > > (connector) > > > >>>>> developers[2]. > > > >>>>> > > > >>>>> 3. It is true that the new source API is hard to understand and > > even > > > >>> hard > > > >>>>> to implement for simple cases. Thanks for the feedback. That is > > > >>> something > > > >>>>> we need to improve. The current design&implementation could be > > > >>> considered > > > >>>>> as the low level API. The next step is to create the high level > API > > > >> to > > > >>>>> reduce some unnecessary complexity for those simple cases. But, > > IMHO, > > > >>>> this > > > >>>>> should not be the prerequisite to postpone the deprecation of the > > old > > > >>>>> SourceFunction APIs. > > > >>>>> > > > >>>>> 4. As long as the old SourceFunction is not marked as deprecated, > > > >>>>> developers will continue asking which one should be used. Let's > > make > > > >> a > > > >>>>> concrete example. If a new connector is developed now and the > > > >> developer > > > >>>>> asks for a suggestion of the choice between the old and new > source > > > >> API > > > >>> on > > > >>>>> the ML, which one should we suggest? I think it should be the new > > > >>> Source > > > >>>>> API. If a fresh new connector has been developed with the old > > > >>>>> SourceFunction API before asking for the consensus in the > community > > > >> and > > > >>>> the > > > >>>>> developer wants to merge it to the master. Should we allow it? If > > the > > > >>>>> answer of all these questions is pointing to the new Source API, > > the > > > >>> old > > > >>>>> SourceFunction is de facto already deprecated, just has not been > > > >> marked > > > >>>> as > > > >>>>> @deprecated, which confuses developers even more. > > > >>>>> > > > >>>>> Best regards, > > > >>>>> Jing > > > >>>>> > > > >>>>> [1] > > > >>>>> > > > >>>>> > > > >>>> > > > >>> > > > >> > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface > > > >>>>> [2] > > https://lists.apache.org/thread/7okp4y46n3o3rx5mn0t3qobrof8zxwqs > > > >>>>> > > > >>>>> On Wed, Jun 8, 2022 at 2:21 AM Alexander Fedulov < > > > >>>> alexan...@ververica.com> > > > >>>>> wrote: > > > >>>>> > > > >>>>>> Hey Austin, > > > >>>>>> > > > >>>>>> Since we are getting deeper into the implementation details of > the > > > >>>>>> DataGeneratorSource > > > >>>>>> and it is not the main topic of this thread, I propose to move > our > > > >>>>>> discussion to where it belongs: [DISCUSS] FLIP-238 [1]. Could > you > > > >>>> please > > > >>>>>> briefly formulate your requirements to make it easier for the > > > >> others > > > >>> to > > > >>>>>> follow? I am happy to continue this conversation there. > > > >>>>>> > > > >>>>>> [1] > > > >> https://lists.apache.org/thread/7gjxto1rmkpff4kl54j8nlg5db2rqhkt > > > >>>>>> > > > >>>>>> Best, > > > >>>>>> Alexander Fedulov > > > >>>>>> > > > >>>>>> On Tue, Jun 7, 2022 at 6:14 PM Austin Cawley-Edwards < > > > >>>>>> austin.caw...@gmail.com> wrote: > > > >>>>>> > > > >>>>>>>> @Austin, in the FLIP I mentioned above [1], the user is > > > >> expected > > > >>> to > > > >>>>>>> pass a MapFunction<Long, > > > >>>>>>> OUT> > > > >>>>>>> to the generator. I wonder if you could have your external > client > > > >>> and > > > >>>>>>> polling logic wrapped in a custom > > > >>>>>>> MapFunction implementation class? Would that answer your needs > or > > > >>> do > > > >>>>> you > > > >>>>>>> have some > > > >>>>>>> more sophisticated scenario in mind? > > > >>>>>>> > > > >>>>>>> At first glance, the FLIP looks good but for this case in > regards > > > >>> to > > > >>>>> the > > > >>>>>>> map function, but leaves out 1) ability to control polling > > > >>> intervals > > > >>>>> and > > > >>>>>> 2) > > > >>>>>>> ability to produce an unknown number of records, both per-poll > > > >> and > > > >>>>>> overall > > > >>>>>>> boundedness. Do you think something like this could be built > from > > > >>> the > > > >>>>>> same > > > >>>>>>> pieces? > > > >>>>>>> I'm also wondering what handles threading, is that on the user > or > > > >>> is > > > >>>>> that > > > >>>>>>> part of the DataGeneratorSource? > > > >>>>>>> > > > >>>>>>> Best, > > > >>>>>>> Austin > > > >>>>>>> > > > >>>>>>> On Tue, Jun 7, 2022 at 9:34 AM Alexander Fedulov < > > > >>>>>> alexan...@ververica.com> > > > >>>>>>> wrote: > > > >>>>>>> > > > >>>>>>>> Hi everyone, > > > >>>>>>>> > > > >>>>>>>> Thanks for all the input and a lively discussion. It seems > that > > > >>>> there > > > >>>>>> is > > > >>>>>>> a > > > >>>>>>>> consensus that due to > > > >>>>>>>> the inherent complexity of FLIP-27 sources we should provide > > > >> more > > > >>>>>>>> user-facing utilities to bridge > > > >>>>>>>> the gap between the existing SourceFunction-based > functionality > > > >>> and > > > >>>>> the > > > >>>>>>> new > > > >>>>>>>> APIs. > > > >>>>>>>> > > > >>>>>>>> To start addressing this I picked the issue that David raised > > > >> and > > > >>>>> many > > > >>>>>>>> upvoted. Here is a proposal > > > >>>>>>>> for the new DataGeneratorSource: FLIP-238 [1]. Please take a > > > >>>> look, I > > > >>>>>> am > > > >>>>>>>> going to open a separate > > > >>>>>>>> discussion thread on it shortly. > > > >>>>>>>> > > > >>>>>>>> Jing also raised some great points regarding the interfaces > and > > > >>>>>>> subclasses. > > > >>>>>>>> It seems to me that > > > >>>>>>>> what might actually help is some sort of a "soft deprecation" > > > >>>> concept > > > >>>>>> and > > > >>>>>>>> annotation. It could be > > > >>>>>>>> used in places where we do not have an alternative > > > >> implementation > > > >>>>> yet, > > > >>>>>>> but > > > >>>>>>>> we clearly want > > > >>>>>>>> to indicate that continuing to build on top of these > interfaces > > > >>> is > > > >>>>>>>> discouraged. The area of > > > >>>>>>>> impact of deprecating all SourceFunction subclasses is rather > > > >>> big, > > > >>>>> and > > > >>>>>> we > > > >>>>>>>> can expect it to > > > >>>>>>>> take a while. The hope would be that if in the meantime > someone > > > >>>> finds > > > >>>>>>>> themselves using one of > > > >>>>>>>> such old APIs, the "soft deprecation" annotation will be a > > > >> clear > > > >>>>>>> indication > > > >>>>>>>> and encouragement to > > > >>>>>>>> work on introducing an alternative FLIP-27-based > implementation > > > >>>>>> instead. > > > >>>>>>>> > > > >>>>>>>> @Austin, in the FLIP I mentioned above [1], the user is > > > >> expected > > > >>> to > > > >>>>>>>> pass a MapFunction<Long, > > > >>>>>>>> OUT> > > > >>>>>>>> to the generator. I wonder if you could have your external > > > >> client > > > >>>> and > > > >>>>>>>> polling logic wrapped in a custom > > > >>>>>>>> MapFunction implementation class? Would that answer your needs > > > >> or > > > >>>> do > > > >>>>>> you > > > >>>>>>>> have some > > > >>>>>>>> more sophisticated scenario in mind? > > > >>>>>>>> > > > >>>>>>>> [1] https://cwiki.apache.org/confluence/x/9Av1D > > > >>>>>>>> Best, > > > >>>>>>>> Alexander Fedulov > > > >>>>>>>> > > > >>>>>>>> On Mon, Jun 6, 2022 at 7:08 PM Austin Cawley-Edwards < > > > >>>>>>>> austin.caw...@gmail.com> wrote: > > > >>>>>>>> > > > >>>>>>>>> Thanks for the nice discussion all. > > > >>>>>>>>> > > > >>>>>>>>> I was recently trying to implement a very simple polling > > > >> source > > > >>>> and > > > >>>>>>>>> would've loved a higher-level base to work from. I'm > > > >> wondering > > > >>> if > > > >>>>> in > > > >>>>>>>>> addition to the data generator use cases, it would be good to > > > >>>>>> support a > > > >>>>>>>>> simple non-parallel polling abstraction to make it easier to, > > > >>> for > > > >>>>>>>> instance, > > > >>>>>>>>> start prototyping with data in existing APIs without adding a > > > >>>> Kafka > > > >>>>>> or > > > >>>>>>>> such > > > >>>>>>>>> in the middle. > > > >>>>>>>>> > > > >>>>>>>>> Best, > > > >>>>>>>>> Austin > > > >>>>>>>>> > > > >>>>>>>>> On Mon, Jun 6, 2022 at 10:02 AM tison <wander4...@gmail.com> > > > >>>>> wrote: > > > >>>>>>>>> > > > >>>>>>>>>> Well. It's a bit off-topic. For deprecating SourceFunction > > > >> as > > > >>>>>> FLIP-27 > > > >>>>>>>>>> series works go ahead, +1 from my side. It's a significant > > > >>> work > > > >>>>>>> towards > > > >>>>>>>>> the > > > >>>>>>>>>> unification of batch and streaming effort :) > > > >>>>>>>>>> > > > >>>>>>>>>> Best, > > > >>>>>>>>>> tison. > > > >>>>>>>>>> > > > >>>>>>>>>> > > > >>>>>>>>>> tison <wander4...@gmail.com> 于2022年6月6日周一 21:54写道: > > > >>>>>>>>>> > > > >>>>>>>>>>> The starting point of the version bump and removal > > > >> question > > > >>>> is > > > >>>>>> that > > > >>>>>>>>>>> downstream projects may experience a tough time to adapt > > > >>> new > > > >>>>>>>> interfaces > > > >>>>>>>>>>> while Flink keeps in 1.x versions so that users may > > > >> expect > > > >>> it > > > >>>>> as > > > >>>>>> an > > > >>>>>>>>> easy > > > >>>>>>>>>>> task. From my experience, it's really challenge to > > > >> maintain > > > >>>>>>>>>>> compatibility between multiple versions of Flink while > > > >>>>>> significant > > > >>>>>>>>>> changes > > > >>>>>>>>>>> made but sharing 1.x version series - users may not be > > > >>> aware > > > >>>>> that > > > >>>>>>>> it's > > > >>>>>>>>>>> almost a major version bump. > > > >>>>>>>>>>> > > > >>>>>>>>>>> Best, > > > >>>>>>>>>>> tison. > > > >>>>>>>>>>> > > > >>>>>>>>>>> > > > >>>>>>>>>>> tison <wander4...@gmail.com> 于2022年6月6日周一 21:51写道: > > > >>>>>>>>>>> > > > >>>>>>>>>>>> One question from my side: > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> As SourceFunction a @Public interface, we cannot remove > > > >> it > > > >>>>>> before > > > >>>>>>>>> doing > > > >>>>>>>>>>>> a major version bump (Flink 2.0). > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> Of course it's not a blocker to make such deprecation > > > >> and > > > >>>> let > > > >>>>>> the > > > >>>>>>>> new > > > >>>>>>>>>>>> interface step in. My question is whether we have a plan > > > >>> to > > > >>>>>>> finally > > > >>>>>>>>>> remove > > > >>>>>>>>>>>> the deprecated interfaces, or postpone it until a clear > > > >>> plan > > > >>>>> of > > > >>>>>>>> Flink > > > >>>>>>>>>> 2.0? > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> Best, > > > >>>>>>>>>>>> tison. > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> David Anderson <dander...@apache.org> 于2022年6月6日周一 > > > >>> 21:35写道: > > > >>>>>>>>>>>> > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> David, can you elaborate why you need watermark > > > >>>> generation > > > >>>>> in > > > >>>>>>> the > > > >>>>>>>>>>>>> source > > > >>>>>>>>>>>>>> for your data generators? > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>>> The training exercises should strive to provide > > > >> examples > > > >>> of > > > >>>>>> best > > > >>>>>>>>>>>>> practices. > > > >>>>>>>>>>>>> If the exercises and their solutions use > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>>> env.fromSource(source, > > > >> WatermarkStrategy.noWatermarks(), > > > >>>>>>>>>>>>> "name-of-source") > > > >>>>>>>>>>>>> .map(...) > > > >>>>>>>>>>>>> .assignTimestampsAndWatermarks(...) > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>>> this will help establish this anti-pattern as the > > > >> normal > > > >>>> way > > > >>>>> of > > > >>>>>>>> doing > > > >>>>>>>>>>>>> things. > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>>> Most new Flink users are using a KafkaSource with a > > > >>>>>> noWatermarks > > > >>>>>>>>>> strategy > > > >>>>>>>>>>>>> and a SimpleStringSchema, followed by a map that does > > > >> the > > > >>>>> real > > > >>>>>>>>>>>>> deserialization, followed by the real watermarking -- > > > >>>> because > > > >>>>>>> they > > > >>>>>>>>>> aren't > > > >>>>>>>>>>>>> seeing examples that teach how these interfaces are > > > >> meant > > > >>>> to > > > >>>>> be > > > >>>>>>>> used. > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>>> When we redo the sources used in training exercises, I > > > >>> want > > > >>>>> to > > > >>>>>>>> avoid > > > >>>>>>>>>>>>> these > > > >>>>>>>>>>>>> pitfalls. > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>>> David > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>>> On Mon, Jun 6, 2022 at 9:12 AM Konstantin Knauf < > > > >>>>>>> kna...@apache.org > > > >>>>>>>>> > > > >>>>>>>>>>>>> wrote: > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>>>> Hi everyone, > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> very interesting thread. The proposal for deprecation > > > >>>> seems > > > >>>>>> to > > > >>>>>>>> have > > > >>>>>>>>>>>>> sparked > > > >>>>>>>>>>>>>> a very important discussion. Do we what users > > > >> struggle > > > >>>> with > > > >>>>>>>>>>>>> specifically? > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> Speaking for myself, when I upgrade flink-faker to > > > >> the > > > >>>> new > > > >>>>>>> Source > > > >>>>>>>>> API > > > >>>>>>>>>>>>> an > > > >>>>>>>>>>>>>> unbounded version of the NumberSequenceSource would > > > >>> have > > > >>>>> been > > > >>>>>>>> all I > > > >>>>>>>>>>>>> needed, > > > >>>>>>>>>>>>>> but that's just the data generator use case. I think, > > > >>>> that > > > >>>>>> one > > > >>>>>>>>> could > > > >>>>>>>>>> be > > > >>>>>>>>>>>>>> solved quite easily. David, can you elaborate why you > > > >>>> need > > > >>>>>>>>> watermark > > > >>>>>>>>>>>>>> generation in the source for your data generators? > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> Cheers, > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> Konstantin > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> Am So., 5. Juni 2022 um 17:48 Uhr schrieb Piotr > > > >>> Nowojski > > > >>>> < > > > >>>>>>>>>>>>>> pnowoj...@apache.org>: > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>> Also +1 to what David has written. But it doesn't > > > >>> mean > > > >>>> we > > > >>>>>>>> should > > > >>>>>>>>> be > > > >>>>>>>>>>>>>> waiting > > > >>>>>>>>>>>>>>> indefinitely to deprecate SourceFunction. > > > >>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>> Best, > > > >>>>>>>>>>>>>>> Piotrek > > > >>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>> niedz., 5 cze 2022 o 16:46 Jark Wu < > > > >> imj...@gmail.com > > > >>>> > > > >>>>>>>>> napisał(a): > > > >>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> +1 to David's point. > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> Usually, when we deprecate some interfaces, we > > > >>> should > > > >>>>>> point > > > >>>>>>>>> users > > > >>>>>>>>>>>>> to > > > >>>>>>>>>>>>>> use > > > >>>>>>>>>>>>>>>> the recommended alternatives. > > > >>>>>>>>>>>>>>>> However, implementing the new Source interface > > > >> for > > > >>>> some > > > >>>>>>>> simple > > > >>>>>>>>>>>>>> scenarios > > > >>>>>>>>>>>>>>> is > > > >>>>>>>>>>>>>>>> too challenging and complex. > > > >>>>>>>>>>>>>>>> We also found it isn't easy to push the internal > > > >>>>>> connector > > > >>>>>>> to > > > >>>>>>>>>>>>> upgrade > > > >>>>>>>>>>>>>> to > > > >>>>>>>>>>>>>>>> the new Source because > > > >>>>>>>>>>>>>>>> "FLIP-27 are hard to understand, while > > > >>> SourceFunction > > > >>>>> is > > > >>>>>>>> easy". > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> +1 to make implementing a simple Source easier > > > >>> before > > > >>>>>>>>> deprecating > > > >>>>>>>>>>>>>>>> SourceFunction. > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> Best, > > > >>>>>>>>>>>>>>>> Jark > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> On Sun, 5 Jun 2022 at 07:29, Jingsong Lee < > > > >>>>>>>>>> lzljs3620...@apache.org > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>> wrote: > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> +1 to David and Ingo. > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> Before deprecate and remove SourceFunction, we > > > >>>> should > > > >>>>>>> have > > > >>>>>>>>> some > > > >>>>>>>>>>>>>> easier > > > >>>>>>>>>>>>>>>> APIs > > > >>>>>>>>>>>>>>>>> to wrap new Source, the cost to write a new > > > >>> Source > > > >>>> is > > > >>>>>> too > > > >>>>>>>>> high > > > >>>>>>>>>>>>> now. > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> Ingo Bürk <airbla...@apache.org>于2022年6月5日 > > > >>>>> 周日05:32写道: > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>> I +1 everything David said. The new Source > > > >> API > > > >>>>> raised > > > >>>>>>> the > > > >>>>>>>>>>>>>> complexity > > > >>>>>>>>>>>>>>>>>> significantly. It's great to have such a > > > >> rich, > > > >>>>>> powerful > > > >>>>>>>> API > > > >>>>>>>>>>>>> that > > > >>>>>>>>>>>>>> can > > > >>>>>>>>>>>>>>> do > > > >>>>>>>>>>>>>>>>>> everything, but in the process we lost the > > > >>>> ability > > > >>>>> to > > > >>>>>>>>> onboard > > > >>>>>>>>>>>>>> people > > > >>>>>>>>>>>>>>> to > > > >>>>>>>>>>>>>>>>>> the APIs. > > > >>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>> Best > > > >>>>>>>>>>>>>>>>>> Ingo > > > >>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>> On 04.06.22 21:21, David Anderson wrote: > > > >>>>>>>>>>>>>>>>>>> I'm in favor of this, but I think we need > > > >> to > > > >>>> make > > > >>>>>> it > > > >>>>>>>>> easier > > > >>>>>>>>>>>>> to > > > >>>>>>>>>>>>>>>>> implement > > > >>>>>>>>>>>>>>>>>>> data generators and test sources. As things > > > >>>> stand > > > >>>>>> in > > > >>>>>>>>> 1.15, > > > >>>>>>>>>>>>> unless > > > >>>>>>>>>>>>>>> you > > > >>>>>>>>>>>>>>>>> can > > > >>>>>>>>>>>>>>>>>>> be satisfied with using a > > > >>> NumberSequenceSource > > > >>>>>>> followed > > > >>>>>>>>> by > > > >>>>>>>>>> a > > > >>>>>>>>>>>>> map, > > > >>>>>>>>>>>>>>>>> things > > > >>>>>>>>>>>>>>>>>>> get quite complicated. I looked into > > > >>> reworking > > > >>>>> the > > > >>>>>>> data > > > >>>>>>>>>>>>>> generators > > > >>>>>>>>>>>>>>>> used > > > >>>>>>>>>>>>>>>>>> in > > > >>>>>>>>>>>>>>>>>>> the training exercises, and got discouraged > > > >>> by > > > >>>>> the > > > >>>>>>>> amount > > > >>>>>>>>>> of > > > >>>>>>>>>>>>> work > > > >>>>>>>>>>>>>>>>>> involved. > > > >>>>>>>>>>>>>>>>>>> (The sources used in the training want to > > > >> be > > > >>>>>>> unbounded, > > > >>>>>>>>> and > > > >>>>>>>>>>>>> need > > > >>>>>>>>>>>>>>>>>>> watermarking in the sources, which means > > > >> that > > > >>>>> using > > > >>>>>>>>>>>>>>>>> NumberSequenceSource > > > >>>>>>>>>>>>>>>>>>> isn't an option.) > > > >>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>> I think the proposed deprecation will be > > > >>> better > > > >>>>>>>> received > > > >>>>>>>>> if > > > >>>>>>>>>>>>> it > > > >>>>>>>>>>>>>> can > > > >>>>>>>>>>>>>>> be > > > >>>>>>>>>>>>>>>>>>> accompanied by something that makes > > > >>>> implementing > > > >>>>> a > > > >>>>>>>> simple > > > >>>>>>>>>>>>> Source > > > >>>>>>>>>>>>>>>> easier > > > >>>>>>>>>>>>>>>>>>> than it is now. People are continuing to > > > >>>>> implement > > > >>>>>>> new > > > >>>>>>>>>>>>>>>> SourceFunctions > > > >>>>>>>>>>>>>>>>>>> because the interfaces defined by FLIP-27 > > > >> are > > > >>>>> hard > > > >>>>>> to > > > >>>>>>>>>>>>> understand, > > > >>>>>>>>>>>>>>>> while > > > >>>>>>>>>>>>>>>>>>> SourceFunction is easy. Alex, I believe you > > > >>>> were > > > >>>>>>>> looking > > > >>>>>>>>>> into > > > >>>>>>>>>>>>>>>>>> implementing > > > >>>>>>>>>>>>>>>>>>> an easier-to-use building block that could > > > >> be > > > >>>>> used > > > >>>>>> in > > > >>>>>>>>>>>>> situations > > > >>>>>>>>>>>>>>> like > > > >>>>>>>>>>>>>>>>>> this. > > > >>>>>>>>>>>>>>>>>>> Can we get something like that in place > > > >>> first? > > > >>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>> David > > > >>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>> On Fri, Jun 3, 2022 at 4:52 PM Jing Ge < > > > >>>>>>>>> j...@ververica.com > > > >>>>>>>>>>> > > > >>>>>>>>>>>>>> wrote: > > > >>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>> Hi, > > > >>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>> Thanks Alex for driving this! > > > >>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>> +1 To give the Flink developers, > > > >> especially > > > >>>>>>> Connector > > > >>>>>>>>>>>>> developers > > > >>>>>>>>>>>>>>> the > > > >>>>>>>>>>>>>>>>>> clear > > > >>>>>>>>>>>>>>>>>>>> signal that the new Source API is > > > >>> recommended > > > >>>>>>>> according > > > >>>>>>>>> to > > > >>>>>>>>>>>>>>> FLIP-27, > > > >>>>>>>>>>>>>>>> we > > > >>>>>>>>>>>>>>>>>>>> should mark them as deprecated. > > > >>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>> There are some open questions to discuss: > > > >>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>> 1. Do we need to mark all > > > >>>>> subinterfaces/subclasses > > > >>>>>>> as > > > >>>>>>>>>>>>>> deprecated? > > > >>>>>>>>>>>>>>>> e.g. > > > >>>>>>>>>>>>>>>>>>>> FromElementsFunction, etc. there are many. > > > >>>> What > > > >>>>>> are > > > >>>>>>>> the > > > >>>>>>>>>>>>>>>> replacements? > > > >>>>>>>>>>>>>>>>>>>> 2. Do we need to mark all subclasses that > > > >>> have > > > >>>>>>>>> replacement > > > >>>>>>>>>>>>> as > > > >>>>>>>>>>>>>>>>>> deprecated? > > > >>>>>>>>>>>>>>>>>>>> e.g. ExternallyInducedSource whose > > > >>> replacement > > > >>>>>>> class, > > > >>>>>>>>> if I > > > >>>>>>>>>>>>> am > > > >>>>>>>>>>>>>> not > > > >>>>>>>>>>>>>>>>>> mistaken, > > > >>>>>>>>>>>>>>>>>>>> ExternallyInducedSourceReader is > > > >>> @Experimental > > > >>>>>>>>>>>>>>>>>>>> 3. Do we need to mark all related test > > > >>> utility > > > >>>>>>> classes > > > >>>>>>>>> as > > > >>>>>>>>>>>>>>>> deprecated? > > > >>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>> I think it might make sense to create an > > > >>>>> umbrella > > > >>>>>>>> ticket > > > >>>>>>>>>> to > > > >>>>>>>>>>>>>> cover > > > >>>>>>>>>>>>>>>> all > > > >>>>>>>>>>>>>>>>> of > > > >>>>>>>>>>>>>>>>>>>> these with the following process: > > > >>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>> 1. Mark SourceFunction as deprecated asap. > > > >>>>>>>>>>>>>>>>>>>> 2. Mark subinterfaces and subclasses as > > > >>>>>> deprecated, > > > >>>>>>> if > > > >>>>>>>>>>>>> there are > > > >>>>>>>>>>>>>>>>>> graduated > > > >>>>>>>>>>>>>>>>>>>> replacements. Good example is that > > > >>> KafkaSource > > > >>>>>>>> replaced > > > >>>>>>>>>>>>>>>> KafkaConsumer > > > >>>>>>>>>>>>>>>>>> which > > > >>>>>>>>>>>>>>>>>>>> has been marked as deprecated. > > > >>>>>>>>>>>>>>>>>>>> 3. Do not mark subinterfaces and > > > >> subclasses > > > >>> as > > > >>>>>>>>> deprecated, > > > >>>>>>>>>>>>> if > > > >>>>>>>>>>>>>>>>>> replacement > > > >>>>>>>>>>>>>>>>>>>> classes are still experimental, check if > > > >> it > > > >>> is > > > >>>>>> time > > > >>>>>>> to > > > >>>>>>>>>>>>> graduate > > > >>>>>>>>>>>>>>>> them. > > > >>>>>>>>>>>>>>>>>> After > > > >>>>>>>>>>>>>>>>>>>> graduation, go to step 2. It might take a > > > >>>> while > > > >>>>>> for > > > >>>>>>>>>>>>> graduation. > > > >>>>>>>>>>>>>>>>>>>> 4. Do not mark subinterfaces and > > > >> subclasses > > > >>> as > > > >>>>>>>>> deprecated, > > > >>>>>>>>>>>>> if > > > >>>>>>>>>>>>>> the > > > >>>>>>>>>>>>>>>>>>>> replacement classes are experimental and > > > >> are > > > >>>> too > > > >>>>>>> young > > > >>>>>>>>> to > > > >>>>>>>>>>>>>>> graduate. > > > >>>>>>>>>>>>>>>> We > > > >>>>>>>>>>>>>>>>>> have > > > >>>>>>>>>>>>>>>>>>>> to wait. But in this case we could create > > > >>> new > > > >>>>>>> tickets > > > >>>>>>>>>> under > > > >>>>>>>>>>>>> the > > > >>>>>>>>>>>>>>>>> umbrella > > > >>>>>>>>>>>>>>>>>>>> ticket. > > > >>>>>>>>>>>>>>>>>>>> 5. Do not mark subinterfaces and > > > >> subclasses > > > >>> as > > > >>>>>>>>> deprecated, > > > >>>>>>>>>>>>> if > > > >>>>>>>>>>>>>>> there > > > >>>>>>>>>>>>>>>> is > > > >>>>>>>>>>>>>>>>>> no > > > >>>>>>>>>>>>>>>>>>>> replacement at all. We have to create new > > > >>>>> tickets > > > >>>>>>> and > > > >>>>>>>>> wait > > > >>>>>>>>>>>>> until > > > >>>>>>>>>>>>>>> the > > > >>>>>>>>>>>>>>>>> new > > > >>>>>>>>>>>>>>>>>>>> implementation has been done and > > > >> graduated. > > > >>> It > > > >>>>>> will > > > >>>>>>>>> take a > > > >>>>>>>>>>>>>> longer > > > >>>>>>>>>>>>>>>>> time, > > > >>>>>>>>>>>>>>>>>>>> roughly 1,5 years. > > > >>>>>>>>>>>>>>>>>>>> 6. For test classes, we could follow the > > > >>> same > > > >>>>>> rule. > > > >>>>>>>> But > > > >>>>>>>>> I > > > >>>>>>>>>>>>> think > > > >>>>>>>>>>>>>>> for > > > >>>>>>>>>>>>>>>>> some > > > >>>>>>>>>>>>>>>>>>>> cases, we could consider doing the > > > >>> replacement > > > >>>>>>>> directly > > > >>>>>>>>>>>>> without > > > >>>>>>>>>>>>>>>> going > > > >>>>>>>>>>>>>>>>>>>> through the deprecation phase. > > > >>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>> When we look back on all of these, we can > > > >>>>> realize > > > >>>>>> it > > > >>>>>>>> is > > > >>>>>>>>> a > > > >>>>>>>>>>>>> big > > > >>>>>>>>>>>>>> epic > > > >>>>>>>>>>>>>>>>> (even > > > >>>>>>>>>>>>>>>>>>>> bigger than an epic). It needs someone to > > > >>>> drive > > > >>>>> it > > > >>>>>>> and > > > >>>>>>>>>> keep > > > >>>>>>>>>>>>>> focus > > > >>>>>>>>>>>>>>> on > > > >>>>>>>>>>>>>>>>> it > > > >>>>>>>>>>>>>>>>>>>> continuously with support from the > > > >> community > > > >>>> and > > > >>>>>>> push > > > >>>>>>>>> the > > > >>>>>>>>>>>>>>>> development > > > >>>>>>>>>>>>>>>>>>>> towards the new Source API of FLIP-27. > > > >>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>> If we could have consensus for this, Alex > > > >>>> and I > > > >>>>>>> could > > > >>>>>>>>>>>>> create > > > >>>>>>>>>>>>>> the > > > >>>>>>>>>>>>>>>>>> umbrella > > > >>>>>>>>>>>>>>>>>>>> ticket to kick it off. > > > >>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>> Best regards, > > > >>>>>>>>>>>>>>>>>>>> Jing > > > >>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>> On Fri, Jun 3, 2022 at 3:54 PM Alexander > > > >>>>> Fedulov < > > > >>>>>>>>>>>>>>>>>> alexan...@ververica.com> > > > >>>>>>>>>>>>>>>>>>>> wrote: > > > >>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>> Hi everyone, > > > >>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>> I would like to start the discussion > > > >> about > > > >>>>>> marking > > > >>>>>>>>>>>>>>>>> SourceFunction-based > > > >>>>>>>>>>>>>>>>>>>>> interfaces as deprecated. With the > > > >> FLIP-27 > > > >>>> APIs > > > >>>>>>>>> becoming > > > >>>>>>>>>>>>> the > > > >>>>>>>>>>>>>> new > > > >>>>>>>>>>>>>>>>>>>> standard, > > > >>>>>>>>>>>>>>>>>>>>> the old ones have to be eventually phased > > > >>>> out. > > > >>>>>>>> Although > > > >>>>>>>>>>>>> this > > > >>>>>>>>>>>>>>> state > > > >>>>>>>>>>>>>>>> is > > > >>>>>>>>>>>>>>>>>>>> well > > > >>>>>>>>>>>>>>>>>>>>> known within the community and no new > > > >>>>> connectors > > > >>>>>>>> based > > > >>>>>>>>> on > > > >>>>>>>>>>>>> the > > > >>>>>>>>>>>>>> old > > > >>>>>>>>>>>>>>>>>>>>> interfaces can be accepted into the > > > >>> project, > > > >>>>> the > > > >>>>>>>>>> footprint > > > >>>>>>>>>>>>> of > > > >>>>>>>>>>>>>>>>>>>>> SourceFunction in the user code still > > > >> keeps > > > >>>>>> growing > > > >>>>>>>>>>>>> (primarily > > > >>>>>>>>>>>>>>> for > > > >>>>>>>>>>>>>>>>> data > > > >>>>>>>>>>>>>>>>>>>>> generators and test utilities). I believe > > > >>> it > > > >>>> is > > > >>>>>>> best > > > >>>>>>>> to > > > >>>>>>>>>>>>> mark > > > >>>>>>>>>>>>>>>>>>>> SourceFunction > > > >>>>>>>>>>>>>>>>>>>>> as deprecated as soon as possible. What > > > >> do > > > >>>> you > > > >>>>>>> think? > > > >>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>> Best, > > > >>>>>>>>>>>>>>>>>>>>> Alexander Fedulov > > > >>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> -- > > > >>>>>>>>>>>>>> https://twitter.com/snntrable > > > >>>>>>>>>>>>>> https://github.com/knaufk > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>> > > > >>>>>>>>>> > > > >>>>>>>>> > > > >>>>>>>> > > > >>>>>>> > > > >>>>>> > > > >>>>> > > > >>>> > > > >>> > > > >> > > > > > > > > > > > > -- > https://twitter.com/snntrable > https://github.com/knaufk >