Hi everyone,

thank you Jing for redirecting the discussion back to the topic at hand. I
agree with all of your points.

+1 to deprecate SourceFunction

Is there really no replacement for the StreamExecutionEnvironment#readXXX.
There is already a FLIP-27 based FileSource, right? What's missing to
recommend using that as opposed to the the readXXX methods?

Cheers,

Konstantin

Am Do., 9. Juni 2022 um 20:11 Uhr schrieb Alexander Fedulov <
alexan...@ververica.com>:

> Hi all,
>
> It seems that there is some understandable cautiousness with regard to
> deprecating methods and subclasses that do not have alternatives just yet.
>
> We should probably first agree if it is in general OK for Flink to use
> @Deprecated
> annotation for parts of the code that do not have alternatives. In that
> case,
> we could add a comment along the lines of:
> "This implementation is based on a deprecated SourceFunction API that
> will gradually be phased out from Flink. No direct substitute exists at the
> moment.
> If you want to have a more future-proof solution, consider helping the
> project by
> contributing an implementation based on the new Source API."
>
> This should clearly communicate the message that usage of these
> methods/classes
> is discouraged and at the same time promote contributions for addressing
> the gap.
> What do you think?
>
> Best,
> Alexander Fedulov
>
>
> On Thu, Jun 9, 2022 at 6:27 PM Ingo Bürk <airbla...@apache.org> wrote:
>
> > Hi,
> >
> > these APIs don't expose the underlying source directly, so I don't think
> > we need to worry about deprecating them as well. There's also nothing
> > inherently wrong with using a deprecated API internally, though even
> > just for the experience of using our own new APIs I would personally say
> > that they should be migrated to the new Source API. It's hard to reason
> > that users must migrate to a new API if we don't do it internally as
> well.
> >
> >
> > Best
> > Ingo
> >
> > On 09.06.22 15:41, Lijie Wang wrote:
> > >   Hi Martijn,
> > >
> > > I don't mean it's a blocker. Just a information. And I'm also +1 for
> > this.
> > >
> > > Put it another way: should we migrate the `#readFile(...)` to new API
> or
> > > provide a similar method "readxxx“ based on the new Source API?
> > >
> > > And if we don't migrate it, does it mean that the `#readFile(...)`
> should
> > > also be marked as deprecated?
> > >
> > > Best,
> > > Lijie
> > >
> > > Martijn Visser <martijnvis...@apache.org> 于2022年6月9日周四 21:03写道:
> > >
> > >> Hi Lijie,
> > >>
> > >> I don't see any problem with deprecating those methods at this moment,
> > as
> > >> long as we don't remove them until the replacements are available.
> > Besides
> > >> that, are we sure there are no replacements already, especially with
> the
> > >> new FileSource?
> > >>
> > >> Best regards,
> > >>
> > >> Martijn
> > >>
> > >> Op do 9 jun. 2022 om 14:23 schreef Lijie Wang <
> wangdachui9...@gmail.com
> > >:
> > >>
> > >>> Hi all,
> > >>>
> > >>> FYI, currently, some commonly used methods in
> > StreamExecutionEnvironment
> > >>> are still based on the old SourceFunction (and there is no
> > alternative):
> > >>> `StreamExecutionEnvironment#readFile(...)`
> > >>> `StreamExecutionEnvironment#readTextFile(...)`
> > >>>
> > >>> I think these should be migrated to the new source API before
> deprecate
> > >> the
> > >>> SourceFunction.
> > >>>
> > >>> Best,
> > >>> Lijie
> > >>>
> > >>> Martijn Visser <martijnvis...@apache.org> 于2022年6月9日周四 16:05写道:
> > >>>
> > >>>> Hi all,
> > >>>>
> > >>>> I think implicitly we've already considered the SourceFunction and
> > >>>> SinkFunction as deprecated. They are even marked as so on the Flink
> > >>> roadmap
> > >>>> [1]. That also shows that connectors that are using these interfaces
> > >> are
> > >>>> either approaching end-of-life. The fact that we're actively
> migrating
> > >>>> connectors from Source/SinkFunction to FLIP-27/FLIP-143 (plus add-on
> > >>> FLIPs)
> > >>>> shows that we've already determined that target.
> > >>>>
> > >>>> With regards to the motivation of FLIP-27, I think reading up on the
> > >>>> original discussion thread is also worthwhile [2] to see more
> context.
> > >>>> FLIP-27 was also very important as it brought a unified connector
> > which
> > >>> can
> > >>>> support both streaming and batch (with batch being considered a
> > special
> > >>>> case of streaming in Flink's vision).
> > >>>>
> > >>>> So +1 to deprecate SourceFunction. I would also argue that we should
> > >>>> already mark the SinkFunction as deprecated to avoid having this
> > >>> discussion
> > >>>> again in a couple of months.
> > >>>>
> > >>>> Best regards,
> > >>>>
> > >>>> Martijn
> > >>>>
> > >>>> [1] https://flink.apache.org/roadmap.html
> > >>>> [2]
> https://lists.apache.org/thread/334co89dbhc8qpr9nvmz8t1gp4sz2c8y
> > >>>>
> > >>>> Op do 9 jun. 2022 om 09:48 schreef Jing Ge <j...@ververica.com>:
> > >>>>
> > >>>>> Hi,
> > >>>>>
> > >>>>> I am very happy to see opinions from different perspectives. That
> > >> will
> > >>>> help
> > >>>>> us understand the problem better. Thanks all for the informative
> > >>>>> discussion.
> > >>>>>
> > >>>>> Let's see the big picture and check following facts together:
> > >>>>>
> > >>>>> 1. FLIP-27 was intended to solve some technical issues that are
> very
> > >>>>> difficult to solve with SourceFunction[1]. When we say
> > >> "SourceFunction
> > >>> is
> > >>>>> easy", well, it depends. If we take a look at the implementation of
> > >> the
> > >>>>> Kafka connector, we will know how complicated it is to build a
> > >> serious
> > >>>>> connector for production with the old SourceFunction. To every
> > >> problem
> > >>>>> there is a solution and to every solution there is a problem. The
> > >> fact
> > >>> is
> > >>>>> that there is no perfect but a feasible solution. If we try to
> solve
> > >>>>> complicated problems, we have to expose some complexity. Comparing
> to
> > >>>>> connectors for POC, demo, training(no offense), I would also solve
> > >>> issues
> > >>>>> for connectors like Kafka connector that are widely used in
> > >> production
> > >>>> with
> > >>>>> higher priority. I think that should be one reason why FLIP-27 has
> > >> been
> > >>>>> designed and why the new source API went public.
> > >>>>>
> > >>>>> 2. FLIP-27 and the implementation was introduced roughly at the end
> > >> of
> > >>>> 2019
> > >>>>> and went public on 19.04.2021, which means Flink has provided two
> > >>>> different
> > >>>>> public/graduated source solutions for more than one year. On the
> day
> > >>> that
> > >>>>> the new source API went public, there should be a consensus in the
> > >>>>> community that we should start the migration. Old SourceFunction
> > >>>> interface,
> > >>>>> in the ideal case, should have been deprecated on that day,
> otherwise
> > >>> we
> > >>>>> should not graduate the new source API to avoid confusing
> (connector)
> > >>>>> developers[2].
> > >>>>>
> > >>>>> 3. It is true that the new source API is hard to understand and
> even
> > >>> hard
> > >>>>> to implement for simple cases. Thanks for the feedback. That is
> > >>> something
> > >>>>> we need to improve. The current design&implementation could be
> > >>> considered
> > >>>>> as the low level API. The next step is to create the high level API
> > >> to
> > >>>>> reduce some unnecessary complexity for those simple cases. But,
> IMHO,
> > >>>> this
> > >>>>> should not be the prerequisite to postpone the deprecation of the
> old
> > >>>>> SourceFunction APIs.
> > >>>>>
> > >>>>> 4. As long as the old SourceFunction is not marked as deprecated,
> > >>>>> developers will continue asking which one should be used. Let's
> make
> > >> a
> > >>>>> concrete example. If a new connector is developed now and the
> > >> developer
> > >>>>> asks for a suggestion of the choice between the old and new source
> > >> API
> > >>> on
> > >>>>> the ML, which one should we suggest? I think it should be the new
> > >>> Source
> > >>>>> API. If a fresh new connector has been developed with the old
> > >>>>> SourceFunction API before asking for the consensus in the community
> > >> and
> > >>>> the
> > >>>>> developer wants to merge it to the master. Should we allow it? If
> the
> > >>>>> answer of all these questions is pointing to the new Source API,
> the
> > >>> old
> > >>>>> SourceFunction is de facto already deprecated, just has not been
> > >> marked
> > >>>> as
> > >>>>> @deprecated, which confuses developers even more.
> > >>>>>
> > >>>>>   Best regards,
> > >>>>> Jing
> > >>>>>
> > >>>>> [1]
> > >>>>>
> > >>>>>
> > >>>>
> > >>>
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface
> > >>>>> [2]
> https://lists.apache.org/thread/7okp4y46n3o3rx5mn0t3qobrof8zxwqs
> > >>>>>
> > >>>>> On Wed, Jun 8, 2022 at 2:21 AM Alexander Fedulov <
> > >>>> alexan...@ververica.com>
> > >>>>> wrote:
> > >>>>>
> > >>>>>> Hey Austin,
> > >>>>>>
> > >>>>>> Since we are getting deeper into the implementation details of the
> > >>>>>> DataGeneratorSource
> > >>>>>> and it is not the main topic of this thread, I propose to move our
> > >>>>>> discussion to where it belongs: [DISCUSS] FLIP-238 [1]. Could you
> > >>>> please
> > >>>>>> briefly formulate your requirements to make it easier for the
> > >> others
> > >>> to
> > >>>>>> follow? I am happy to continue this conversation there.
> > >>>>>>
> > >>>>>> [1]
> > >> https://lists.apache.org/thread/7gjxto1rmkpff4kl54j8nlg5db2rqhkt
> > >>>>>>
> > >>>>>> Best,
> > >>>>>> Alexander Fedulov
> > >>>>>>
> > >>>>>> On Tue, Jun 7, 2022 at 6:14 PM Austin Cawley-Edwards <
> > >>>>>> austin.caw...@gmail.com> wrote:
> > >>>>>>
> > >>>>>>>> @Austin, in the FLIP I mentioned above [1], the user is
> > >> expected
> > >>> to
> > >>>>>>> pass a MapFunction<Long,
> > >>>>>>> OUT>
> > >>>>>>> to the generator. I wonder if you could have your external client
> > >>> and
> > >>>>>>> polling logic wrapped in a custom
> > >>>>>>> MapFunction implementation class? Would that answer your needs or
> > >>> do
> > >>>>> you
> > >>>>>>> have some
> > >>>>>>> more sophisticated scenario in mind?
> > >>>>>>>
> > >>>>>>> At first glance, the FLIP looks good but for this case in regards
> > >>> to
> > >>>>> the
> > >>>>>>> map function, but leaves out 1) ability to control polling
> > >>> intervals
> > >>>>> and
> > >>>>>> 2)
> > >>>>>>> ability to produce an unknown number of records, both per-poll
> > >> and
> > >>>>>> overall
> > >>>>>>> boundedness. Do you think something like this could be built from
> > >>> the
> > >>>>>> same
> > >>>>>>> pieces?
> > >>>>>>> I'm also wondering what handles threading, is that on the user or
> > >>> is
> > >>>>> that
> > >>>>>>> part of the DataGeneratorSource?
> > >>>>>>>
> > >>>>>>> Best,
> > >>>>>>> Austin
> > >>>>>>>
> > >>>>>>> On Tue, Jun 7, 2022 at 9:34 AM Alexander Fedulov <
> > >>>>>> alexan...@ververica.com>
> > >>>>>>> wrote:
> > >>>>>>>
> > >>>>>>>> Hi everyone,
> > >>>>>>>>
> > >>>>>>>> Thanks for all the input and a lively discussion. It seems that
> > >>>> there
> > >>>>>> is
> > >>>>>>> a
> > >>>>>>>> consensus that due to
> > >>>>>>>> the inherent complexity of FLIP-27 sources we should provide
> > >> more
> > >>>>>>>> user-facing utilities to bridge
> > >>>>>>>> the gap between the existing SourceFunction-based functionality
> > >>> and
> > >>>>> the
> > >>>>>>> new
> > >>>>>>>> APIs.
> > >>>>>>>>
> > >>>>>>>> To start addressing this I picked the issue that David raised
> > >> and
> > >>>>> many
> > >>>>>>>> upvoted. Here is a proposal
> > >>>>>>>> for  the new DataGeneratorSource: FLIP-238 [1]. Please take a
> > >>>> look, I
> > >>>>>> am
> > >>>>>>>> going to open a separate
> > >>>>>>>> discussion thread on it shortly.
> > >>>>>>>>
> > >>>>>>>> Jing also raised some great points regarding the interfaces and
> > >>>>>>> subclasses.
> > >>>>>>>> It seems to me that
> > >>>>>>>> what might actually help is some sort of a "soft deprecation"
> > >>>> concept
> > >>>>>> and
> > >>>>>>>> annotation. It could be
> > >>>>>>>> used in places where we do not have an alternative
> > >> implementation
> > >>>>> yet,
> > >>>>>>> but
> > >>>>>>>> we clearly want
> > >>>>>>>> to indicate that continuing to build on top of these interfaces
> > >>> is
> > >>>>>>>> discouraged. The area of
> > >>>>>>>> impact of deprecating all SourceFunction subclasses is rather
> > >>> big,
> > >>>>> and
> > >>>>>> we
> > >>>>>>>> can expect it to
> > >>>>>>>> take a while. The hope would be that if in the meantime someone
> > >>>> finds
> > >>>>>>>> themselves using one of
> > >>>>>>>> such old APIs, the "soft deprecation" annotation will be a
> > >> clear
> > >>>>>>> indication
> > >>>>>>>> and encouragement to
> > >>>>>>>> work on introducing an alternative FLIP-27-based implementation
> > >>>>>> instead.
> > >>>>>>>>
> > >>>>>>>> @Austin, in the FLIP I mentioned above [1], the user is
> > >> expected
> > >>> to
> > >>>>>>>> pass a MapFunction<Long,
> > >>>>>>>> OUT>
> > >>>>>>>> to the generator. I wonder if you could have your external
> > >> client
> > >>>> and
> > >>>>>>>> polling logic wrapped in a custom
> > >>>>>>>> MapFunction implementation class? Would that answer your needs
> > >> or
> > >>>> do
> > >>>>>> you
> > >>>>>>>> have some
> > >>>>>>>> more sophisticated scenario in mind?
> > >>>>>>>>
> > >>>>>>>> [1] https://cwiki.apache.org/confluence/x/9Av1D
> > >>>>>>>> Best,
> > >>>>>>>> Alexander Fedulov
> > >>>>>>>>
> > >>>>>>>> On Mon, Jun 6, 2022 at 7:08 PM Austin Cawley-Edwards <
> > >>>>>>>> austin.caw...@gmail.com> wrote:
> > >>>>>>>>
> > >>>>>>>>> Thanks for the nice discussion all.
> > >>>>>>>>>
> > >>>>>>>>> I was recently trying to implement a very simple polling
> > >> source
> > >>>> and
> > >>>>>>>>> would've loved a higher-level base to work from. I'm
> > >> wondering
> > >>> if
> > >>>>> in
> > >>>>>>>>> addition to the data generator use cases, it would be good to
> > >>>>>> support a
> > >>>>>>>>> simple non-parallel polling abstraction to make it easier to,
> > >>> for
> > >>>>>>>> instance,
> > >>>>>>>>> start prototyping with data in existing APIs without adding a
> > >>>> Kafka
> > >>>>>> or
> > >>>>>>>> such
> > >>>>>>>>> in the middle.
> > >>>>>>>>>
> > >>>>>>>>> Best,
> > >>>>>>>>> Austin
> > >>>>>>>>>
> > >>>>>>>>> On Mon, Jun 6, 2022 at 10:02 AM tison <wander4...@gmail.com>
> > >>>>> wrote:
> > >>>>>>>>>
> > >>>>>>>>>> Well. It's a bit off-topic. For deprecating SourceFunction
> > >> as
> > >>>>>> FLIP-27
> > >>>>>>>>>> series works go ahead, +1 from my side. It's a significant
> > >>> work
> > >>>>>>> towards
> > >>>>>>>>> the
> > >>>>>>>>>> unification of batch and streaming effort :)
> > >>>>>>>>>>
> > >>>>>>>>>> Best,
> > >>>>>>>>>> tison.
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>> tison <wander4...@gmail.com> 于2022年6月6日周一 21:54写道:
> > >>>>>>>>>>
> > >>>>>>>>>>> The starting point of the version bump and removal
> > >> question
> > >>>> is
> > >>>>>> that
> > >>>>>>>>>>> downstream projects may experience a tough time to adapt
> > >>> new
> > >>>>>>>> interfaces
> > >>>>>>>>>>> while Flink keeps in 1.x versions so that users may
> > >> expect
> > >>> it
> > >>>>> as
> > >>>>>> an
> > >>>>>>>>> easy
> > >>>>>>>>>>> task. From my experience, it's really challenge to
> > >> maintain
> > >>>>>>>>>>> compatibility between multiple versions of Flink while
> > >>>>>> significant
> > >>>>>>>>>> changes
> > >>>>>>>>>>> made but sharing 1.x version series - users may not be
> > >>> aware
> > >>>>> that
> > >>>>>>>> it's
> > >>>>>>>>>>> almost a major version bump.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Best,
> > >>>>>>>>>>> tison.
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>> tison <wander4...@gmail.com> 于2022年6月6日周一 21:51写道:
> > >>>>>>>>>>>
> > >>>>>>>>>>>> One question from my side:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> As SourceFunction a @Public interface, we cannot remove
> > >> it
> > >>>>>> before
> > >>>>>>>>> doing
> > >>>>>>>>>>>> a major version bump (Flink 2.0).
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Of course it's not a blocker to make such deprecation
> > >> and
> > >>>> let
> > >>>>>> the
> > >>>>>>>> new
> > >>>>>>>>>>>> interface step in. My question is whether we have a plan
> > >>> to
> > >>>>>>> finally
> > >>>>>>>>>> remove
> > >>>>>>>>>>>> the deprecated interfaces, or postpone it until a clear
> > >>> plan
> > >>>>> of
> > >>>>>>>> Flink
> > >>>>>>>>>> 2.0?
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Best,
> > >>>>>>>>>>>> tison.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> David Anderson <dander...@apache.org> 于2022年6月6日周一
> > >>> 21:35写道:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> David, can you elaborate why you need watermark
> > >>>> generation
> > >>>>> in
> > >>>>>>> the
> > >>>>>>>>>>>>> source
> > >>>>>>>>>>>>>> for your data generators?
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> The training exercises should strive to provide
> > >> examples
> > >>> of
> > >>>>>> best
> > >>>>>>>>>>>>> practices.
> > >>>>>>>>>>>>> If the exercises and their solutions use
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> env.fromSource(source,
> > >> WatermarkStrategy.noWatermarks(),
> > >>>>>>>>>>>>> "name-of-source")
> > >>>>>>>>>>>>>    .map(...)
> > >>>>>>>>>>>>>    .assignTimestampsAndWatermarks(...)
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> this will help establish this anti-pattern as the
> > >> normal
> > >>>> way
> > >>>>> of
> > >>>>>>>> doing
> > >>>>>>>>>>>>> things.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Most new Flink users are using a KafkaSource with a
> > >>>>>> noWatermarks
> > >>>>>>>>>> strategy
> > >>>>>>>>>>>>> and a SimpleStringSchema, followed by a map that does
> > >> the
> > >>>>> real
> > >>>>>>>>>>>>> deserialization, followed by the real watermarking --
> > >>>> because
> > >>>>>>> they
> > >>>>>>>>>> aren't
> > >>>>>>>>>>>>> seeing examples that teach how these interfaces are
> > >> meant
> > >>>> to
> > >>>>> be
> > >>>>>>>> used.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> When we redo the sources used in training exercises, I
> > >>> want
> > >>>>> to
> > >>>>>>>> avoid
> > >>>>>>>>>>>>> these
> > >>>>>>>>>>>>> pitfalls.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> David
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> On Mon, Jun 6, 2022 at 9:12 AM Konstantin Knauf <
> > >>>>>>> kna...@apache.org
> > >>>>>>>>>
> > >>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Hi everyone,
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> very interesting thread. The proposal for deprecation
> > >>>> seems
> > >>>>>> to
> > >>>>>>>> have
> > >>>>>>>>>>>>> sparked
> > >>>>>>>>>>>>>> a very important discussion. Do we what users
> > >> struggle
> > >>>> with
> > >>>>>>>>>>>>> specifically?
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Speaking for myself, when I upgrade flink-faker to
> > >> the
> > >>>> new
> > >>>>>>> Source
> > >>>>>>>>> API
> > >>>>>>>>>>>>> an
> > >>>>>>>>>>>>>> unbounded version of the NumberSequenceSource would
> > >>> have
> > >>>>> been
> > >>>>>>>> all I
> > >>>>>>>>>>>>> needed,
> > >>>>>>>>>>>>>> but that's just the data generator use case. I think,
> > >>>> that
> > >>>>>> one
> > >>>>>>>>> could
> > >>>>>>>>>> be
> > >>>>>>>>>>>>>> solved quite easily. David, can you elaborate why you
> > >>>> need
> > >>>>>>>>> watermark
> > >>>>>>>>>>>>>> generation in the source for your data generators?
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Cheers,
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Konstantin
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Am So., 5. Juni 2022 um 17:48 Uhr schrieb Piotr
> > >>> Nowojski
> > >>>> <
> > >>>>>>>>>>>>>> pnowoj...@apache.org>:
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Also +1 to what David has written. But it doesn't
> > >>> mean
> > >>>> we
> > >>>>>>>> should
> > >>>>>>>>> be
> > >>>>>>>>>>>>>> waiting
> > >>>>>>>>>>>>>>> indefinitely to deprecate SourceFunction.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Best,
> > >>>>>>>>>>>>>>> Piotrek
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> niedz., 5 cze 2022 o 16:46 Jark Wu <
> > >> imj...@gmail.com
> > >>>>
> > >>>>>>>>> napisał(a):
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> +1 to David's point.
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> Usually, when we deprecate some interfaces, we
> > >>> should
> > >>>>>> point
> > >>>>>>>>> users
> > >>>>>>>>>>>>> to
> > >>>>>>>>>>>>>> use
> > >>>>>>>>>>>>>>>> the recommended alternatives.
> > >>>>>>>>>>>>>>>> However, implementing the new Source interface
> > >> for
> > >>>> some
> > >>>>>>>> simple
> > >>>>>>>>>>>>>> scenarios
> > >>>>>>>>>>>>>>> is
> > >>>>>>>>>>>>>>>> too challenging and complex.
> > >>>>>>>>>>>>>>>> We also found it isn't easy to push the internal
> > >>>>>> connector
> > >>>>>>> to
> > >>>>>>>>>>>>> upgrade
> > >>>>>>>>>>>>>> to
> > >>>>>>>>>>>>>>>> the new Source because
> > >>>>>>>>>>>>>>>> "FLIP-27 are hard to understand, while
> > >>> SourceFunction
> > >>>>> is
> > >>>>>>>> easy".
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> +1 to make implementing a simple Source easier
> > >>> before
> > >>>>>>>>> deprecating
> > >>>>>>>>>>>>>>>> SourceFunction.
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> Best,
> > >>>>>>>>>>>>>>>> Jark
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> On Sun, 5 Jun 2022 at 07:29, Jingsong Lee <
> > >>>>>>>>>> lzljs3620...@apache.org
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> +1 to David and Ingo.
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> Before deprecate and remove SourceFunction, we
> > >>>> should
> > >>>>>>> have
> > >>>>>>>>> some
> > >>>>>>>>>>>>>> easier
> > >>>>>>>>>>>>>>>> APIs
> > >>>>>>>>>>>>>>>>> to wrap new Source, the cost to write a new
> > >>> Source
> > >>>> is
> > >>>>>> too
> > >>>>>>>>> high
> > >>>>>>>>>>>>> now.
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> Ingo Bürk <airbla...@apache.org>于2022年6月5日
> > >>>>> 周日05:32写道:
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> I +1 everything David said. The new Source
> > >> API
> > >>>>> raised
> > >>>>>>> the
> > >>>>>>>>>>>>>> complexity
> > >>>>>>>>>>>>>>>>>> significantly. It's great to have such a
> > >> rich,
> > >>>>>> powerful
> > >>>>>>>> API
> > >>>>>>>>>>>>> that
> > >>>>>>>>>>>>>> can
> > >>>>>>>>>>>>>>> do
> > >>>>>>>>>>>>>>>>>> everything, but in the process we lost the
> > >>>> ability
> > >>>>> to
> > >>>>>>>>> onboard
> > >>>>>>>>>>>>>> people
> > >>>>>>>>>>>>>>> to
> > >>>>>>>>>>>>>>>>>> the APIs.
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> Best
> > >>>>>>>>>>>>>>>>>> Ingo
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> On 04.06.22 21:21, David Anderson wrote:
> > >>>>>>>>>>>>>>>>>>> I'm in favor of this, but I think we need
> > >> to
> > >>>> make
> > >>>>>> it
> > >>>>>>>>> easier
> > >>>>>>>>>>>>> to
> > >>>>>>>>>>>>>>>>> implement
> > >>>>>>>>>>>>>>>>>>> data generators and test sources. As things
> > >>>> stand
> > >>>>>> in
> > >>>>>>>>> 1.15,
> > >>>>>>>>>>>>> unless
> > >>>>>>>>>>>>>>> you
> > >>>>>>>>>>>>>>>>> can
> > >>>>>>>>>>>>>>>>>>> be satisfied with using a
> > >>> NumberSequenceSource
> > >>>>>>> followed
> > >>>>>>>>> by
> > >>>>>>>>>> a
> > >>>>>>>>>>>>> map,
> > >>>>>>>>>>>>>>>>> things
> > >>>>>>>>>>>>>>>>>>> get quite complicated. I looked into
> > >>> reworking
> > >>>>> the
> > >>>>>>> data
> > >>>>>>>>>>>>>> generators
> > >>>>>>>>>>>>>>>> used
> > >>>>>>>>>>>>>>>>>> in
> > >>>>>>>>>>>>>>>>>>> the training exercises, and got discouraged
> > >>> by
> > >>>>> the
> > >>>>>>>> amount
> > >>>>>>>>>> of
> > >>>>>>>>>>>>> work
> > >>>>>>>>>>>>>>>>>> involved.
> > >>>>>>>>>>>>>>>>>>> (The sources used in the training want to
> > >> be
> > >>>>>>> unbounded,
> > >>>>>>>>> and
> > >>>>>>>>>>>>> need
> > >>>>>>>>>>>>>>>>>>> watermarking in the sources, which means
> > >> that
> > >>>>> using
> > >>>>>>>>>>>>>>>>> NumberSequenceSource
> > >>>>>>>>>>>>>>>>>>> isn't an option.)
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> I think the proposed deprecation will be
> > >>> better
> > >>>>>>>> received
> > >>>>>>>>> if
> > >>>>>>>>>>>>> it
> > >>>>>>>>>>>>>> can
> > >>>>>>>>>>>>>>> be
> > >>>>>>>>>>>>>>>>>>> accompanied by something that makes
> > >>>> implementing
> > >>>>> a
> > >>>>>>>> simple
> > >>>>>>>>>>>>> Source
> > >>>>>>>>>>>>>>>> easier
> > >>>>>>>>>>>>>>>>>>> than it is now. People are continuing to
> > >>>>> implement
> > >>>>>>> new
> > >>>>>>>>>>>>>>>> SourceFunctions
> > >>>>>>>>>>>>>>>>>>> because the interfaces defined by FLIP-27
> > >> are
> > >>>>> hard
> > >>>>>> to
> > >>>>>>>>>>>>> understand,
> > >>>>>>>>>>>>>>>> while
> > >>>>>>>>>>>>>>>>>>> SourceFunction is easy. Alex, I believe you
> > >>>> were
> > >>>>>>>> looking
> > >>>>>>>>>> into
> > >>>>>>>>>>>>>>>>>> implementing
> > >>>>>>>>>>>>>>>>>>> an easier-to-use building block that could
> > >> be
> > >>>>> used
> > >>>>>> in
> > >>>>>>>>>>>>> situations
> > >>>>>>>>>>>>>>> like
> > >>>>>>>>>>>>>>>>>> this.
> > >>>>>>>>>>>>>>>>>>> Can we get something like that in place
> > >>> first?
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> David
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> On Fri, Jun 3, 2022 at 4:52 PM Jing Ge <
> > >>>>>>>>> j...@ververica.com
> > >>>>>>>>>>>
> > >>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> Hi,
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> Thanks Alex for driving this!
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> +1 To give the Flink developers,
> > >> especially
> > >>>>>>> Connector
> > >>>>>>>>>>>>> developers
> > >>>>>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>>>> clear
> > >>>>>>>>>>>>>>>>>>>> signal that the new Source API is
> > >>> recommended
> > >>>>>>>> according
> > >>>>>>>>> to
> > >>>>>>>>>>>>>>> FLIP-27,
> > >>>>>>>>>>>>>>>> we
> > >>>>>>>>>>>>>>>>>>>> should mark them as deprecated.
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> There are some open questions to discuss:
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> 1. Do we need to mark all
> > >>>>> subinterfaces/subclasses
> > >>>>>>> as
> > >>>>>>>>>>>>>> deprecated?
> > >>>>>>>>>>>>>>>> e.g.
> > >>>>>>>>>>>>>>>>>>>> FromElementsFunction, etc. there are many.
> > >>>> What
> > >>>>>> are
> > >>>>>>>> the
> > >>>>>>>>>>>>>>>> replacements?
> > >>>>>>>>>>>>>>>>>>>> 2. Do we need to mark all subclasses that
> > >>> have
> > >>>>>>>>> replacement
> > >>>>>>>>>>>>> as
> > >>>>>>>>>>>>>>>>>> deprecated?
> > >>>>>>>>>>>>>>>>>>>> e.g. ExternallyInducedSource whose
> > >>> replacement
> > >>>>>>> class,
> > >>>>>>>>> if I
> > >>>>>>>>>>>>> am
> > >>>>>>>>>>>>>> not
> > >>>>>>>>>>>>>>>>>> mistaken,
> > >>>>>>>>>>>>>>>>>>>> ExternallyInducedSourceReader is
> > >>> @Experimental
> > >>>>>>>>>>>>>>>>>>>> 3. Do we need to mark all related test
> > >>> utility
> > >>>>>>> classes
> > >>>>>>>>> as
> > >>>>>>>>>>>>>>>> deprecated?
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> I think it might make sense to create an
> > >>>>> umbrella
> > >>>>>>>> ticket
> > >>>>>>>>>> to
> > >>>>>>>>>>>>>> cover
> > >>>>>>>>>>>>>>>> all
> > >>>>>>>>>>>>>>>>> of
> > >>>>>>>>>>>>>>>>>>>> these with the following process:
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> 1. Mark SourceFunction as deprecated asap.
> > >>>>>>>>>>>>>>>>>>>> 2. Mark subinterfaces and subclasses as
> > >>>>>> deprecated,
> > >>>>>>> if
> > >>>>>>>>>>>>> there are
> > >>>>>>>>>>>>>>>>>> graduated
> > >>>>>>>>>>>>>>>>>>>> replacements. Good example is that
> > >>> KafkaSource
> > >>>>>>>> replaced
> > >>>>>>>>>>>>>>>> KafkaConsumer
> > >>>>>>>>>>>>>>>>>> which
> > >>>>>>>>>>>>>>>>>>>> has been marked as deprecated.
> > >>>>>>>>>>>>>>>>>>>> 3. Do not mark subinterfaces and
> > >> subclasses
> > >>> as
> > >>>>>>>>> deprecated,
> > >>>>>>>>>>>>> if
> > >>>>>>>>>>>>>>>>>> replacement
> > >>>>>>>>>>>>>>>>>>>> classes are still experimental, check if
> > >> it
> > >>> is
> > >>>>>> time
> > >>>>>>> to
> > >>>>>>>>>>>>> graduate
> > >>>>>>>>>>>>>>>> them.
> > >>>>>>>>>>>>>>>>>> After
> > >>>>>>>>>>>>>>>>>>>> graduation, go to step 2. It might take a
> > >>>> while
> > >>>>>> for
> > >>>>>>>>>>>>> graduation.
> > >>>>>>>>>>>>>>>>>>>> 4. Do not mark subinterfaces and
> > >> subclasses
> > >>> as
> > >>>>>>>>> deprecated,
> > >>>>>>>>>>>>> if
> > >>>>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>>>>>> replacement classes are experimental and
> > >> are
> > >>>> too
> > >>>>>>> young
> > >>>>>>>>> to
> > >>>>>>>>>>>>>>> graduate.
> > >>>>>>>>>>>>>>>> We
> > >>>>>>>>>>>>>>>>>> have
> > >>>>>>>>>>>>>>>>>>>> to wait. But in this case we could create
> > >>> new
> > >>>>>>> tickets
> > >>>>>>>>>> under
> > >>>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>>> umbrella
> > >>>>>>>>>>>>>>>>>>>> ticket.
> > >>>>>>>>>>>>>>>>>>>> 5. Do not mark subinterfaces and
> > >> subclasses
> > >>> as
> > >>>>>>>>> deprecated,
> > >>>>>>>>>>>>> if
> > >>>>>>>>>>>>>>> there
> > >>>>>>>>>>>>>>>> is
> > >>>>>>>>>>>>>>>>>> no
> > >>>>>>>>>>>>>>>>>>>> replacement at all. We have to create new
> > >>>>> tickets
> > >>>>>>> and
> > >>>>>>>>> wait
> > >>>>>>>>>>>>> until
> > >>>>>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>>> new
> > >>>>>>>>>>>>>>>>>>>> implementation has been done and
> > >> graduated.
> > >>> It
> > >>>>>> will
> > >>>>>>>>> take a
> > >>>>>>>>>>>>>> longer
> > >>>>>>>>>>>>>>>>> time,
> > >>>>>>>>>>>>>>>>>>>> roughly 1,5 years.
> > >>>>>>>>>>>>>>>>>>>> 6. For test classes, we could follow the
> > >>> same
> > >>>>>> rule.
> > >>>>>>>> But
> > >>>>>>>>> I
> > >>>>>>>>>>>>> think
> > >>>>>>>>>>>>>>> for
> > >>>>>>>>>>>>>>>>> some
> > >>>>>>>>>>>>>>>>>>>> cases, we could consider doing the
> > >>> replacement
> > >>>>>>>> directly
> > >>>>>>>>>>>>> without
> > >>>>>>>>>>>>>>>> going
> > >>>>>>>>>>>>>>>>>>>> through the deprecation phase.
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> When we look back on all of these, we can
> > >>>>> realize
> > >>>>>> it
> > >>>>>>>> is
> > >>>>>>>>> a
> > >>>>>>>>>>>>> big
> > >>>>>>>>>>>>>> epic
> > >>>>>>>>>>>>>>>>> (even
> > >>>>>>>>>>>>>>>>>>>> bigger than an epic). It needs someone to
> > >>>> drive
> > >>>>> it
> > >>>>>>> and
> > >>>>>>>>>> keep
> > >>>>>>>>>>>>>> focus
> > >>>>>>>>>>>>>>> on
> > >>>>>>>>>>>>>>>>> it
> > >>>>>>>>>>>>>>>>>>>> continuously with support from the
> > >> community
> > >>>> and
> > >>>>>>> push
> > >>>>>>>>> the
> > >>>>>>>>>>>>>>>> development
> > >>>>>>>>>>>>>>>>>>>> towards the new Source API of FLIP-27.
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> If we could have consensus for this,  Alex
> > >>>> and I
> > >>>>>>> could
> > >>>>>>>>>>>>> create
> > >>>>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>>>> umbrella
> > >>>>>>>>>>>>>>>>>>>> ticket to kick it off.
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> Best regards,
> > >>>>>>>>>>>>>>>>>>>> Jing
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> On Fri, Jun 3, 2022 at 3:54 PM Alexander
> > >>>>> Fedulov <
> > >>>>>>>>>>>>>>>>>> alexan...@ververica.com>
> > >>>>>>>>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> Hi everyone,
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> I would like to start the discussion
> > >> about
> > >>>>>> marking
> > >>>>>>>>>>>>>>>>> SourceFunction-based
> > >>>>>>>>>>>>>>>>>>>>> interfaces as deprecated. With the
> > >> FLIP-27
> > >>>> APIs
> > >>>>>>>>> becoming
> > >>>>>>>>>>>>> the
> > >>>>>>>>>>>>>> new
> > >>>>>>>>>>>>>>>>>>>> standard,
> > >>>>>>>>>>>>>>>>>>>>> the old ones have to be eventually phased
> > >>>> out.
> > >>>>>>>> Although
> > >>>>>>>>>>>>> this
> > >>>>>>>>>>>>>>> state
> > >>>>>>>>>>>>>>>> is
> > >>>>>>>>>>>>>>>>>>>> well
> > >>>>>>>>>>>>>>>>>>>>> known within the community and no new
> > >>>>> connectors
> > >>>>>>>> based
> > >>>>>>>>> on
> > >>>>>>>>>>>>> the
> > >>>>>>>>>>>>>> old
> > >>>>>>>>>>>>>>>>>>>>> interfaces can be accepted into the
> > >>> project,
> > >>>>> the
> > >>>>>>>>>> footprint
> > >>>>>>>>>>>>> of
> > >>>>>>>>>>>>>>>>>>>>> SourceFunction in the user code still
> > >> keeps
> > >>>>>> growing
> > >>>>>>>>>>>>> (primarily
> > >>>>>>>>>>>>>>> for
> > >>>>>>>>>>>>>>>>> data
> > >>>>>>>>>>>>>>>>>>>>> generators and test utilities). I believe
> > >>> it
> > >>>> is
> > >>>>>>> best
> > >>>>>>>> to
> > >>>>>>>>>>>>> mark
> > >>>>>>>>>>>>>>>>>>>> SourceFunction
> > >>>>>>>>>>>>>>>>>>>>> as deprecated as soon as possible. What
> > >> do
> > >>>> you
> > >>>>>>> think?
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> Best,
> > >>>>>>>>>>>>>>>>>>>>> Alexander Fedulov
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> --
> > >>>>>>>>>>>>>> https://twitter.com/snntrable
> > >>>>>>>>>>>>>> https://github.com/knaufk
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>
> > >>
> > >
> >
>


-- 
https://twitter.com/snntrable
https://github.com/knaufk

Reply via email to