Thanks for the nice discussion all.

I was recently trying to implement a very simple polling source and
would've loved a higher-level base to work from. I'm wondering if in
addition to the data generator use cases, it would be good to support a
simple non-parallel polling abstraction to make it easier to, for instance,
start prototyping with data in existing APIs without adding a Kafka or such
in the middle.

Best,
Austin

On Mon, Jun 6, 2022 at 10:02 AM tison <wander4...@gmail.com> wrote:

> Well. It's a bit off-topic. For deprecating SourceFunction as FLIP-27
> series works go ahead, +1 from my side. It's a significant work towards the
> unification of batch and streaming effort :)
>
> Best,
> tison.
>
>
> tison <wander4...@gmail.com> 于2022年6月6日周一 21:54写道:
>
> > The starting point of the version bump and removal question is that
> > downstream projects may experience a tough time to adapt new interfaces
> > while Flink keeps in 1.x versions so that users may expect it as an easy
> > task. From my experience, it's really challenge to maintain
> > compatibility between multiple versions of Flink while significant
> changes
> > made but sharing 1.x version series - users may not be aware that it's
> > almost a major version bump.
> >
> > Best,
> > tison.
> >
> >
> > tison <wander4...@gmail.com> 于2022年6月6日周一 21:51写道:
> >
> >> One question from my side:
> >>
> >> As SourceFunction a @Public interface, we cannot remove it before doing
> >> a major version bump (Flink 2.0).
> >>
> >> Of course it's not a blocker to make such deprecation and let the new
> >> interface step in. My question is whether we have a plan to finally
> remove
> >> the deprecated interfaces, or postpone it until a clear plan of Flink
> 2.0?
> >>
> >> Best,
> >> tison.
> >>
> >>
> >> David Anderson <dander...@apache.org> 于2022年6月6日周一 21:35写道:
> >>
> >>> >
> >>> > David, can you elaborate why you need watermark generation in the
> >>> source
> >>> > for your data generators?
> >>>
> >>>
> >>> The training exercises should strive to provide examples of best
> >>> practices.
> >>> If the exercises and their solutions use
> >>>
> >>> env.fromSource(source, WatermarkStrategy.noWatermarks(),
> >>> "name-of-source")
> >>>   .map(...)
> >>>   .assignTimestampsAndWatermarks(...)
> >>>
> >>> this will help establish this anti-pattern as the normal way of doing
> >>> things.
> >>>
> >>> Most new Flink users are using a KafkaSource with a noWatermarks
> strategy
> >>> and a SimpleStringSchema, followed by a map that does the real
> >>> deserialization, followed by the real watermarking -- because they
> aren't
> >>> seeing examples that teach how these interfaces are meant to be used.
> >>>
> >>> When we redo the sources used in training exercises, I want to avoid
> >>> these
> >>> pitfalls.
> >>>
> >>> David
> >>>
> >>> On Mon, Jun 6, 2022 at 9:12 AM Konstantin Knauf <kna...@apache.org>
> >>> wrote:
> >>>
> >>> > Hi everyone,
> >>> >
> >>> > very interesting thread. The proposal for deprecation seems to have
> >>> sparked
> >>> > a very important discussion. Do we what users struggle with
> >>> specifically?
> >>> >
> >>> > Speaking for myself, when I upgrade flink-faker to the new Source API
> >>> an
> >>> > unbounded version of the NumberSequenceSource would have been all I
> >>> needed,
> >>> > but that's just the data generator use case. I think, that one could
> be
> >>> > solved quite easily. David, can you elaborate why you need watermark
> >>> > generation in the source for your data generators?
> >>> >
> >>> > Cheers,
> >>> >
> >>> > Konstantin
> >>> >
> >>> >
> >>> >
> >>> >
> >>> >
> >>> > Am So., 5. Juni 2022 um 17:48 Uhr schrieb Piotr Nowojski <
> >>> > pnowoj...@apache.org>:
> >>> >
> >>> > > Also +1 to what David has written. But it doesn't mean we should be
> >>> > waiting
> >>> > > indefinitely to deprecate SourceFunction.
> >>> > >
> >>> > > Best,
> >>> > > Piotrek
> >>> > >
> >>> > > niedz., 5 cze 2022 o 16:46 Jark Wu <imj...@gmail.com> napisał(a):
> >>> > >
> >>> > > > +1 to David's point.
> >>> > > >
> >>> > > > Usually, when we deprecate some interfaces, we should point users
> >>> to
> >>> > use
> >>> > > > the recommended alternatives.
> >>> > > > However, implementing the new Source interface for some simple
> >>> > scenarios
> >>> > > is
> >>> > > > too challenging and complex.
> >>> > > > We also found it isn't easy to push the internal connector to
> >>> upgrade
> >>> > to
> >>> > > > the new Source because
> >>> > > > "FLIP-27 are hard to understand, while SourceFunction is easy".
> >>> > > >
> >>> > > > +1 to make implementing a simple Source easier before deprecating
> >>> > > > SourceFunction.
> >>> > > >
> >>> > > > Best,
> >>> > > > Jark
> >>> > > >
> >>> > > >
> >>> > > > On Sun, 5 Jun 2022 at 07:29, Jingsong Lee <
> lzljs3620...@apache.org
> >>> >
> >>> > > wrote:
> >>> > > >
> >>> > > > > +1 to David and Ingo.
> >>> > > > >
> >>> > > > > Before deprecate and remove SourceFunction, we should have some
> >>> > easier
> >>> > > > APIs
> >>> > > > > to wrap new Source, the cost to write a new Source is too high
> >>> now.
> >>> > > > >
> >>> > > > >
> >>> > > > >
> >>> > > > > Ingo Bürk <airbla...@apache.org>于2022年6月5日 周日05:32写道:
> >>> > > > >
> >>> > > > > > I +1 everything David said. The new Source API raised the
> >>> > complexity
> >>> > > > > > significantly. It's great to have such a rich, powerful API
> >>> that
> >>> > can
> >>> > > do
> >>> > > > > > everything, but in the process we lost the ability to onboard
> >>> > people
> >>> > > to
> >>> > > > > > the APIs.
> >>> > > > > >
> >>> > > > > >
> >>> > > > > > Best
> >>> > > > > > Ingo
> >>> > > > > >
> >>> > > > > > On 04.06.22 21:21, David Anderson wrote:
> >>> > > > > > > I'm in favor of this, but I think we need to make it easier
> >>> to
> >>> > > > > implement
> >>> > > > > > > data generators and test sources. As things stand in 1.15,
> >>> unless
> >>> > > you
> >>> > > > > can
> >>> > > > > > > be satisfied with using a NumberSequenceSource followed by
> a
> >>> map,
> >>> > > > > things
> >>> > > > > > > get quite complicated. I looked into reworking the data
> >>> > generators
> >>> > > > used
> >>> > > > > > in
> >>> > > > > > > the training exercises, and got discouraged by the amount
> of
> >>> work
> >>> > > > > > involved.
> >>> > > > > > > (The sources used in the training want to be unbounded, and
> >>> need
> >>> > > > > > > watermarking in the sources, which means that using
> >>> > > > > NumberSequenceSource
> >>> > > > > > > isn't an option.)
> >>> > > > > > >
> >>> > > > > > > I think the proposed deprecation will be better received if
> >>> it
> >>> > can
> >>> > > be
> >>> > > > > > > accompanied by something that makes implementing a simple
> >>> Source
> >>> > > > easier
> >>> > > > > > > than it is now. People are continuing to implement new
> >>> > > > SourceFunctions
> >>> > > > > > > because the interfaces defined by FLIP-27 are hard to
> >>> understand,
> >>> > > > while
> >>> > > > > > > SourceFunction is easy. Alex, I believe you were looking
> into
> >>> > > > > > implementing
> >>> > > > > > > an easier-to-use building block that could be used in
> >>> situations
> >>> > > like
> >>> > > > > > this.
> >>> > > > > > > Can we get something like that in place first?
> >>> > > > > > >
> >>> > > > > > > David
> >>> > > > > > >
> >>> > > > > > > On Fri, Jun 3, 2022 at 4:52 PM Jing Ge <j...@ververica.com
> >
> >>> > wrote:
> >>> > > > > > >
> >>> > > > > > >> Hi,
> >>> > > > > > >>
> >>> > > > > > >> Thanks Alex for driving this!
> >>> > > > > > >>
> >>> > > > > > >> +1 To give the Flink developers, especially Connector
> >>> developers
> >>> > > the
> >>> > > > > > clear
> >>> > > > > > >> signal that the new Source API is recommended according to
> >>> > > FLIP-27,
> >>> > > > we
> >>> > > > > > >> should mark them as deprecated.
> >>> > > > > > >>
> >>> > > > > > >> There are some open questions to discuss:
> >>> > > > > > >>
> >>> > > > > > >> 1. Do we need to mark all subinterfaces/subclasses as
> >>> > deprecated?
> >>> > > > e.g.
> >>> > > > > > >> FromElementsFunction, etc. there are many. What are the
> >>> > > > replacements?
> >>> > > > > > >> 2. Do we need to mark all subclasses that have replacement
> >>> as
> >>> > > > > > deprecated?
> >>> > > > > > >> e.g. ExternallyInducedSource whose replacement class, if I
> >>> am
> >>> > not
> >>> > > > > > mistaken,
> >>> > > > > > >> ExternallyInducedSourceReader is @Experimental
> >>> > > > > > >> 3. Do we need to mark all related test utility classes as
> >>> > > > deprecated?
> >>> > > > > > >>
> >>> > > > > > >> I think it might make sense to create an umbrella ticket
> to
> >>> > cover
> >>> > > > all
> >>> > > > > of
> >>> > > > > > >> these with the following process:
> >>> > > > > > >>
> >>> > > > > > >> 1. Mark SourceFunction as deprecated asap.
> >>> > > > > > >> 2. Mark subinterfaces and subclasses as deprecated, if
> >>> there are
> >>> > > > > > graduated
> >>> > > > > > >> replacements. Good example is that KafkaSource replaced
> >>> > > > KafkaConsumer
> >>> > > > > > which
> >>> > > > > > >> has been marked as deprecated.
> >>> > > > > > >> 3. Do not mark subinterfaces and subclasses as deprecated,
> >>> if
> >>> > > > > > replacement
> >>> > > > > > >> classes are still experimental, check if it is time to
> >>> graduate
> >>> > > > them.
> >>> > > > > > After
> >>> > > > > > >> graduation, go to step 2. It might take a while for
> >>> graduation.
> >>> > > > > > >> 4. Do not mark subinterfaces and subclasses as deprecated,
> >>> if
> >>> > the
> >>> > > > > > >> replacement classes are experimental and are too young to
> >>> > > graduate.
> >>> > > > We
> >>> > > > > > have
> >>> > > > > > >> to wait. But in this case we could create new tickets
> under
> >>> the
> >>> > > > > umbrella
> >>> > > > > > >> ticket.
> >>> > > > > > >> 5. Do not mark subinterfaces and subclasses as deprecated,
> >>> if
> >>> > > there
> >>> > > > is
> >>> > > > > > no
> >>> > > > > > >> replacement at all. We have to create new tickets and wait
> >>> until
> >>> > > the
> >>> > > > > new
> >>> > > > > > >> implementation has been done and graduated. It will take a
> >>> > longer
> >>> > > > > time,
> >>> > > > > > >> roughly 1,5 years.
> >>> > > > > > >> 6. For test classes, we could follow the same rule. But I
> >>> think
> >>> > > for
> >>> > > > > some
> >>> > > > > > >> cases, we could consider doing the replacement directly
> >>> without
> >>> > > > going
> >>> > > > > > >> through the deprecation phase.
> >>> > > > > > >>
> >>> > > > > > >> When we look back on all of these, we can realize it is a
> >>> big
> >>> > epic
> >>> > > > > (even
> >>> > > > > > >> bigger than an epic). It needs someone to drive it and
> keep
> >>> > focus
> >>> > > on
> >>> > > > > it
> >>> > > > > > >> continuously with support from the community and push the
> >>> > > > development
> >>> > > > > > >> towards the new Source API of FLIP-27.
> >>> > > > > > >>
> >>> > > > > > >> If we could have consensus for this,  Alex and I could
> >>> create
> >>> > the
> >>> > > > > > umbrella
> >>> > > > > > >> ticket to kick it off.
> >>> > > > > > >>
> >>> > > > > > >> Best regards,
> >>> > > > > > >> Jing
> >>> > > > > > >>
> >>> > > > > > >>
> >>> > > > > > >> On Fri, Jun 3, 2022 at 3:54 PM Alexander Fedulov <
> >>> > > > > > alexan...@ververica.com>
> >>> > > > > > >> wrote:
> >>> > > > > > >>
> >>> > > > > > >>> Hi everyone,
> >>> > > > > > >>>
> >>> > > > > > >>> I would like to start the discussion about marking
> >>> > > > > SourceFunction-based
> >>> > > > > > >>> interfaces as deprecated. With the FLIP-27 APIs becoming
> >>> the
> >>> > new
> >>> > > > > > >> standard,
> >>> > > > > > >>> the old ones have to be eventually phased out. Although
> >>> this
> >>> > > state
> >>> > > > is
> >>> > > > > > >> well
> >>> > > > > > >>> known within the community and no new connectors based on
> >>> the
> >>> > old
> >>> > > > > > >>> interfaces can be accepted into the project, the
> footprint
> >>> of
> >>> > > > > > >>> SourceFunction in the user code still keeps growing
> >>> (primarily
> >>> > > for
> >>> > > > > data
> >>> > > > > > >>> generators and test utilities). I believe it is best to
> >>> mark
> >>> > > > > > >> SourceFunction
> >>> > > > > > >>> as deprecated as soon as possible. What do you think?
> >>> > > > > > >>>
> >>> > > > > > >>> Best,
> >>> > > > > > >>> Alexander Fedulov
> >>> > > > > > >>>
> >>> > > > > > >>
> >>> > > > > > >
> >>> > > > > >
> >>> > > > >
> >>> > > >
> >>> > >
> >>> >
> >>> >
> >>> > --
> >>> > https://twitter.com/snntrable
> >>> > https://github.com/knaufk
> >>> >
> >>>
> >>
>

Reply via email to