Hi Martijn,

Thanks for sharing more information to help us have a clear big picture.

1) I agree with you. We should graduate the SinkV2 API asap. I have also
tried many times in the past and there were always some issues that
postponed the graduation and they made sense. One rule to check if an API
is ready to be @Public is not to check when it was introduced but to
analyse when and how big the last change was. Most SinkV2 implementation
was done or updated in early 2022 because of FLIP-191[1]. After that, there
were many bugs-fixed until now, which shows it is not stable enough to be
graduated.

2,3,4) one reason that FLIP-197 [2] proposed two release cycles for each
graduation(from @PublicEvolving to @Public) is to give both Flink users and
Flink developers enough time to evaluate, improve, and stabilize the API.
What we really checked is whether the implementation met the requirement
which turns out that the API is a good fit. That's why I said Sink API is a
special case because connectors are the implementations of Sink API. It is
a strong dependency between them, not orthogonal. I checked all connectors
that implement SinkV2. None of them is @Public. It is weird to graduate
interfaces(the design) without one single successful graduation of its
implementations. For Sink API, since it will cover so many different
heterogeneous downstream systems, as I said, having three graduated
implementations is a safer process.

6) FlieSink can be used in both stream and batch mode. JDBC is more or less
a (pure) batch-oriented case.

Long story short, the question is actually about the risk that we could
live with. I personally feel comfortable to graduate it along with some
connectors(Kafka, File, and JDBC) in 1.18. Doing it in 1.17 is a little bit
risky. Since @Qingsheng Ren <renqs...@gmail.com> and @Yun Gao
<yungao...@aliyun.com> have been working on Sink, I would like to have
their thoughts. They still have public holidays and might join this thread
next week. It would be great if we could keep this thread open until then.
If all of you are aware of those risks and are still fine with it, I will
be happy to see the SinkV2 API get graduated with the 1.17 release. :-)

Best regards,
Jing



[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-191%3A+Extend+unified+Sink+interface+to+support+small+file+compaction
[2]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process

On Thu, Jan 19, 2023 at 10:08 AM Martijn Visser <martijnvis...@apache.org>
wrote:

> Hi Jing,
>
> Thanks for your input!
>
> 1) I think that we have had more than two release cycles for the Sink V2
> API. The first release of the Sink API was introduced with Flink 1.12 which
> was December 2020, more than 2 years ago. The additional feature that Sink
> V2 has introduced was the ability to hook custom topologies, everything
> else remained the same. Like you've mentioned, the Sink API is super
> important why I don't understand why we actually shouldn't move it
> to @public. As outlined in FLIP-197, "The community should also try to
> stabilize new APIs as soon as possible so that Flink users can rely on
> them. If we don’t do this, then users start building against
> @PublicEvolving and weaker annotated APIs. This might then lead to
> disappointment."
>
> 2 & 3) The ElasticSearch Sink is also using Sink V2, as is OpenSearch and
> the Sinks that are using the Async Sink from FLIP-171 [1]. So I think
> that's also AWS Firehose, Kinesis, DynamoDB. It's actually weird that
> KafkaSink is not yet at @publicevolving, given that the FlinkKafkaProducer
> has already been marked as deprecated with Flink 1.14. [2]. The FileSink
> being experimental is even weirder, given that that was introduced in Flink
> 1.12 [3] and should also have been @public. I think the fact that some
> missing functionality might be missing should not be a reason to graduate
> an API to the next phase (that also applies to nr 5 from your email). I
> believe the intent of the graduation process is about guaranteeing that
> functionality that already exists won't be broken and will remain working
> if you've currently built an implementation with that.
>
> 4) I think the graduation of a connector is orthogonal to the graduation of
> the Flink interfaces that such a connector is using. I don't think that
> we're treating Kafka like a first class connector over the others: there
> are many more connectors that are using the Sink V2 interfaces.
>
> 6) Given that we have the FileSink that already uses the interfaces,
> haven't we already covered the batch oriented implementation?
>
> I noticed that in the discussion thread that you referred to I already was
> OK with promoting SinkV2 to @public in October of last year :) - I'm still
> in favour of making this change sooner rather than later.
>
> Best regards,
>
> Martijn
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink
> [2] https://issues.apache.org/jira/browse/FLINK-23710
> [3] https://issues.apache.org/jira/browse/FLINK-19758
>
> Op do 19 jan. 2023 om 05:54 schreef Jing Ge <j...@ververica.com.invalid>:
>
> > Hi,
> >
> > There are some typos in my last email, although hopefully I made myself
> > clear with the context, sorry about that:
> >
> > FLIP-179 -> FLIP-197
> > FlinkKafkaConsumer -> FlinkKafkaProducer
> >
> > Best regards,
> > Jing
> >
> > On Thu, Jan 19, 2023 at 5:30 AM Jing Ge <j...@ververica.com> wrote:
> >
> > > Hi,
> > >
> > > Thanks Martijn for bringing this to our attention. This
> > > discussion deserves a much wider range of attention. because graduation
> > of
> > > the Sink API will have an impact on all connectors.
> > >
> > > For me, briefly speaking, like I already pointed out in another thread
> > > [1], it is too hasty and risky to do it in 1.17 and even in 1.18. Here
> > are
> > > some reasons I knew and there might be more:
> > >
> > > 1. commonly, it is fine to follow FLIP-179 [2], since most API has only
> > > one implementation. Two release cycles for @PublicEvolving should be
> good
> > > enough to test the API design and implementation and get feedback and
> > make
> > > the API stable. But, Connector API (including the Sink API) is a
> special
> > > case. There will be more than 10 different implementations, Some of
> them
> > > are very important for Flink users' daily work. Rules defined in
> FLIP-179
> > > should be adapted in this case.
> > >
> > > 2. Currently, afaik, there are only two Sink API implementations and
> none
> > > of them is graduated. KafkaSink is @PublicEvloving and FileSink
> > > is @Experimental. It is very risky to graduate the fundamental Sink API
> > > before none of its implementation has been graduated.
> > >
> > > 3. We have an awkward situation with one of the most important
> > connectors,
> > > the Kafka connector. The KafakSink is @PublicEvolving but the
> > > FlinkKafkaConsumer is already deprecated. Users are confused and some
> of
> > > them are still sticking to FlinkKafkaConsumer, since it is normally not
> > > recommended to use @PublicEvolving in production. I tried to graduate
> the
> > > KafkaSink and remove the FlinkKafkaProducer to push users using
> KafkaSink
> > > in this thread [3]. But there are some tasks to finish as a
> prerequisite.
> > > Things got more complicated as expected and some new rules
> > > need to be defined and discussed. I will write a new FLIP to address
> this
> > > issue later.
> > >
> > > 4. Even if we could graduate the Kafka connector, it is still risky to
> > > graduate the SinkV2 API, which will make the Kafka connector special
> > > comparing to others, i.e. kafka connector as the first class vs. others
> > as
> > > the second class. This is not what we want to see.
> > >
> > > 5. Afaik, there are still some fresh construction wrt the SinkV2 @Yun
> Gao
> > > <yungao...@aliyun.com>
> > >
> > > 6. Flink wants to be the unified batch and stream processing framework.
> > It
> > > might be rational, if a more batch oriented Sink implementation could
> be
> > > done and graduated before we graduate the Sink API. The JDBC connector
> > > might be a feasible choice.
> > >
> > > Therefore, a solid solution should be:
> > >
> > > 1. Develop three SinkV2 implementations. Considering batch, stream,
> > > popularity, etc. I personally would choose Kafka, File, and JDBC. But
> the
> > > list deserves further discussion.
> > > 2. graduate all three SinkV2 implementations and remove old
> > > implementations. If there are any rejections in the community for
> > removing
> > > the old implementation. It means the related SinkV2 implementation is
> not
> > > ready to graduate. We will need more time.
> > > 3. 1 - 2 release cycles to get feedback from users wrt those SinkV2
> > > implementations.
> > > 3. graduate the SinkV2 API.
> > >
> > > Now we could see, if we could find developers focusing on them and
> > deliver
> > > them on time, we would still need at least  4 release cycles before we
> > > could graduate the Sink API.
> > >
> > > If we want to pace up and could live with some risk, we could go with
> the
> > > following process:
> > >
> > > 1. Remove FlinkKafkaProucer and graduate KafkaSink in Flink 1.18 in the
> > > ideal case.
> > > 2. Get feedback and stabilize the API while we continue working on
> Flink
> > > and release 1.19
> > > 3. granduate the Sink API in Flink 1.20
> > > 4. be aware that we might have to develop SinkV3 if any unknown
> > > requirements become known after new Sink implementation is developed.
> > >
> > > Please pay attention, this process is still hasty and risky. Because it
> > > only satisfied Kafka and might have a big impact on other connectors,
> > i.e.
> > > we are treating the Kafka connector as the king.
> > >
> > > Please don't get me wrong. I am not trying to block this change. On the
> > > contrary, I am keen to graduate SinkV2 API and remove old API and
> > > implementations. But I think it is important to be aware of the risk
> and
> > > make sure everyone is on the same page. There might be some other
> > > information I don't know and I am happy to hear any opposite opinions
> > that
> > > can push the SinkV2 API to be graduated asap.
> > >
> > > Last but not least, this discussion starts during the holiday season in
> > > China. @Qingsheng Ren <renqs...@gmail.com> @Jark Wu <imj...@gmail.com>
> > I
> > > would like to have your attention please.
> > >
> > > Best regards,
> > > Jing
> > >
> > >
> > > [1] https://lists.apache.org/thread/l05m6cf8fwkkbpnjtzbg9l2lo40oxzd1
> > > [2]
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> > > [3] https://lists.apache.org/thread/m3o48c2d8j9g5t9s89hqs6qvr924s71o
> > >
> > > On Thu, Jan 19, 2023 at 3:26 AM Lijie Wang <wangdachui9...@gmail.com>
> > > wrote:
> > >
> > >> Hi Martijn,
> > >>
> > >> Thanks for driving this. I have a only concern about the
> > Sink.InitContext.
> > >>
> > >> Does the Sink.InitContext will also be changed to @Public ? As
> described
> > >> in
> > >> FLIP-287, currently the Sink.InitContext still lacks some necessary
> > >> information to migrate existing connectors to new sinks. If it is
> marked
> > >> as
> > >> public/stable, we can no longer modify it in the future(since most
> > >> connectors are not migrated to SinkV2 currently, we may find we need
> > more
> > >> information via InitContext in the future migrations).
> > >>
> > >> Best,
> > >> Lijie
> > >>
> > >> Yun Tang <myas...@live.com> 于2023年1月18日周三 21:13写道:
> > >>
> > >> > SinkV2 was introduced in Flink-1.15 and annotated as @PublicEvolving
> > >> from
> > >> > the 1st day [1]. From FLIP-197, we can promote it to @Public since
> it
> > >> > already existed with two releases.
> > >> > And I didn't find a FLIP to discuss the process to deprecate APIs,
> > >> > considering the SinkFunction has actually been stale for some time,
> I
> > >> think
> > >> > we can deprecate it with the @Public SinkV2.
> > >> >
> > >> > Thus, +1 (binding) for this proposal.
> > >> >
> > >> > [1] https://issues.apache.org/jira/browse/FLINK-25555
> > >> >
> > >> > Best
> > >> > Yun Tang
> > >> >
> > >> > ________________________________
> > >> > From: Martijn Visser <martijnvis...@apache.org>
> > >> > Sent: Wednesday, January 18, 2023 18:50
> > >> > To: dev <dev@flink.apache.org>; Jing Ge <j...@ververica.com>; Yun
> > Tang
> > >> <
> > >> > myas...@live.com>
> > >> > Subject: [DISCUSS] Promote SinkV2 to @Public and deprecate
> > SinkFunction
> > >> >
> > >> > Hi all,
> > >> >
> > >> > While discussing FLIP-281 [1] the discussion also turned to the
> > >> > SinkFunction and the SinkV2 API. For a broader discussion I'm
> opening
> > >> up a
> > >> > separate discussion thread.
> > >> >
> > >> > As Yun Tang has mentioned in that discussion thread, it would be a
> > good
> > >> > time to deprecate the SinkFunction to avoid the need to introduce
> new
> > >> > functions towards (to be) deprecated APIs. Jing rightfully mentioned
> > >> that
> > >> > it would be confusing to deprecate the SinkFunction if its successor
> > is
> > >> not
> > >> > yet marked as @Public (it's currently @PublicEvolving).
> > >> >
> > >> > My proposal would be to promote the SinkV2 API to @public in Flink
> > 1.17
> > >> > and mark the SinkFunction as @deprecated in Flink 1.17
> > >> >
> > >> > The original Sink interface was introduced in Flink 1.12 with
> FLIP-143
> > >> [2]
> > >> > and extended with FLIP-177 in Flink 1.14 [3] and has been improved
> on
> > >> > further as Sink V2 via FLIP-191 in Flink 1.15 [4].
> > >> >
> > >> > Looking at the API stability graduation process [5], the fact that
> > Sink
> > >> V2
> > >> > was introduced in Flink 1.15 would mean that we could warrant a
> > >> promotion
> > >> > to @public already (given that there have been two releases with
> 1.15
> > >> and
> > >> > 1.16 where it was introduced). Combined with the fact that SinkV2
> has
> > >> been
> > >> > the result of iteration over the introduction of the original Sink
> API
> > >> > since Flink 1.12, I would argue that the promotion is overdue.
> > >> >
> > >> > If we promote the Sink API to @public, I think we should also
> > >> immediately
> > >> > mark the SinkFunction as @deprecated.
> > >> >
> > >> > Looking forward to your thoughts.
> > >> >
> > >> > Best regards,
> > >> >
> > >> > Martijn
> > >> >
> > >> >
> > >> > [1]
> https://lists.apache.org/thread/l05m6cf8fwkkbpnjtzbg9l2lo40oxzd1
> > >> > [2]
> > >> >
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-143%3A+Unified+Sink+API
> > >> > [3]
> > >> >
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-177%3A+Extend+Sink+API
> > >> > [4]
> > >> >
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-191%3A+Extend+unified+Sink+interface+to+support+small+file+compaction
> > >> > [5]
> > >> >
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> > >> >
> > >> >
> > >>
> > >
> >
>

Reply via email to