Hi, We have the same goal to let users migrate to SinkV2. Just marking those interfaces as deprecated or public won't achieve this goal. Frankly speaking, users will start to migrate not because those interfaces are marked as deprecated. They will do it because there is a workable implementation which can replace the old one with a clear migration guideline. What Dong Li and Jark Wu said makes a lot of sense. This leads to the "bottom-up" solution - first have workable(graduated, not necessarily means @public, please see below) connectors, then deprecate and graduate top level interfaces.
Hi Konstanin Thanks for sharing. Commonly, you are right, I'm with you. Interface and its (solo) implementation should be graduated together. But SinkV2 is a special case. It has more than 10 implementations. And it is a more customer-facing API which has a big impact on the ecosystem. We should do it with special care. If we go this way, it will be even harder to graduate the SinkV2 API, because many connector implementations should be graduated with SinkV2 interfaces together. Speaking of the issue you mentioned, as far as I am concerned, the current KafkaSink implementation[1] offers an option for this. Conceptually, it is possible to mark KafkaSink as @public without marking SinkV2 as @public. All potential unstable methods coming from SinkV2 interfaces will be kept marked as @internal. We still can make breaking changes with those methods, i.e. with SinkV2 interfaces(still @publicEvolving) after KafkaSink is graduated. For KafkaSink itself, it is also possible to do further backward compatible changes, like adding new methods, removing @internal(once the related SinkV2 interface is graduated), etc. The graduation impact will be limited only within the Kafka connector. Another tiny different option could be that we make KafkaSink be able to completely replace FlinkKafkaProducer, still keep it as @publicEvolving and remove FlinkKafkaProducer. This could also be considered as the (soft) graduation of KafkaSink even if it is still @publicEvolving. Users will not be confused because there is only one Kafka sink solution(FlinkKafkaProducer is gone). It's my belief that this is a more flexible solution with small impact on the ecosystem than directly marking top level interfaces as @deprecated or @public without showing users how to migrate, i.e. without concrete graduated implementations. Best regards, Jing [1] https://github.com/apache/flink/blob/2ae5df278958073fee63b2bf824a53a28a21701b/flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/connector/kafka/sink/KafkaSink.java#L89 On Mon, Feb 6, 2023 at 2:56 PM Galen Warren <ga...@cvillewarrens.com> wrote: > Recently, a critical bug with the Unified Sink committer was reported: > [FLINK-30238] > Unified Sink committer does not clean up state on final savepoint - ASF > JIRA (apache.org) <https://issues.apache.org/jira/browse/FLINK-30238>. > > Fabian Paul reported: > > Hi folks, >> >> I did some initial investigation, and the problem seems twofold. >> >> If no post-commit topology is used, we do not run into a problem where >> we could lose data but since we do not clean up the state correctly, >> we will hit this [1] when trying to stop the pipeline with a savepoint >> after we have started it from a savepoint. >> AFAICT all two-phase commit sinks are affected Kafka, File etc. >> >> For sinks using the post-commit topology, the same applies. >> Additionally, we might never do the commit from the post-commit >> topology resulting in lost data. >> >> Best, >> Fabian >> > > > Does this need to be addressed before people can safely move to V2 sinks? > I'm using the StreamingFileSink for this reason. > > Thanks, > > Galen > > > > > > On Mon, Feb 6, 2023 at 8:48 AM Konstantin Knauf <kna...@apache.org> wrote: > >> Hi Steven, >> >> Sink is already deprecated. It was deprecated at the moment where we >> introduced SinkV2. >> >> Hi Jark, Hi Dong, >> >> My understanding is the SinkV2 is a workable interface. The most important >> connectors have been migrated (Kafka, Filesystem) and more connectors >> (OpenSearch, ElasticSearch, Kinesis) use it successfully. To make SinkV2 >> public, it does not need to have all possible functionality. Public APIs >> can be extended. That's what we do all the time. There will also always be >> bugs. So, these points can not be categorical blockers to promote the API. >> >> What are the specific issues/tickets that are blocking us? Can we in your >> opinion only deprecate it when every single connector in Apache Flink is >> migrated already? >> >> In my opinion it is the time to ask users to the migrate their connectors. >> More importantly, @Deprecated would signal users not to build new >> connectors on SinkFunction. I would arque its also very misleading to >> users >> to not @Deprecated SinkFunction given that is clearly will be deprecated. >> >> Cheers, >> >> Konstantin >> >> >> Am Mo., 6. Feb. 2023 um 13:26 Uhr schrieb Jark Wu <imj...@gmail.com>: >> >> > I agree with Dong Lin. >> > >> > Oracle explains how to use Deprecate API [1]: >> > >> > You are strongly recommended to use the Javadoc @deprecated tag with >> > > appropriate comments explaining how to use the new API. This ensures >> > > developers will *have a workable migration path from the old API to >> the >> > > new API*. >> > >> > >> > From a user's perspective, the workable migration path is very >> important. >> > Otherwise, it blurs the semantics of API deprecation. The Flink API's >> > compatibility and stability issues in the past left a bad impression on >> the >> > downstream projects. We should be careful when changing and deprecating >> > APIs, especially when there are known migration gaps. I think it's a >> good >> > idea to migrate Flink-owned connectors before marking old API >> deprecated. >> > This ensures downstream projects can migrate to new APIs smoothly. >> > >> > Best, >> > Jark >> > >> > [1]: >> > >> > >> https://docs.oracle.com/javase/8/docs/technotes/guides/javadoc/deprecation/deprecation.html >> > >> > On Mon, 6 Feb 2023 at 10:01, Steven Wu <stevenz...@gmail.com> wrote: >> > >> > > Regarding the discussion on global committer [1] for sinks with global >> > > transactions, there is no consensus on solving that problem in SinkV2. >> > Will >> > > it require any breaking change in SinkV2? >> > > >> > > Also will SinkV1 be deprecated too? or it should happen sometime after >> > > SinkFunction deprecation? >> > > >> > > [1] https://lists.apache.org/thread/82bgvlton9olb591bfg2djv0cshj1bxj >> > > >> > > On Sun, Feb 5, 2023 at 2:14 AM Dong Lin <lindon...@gmail.com> wrote: >> > > >> > > > Hi Konstantin, >> > > > >> > > > Thanks for the comment! Please see my comment inline. >> > > > >> > > > Cheers, >> > > > Dong >> > > > >> > > > On Sat, Feb 4, 2023 at 2:06 AM Konstantin Knauf <kna...@apache.org> >> > > wrote: >> > > > >> > > > > Hi everyone, >> > > > > >> > > > > sorry for joining the discussion late. >> > > > > >> > > > > 1) Is there an option to deprecate SinkFunction in Flink 1.17 >> while >> > > > leaving >> > > > > SinkV2 @PublicEvolving in Flink 1.17. We then aim to make SinkV2 >> > > @Public >> > > > in >> > > > > and remove SinkFunction in Flink 1.18. @PublicEvolving are >> intended >> > for >> > > > > public use. So, I don't see it as a blocker for deprecating >> > > SinkFunction >> > > > > that we have to make SinkV2 @PublicEvovling. For reference this is >> > the >> > > > > description of @PublicEvovling: >> > > > > >> > > > > /** >> > > > > * Annotation to mark classes and methods for public use, but with >> > > > > evolving interfaces. >> > > > > * >> > > > > * <p>Classes and methods with this annotation are intended for >> > public >> > > > > use and have stable behavior. >> > > > > * However, their interfaces and signatures are not considered to >> be >> > > > > stable and might be changed >> > > > > * across versions. >> > > > > * >> > > > > * <p>This annotation also excludes methods and classes with >> evolving >> > > > > interfaces / signatures within >> > > > > * classes annotated with {@link Public}. >> > > > > */ >> > > > > >> > > > > >> > > > > Marking SinkFunction @Deprecated would already single everyone to >> > move >> > > to >> > > > > SinkV2, which we as a community, I believe, have a strong interest >> > in. >> > > > Its >> > > > > >> > > > >> > > > Yes, I also believe we all have this strong interest. I just hope >> that >> > > this >> > > > can be done in the best possible way that does not confuse users. >> > > > >> > > > I probably still have the same concern regarding its impact on >> users: >> > if >> > > we >> > > > mark an API as deprecated, it effectively means the users of this >> API >> > > > should start to migrate to another API (e.g. SinkV2) and we might >> > remove >> > > > this API in the future. However, given that we know there are known >> > > > problems preventing users from doing so, it seems that we are not >> ready >> > > to >> > > > send this message to users right. >> > > > >> > > > If I understand correctly, I guess you are suggesting that by >> marking >> > > > SinkFunction as deprecated, we can put higher pressure on Flink >> > > > contributors to update the existing Flink codebase to improve and >> use >> > > > SinkV2. >> > > > >> > > > I am not sure this is the right way to use @deprecated, which has a >> > > > particular meaning for its users rather than contributors. And I am >> > also >> > > > not sure we can even pressure contributors of an open-source project >> > into >> > > > developing a feature (e.g. migrate all existing SinkFunction >> subclasses >> > > to >> > > > SinkV2). IMO, the typical way is for the contributor with >> interest/time >> > > to >> > > > work on the feature, or talk to other contributors whether they are >> > > willing >> > > > to collaborate/work on this, rather than pressuring other >> contributors >> > > into >> > > > working on this. >> > > > >> > > > >> > > > almost comical how long the transition from >> > SourceFurnction/SinkFunction >> > > to >> > > > > Source/Sink takes us. At the same time, we leave ourselves the >> option >> > > to >> > > > to >> > > > > make small changes to SinkV2 if any problems arise during the >> > migration >> > > > of >> > > > > these connector. >> > > > > >> > > > > I think, we have a bit of a chicken/egg problem here. The pressure >> > for >> > > > > >> > > > >> > > > Similar to the reason described above, I am not sure we have a >> > > chicken/egg >> > > > problem here. The issue here is that SinkV2 is not ready and we >> have a >> > > lot >> > > > of existing SinkFunction that is not migrated by ourselves. We >> (Flink >> > > > contributors) probably do not need to mark SinkFunction as >> deprecated >> > in >> > > > order to address these issues in our own codebase. >> > > > >> > > > >> > > > users and contributors is not high enough to move away from >> > SinkFunction >> > > as >> > > > > long as its not deprecated, but at the same time we need people to >> > > > migrate >> > > > > their connectors to see if there are any gaps in SinkV2. I >> believe, >> > the >> > > > > combination proposed above could bridge this problem. >> > > > > >> > > > > 2) I don't understand the argument of waiting until some of the >> > > > > implementations are @Public. How can we make the implementations >> of >> > the >> > > > > SinkV2 API @Public without making SinkV2 @Public? All public >> methods >> > of >> > > > > SinkV2 are part of every implementation. So to me it actually >> seems >> > to >> > > be >> > > > > opposite: in order to make any of the implementation @Public we >> first >> > > > need >> > > > > to make the API @Public. >> > > > > >> > > > >> > > > Yeah I also agree with you. >> > > > >> > > > >> > > > > >> > > > > Cheers, >> > > > > >> > > > > Konstantin >> > > > > >> > > > > Am Mo., 30. Jan. 2023 um 13:18 Uhr schrieb Dong Lin < >> > > lindon...@gmail.com >> > > > >: >> > > > > >> > > > > > Hi Martijn, >> > > > > > >> > > > > > Thanks for driving this effort to clean-up the Flink codebase! >> > > > > > >> > > > > > I like the idea to cleanup Flink codebase to avoid having two >> > Sinks. >> > > On >> > > > > the >> > > > > > other hand, I also thing the concern mentioned by Jing makes >> sense. >> > > In >> > > > > > addition to thinking in terms of the rule proposed in FLIP-197 >> > > > > > < >> > > > > > >> > > > > >> > > > >> > > >> > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process >> > > > > > > >> > > > > > (which >> > > > > > seems to focus mostly on the Flink developers' perspective), it >> > might >> > > > be >> > > > > > useful to also think about the story from users' perspective and >> > make >> > > > > sure >> > > > > > their concerns can be addressed. >> > > > > > >> > > > > > Typically, by marking an API as deprecated, we are effectively >> > > telling >> > > > > > users *they should start to migrate to the new API ASAP and we >> > > reserve >> > > > > the >> > > > > > right to remove this API completely in the 1-2 releases*. Then >> it >> > > might >> > > > > be >> > > > > > reasonable for users to ask questions such as: >> > > > > > >> > > > > > 1) Does SinkV2 public API provides all the functionalities >> needed >> > to >> > > > > > migrate my existing code from subclassing SinkFunction to >> > subclassing >> > > > > > SinkV2? >> > > > > > >> > > > > > 2) Is the amount of migration work reasonable? If yes, why is a >> > class >> > > > > such >> > > > > > as HBaseSinkFunction in Flink's own codebase still depending on >> > > > > > SinkFunction? Maybe Flink developers should eat their own dog >> food >> > > and >> > > > > > migrate (or deprecate) those classes in the Flink codebase >> first? >> > > > > > >> > > > > > Based on the discussion in this thread, I am not sure we have >> good >> > > > > answers >> > > > > > to those questions yet. For the 1st question above, the answer >> is >> > > *no* >> > > > > > because we already know that the SinkV2 is currently not able to >> > > > support >> > > > > > migration for JdbcSink. For the 2nd question above, we know >> there >> > are >> > > > > many >> > > > > > non-deprecated class in Flink codebase that are still depending >> on >> > > > > SinkV2. >> > > > > > It is probably not nice to users if we tell them to migrate >> while >> > we >> > > > know >> > > > > > there are existing issues that can prevent them from doing so >> > easily. >> > > > > > >> > > > > > In order to move forward to deprecate SinkV2, I think it will be >> > > super >> > > > > > useful to first migrate all the connectors managed by Flink >> > community >> > > > > > (including all externalized connectors) to use SinkV2. This work >> > > won't >> > > > be >> > > > > > wasted since we need to do this anyway. And it will also give >> us a >> > > > chance >> > > > > > to validate the capabilities of SinkV2 and fix bugs by >> ourselves as >> > > > much >> > > > > as >> > > > > > possible. >> > > > > > >> > > > > > What do you think? >> > > > > > >> > > > > > Best Regards, >> > > > > > Dong >> > > > > > >> > > > > > >> > > > > > On Wed, Jan 18, 2023 at 6:52 PM Martijn Visser < >> > > > martijnvis...@apache.org >> > > > > > >> > > > > > wrote: >> > > > > > >> > > > > > > Hi all, >> > > > > > > >> > > > > > > While discussing FLIP-281 [1] the discussion also turned to >> the >> > > > > > > SinkFunction and the SinkV2 API. For a broader discussion I'm >> > > opening >> > > > > up >> > > > > > a >> > > > > > > separate discussion thread. >> > > > > > > >> > > > > > > As Yun Tang has mentioned in that discussion thread, it would >> be >> > a >> > > > good >> > > > > > > time to deprecate the SinkFunction to avoid the need to >> introduce >> > > new >> > > > > > > functions towards (to be) deprecated APIs. Jing rightfully >> > > mentioned >> > > > > that >> > > > > > > it would be confusing to deprecate the SinkFunction if its >> > > successor >> > > > is >> > > > > > not >> > > > > > > yet marked as @Public (it's currently @PublicEvolving). >> > > > > > > >> > > > > > > My proposal would be to promote the SinkV2 API to @public in >> > Flink >> > > > 1.17 >> > > > > > and >> > > > > > > mark the SinkFunction as @deprecated in Flink 1.17 >> > > > > > > >> > > > > > > The original Sink interface was introduced in Flink 1.12 with >> > > > FLIP-143 >> > > > > > [2] >> > > > > > > and extended with FLIP-177 in Flink 1.14 [3] and has been >> > improved >> > > on >> > > > > > > further as Sink V2 via FLIP-191 in Flink 1.15 [4]. >> > > > > > > >> > > > > > > Looking at the API stability graduation process [5], the fact >> > that >> > > > Sink >> > > > > > V2 >> > > > > > > was introduced in Flink 1.15 would mean that we could warrant >> a >> > > > > promotion >> > > > > > > to @public already (given that there have been two releases >> with >> > > 1.15 >> > > > > and >> > > > > > > 1.16 where it was introduced). Combined with the fact that >> SinkV2 >> > > has >> > > > > > been >> > > > > > > the result of iteration over the introduction of the original >> > Sink >> > > > API >> > > > > > > since Flink 1.12, I would argue that the promotion is overdue. >> > > > > > > >> > > > > > > If we promote the Sink API to @public, I think we should also >> > > > > immediately >> > > > > > > mark the SinkFunction as @deprecated. >> > > > > > > >> > > > > > > Looking forward to your thoughts. >> > > > > > > >> > > > > > > Best regards, >> > > > > > > >> > > > > > > Martijn >> > > > > > > >> > > > > > > >> > > > > > > [1] >> > > https://lists.apache.org/thread/l05m6cf8fwkkbpnjtzbg9l2lo40oxzd1 >> > > > > > > [2] >> > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-143%3A+Unified+Sink+API >> > > > > > > [3] >> > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-177%3A+Extend+Sink+API >> > > > > > > [4] >> > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-191%3A+Extend+unified+Sink+interface+to+support+small+file+compaction >> > > > > > > [5] >> > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process >> > > > > > > >> > > > > > >> > > > > >> > > > > >> > > > > -- >> > > > > https://twitter.com/snntrable >> > > > > https://github.com/knaufk >> > > > > >> > > > >> > > >> > >> >> >> -- >> https://twitter.com/snntrable >> https://github.com/knaufk >> >