Hi all, Nice thread and great discussion! Ecosystem is one of the most important things to the Flink community, we should pay more attention to API compatibility.
Marking all connector APIs @Public is a good idea, not only mark the Table/SQL connector APIs public, but also the new Source/Sink APIs as public. Besides, we should also add a check item to the Verify Release documentation[1] to verify the release is backward-compatible for connectors. From my point of view, such backward incompatibility should cancel the vote. Regarding the SourceReaderContext#metricGroup compatibility problem in 1.14.0, I would suggest starting a new discussion thread to see whether we have any idea to fix it. We should fix it ASAP! Otherwise iceberg/hudi/cdc communities will get frustrated again when upgrading to 1.14. Maybe we still have time to release a minor version, because there is no connector upgraded to 1.14.0 yet. What do you think? @Leonard Xu <xbjt...@gmail.com> @Arvid Heise <ar...@ververica.com> @Piotr Nowojski <pi...@ververica.com> Best, Jark [1]: https://cwiki.apache.org/confluence/display/FLINK/Verifying+a+Flink+Release On Wed, 29 Sept 2021 at 09:46, OpenInx <open...@gmail.com> wrote: > > Apart from this being `@PublicEvolving` > > From my perspective, annotating the 'DynamicTableSink' to be a > 'PublicEvolving' class is not reasonable, because that means devs could > just change the basic API which all downstream connectors are depending on > easily when iterating flink from 1.12 to 1.13 (according to the wiki [1]). > This implies all downstream maintainers must take on this maintenance > burden, and it also makes our flink ecosystem very fragile. Changing the > 'DynamicTableSink' between two major versions sounds reasonable to me, but > unreasonable for uncompatibility changes between two minor versions. I > think we may need to check those API which are annotated 'PublicEnvoling' > while should be 'Public' because of the dependency from all connectors. > We should ensure the stability of those APIs that are necessary to > implement the connector, and at the same time implement the updated v2 > version of the API. After all v2 APIs are considered stable, we will mark > them as stable. Instead of releasing a version of the API, some of the APIs > necessary to implement the connector are marked as stable and some are > marked as unstable, which is very unfriendly to downstream. Because > downstream essentially every upgrade requires refactoring of the code. > > > We are trying to provide forward compatibility: applications using > `@Public` APIs compiled against Flink 1.12.x, should work fine in Flink > 1.13.x > > Thanks for clarifying this. Sounds reasonable to me, then we apache > iceberg could just use flink 1.12.x to build the flink+iceberg connector > and should make all the tests work fine for both flink 1.12 & flink 1.13. > For the `ResolvedCatalogTable` changes, I don't think it guarantees the > forward compatibility as you said, because the flink-iceberg-runtime.jar > compiled by flink 1.12 can still not works fine in flink 1.13 because it > will need the flink1.12's `CatalogTable` to be cast to a flink1.13 's > `ResolvedCatalogTable`. In general, I agree that forward compatibility is > a more clear compatibility guarantee. > > [1]. > https://cwiki.apache.org/confluence/display/FLINK/Stability+Annotations > > > On Tue, Sep 28, 2021 at 10:33 PM David Morávek <d...@apache.org> wrote: > > > > > > > I think we have a compile time checks for breaking changes in `@Public` > > > marked classes/interfaces using japicmp [1]. > > > > > > Nice, thanks for pointing that out, I'll take a closer look at it ;) > > > > Best, > > D. > > > > On Tue, Sep 28, 2021 at 4:14 PM Piotr Nowojski <pnowoj...@apache.org> > > wrote: > > > > > > - We don't have any safeguards for stable API breaks. Big +1 for > Ingo's > > > effort with architectural tests [3]. > > > > > > I think we have a compile time checks for breaking changes in `@Public` > > > marked classes/interfaces using japicmp [1]. > > > > > > Piotrek > > > > > > [1] https://github.com/apache/flink/blob/master/pom.xml#L2014:L2084 > > > > > > wt., 28 wrz 2021 o 15:59 David Morávek <d...@apache.org> napisał(a): > > > > > > > This is a super interesting topic and there is already a great > > > discussion. > > > > Here are few thoughts: > > > > > > > > - There is a delicate balance between fast delivery of the new > features > > > and > > > > API stability. Even though we should be careful with breaking > evolving > > > > interfaces, it shouldn't stop us from making fast progress / iterate > on > > > > features. > > > > - There are two camps of users. One camp prefers more frequent > > releases / > > > > new features (early adopters) and second that prefer less frequent > > stable > > > > releases. There was already a great discussion about this at Flink > 1.14 > > > > thread [1]. > > > > - We're already trying to be explicit about which API's may break via > > > > annotations and the feature radar [2]. Stability annotations are a > well > > > > known concept used by many projects. I think we still can go bit > > further > > > > here and aim for an IDE support (for example usages of guava > > > @Experimental > > > > interfaces get highlighted, raising more awareness about potential > > > issues). > > > > I'm not sure how this IDE integration works though. > > > > - We don't have any safeguards for stable API breaks. Big +1 for > Ingo's > > > > effort with architectural tests [3]. > > > > > > > > [1] > > > > > > > > > > > > > > https://lists.apache.org/thread.html/r76e1cdba577c6f4d6c86b23fdaeb53c4e3744c20d0b3e850fc2e14a7%40%3Cdev.flink.apache.org%3E > > > > [2] https://flink.apache.org/roadmap.html > > > > [3] https://issues.apache.org/jira/browse/FLINK-24138 > > > > > > > > Best, > > > > D. > > > > > > > > On Tue, Sep 28, 2021 at 3:49 PM Leonard Xu <xbjt...@gmail.com> > wrote: > > > > > > > > > Thanks Piotr for the kindly reply, what confused me is that > > > > > `SourceReaderContext` was marked @Public when it was born in flink > > > 1.11, > > > > > and then it was corrected to @PublicEvolving in 1.11 -_-, and > finally > > > it > > > > > was changed to @Public again... > > > > > > > > > > As Flink and Flink ecosystem(Flink CDC connectors) developer, I > think > > > > what > > > > > we're discussing is meaningful and I’d like to help improve those > > > Public > > > > > API check for those changes. > > > > > Chesnay’s tips is a good idea that maybe we can use the tool like > > > japicmp > > > > > to do the check for every PR in CI phase. > > > > > > > > > > Best, > > > > > Leonard > > > > > > > > > > > > > > > > 在 2021年9月28日,21:15,Piotr Nowojski <pnowoj...@apache.org> 写道: > > > > > > > > > > > > Hi Leonard, > > > > > > > > > > > > Sorry for this causing you troubles, however that change in the > > > return > > > > > type > > > > > > was done while this class still has been marked as > > > > `@PublicEvolving`[1]. > > > > > As > > > > > > of 1.13.x `SourceReaderContext` was `@PublicEvolving` and it was > > > marked > > > > > as > > > > > > `@Public` only starting from Flink 1.14.0 [2]. Probably what > > confused > > > > you > > > > > > was that both of those changes (changing the return type and > making > > > it > > > > > > `@Public`) happened in the same release. > > > > > > > > > > > > However, those changes (`SourceReaderContext` and > > > > `ResolvedCatalogTable`) > > > > > > should have been clearly mentioned in the release notes with an > > > upgrade > > > > > > guide. > > > > > > > > > > > > Best, Piotrek > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > > https://github.com/apache/flink/commit/7f3636f6b4f8bac415a7db85917ad849636bd730#diff-a247a24ccd1afc07c5d690a8a58b1f6584329925fdf0d7dc89361b90d621b7f2R31 > > > > > > [2] https://issues.apache.org/jira/browse/FLINK-22357 > > > > > > > > > > > > wt., 28 wrz 2021 o 14:49 Leonard Xu <xbjt...@gmail.com> > > napisał(a): > > > > > > > > > > > >>>> > > > > > >>>> Not sure if this will happen in 1.15 already. We will needed > > > > automated > > > > > >>>> compatibility tests and a well-defined list of stable API. > > > > > >> > > > > > >>> We are > > > > > >>> trying to provide forward compatibility: applications using > > > `@Public` > > > > > >> APIs > > > > > >>> compiled against Flink 1.12.x, should work fine in Flink 1.13.x > > > > > >> > > > > > >> Unfortunately, I also meet forward compatibility issue, when I > do > > > the > > > > > >> release 1.14 check, I try to use mysql-cdc connector[1] which > > > compiled > > > > > >> against 1.13.1in SQL Client, but it can not work in flink 1.14.0 > > > > > cluster, > > > > > >> it failed due to the metric API compatibility broken. > > > > > >> > > > > > >> @Public > > > > > >> public interface SourceReaderContext { > > > > > >> > > > > > >> MetricGroup metricGroup(); > > > > > >> > > > > > >> > > > > > >> @Public > > > > > >> public interface SourceReaderContext { > > > > > >> > > > > > >> SourceReaderMetricGroup metricGroup(); > > > > > >> > > > > > >> > > > > > >> Shouldn't we mark it as @Deprecated and then delete it util > 2.0.0 > > > > > version > > > > > >> for @Public API as the our community rule [2] described? At > least > > we > > > > > should > > > > > >> keep them across server minor versions > (<major>.<minor>.<patch>). > > > > > >> > > > > > >> Although these changes can be tracked to voted FLIPs and it’s > not > > > the > > > > > >> fault of a few developers, it show us the fact that we didn’t > pay > > > > enough > > > > > >> attention to back compatibility/forward compatibility. > > > > > >> > > > > > >> Best, > > > > > >> Leonard > > > > > >> [1] > > > > > >> > > > > > > > > > > > > > > > https://github.com/ververica/flink-cdc-connectors/tree/master/flink-connector-mysql-cdc > > > > > >> [2] > > > > > >> > > > > > > https://cwiki.apache.org/confluence/display/FLINK/Stability+Annotations > > > > > >> > > > > > >> > > > > > > > > > > > > > > > > > > > >