Re: [DISCUSS] Creating an external connector repository

Konstantin Knauf Wed, 20 Oct 2021 00:12:16 -0700

Hi everyone,

regarding the stability of the APIs. I think everyone agrees that
connector APIs which are stable across minor versions (1.13->1.14) are the
mid-term goal. But:


a) These APIs are still quite young, and we shouldn't make them @Public
prematurely either.

b) Isn't this *mostly* orthogonal to where the connector code lives? Yes,
as long as there are breaking changes, the connectors need to be adopted
and require at least one release per Flink minor release.
Documentation-wise this can be addressed via a compatibility matrix for
each connector as Arvid suggested. IMO we shouldn't block this effort on
the stability of the APIs.

Cheers,

Konstantin



On Wed, Oct 20, 2021 at 8:56 AM Jark Wu <imj...@gmail.com> wrote:

> Hi,
>
> I think Thomas raised very good questions and would like to know your
> opinions if we want to move connectors out of flink in this version.
>
> (1) is the connector API already stable?
> > Separate releases would only make sense if the core Flink surface is
> > fairly stable though. As evident from Iceberg (and also Beam), that's
> > not the case currently. We should probably focus on addressing the
> > stability first, before splitting code. A success criteria could be
> > that we are able to build Iceberg and Beam against multiple Flink
> > versions w/o the need to change code. The goal would be that no
> > connector breaks when we make changes to Flink core. Until that's the
> > case, code separation creates a setup where 1+1 or N+1 repositories
> > need to move lock step.
>
> From another discussion thread [1], connector API is far from stable.
> Currently, it's hard to build connectors against multiple Flink versions.
> There are breaking API changes both in 1.12 -> 1.13 and 1.13 -> 1.14 and
>  maybe also in the future versions,  because Table related APIs are still
> @PublicEvolving and new Sink API is still @Experimental.
>
>
> (2) Flink testability without connectors.
> > Flink w/o Kafka connector (and few others) isn't
> > viable. Testability of Flink was already brought up, can we really
> > certify a Flink core release without Kafka connector? Maybe those
> > connectors that are used in Flink e2e tests to validate functionality
> > of core Flink should not be broken out?
>
> This is a very good question. How can we guarantee the new Source and Sink
> API are stable with only test implementation?
>
>
> Best,
> Jark
>
>
>
>
>
> On Tue, 19 Oct 2021 at 23:56, Chesnay Schepler <ches...@apache.org> wrote:
>
> > Could you clarify what release cadence you're thinking of? There's quite
> > a big range that fits "more frequent than Flink" (per-commit, daily,
> > weekly, bi-weekly, monthly, even bi-monthly).
> >
> > On 19/10/2021 14:15, Martijn Visser wrote:
> > > Hi all,
> > >
> > > I think it would be a huge benefit if we can achieve more frequent
> > releases
> > > of connectors, which are not bound to the release cycle of Flink
> itself.
> > I
> > > agree that in order to get there, we need to have stable interfaces
> which
> > > are trustworthy and reliable, so they can be safely used by those
> > > connectors. I do think that work still needs to be done on those
> > > interfaces, but I am confident that we can get there from a Flink
> > > perspective.
> > >
> > > I am worried that we would not be able to achieve those frequent
> releases
> > > of connectors if we are putting these connectors under the Apache
> > umbrella,
> > > because that means that for each connector release we have to follow
> the
> > > Apache release creation process. This requires a lot of manual steps
> and
> > > prohibits automation and I think it would be hard to scale out frequent
> > > releases of connectors. I'm curious how others think this challenge
> could
> > > be solved.
> > >
> > > Best regards,
> > >
> > > Martijn
> > >
> > > On Mon, 18 Oct 2021 at 22:22, Thomas Weise <t...@apache.org> wrote:
> > >
> > >> Thanks for initiating this discussion.
> > >>
> > >> There are definitely a few things that are not optimal with our
> > >> current management of connectors. I would not necessarily characterize
> > >> it as a "mess" though. As the points raised so far show, it isn't easy
> > >> to find a solution that balances competing requirements and leads to a
> > >> net improvement.
> > >>
> > >> It would be great if we can find a setup that allows for connectors to
> > >> be released independently of core Flink and that each connector can be
> > >> released separately. Flink already has separate releases
> > >> (flink-shaded), so that by itself isn't a new thing. Per-connector
> > >> releases would need to allow for more frequent releases (without the
> > >> baggage that a full Flink release comes with).
> > >>
> > >> Separate releases would only make sense if the core Flink surface is
> > >> fairly stable though. As evident from Iceberg (and also Beam), that's
> > >> not the case currently. We should probably focus on addressing the
> > >> stability first, before splitting code. A success criteria could be
> > >> that we are able to build Iceberg and Beam against multiple Flink
> > >> versions w/o the need to change code. The goal would be that no
> > >> connector breaks when we make changes to Flink core. Until that's the
> > >> case, code separation creates a setup where 1+1 or N+1 repositories
> > >> need to move lock step.
> > >>
> > >> Regarding some connectors being more important for Flink than others:
> > >> That's a fact. Flink w/o Kafka connector (and few others) isn't
> > >> viable. Testability of Flink was already brought up, can we really
> > >> certify a Flink core release without Kafka connector? Maybe those
> > >> connectors that are used in Flink e2e tests to validate functionality
> > >> of core Flink should not be broken out?
> > >>
> > >> Finally, I think that the connectors that move into separate repos
> > >> should remain part of the Apache Flink project. Larger organizations
> > >> tend to approve the use of and contribution to open source at the
> > >> project level. Sometimes it is everything ASF. More often it is
> > >> "Apache Foo". It would be fatal to end up with a patchwork of projects
> > >> with potentially different licenses and governance to arrive at a
> > >> working Flink setup. This may mean we prioritize usability over
> > >> developer convenience, if that's in the best interest of Flink as a
> > >> whole.
> > >>
> > >> Thanks,
> > >> Thomas
> > >>
> > >>
> > >>
> > >> On Mon, Oct 18, 2021 at 6:59 AM Chesnay Schepler <ches...@apache.org>
> > >> wrote:
> > >>> Generally, the issues are reproducibility and control.
> > >>>
> > >>> Stuffs completely broken on the Flink side for a week? Well then so
> are
> > >>> the connector repos.
> > >>> (As-is) You can't go back to a previous version of the snapshot.
> Which
> > >>> also means that checking out older commits can be problematic because
> > >>> you'd still work against the latest snapshots, and they not be
> > >>> compatible with each other.
> > >>>
> > >>>
> > >>> On 18/10/2021 15:22, Arvid Heise wrote:
> > >>>> I was actually betting on snapshots versions. What are the limits?
> > >>>> Obviously, we can only do a release of a 1.15 connector after 1.15
> is
> > >>>> release.
> > >>>
> >
> >
>


-- 

Konstantin Knauf

https://twitter.com/snntrable

https://github.com/knaufk

Re: [DISCUSS] Creating an external connector repository

Reply via email to