Re: [DISCUSS] Planning Flink 2.0

2023-05-12 Thread Xintong Song
 a great opportunity for newcomers like me
> to
> > >> >> contribute.
> > >> >>
> > >> >> Is this discussion happening in a slack or discord forum too? If
> so,
> > pls
> > >> >> include me.
> > >> >>
> > >> >> Thanks,
> > >> >> Sai
> > >> >>
> > >> >> On Fri, Apr 28, 2023 at 2:55 AM Martijn Visser <
> > >> martijnvis...@apache.org>
> > >> >> wrote:
> > >> >>
> > >> >> > Hi all,
> > >> >> >
> > >> >> > I think the proposal is a good starting point. We should aim to
> > make
> > >> >> Flink
> > >> >> > a unified data processing, cloud friendly / cloud native
> > technology,
> > >> >> with
> > >> >> > proper low-level and high-level interfaces (DataStream API, Table
> > API,
> > >> >> > SQL). I think it would make a lot of sense that we write down a
> > vision
> > >> >> for
> > >> >> > Flink for the long term. That would also mean sharing and
> > discussing
> > >> >> more
> > >> >> > insights and having conversations around some of the long-term
> > >> direction
> > >> >> > from the proposal.
> > >> >> >
> > >> >> > In order to achieve that vision, I believe that we need a Flink
> 2.0
> > >> >> which I
> > >> >> > consider a long overdue clean-up. That version should be the
> > >> foundation
> > >> >> for
> > >> >> > Flink that allows the above mentioned vision to become actual
> > >> proposals
> > >> >> and
> > >> >> > implementations.
> > >> >> >
> > >> >> > As a foundation in Flink 2.0, I would be inclined to say it
> should
> > be:
> > >> >> >
> > >> >> > - Remove all deprecated APIs, including the DataSet API, Scala
> API,
> > >> >> > Queryable State, legacy Source and Sink implementations, legacy
> SQL
> > >> >> > functions etc.
> > >> >> > - Add support for Java 17 and 21, make 17 the default (given that
> > the
> > >> >> next
> > >> >> > Java LTS, 21, is released in September this year and the timeline
> > is
> > >> >> set of
> > >> >> > 2024)
> > >> >> > - Drop support for Java 8 and 11
> > >> >> > - Refactor the configuration layer
> > >> >> > - Refactor the DataStream API, such as:
> > >> >> > ** Having a coherent and well designed API
> > >> >> > ** Decouple the API into API-only modules, so no more cyclic
> > >> >> dependencies
> > >> >> > and leaking of non-APIs, including Kryo
> > >> >> > ** Reorganize APIs and modules
> > >> >> >
> > >> >> > I think these are some of the must-haves. Curious about the
> > thoughts
> > >> of
> > >> >> the
> > >> >> > community.
> > >> >> >
> > >> >> > Thanks, Martijn
> > >> >> >
> > >> >> > Op do 27 apr. 2023 om 10:16 schreef David Morávek <
> d...@apache.org
> > >
> > >> >> >
> > >> >> > > Hi,
> > >> >> > >
> > >> >> > > Great to see this topic moving forward; I agree it's long
> > overdue.
> > >> >> > >
> > >> >> > > I keep thinking about 2.0 as a chance to eliminate things that
> > >> didn't
> > >> >> > work,
> > >> >> > > make the feature set denser, and fix rough edges and APIs that
> > hold
> > >> us
> > >> >> > > back.
> > >> >> > >
> > >> >> > > Some items in the doc (Key Features section) don't tick these
> > boxes
> > >> >> for
> > >> >> > me,
> > >> >> > > as they could also be implemented in the 1x branch. We should
> > >> consider
> > >> >> > > whether we need a backward incompatible release to introduce
> each
> > >> >> > fea

Re: [DISCUSS] Planning Flink 2.0

2023-05-03 Thread Matthias Pohl
sider a long overdue clean-up. That version should be the
> >> foundation
> >> >> for
> >> >> > Flink that allows the above mentioned vision to become actual
> >> proposals
> >> >> and
> >> >> > implementations.
> >> >> >
> >> >> > As a foundation in Flink 2.0, I would be inclined to say it should
> be:
> >> >> >
> >> >> > - Remove all deprecated APIs, including the DataSet API, Scala API,
> >> >> > Queryable State, legacy Source and Sink implementations, legacy SQL
> >> >> > functions etc.
> >> >> > - Add support for Java 17 and 21, make 17 the default (given that
> the
> >> >> next
> >> >> > Java LTS, 21, is released in September this year and the timeline
> is
> >> >> set of
> >> >> > 2024)
> >> >> > - Drop support for Java 8 and 11
> >> >> > - Refactor the configuration layer
> >> >> > - Refactor the DataStream API, such as:
> >> >> > ** Having a coherent and well designed API
> >> >> > ** Decouple the API into API-only modules, so no more cyclic
> >> >> dependencies
> >> >> > and leaking of non-APIs, including Kryo
> >> >> > ** Reorganize APIs and modules
> >> >> >
> >> >> > I think these are some of the must-haves. Curious about the
> thoughts
> >> of
> >> >> the
> >> >> > community.
> >> >> >
> >> >> > Thanks, Martijn
> >> >> >
> >> >> > Op do 27 apr. 2023 om 10:16 schreef David Morávek  >
> >> >> >
> >> >> > > Hi,
> >> >> > >
> >> >> > > Great to see this topic moving forward; I agree it's long
> overdue.
> >> >> > >
> >> >> > > I keep thinking about 2.0 as a chance to eliminate things that
> >> didn't
> >> >> > work,
> >> >> > > make the feature set denser, and fix rough edges and APIs that
> hold
> >> us
> >> >> > > back.
> >> >> > >
> >> >> > > Some items in the doc (Key Features section) don't tick these
> boxes
> >> >> for
> >> >> > me,
> >> >> > > as they could also be implemented in the 1x branch. We should
> >> consider
> >> >> > > whether we need a backward incompatible release to introduce each
> >> >> > feature.
> >> >> > > This should help us to keep the discussion more focused.
> >> >> > >
> >> >> > > Best,
> >> >> > > D.
> >> >> > >
> >> >> > >
> >> >> > > On Wed, Apr 26, 2023 at 2:33 PM DONG Weike <
> kyled...@connect.hku.hk
> >> >
> >> >> > > wrote:
> >> >> > >
> >> >> > > > Hi,
> >> >> > > >
> >> >> > > > It is thrilling to see the foreseeable upcoming rollouts of
> Flink
> >> >> 2.x
> >> >> > > > releases, and I believe that this roadmap can take Flink to the
> >> next
> >> >> > > stage
> >> >> > > > of a top-of-notch unified streaming & batch computing engine.
> >> >> > > >
> >> >> > > > Given that all of the existing user programs are written and
> run
> >> in
> >> >> > Flink
> >> >> > > > 1.x versions as for now, and some of them are very complex and
> >> rely
> >> >> on
> >> >> > > > various third-party connectors written with legacy APIs, one
> thing
> >> >> > that I
> >> >> > > > have concerns about is if, one day in the future, the community
> >> >> decides
> >> >> > > > that new features are only given to 2.x releases, could the
> last
> >> >> > release
> >> >> > > of
> >> >> > > > Flink 1.x be converted as an LTS version (backporting severe
> bug
> >> >> fixes
> >> >> > > and
> >> >> > > > critical security patches), so that existing users could have
> >> enough
> >> >> > time
> >> >&

Re: [DISCUSS] Planning Flink 2.0

2023-04-28 Thread John Roesler
see that things you've listed have
>> > a lot in common with what we put in our list. I believe that's a good
>> > signal that we share similar opinions on what is good and important for
>> the
>> > project and the release.
>> >
>> > @Sai,
>> >
>> > Welcome to the community. And thanks for offering helps.
>> >
>> > At the moment, this discussion is only happening in this mailing list. We
>> > may consider setting up online meetings or dedicated slack channels in
>> > future. And if so, the information will also be posted in the mailing
>> list.
>> >
>> > Best,
>> >
>> > Xintong
>> >
>> >
>> >
>> > On Fri, Apr 28, 2023 at 2:19 PM Saichandrasekar TM <
>> > saichandrase...@gmail.com> wrote:
>> >
>> >> Hi All,
>> >>
>> >> Awesome...I see this as a great opportunity for newcomers like me to
>> >> contribute.
>> >>
>> >> Is this discussion happening in a slack or discord forum too? If so, pls
>> >> include me.
>> >>
>> >> Thanks,
>> >> Sai
>> >>
>> >> On Fri, Apr 28, 2023 at 2:55 AM Martijn Visser <
>> martijnvis...@apache.org>
>> >> wrote:
>> >>
>> >> > Hi all,
>> >> >
>> >> > I think the proposal is a good starting point. We should aim to make
>> >> Flink
>> >> > a unified data processing, cloud friendly / cloud native technology,
>> >> with
>> >> > proper low-level and high-level interfaces (DataStream API, Table API,
>> >> > SQL). I think it would make a lot of sense that we write down a vision
>> >> for
>> >> > Flink for the long term. That would also mean sharing and discussing
>> >> more
>> >> > insights and having conversations around some of the long-term
>> direction
>> >> > from the proposal.
>> >> >
>> >> > In order to achieve that vision, I believe that we need a Flink 2.0
>> >> which I
>> >> > consider a long overdue clean-up. That version should be the
>> foundation
>> >> for
>> >> > Flink that allows the above mentioned vision to become actual
>> proposals
>> >> and
>> >> > implementations.
>> >> >
>> >> > As a foundation in Flink 2.0, I would be inclined to say it should be:
>> >> >
>> >> > - Remove all deprecated APIs, including the DataSet API, Scala API,
>> >> > Queryable State, legacy Source and Sink implementations, legacy SQL
>> >> > functions etc.
>> >> > - Add support for Java 17 and 21, make 17 the default (given that the
>> >> next
>> >> > Java LTS, 21, is released in September this year and the timeline is
>> >> set of
>> >> > 2024)
>> >> > - Drop support for Java 8 and 11
>> >> > - Refactor the configuration layer
>> >> > - Refactor the DataStream API, such as:
>> >> > ** Having a coherent and well designed API
>> >> > ** Decouple the API into API-only modules, so no more cyclic
>> >> dependencies
>> >> > and leaking of non-APIs, including Kryo
>> >> > ** Reorganize APIs and modules
>> >> >
>> >> > I think these are some of the must-haves. Curious about the thoughts
>> of
>> >> the
>> >> > community.
>> >> >
>> >> > Thanks, Martijn
>> >> >
>> >> > Op do 27 apr. 2023 om 10:16 schreef David Morávek 
>> >> >
>> >> > > Hi,
>> >> > >
>> >> > > Great to see this topic moving forward; I agree it's long overdue.
>> >> > >
>> >> > > I keep thinking about 2.0 as a chance to eliminate things that
>> didn't
>> >> > work,
>> >> > > make the feature set denser, and fix rough edges and APIs that hold
>> us
>> >> > > back.
>> >> > >
>> >> > > Some items in the doc (Key Features section) don't tick these boxes
>> >> for
>> >> > me,
>> >> > > as they could also be implemented in the 1x branch. We should
>> consider
>> >> > > whether we need a backward incompatible release to introduce each
>> >> > feature.
>> >> > > This shou

Re: [DISCUSS] Planning Flink 2.0

2023-04-28 Thread Jing Ge
 foundation
> >> for
> >> > Flink that allows the above mentioned vision to become actual
> proposals
> >> and
> >> > implementations.
> >> >
> >> > As a foundation in Flink 2.0, I would be inclined to say it should be:
> >> >
> >> > - Remove all deprecated APIs, including the DataSet API, Scala API,
> >> > Queryable State, legacy Source and Sink implementations, legacy SQL
> >> > functions etc.
> >> > - Add support for Java 17 and 21, make 17 the default (given that the
> >> next
> >> > Java LTS, 21, is released in September this year and the timeline is
> >> set of
> >> > 2024)
> >> > - Drop support for Java 8 and 11
> >> > - Refactor the configuration layer
> >> > - Refactor the DataStream API, such as:
> >> > ** Having a coherent and well designed API
> >> > ** Decouple the API into API-only modules, so no more cyclic
> >> dependencies
> >> > and leaking of non-APIs, including Kryo
> >> > ** Reorganize APIs and modules
> >> >
> >> > I think these are some of the must-haves. Curious about the thoughts
> of
> >> the
> >> > community.
> >> >
> >> > Thanks, Martijn
> >> >
> >> > Op do 27 apr. 2023 om 10:16 schreef David Morávek 
> >> >
> >> > > Hi,
> >> > >
> >> > > Great to see this topic moving forward; I agree it's long overdue.
> >> > >
> >> > > I keep thinking about 2.0 as a chance to eliminate things that
> didn't
> >> > work,
> >> > > make the feature set denser, and fix rough edges and APIs that hold
> us
> >> > > back.
> >> > >
> >> > > Some items in the doc (Key Features section) don't tick these boxes
> >> for
> >> > me,
> >> > > as they could also be implemented in the 1x branch. We should
> consider
> >> > > whether we need a backward incompatible release to introduce each
> >> > feature.
> >> > > This should help us to keep the discussion more focused.
> >> > >
> >> > > Best,
> >> > > D.
> >> > >
> >> > >
> >> > > On Wed, Apr 26, 2023 at 2:33 PM DONG Weike  >
> >> > > wrote:
> >> > >
> >> > > > Hi,
> >> > > >
> >> > > > It is thrilling to see the foreseeable upcoming rollouts of Flink
> >> 2.x
> >> > > > releases, and I believe that this roadmap can take Flink to the
> next
> >> > > stage
> >> > > > of a top-of-notch unified streaming & batch computing engine.
> >> > > >
> >> > > > Given that all of the existing user programs are written and run
> in
> >> > Flink
> >> > > > 1.x versions as for now, and some of them are very complex and
> rely
> >> on
> >> > > > various third-party connectors written with legacy APIs, one thing
> >> > that I
> >> > > > have concerns about is if, one day in the future, the community
> >> decides
> >> > > > that new features are only given to 2.x releases, could the last
> >> > release
> >> > > of
> >> > > > Flink 1.x be converted as an LTS version (backporting severe bug
> >> fixes
> >> > > and
> >> > > > critical security patches), so that existing users could have
> enough
> >> > time
> >> > > > to wait for third-party connectors to upgrade, test their programs
> >> on
> >> > the
> >> > > > Flink APIs, and avoid sudden loss of community support.
> >> > > >
> >> > > > Just my two cents : )
> >> > > >
> >> > > > Best,
> >> > > > Weike
> >> > > >
> >> > > > 
> >> > > > 发件人: Xintong Song 
> >> > > > 发送时间: 2023年4月26日 20:01
> >> > > > 收件人: dev 
> >> > > > 主题: Re: [DISCUSS] Planning Flink 2.0
> >> > > >
> >> > > > @Chesnay
> >> > > >
> >> > > >
> >> > > > > Technically this implies that every minor release may contain
> >> > breaking
> >> > > > > changes, which is exactly what users d

Re: [DISCUSS] Planning Flink 2.0

2023-04-28 Thread Xintong Song
long overdue.
>> > >
>> > > I keep thinking about 2.0 as a chance to eliminate things that didn't
>> > work,
>> > > make the feature set denser, and fix rough edges and APIs that hold us
>> > > back.
>> > >
>> > > Some items in the doc (Key Features section) don't tick these boxes
>> for
>> > me,
>> > > as they could also be implemented in the 1x branch. We should consider
>> > > whether we need a backward incompatible release to introduce each
>> > feature.
>> > > This should help us to keep the discussion more focused.
>> > >
>> > > Best,
>> > > D.
>> > >
>> > >
>> > > On Wed, Apr 26, 2023 at 2:33 PM DONG Weike 
>> > > wrote:
>> > >
>> > > > Hi,
>> > > >
>> > > > It is thrilling to see the foreseeable upcoming rollouts of Flink
>> 2.x
>> > > > releases, and I believe that this roadmap can take Flink to the next
>> > > stage
>> > > > of a top-of-notch unified streaming & batch computing engine.
>> > > >
>> > > > Given that all of the existing user programs are written and run in
>> > Flink
>> > > > 1.x versions as for now, and some of them are very complex and rely
>> on
>> > > > various third-party connectors written with legacy APIs, one thing
>> > that I
>> > > > have concerns about is if, one day in the future, the community
>> decides
>> > > > that new features are only given to 2.x releases, could the last
>> > release
>> > > of
>> > > > Flink 1.x be converted as an LTS version (backporting severe bug
>> fixes
>> > > and
>> > > > critical security patches), so that existing users could have enough
>> > time
>> > > > to wait for third-party connectors to upgrade, test their programs
>> on
>> > the
>> > > > Flink APIs, and avoid sudden loss of community support.
>> > > >
>> > > > Just my two cents : )
>> > > >
>> > > > Best,
>> > > > Weike
>> > > >
>> > > > 
>> > > > 发件人: Xintong Song 
>> > > > 发送时间: 2023年4月26日 20:01
>> > > > 收件人: dev 
>> > > > 主题: Re: [DISCUSS] Planning Flink 2.0
>> > > >
>> > > > @Chesnay
>> > > >
>> > > >
>> > > > > Technically this implies that every minor release may contain
>> > breaking
>> > > > > changes, which is exactly what users don't want.
>> > > >
>> > > >
>> > > > It's not necessary to introduce the breaking chagnes immediately
>> upon
>> > > > reaching the minimum guaranteed stable time. If there are multiple
>> > > changes
>> > > > waiting for the stable time, we can still gather them in 1 minor
>> > release.
>> > > > But I see your point, from the user's perspective, the mechanism
>> does
>> > not
>> > > > provide any guarantees for the compatibility of minor releases.
>> > > >
>> > > > What problems to do you see in creating major releases every N
>> years?
>> > > > >
>> > > >
>> > > > It might not be concrete problem, but I'm a bit concerned by the
>> > > > uncertainty. I assume N should not be too small, e.g., at least 3.
>> I'd
>> > > > expect the decision to ship a major release would be made based on
>> > > > comprehensive considerations over the situations at that time.
>> Making a
>> > > > decision now that we would ship a major release 3 years later seems
>> a
>> > bit
>> > > > agressive to me.
>> > > >
>> > > > We need to figure out what this release means for connectors
>> > > > > compatibility-wise.
>> > > > >
>> > > >
>> > > > +1
>> > > >
>> > > >
>> > > > > What process are you thinking of for deciding what breaking
>> changes
>> > to
>> > > > > make? The obvious choice would be FLIPs, but I'm worried that this
>> > will
>> > > > > overload the mailing list / wiki for lots of tiny changes.
>> > > > >
>> > > >
>> > >

Re: [DISCUSS] Planning Flink 2.0

2023-04-28 Thread Xintong Song
 versions as for now, and some of them are very complex and rely
> on
> > > > various third-party connectors written with legacy APIs, one thing
> > that I
> > > > have concerns about is if, one day in the future, the community
> decides
> > > > that new features are only given to 2.x releases, could the last
> > release
> > > of
> > > > Flink 1.x be converted as an LTS version (backporting severe bug
> fixes
> > > and
> > > > critical security patches), so that existing users could have enough
> > time
> > > > to wait for third-party connectors to upgrade, test their programs on
> > the
> > > > Flink APIs, and avoid sudden loss of community support.
> > > >
> > > > Just my two cents : )
> > > >
> > > > Best,
> > > > Weike
> > > >
> > > > 
> > > > 发件人: Xintong Song 
> > > > 发送时间: 2023年4月26日 20:01
> > > > 收件人: dev 
> > > > 主题: Re: [DISCUSS] Planning Flink 2.0
> > > >
> > > > @Chesnay
> > > >
> > > >
> > > > > Technically this implies that every minor release may contain
> > breaking
> > > > > changes, which is exactly what users don't want.
> > > >
> > > >
> > > > It's not necessary to introduce the breaking chagnes immediately upon
> > > > reaching the minimum guaranteed stable time. If there are multiple
> > > changes
> > > > waiting for the stable time, we can still gather them in 1 minor
> > release.
> > > > But I see your point, from the user's perspective, the mechanism does
> > not
> > > > provide any guarantees for the compatibility of minor releases.
> > > >
> > > > What problems to do you see in creating major releases every N years?
> > > > >
> > > >
> > > > It might not be concrete problem, but I'm a bit concerned by the
> > > > uncertainty. I assume N should not be too small, e.g., at least 3.
> I'd
> > > > expect the decision to ship a major release would be made based on
> > > > comprehensive considerations over the situations at that time.
> Making a
> > > > decision now that we would ship a major release 3 years later seems a
> > bit
> > > > agressive to me.
> > > >
> > > > We need to figure out what this release means for connectors
> > > > > compatibility-wise.
> > > > >
> > > >
> > > > +1
> > > >
> > > >
> > > > > What process are you thinking of for deciding what breaking changes
> > to
> > > > > make? The obvious choice would be FLIPs, but I'm worried that this
> > will
> > > > > overload the mailing list / wiki for lots of tiny changes.
> > > > >
> > > >
> > > > This should be a community decision. What I have in mind would be:
> (1)
> > > > collect a wish list on wiki, (2) schedule a series of online meetings
> > > (like
> > > > the release syncs) to get an agreed set of must-have items, (3)
> develop
> > > and
> > > > polish the detailed plans of items via FLIPs, and (4) if the plan
> for a
> > > > must-have item does not work out then go back to (2) for an update.
> I'm
> > > > also open to other opinions.
> > > >
> > > > Would we wait a few months for people to prepare/agree on changes so
> we
> > > > > reduce the time we need to merge things into 2 branches?
> > > > >
> > > >
> > > > That's what I had in mind. Hopefully after 1.18.
> > > >
> > > > @Max
> > > >
> > > > When I look at
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1_PMGl5RuDQGlV99_gL3y7OiRsF0DgCk91Coua6hFXhE/edit
> > > > > , I'm a bit skeptical we will even be able to reach all these
> goals.
> > I
> > > > > think we have to prioritize and try to establish a deadline.
> > Otherwise
> > > we
> > > > > will end up never releasing 2.0.
> > > >
> > > >
> > > > Sorry for the confusion. I should have explain this more clearly. We
> > are
> > > > not planning to finish all the items in the list. It's more like a
> > > > brainstorm, a list of candidates. We are also expecting to collect
> more
> > &g

Re: [DISCUSS] Planning Flink 2.0

2023-04-28 Thread Saichandrasekar TM
Hi All,

Awesome...I see this as a great opportunity for newcomers like me to
contribute.

Is this discussion happening in a slack or discord forum too? If so, pls
include me.

Thanks,
Sai

On Fri, Apr 28, 2023 at 2:55 AM Martijn Visser 
wrote:

> Hi all,
>
> I think the proposal is a good starting point. We should aim to make Flink
> a unified data processing, cloud friendly / cloud native technology, with
> proper low-level and high-level interfaces (DataStream API, Table API,
> SQL). I think it would make a lot of sense that we write down a vision for
> Flink for the long term. That would also mean sharing and discussing more
> insights and having conversations around some of the long-term direction
> from the proposal.
>
> In order to achieve that vision, I believe that we need a Flink 2.0 which I
> consider a long overdue clean-up. That version should be the foundation for
> Flink that allows the above mentioned vision to become actual proposals and
> implementations.
>
> As a foundation in Flink 2.0, I would be inclined to say it should be:
>
> - Remove all deprecated APIs, including the DataSet API, Scala API,
> Queryable State, legacy Source and Sink implementations, legacy SQL
> functions etc.
> - Add support for Java 17 and 21, make 17 the default (given that the next
> Java LTS, 21, is released in September this year and the timeline is set of
> 2024)
> - Drop support for Java 8 and 11
> - Refactor the configuration layer
> - Refactor the DataStream API, such as:
> ** Having a coherent and well designed API
> ** Decouple the API into API-only modules, so no more cyclic dependencies
> and leaking of non-APIs, including Kryo
> ** Reorganize APIs and modules
>
> I think these are some of the must-haves. Curious about the thoughts of the
> community.
>
> Thanks, Martijn
>
> Op do 27 apr. 2023 om 10:16 schreef David Morávek 
>
> > Hi,
> >
> > Great to see this topic moving forward; I agree it's long overdue.
> >
> > I keep thinking about 2.0 as a chance to eliminate things that didn't
> work,
> > make the feature set denser, and fix rough edges and APIs that hold us
> > back.
> >
> > Some items in the doc (Key Features section) don't tick these boxes for
> me,
> > as they could also be implemented in the 1x branch. We should consider
> > whether we need a backward incompatible release to introduce each
> feature.
> > This should help us to keep the discussion more focused.
> >
> > Best,
> > D.
> >
> >
> > On Wed, Apr 26, 2023 at 2:33 PM DONG Weike 
> > wrote:
> >
> > > Hi,
> > >
> > > It is thrilling to see the foreseeable upcoming rollouts of Flink 2.x
> > > releases, and I believe that this roadmap can take Flink to the next
> > stage
> > > of a top-of-notch unified streaming & batch computing engine.
> > >
> > > Given that all of the existing user programs are written and run in
> Flink
> > > 1.x versions as for now, and some of them are very complex and rely on
> > > various third-party connectors written with legacy APIs, one thing
> that I
> > > have concerns about is if, one day in the future, the community decides
> > > that new features are only given to 2.x releases, could the last
> release
> > of
> > > Flink 1.x be converted as an LTS version (backporting severe bug fixes
> > and
> > > critical security patches), so that existing users could have enough
> time
> > > to wait for third-party connectors to upgrade, test their programs on
> the
> > > Flink APIs, and avoid sudden loss of community support.
> > >
> > > Just my two cents : )
> > >
> > > Best,
> > > Weike
> > >
> > > 
> > > 发件人: Xintong Song 
> > > 发送时间: 2023年4月26日 20:01
> > > 收件人: dev 
> > > 主题: Re: [DISCUSS] Planning Flink 2.0
> > >
> > > @Chesnay
> > >
> > >
> > > > Technically this implies that every minor release may contain
> breaking
> > > > changes, which is exactly what users don't want.
> > >
> > >
> > > It's not necessary to introduce the breaking chagnes immediately upon
> > > reaching the minimum guaranteed stable time. If there are multiple
> > changes
> > > waiting for the stable time, we can still gather them in 1 minor
> release.
> > > But I see your point, from the user's perspective, the mechanism does
> not
> > > provide any guarantees for the compatibility of minor releases.
> > >
> > > What problems to do you see i

Re: [DISCUSS] Planning Flink 2.0

2023-04-27 Thread Martijn Visser
Hi all,

I think the proposal is a good starting point. We should aim to make Flink
a unified data processing, cloud friendly / cloud native technology, with
proper low-level and high-level interfaces (DataStream API, Table API,
SQL). I think it would make a lot of sense that we write down a vision for
Flink for the long term. That would also mean sharing and discussing more
insights and having conversations around some of the long-term direction
from the proposal.

In order to achieve that vision, I believe that we need a Flink 2.0 which I
consider a long overdue clean-up. That version should be the foundation for
Flink that allows the above mentioned vision to become actual proposals and
implementations.

As a foundation in Flink 2.0, I would be inclined to say it should be:

- Remove all deprecated APIs, including the DataSet API, Scala API,
Queryable State, legacy Source and Sink implementations, legacy SQL
functions etc.
- Add support for Java 17 and 21, make 17 the default (given that the next
Java LTS, 21, is released in September this year and the timeline is set of
2024)
- Drop support for Java 8 and 11
- Refactor the configuration layer
- Refactor the DataStream API, such as:
** Having a coherent and well designed API
** Decouple the API into API-only modules, so no more cyclic dependencies
and leaking of non-APIs, including Kryo
** Reorganize APIs and modules

I think these are some of the must-haves. Curious about the thoughts of the
community.

Thanks, Martijn

Op do 27 apr. 2023 om 10:16 schreef David Morávek 

> Hi,
>
> Great to see this topic moving forward; I agree it's long overdue.
>
> I keep thinking about 2.0 as a chance to eliminate things that didn't work,
> make the feature set denser, and fix rough edges and APIs that hold us
> back.
>
> Some items in the doc (Key Features section) don't tick these boxes for me,
> as they could also be implemented in the 1x branch. We should consider
> whether we need a backward incompatible release to introduce each feature.
> This should help us to keep the discussion more focused.
>
> Best,
> D.
>
>
> On Wed, Apr 26, 2023 at 2:33 PM DONG Weike 
> wrote:
>
> > Hi,
> >
> > It is thrilling to see the foreseeable upcoming rollouts of Flink 2.x
> > releases, and I believe that this roadmap can take Flink to the next
> stage
> > of a top-of-notch unified streaming & batch computing engine.
> >
> > Given that all of the existing user programs are written and run in Flink
> > 1.x versions as for now, and some of them are very complex and rely on
> > various third-party connectors written with legacy APIs, one thing that I
> > have concerns about is if, one day in the future, the community decides
> > that new features are only given to 2.x releases, could the last release
> of
> > Flink 1.x be converted as an LTS version (backporting severe bug fixes
> and
> > critical security patches), so that existing users could have enough time
> > to wait for third-party connectors to upgrade, test their programs on the
> > Flink APIs, and avoid sudden loss of community support.
> >
> > Just my two cents : )
> >
> > Best,
> > Weike
> >
> > 
> > 发件人: Xintong Song 
> > 发送时间: 2023年4月26日 20:01
> > 收件人: dev 
> > 主题: Re: [DISCUSS] Planning Flink 2.0
> >
> > @Chesnay
> >
> >
> > > Technically this implies that every minor release may contain breaking
> > > changes, which is exactly what users don't want.
> >
> >
> > It's not necessary to introduce the breaking chagnes immediately upon
> > reaching the minimum guaranteed stable time. If there are multiple
> changes
> > waiting for the stable time, we can still gather them in 1 minor release.
> > But I see your point, from the user's perspective, the mechanism does not
> > provide any guarantees for the compatibility of minor releases.
> >
> > What problems to do you see in creating major releases every N years?
> > >
> >
> > It might not be concrete problem, but I'm a bit concerned by the
> > uncertainty. I assume N should not be too small, e.g., at least 3. I'd
> > expect the decision to ship a major release would be made based on
> > comprehensive considerations over the situations at that time. Making a
> > decision now that we would ship a major release 3 years later seems a bit
> > agressive to me.
> >
> > We need to figure out what this release means for connectors
> > > compatibility-wise.
> > >
> >
> > +1
> >
> >
> > > What process are you thinking of for deciding what breaking changes to
> > > make? The obvious choice would be FLIPs, bu

Re: [DISCUSS] Planning Flink 2.0

2023-04-27 Thread David Morávek
Hi,

Great to see this topic moving forward; I agree it's long overdue.

I keep thinking about 2.0 as a chance to eliminate things that didn't work,
make the feature set denser, and fix rough edges and APIs that hold us back.

Some items in the doc (Key Features section) don't tick these boxes for me,
as they could also be implemented in the 1x branch. We should consider
whether we need a backward incompatible release to introduce each feature.
This should help us to keep the discussion more focused.

Best,
D.


On Wed, Apr 26, 2023 at 2:33 PM DONG Weike  wrote:

> Hi,
>
> It is thrilling to see the foreseeable upcoming rollouts of Flink 2.x
> releases, and I believe that this roadmap can take Flink to the next stage
> of a top-of-notch unified streaming & batch computing engine.
>
> Given that all of the existing user programs are written and run in Flink
> 1.x versions as for now, and some of them are very complex and rely on
> various third-party connectors written with legacy APIs, one thing that I
> have concerns about is if, one day in the future, the community decides
> that new features are only given to 2.x releases, could the last release of
> Flink 1.x be converted as an LTS version (backporting severe bug fixes and
> critical security patches), so that existing users could have enough time
> to wait for third-party connectors to upgrade, test their programs on the
> Flink APIs, and avoid sudden loss of community support.
>
> Just my two cents : )
>
> Best,
> Weike
>
> 
> 发件人: Xintong Song 
> 发送时间: 2023年4月26日 20:01
> 收件人: dev 
> 主题: Re: [DISCUSS] Planning Flink 2.0
>
> @Chesnay
>
>
> > Technically this implies that every minor release may contain breaking
> > changes, which is exactly what users don't want.
>
>
> It's not necessary to introduce the breaking chagnes immediately upon
> reaching the minimum guaranteed stable time. If there are multiple changes
> waiting for the stable time, we can still gather them in 1 minor release.
> But I see your point, from the user's perspective, the mechanism does not
> provide any guarantees for the compatibility of minor releases.
>
> What problems to do you see in creating major releases every N years?
> >
>
> It might not be concrete problem, but I'm a bit concerned by the
> uncertainty. I assume N should not be too small, e.g., at least 3. I'd
> expect the decision to ship a major release would be made based on
> comprehensive considerations over the situations at that time. Making a
> decision now that we would ship a major release 3 years later seems a bit
> agressive to me.
>
> We need to figure out what this release means for connectors
> > compatibility-wise.
> >
>
> +1
>
>
> > What process are you thinking of for deciding what breaking changes to
> > make? The obvious choice would be FLIPs, but I'm worried that this will
> > overload the mailing list / wiki for lots of tiny changes.
> >
>
> This should be a community decision. What I have in mind would be: (1)
> collect a wish list on wiki, (2) schedule a series of online meetings (like
> the release syncs) to get an agreed set of must-have items, (3) develop and
> polish the detailed plans of items via FLIPs, and (4) if the plan for a
> must-have item does not work out then go back to (2) for an update. I'm
> also open to other opinions.
>
> Would we wait a few months for people to prepare/agree on changes so we
> > reduce the time we need to merge things into 2 branches?
> >
>
> That's what I had in mind. Hopefully after 1.18.
>
> @Max
>
> When I look at
> >
> https://docs.google.com/document/d/1_PMGl5RuDQGlV99_gL3y7OiRsF0DgCk91Coua6hFXhE/edit
> > , I'm a bit skeptical we will even be able to reach all these goals. I
> > think we have to prioritize and try to establish a deadline. Otherwise we
> > will end up never releasing 2.0.
>
>
> Sorry for the confusion. I should have explain this more clearly. We are
> not planning to finish all the items in the list. It's more like a
> brainstorm, a list of candidates. We are also expecting to collect more
> ideas from the community. And after collecting the ideas, we should
> prioritize them and decide on a subset of must-have items, following the
> consensus decision making.
>
> +1 on Flink 2.0 by May 2024 (not a hard deadline but I think having a
> > deadline helps).
> >
>
> I agree that having a deadline helps. I proposed mid 2024, which is similar
> to but not as explicit as what you proposed. We may start with having a
> deadline for deciding the must-have items (e.g., by the end of June).  That
> should make it easier for estimating the overall time needed 

Re: [DISCUSS] Planning Flink 2.0

2023-04-26 Thread DONG Weike
Hi,

It is thrilling to see the foreseeable upcoming rollouts of Flink 2.x releases, 
and I believe that this roadmap can take Flink to the next stage of a 
top-of-notch unified streaming & batch computing engine.

Given that all of the existing user programs are written and run in Flink 1.x 
versions as for now, and some of them are very complex and rely on various 
third-party connectors written with legacy APIs, one thing that I have concerns 
about is if, one day in the future, the community decides that new features are 
only given to 2.x releases, could the last release of Flink 1.x be converted as 
an LTS version (backporting severe bug fixes and critical security patches), so 
that existing users could have enough time to wait for third-party connectors 
to upgrade, test their programs on the Flink APIs, and avoid sudden loss of 
community support.

Just my two cents : )

Best,
Weike


发件人: Xintong Song 
发送时间: 2023年4月26日 20:01
收件人: dev 
主题: Re: [DISCUSS] Planning Flink 2.0

@Chesnay


> Technically this implies that every minor release may contain breaking
> changes, which is exactly what users don't want.


It's not necessary to introduce the breaking chagnes immediately upon
reaching the minimum guaranteed stable time. If there are multiple changes
waiting for the stable time, we can still gather them in 1 minor release.
But I see your point, from the user's perspective, the mechanism does not
provide any guarantees for the compatibility of minor releases.

What problems to do you see in creating major releases every N years?
>

It might not be concrete problem, but I'm a bit concerned by the
uncertainty. I assume N should not be too small, e.g., at least 3. I'd
expect the decision to ship a major release would be made based on
comprehensive considerations over the situations at that time. Making a
decision now that we would ship a major release 3 years later seems a bit
agressive to me.

We need to figure out what this release means for connectors
> compatibility-wise.
>

+1


> What process are you thinking of for deciding what breaking changes to
> make? The obvious choice would be FLIPs, but I'm worried that this will
> overload the mailing list / wiki for lots of tiny changes.
>

This should be a community decision. What I have in mind would be: (1)
collect a wish list on wiki, (2) schedule a series of online meetings (like
the release syncs) to get an agreed set of must-have items, (3) develop and
polish the detailed plans of items via FLIPs, and (4) if the plan for a
must-have item does not work out then go back to (2) for an update. I'm
also open to other opinions.

Would we wait a few months for people to prepare/agree on changes so we
> reduce the time we need to merge things into 2 branches?
>

That's what I had in mind. Hopefully after 1.18.

@Max

When I look at
> https://docs.google.com/document/d/1_PMGl5RuDQGlV99_gL3y7OiRsF0DgCk91Coua6hFXhE/edit
> , I'm a bit skeptical we will even be able to reach all these goals. I
> think we have to prioritize and try to establish a deadline. Otherwise we
> will end up never releasing 2.0.


Sorry for the confusion. I should have explain this more clearly. We are
not planning to finish all the items in the list. It's more like a
brainstorm, a list of candidates. We are also expecting to collect more
ideas from the community. And after collecting the ideas, we should
prioritize them and decide on a subset of must-have items, following the
consensus decision making.

+1 on Flink 2.0 by May 2024 (not a hard deadline but I think having a
> deadline helps).
>

I agree that having a deadline helps. I proposed mid 2024, which is similar
to but not as explicit as what you proposed. We may start with having a
deadline for deciding the must-have items (e.g., by the end of June).  That
should make it easier for estimating the overall time needed for preparing
the release.

Best,

Xintong



On Wed, Apr 26, 2023 at 6:57 PM Gyula Fóra  wrote:

> +1 to everything Max said.
>
> Gyula
>
> On Wed, 26 Apr 2023 at 11:42, Maximilian Michels  wrote:
>
> > Thanks for starting the discussion, Jark and Xingtong!
> >
> > Flink 2.0 is long overdue. In the past, the expectations for such a
> > release were unreasonably high. I think everybody had a different
> > understanding of what exactly the criteria were. This led to releasing
> > 18 minor releases for the current major version.
> >
> > What I'm most excited about for Flink 2.0 is removal of baggage that
> > Flink has accumulated over the years:
> >
> > - Removal of Scala, deprecated interfaces, unmaintained libraries and
> > APIs (DataSet)
> > - Consolidation of configuration
> > - Merging of multiple scheduler implementations
> > - Ability to freely combine batch / streaming tasks in the runtime
> >
> > When 

Re: [DISCUSS] Planning Flink 2.0

2023-04-26 Thread Xintong Song
@Chesnay


> Technically this implies that every minor release may contain breaking
> changes, which is exactly what users don't want.


It's not necessary to introduce the breaking chagnes immediately upon
reaching the minimum guaranteed stable time. If there are multiple changes
waiting for the stable time, we can still gather them in 1 minor release.
But I see your point, from the user's perspective, the mechanism does not
provide any guarantees for the compatibility of minor releases.

What problems to do you see in creating major releases every N years?
>

It might not be concrete problem, but I'm a bit concerned by the
uncertainty. I assume N should not be too small, e.g., at least 3. I'd
expect the decision to ship a major release would be made based on
comprehensive considerations over the situations at that time. Making a
decision now that we would ship a major release 3 years later seems a bit
agressive to me.

We need to figure out what this release means for connectors
> compatibility-wise.
>

+1


> What process are you thinking of for deciding what breaking changes to
> make? The obvious choice would be FLIPs, but I'm worried that this will
> overload the mailing list / wiki for lots of tiny changes.
>

This should be a community decision. What I have in mind would be: (1)
collect a wish list on wiki, (2) schedule a series of online meetings (like
the release syncs) to get an agreed set of must-have items, (3) develop and
polish the detailed plans of items via FLIPs, and (4) if the plan for a
must-have item does not work out then go back to (2) for an update. I'm
also open to other opinions.

Would we wait a few months for people to prepare/agree on changes so we
> reduce the time we need to merge things into 2 branches?
>

That's what I had in mind. Hopefully after 1.18.

@Max

When I look at
> https://docs.google.com/document/d/1_PMGl5RuDQGlV99_gL3y7OiRsF0DgCk91Coua6hFXhE/edit
> , I'm a bit skeptical we will even be able to reach all these goals. I
> think we have to prioritize and try to establish a deadline. Otherwise we
> will end up never releasing 2.0.


Sorry for the confusion. I should have explain this more clearly. We are
not planning to finish all the items in the list. It's more like a
brainstorm, a list of candidates. We are also expecting to collect more
ideas from the community. And after collecting the ideas, we should
prioritize them and decide on a subset of must-have items, following the
consensus decision making.

+1 on Flink 2.0 by May 2024 (not a hard deadline but I think having a
> deadline helps).
>

I agree that having a deadline helps. I proposed mid 2024, which is similar
to but not as explicit as what you proposed. We may start with having a
deadline for deciding the must-have items (e.g., by the end of June).  That
should make it easier for estimating the overall time needed for preparing
the release.

Best,

Xintong



On Wed, Apr 26, 2023 at 6:57 PM Gyula Fóra  wrote:

> +1 to everything Max said.
>
> Gyula
>
> On Wed, 26 Apr 2023 at 11:42, Maximilian Michels  wrote:
>
> > Thanks for starting the discussion, Jark and Xingtong!
> >
> > Flink 2.0 is long overdue. In the past, the expectations for such a
> > release were unreasonably high. I think everybody had a different
> > understanding of what exactly the criteria were. This led to releasing
> > 18 minor releases for the current major version.
> >
> > What I'm most excited about for Flink 2.0 is removal of baggage that
> > Flink has accumulated over the years:
> >
> > - Removal of Scala, deprecated interfaces, unmaintained libraries and
> > APIs (DataSet)
> > - Consolidation of configuration
> > - Merging of multiple scheduler implementations
> > - Ability to freely combine batch / streaming tasks in the runtime
> >
> > When I look at
> >
> https://docs.google.com/document/d/1_PMGl5RuDQGlV99_gL3y7OiRsF0DgCk91Coua6hFXhE/edit
> > , I'm a bit skeptical we will even be able to reach all these goals. I
> > think we have to prioritize and try to establish a deadline. Otherwise
> > we will end up never releasing 2.0.
> >
> > +1 on Flink 2.0 by May 2024 (not a hard deadline but I think having a
> > deadline helps).
> >
> > -Max
> >
> >
> > On Wed, Apr 26, 2023 at 10:08 AM Chesnay Schepler 
> > wrote:
> > >
> > >  > /Instead of defining compatibility guarantees as "this API won't
> > > change in all 1.x/2.x series", what if we define it as "this API won't
> > > change in the next 2/3 years"./
> > >
> > > I can see some benefits to this approach (all APIs having a fixed
> > > minimum lifetime) but it's just gonna be difficult to communicate.
> > > Technically this implies that every minor release may contain breaking
> > > changes, which is exactly what users don't want.
> > >
> > > What problems to do you see in creating major releases every N years?
> > >
> > >  > /IIUC, the milestone releases are a breakdown of the 2.0 release,
> > > while we are free to introduce breaking changes between them. And you
> > > 

Re: [DISCUSS] Planning Flink 2.0

2023-04-26 Thread Gyula Fóra
+1 to everything Max said.

Gyula

On Wed, 26 Apr 2023 at 11:42, Maximilian Michels  wrote:

> Thanks for starting the discussion, Jark and Xingtong!
>
> Flink 2.0 is long overdue. In the past, the expectations for such a
> release were unreasonably high. I think everybody had a different
> understanding of what exactly the criteria were. This led to releasing
> 18 minor releases for the current major version.
>
> What I'm most excited about for Flink 2.0 is removal of baggage that
> Flink has accumulated over the years:
>
> - Removal of Scala, deprecated interfaces, unmaintained libraries and
> APIs (DataSet)
> - Consolidation of configuration
> - Merging of multiple scheduler implementations
> - Ability to freely combine batch / streaming tasks in the runtime
>
> When I look at
> https://docs.google.com/document/d/1_PMGl5RuDQGlV99_gL3y7OiRsF0DgCk91Coua6hFXhE/edit
> , I'm a bit skeptical we will even be able to reach all these goals. I
> think we have to prioritize and try to establish a deadline. Otherwise
> we will end up never releasing 2.0.
>
> +1 on Flink 2.0 by May 2024 (not a hard deadline but I think having a
> deadline helps).
>
> -Max
>
>
> On Wed, Apr 26, 2023 at 10:08 AM Chesnay Schepler 
> wrote:
> >
> >  > /Instead of defining compatibility guarantees as "this API won't
> > change in all 1.x/2.x series", what if we define it as "this API won't
> > change in the next 2/3 years"./
> >
> > I can see some benefits to this approach (all APIs having a fixed
> > minimum lifetime) but it's just gonna be difficult to communicate.
> > Technically this implies that every minor release may contain breaking
> > changes, which is exactly what users don't want.
> >
> > What problems to do you see in creating major releases every N years?
> >
> >  > /IIUC, the milestone releases are a breakdown of the 2.0 release,
> > while we are free to introduce breaking changes between them. And you
> > suggest using longer-living feature branches to keep the master branch
> > in a releasable state (in terms of milestone releases). Am I
> > understanding it correctly?/
> >
> > I think you got the general idea. There are a lot of details to be
> > ironed out though (e.g., do we release connectors for each
> milestone?...).
> >
> > Conflicts in the long-lived branches are certainly a concern, but I
> > think those will be inevitable. Right now I'm not _too_ worried about
> > them, at least based on my personal wish-list.
> > Maybe the milestones could even help with that, as we could preemptively
> > decide on an order for certain changes that have a high chance of
> > conflicting with each other?
> > I guess we could do that anyway.
> > Maybe we should explicitly evaluate how invasive a change is (in
> > relation to other breaking changes!) and manage things accordingly
> >
> >
> > Other thoughts:
> >
> > We need to figure out what this release means for connectors
> > compatibility-wise. The current rules for which versions a connector
> > must support don't cover major releases at all.
> > (This depends a bit on the scope of 2.0; if we add binary compatibility
> > to Public APIs and promote a few Evolving ones then compatibility across
> > minor releases becomes trivial)
> >
> > What process are you thinking of for deciding what breaking changes to
> > make? The obvious choice would be FLIPs, but I'm worried that this will
> > overload the mailing list / wiki for lots of tiny changes.
> >
> > Provided that we agree on doing 2.0, when would we cut the 2.0 branch?
> > Would we wait a few months for people to prepare/agree on changes so we
> > reduce the time we need to merge things into 2 branches?
> >
> > On 26/04/2023 05:51, Xintong Song wrote:
> > > Thanks all for the positive feedback.
> > >
> > > @Martijn
> > >
> > > If we want to have that roadmap, should we consolidate this into a
> > >> dedicated Confluence page over storing it in a Google doc?
> > >>
> > > Having a dedicated wiki page is definitely a good way for the roadmap
> > > discussion. I haven't created one yet because it's still a proposal to
> have
> > > such roadmap discussion. If the community agrees with our proposal, the
> > > release manager team can decide how they want to drive and track the
> > > roadmap discussion.
> > >
> > > @Chesnay
> > >
> > > We should discuss how regularly we will ship major releases from now
> on.
> > >> Let's avoid again making breaking changes because we "gotta do it now
> > >> because 3.0 isn't happening anytime soon". (e.g., every 2 years or
> > >> something)
> > >
> > > I'm not entirely sure about shipping major releases regularly. But I do
> > > agree that we may want to avoid the situation that "breaking changes
> can
> > > only happen now, or no idea when". Instead of defining compatibility
> > > guarantees as "this API won't change in all 1.x/2.x series", what if we
> > > define it as "this API won't change in the next 2/3 years". That should
> > > allow us to incrementally iterate the APIs.
> > >
> > > E.g., 

Re: [DISCUSS] Planning Flink 2.0

2023-04-26 Thread Maximilian Michels
Thanks for starting the discussion, Jark and Xingtong!

Flink 2.0 is long overdue. In the past, the expectations for such a
release were unreasonably high. I think everybody had a different
understanding of what exactly the criteria were. This led to releasing
18 minor releases for the current major version.

What I'm most excited about for Flink 2.0 is removal of baggage that
Flink has accumulated over the years:

- Removal of Scala, deprecated interfaces, unmaintained libraries and
APIs (DataSet)
- Consolidation of configuration
- Merging of multiple scheduler implementations
- Ability to freely combine batch / streaming tasks in the runtime

When I look at 
https://docs.google.com/document/d/1_PMGl5RuDQGlV99_gL3y7OiRsF0DgCk91Coua6hFXhE/edit
, I'm a bit skeptical we will even be able to reach all these goals. I
think we have to prioritize and try to establish a deadline. Otherwise
we will end up never releasing 2.0.

+1 on Flink 2.0 by May 2024 (not a hard deadline but I think having a
deadline helps).

-Max


On Wed, Apr 26, 2023 at 10:08 AM Chesnay Schepler  wrote:
>
>  > /Instead of defining compatibility guarantees as "this API won't
> change in all 1.x/2.x series", what if we define it as "this API won't
> change in the next 2/3 years"./
>
> I can see some benefits to this approach (all APIs having a fixed
> minimum lifetime) but it's just gonna be difficult to communicate.
> Technically this implies that every minor release may contain breaking
> changes, which is exactly what users don't want.
>
> What problems to do you see in creating major releases every N years?
>
>  > /IIUC, the milestone releases are a breakdown of the 2.0 release,
> while we are free to introduce breaking changes between them. And you
> suggest using longer-living feature branches to keep the master branch
> in a releasable state (in terms of milestone releases). Am I
> understanding it correctly?/
>
> I think you got the general idea. There are a lot of details to be
> ironed out though (e.g., do we release connectors for each milestone?...).
>
> Conflicts in the long-lived branches are certainly a concern, but I
> think those will be inevitable. Right now I'm not _too_ worried about
> them, at least based on my personal wish-list.
> Maybe the milestones could even help with that, as we could preemptively
> decide on an order for certain changes that have a high chance of
> conflicting with each other?
> I guess we could do that anyway.
> Maybe we should explicitly evaluate how invasive a change is (in
> relation to other breaking changes!) and manage things accordingly
>
>
> Other thoughts:
>
> We need to figure out what this release means for connectors
> compatibility-wise. The current rules for which versions a connector
> must support don't cover major releases at all.
> (This depends a bit on the scope of 2.0; if we add binary compatibility
> to Public APIs and promote a few Evolving ones then compatibility across
> minor releases becomes trivial)
>
> What process are you thinking of for deciding what breaking changes to
> make? The obvious choice would be FLIPs, but I'm worried that this will
> overload the mailing list / wiki for lots of tiny changes.
>
> Provided that we agree on doing 2.0, when would we cut the 2.0 branch?
> Would we wait a few months for people to prepare/agree on changes so we
> reduce the time we need to merge things into 2 branches?
>
> On 26/04/2023 05:51, Xintong Song wrote:
> > Thanks all for the positive feedback.
> >
> > @Martijn
> >
> > If we want to have that roadmap, should we consolidate this into a
> >> dedicated Confluence page over storing it in a Google doc?
> >>
> > Having a dedicated wiki page is definitely a good way for the roadmap
> > discussion. I haven't created one yet because it's still a proposal to have
> > such roadmap discussion. If the community agrees with our proposal, the
> > release manager team can decide how they want to drive and track the
> > roadmap discussion.
> >
> > @Chesnay
> >
> > We should discuss how regularly we will ship major releases from now on.
> >> Let's avoid again making breaking changes because we "gotta do it now
> >> because 3.0 isn't happening anytime soon". (e.g., every 2 years or
> >> something)
> >
> > I'm not entirely sure about shipping major releases regularly. But I do
> > agree that we may want to avoid the situation that "breaking changes can
> > only happen now, or no idea when". Instead of defining compatibility
> > guarantees as "this API won't change in all 1.x/2.x series", what if we
> > define it as "this API won't change in the next 2/3 years". That should
> > allow us to incrementally iterate the APIs.
> >
> > E.g., in 2.a, all APIs annotated as `@Stable` will be guaranteed compatible
> > until 2 years after 2.a is shipped, and in 2.b if the API is still
> > annotated `@Stable` it extends the compatibility guarantee to 2 years after
> > 2.b is shipped. To remove an API, we would need to mark it as `@Deprecated`
> > and 

Re: [DISCUSS] Planning Flink 2.0

2023-04-26 Thread Chesnay Schepler
> /Instead of defining compatibility guarantees as "this API won't 
change in all 1.x/2.x series", what if we define it as "this API won't 
change in the next 2/3 years"./


I can see some benefits to this approach (all APIs having a fixed 
minimum lifetime) but it's just gonna be difficult to communicate. 
Technically this implies that every minor release may contain breaking 
changes, which is exactly what users don't want.


What problems to do you see in creating major releases every N years?

> /IIUC, the milestone releases are a breakdown of the 2.0 release, 
while we are free to introduce breaking changes between them. And you 
suggest using longer-living feature branches to keep the master branch 
in a releasable state (in terms of milestone releases). Am I 
understanding it correctly?/


I think you got the general idea. There are a lot of details to be 
ironed out though (e.g., do we release connectors for each milestone?...).


Conflicts in the long-lived branches are certainly a concern, but I 
think those will be inevitable. Right now I'm not _too_ worried about 
them, at least based on my personal wish-list.
Maybe the milestones could even help with that, as we could preemptively 
decide on an order for certain changes that have a high chance of 
conflicting with each other?

I guess we could do that anyway.
Maybe we should explicitly evaluate how invasive a change is (in 
relation to other breaking changes!) and manage things accordingly



Other thoughts:

We need to figure out what this release means for connectors 
compatibility-wise. The current rules for which versions a connector 
must support don't cover major releases at all.
(This depends a bit on the scope of 2.0; if we add binary compatibility 
to Public APIs and promote a few Evolving ones then compatibility across 
minor releases becomes trivial)


What process are you thinking of for deciding what breaking changes to 
make? The obvious choice would be FLIPs, but I'm worried that this will 
overload the mailing list / wiki for lots of tiny changes.


Provided that we agree on doing 2.0, when would we cut the 2.0 branch? 
Would we wait a few months for people to prepare/agree on changes so we 
reduce the time we need to merge things into 2 branches?


On 26/04/2023 05:51, Xintong Song wrote:

Thanks all for the positive feedback.

@Martijn

If we want to have that roadmap, should we consolidate this into a

dedicated Confluence page over storing it in a Google doc?


Having a dedicated wiki page is definitely a good way for the roadmap
discussion. I haven't created one yet because it's still a proposal to have
such roadmap discussion. If the community agrees with our proposal, the
release manager team can decide how they want to drive and track the
roadmap discussion.

@Chesnay

We should discuss how regularly we will ship major releases from now on.

Let's avoid again making breaking changes because we "gotta do it now
because 3.0 isn't happening anytime soon". (e.g., every 2 years or
something)


I'm not entirely sure about shipping major releases regularly. But I do
agree that we may want to avoid the situation that "breaking changes can
only happen now, or no idea when". Instead of defining compatibility
guarantees as "this API won't change in all 1.x/2.x series", what if we
define it as "this API won't change in the next 2/3 years". That should
allow us to incrementally iterate the APIs.

E.g., in 2.a, all APIs annotated as `@Stable` will be guaranteed compatible
until 2 years after 2.a is shipped, and in 2.b if the API is still
annotated `@Stable` it extends the compatibility guarantee to 2 years after
2.b is shipped. To remove an API, we would need to mark it as `@Deprecated`
and wait for 2 years after the last release in which it was marked
`@Stable`.

My thinking goes rather in the area of defining Milestone releases, each

Milestone targeting specific changes.


I'm trying to understand what you are suggesting here. IIUC, the milestone
releases are a breakdown of the 2.0 release, while we are free to introduce
breaking changes between them. And you suggest using longer-living feature
branches to keep the master branch in a releasable state (in terms of
milestone releases). Am I understanding it correctly?

I haven't thought this through. My gut feeling is this might be a good
direction to go, in terms of keeping things organized. The risk is the cost
of merging feature branches and rebasing feature branches after other
features are merged. That depends on how close the features are related to
each other. E.g., reorganization of the project modules and dependencies
may change the project structure a lot, which may significantly affect most
of the feature branches. Maybe we can identify such widely-affecting
changes and perform them at the beginning or end of the release cycle.

Best,

Xintong



On Wed, Apr 26, 2023 at 8:23 AM ConradJam  wrote:


Thanks Xintong and Jark for kicking off the great discussion!

I checked 

Re: [DISCUSS] Planning Flink 2.0

2023-04-25 Thread Xintong Song
Thanks all for the positive feedback.

@Martijn

If we want to have that roadmap, should we consolidate this into a
> dedicated Confluence page over storing it in a Google doc?
>

Having a dedicated wiki page is definitely a good way for the roadmap
discussion. I haven't created one yet because it's still a proposal to have
such roadmap discussion. If the community agrees with our proposal, the
release manager team can decide how they want to drive and track the
roadmap discussion.

@Chesnay

We should discuss how regularly we will ship major releases from now on.
> Let's avoid again making breaking changes because we "gotta do it now
> because 3.0 isn't happening anytime soon". (e.g., every 2 years or
> something)


I'm not entirely sure about shipping major releases regularly. But I do
agree that we may want to avoid the situation that "breaking changes can
only happen now, or no idea when". Instead of defining compatibility
guarantees as "this API won't change in all 1.x/2.x series", what if we
define it as "this API won't change in the next 2/3 years". That should
allow us to incrementally iterate the APIs.

E.g., in 2.a, all APIs annotated as `@Stable` will be guaranteed compatible
until 2 years after 2.a is shipped, and in 2.b if the API is still
annotated `@Stable` it extends the compatibility guarantee to 2 years after
2.b is shipped. To remove an API, we would need to mark it as `@Deprecated`
and wait for 2 years after the last release in which it was marked
`@Stable`.

My thinking goes rather in the area of defining Milestone releases, each
> Milestone targeting specific changes.
>

I'm trying to understand what you are suggesting here. IIUC, the milestone
releases are a breakdown of the 2.0 release, while we are free to introduce
breaking changes between them. And you suggest using longer-living feature
branches to keep the master branch in a releasable state (in terms of
milestone releases). Am I understanding it correctly?

I haven't thought this through. My gut feeling is this might be a good
direction to go, in terms of keeping things organized. The risk is the cost
of merging feature branches and rebasing feature branches after other
features are merged. That depends on how close the features are related to
each other. E.g., reorganization of the project modules and dependencies
may change the project structure a lot, which may significantly affect most
of the feature branches. Maybe we can identify such widely-affecting
changes and perform them at the beginning or end of the release cycle.

Best,

Xintong



On Wed, Apr 26, 2023 at 8:23 AM ConradJam  wrote:

> Thanks Xintong and Jark for kicking off the great discussion!
>
> I checked the list carefully. The plans are detailed and most of the
> problems are covered
> Some of the ideas Chesnay mentioned, I think we should iterate in
> small steps and collect feedback in time
> Looking forward to the start of the work of Flink2.0, I am willing to
> provide assistance ~
>
> Xintong Song  于2023年4月25日周二 19:10写道:
> >
> > Hi everyone,
> >
> > I'd like to start a discussion on planning for a Flink 2.0 release.
> >
> > AFAIK, in the past years this topic has been mentioned from time to time,
> > in mailing lists, jira tickets and offline discussions. However, few
> > concrete steps have been taken, due to the significant determination and
> > efforts it requires and distractions from other prioritized focuses.
> After
> > a series of offline discussions in the recent weeks, with folks mostly
> from
> > our team internally as well as a few from outside Alibaba / Ververica
> > (thanks for insights from Becket and Robert), we believe it's time to
> kick
> > this off in the community.
> >
> > Below are some of our thoughts about the 2.0 release. Looking forward to
> > your opinions and feedback.
> >
> >
> > ## Why plan for release 2.0?
> >
> >
> > Flink 1.0.0 was released in March 2016. In the past 7 years, many new
> > features have been added and the project has become different from what
> it
> > used to be. So what is Flink now? What will it become in the next 3-5
> > years? What do we think of Flink's position in the industry? We believe
> > it's time to rethink these questions, and draw a roadmap towards another
> > milestone, a milestone that worths a new major release.
> >
> >
> > In addition, we are still providing backwards compatibility (maybe not
> > perfectly but largely) with APIs that we designed and claimed stable 7
> > years ago. While such backwards compatibility helps users to stick with
> the
> > latest Flink releases more easily, it sometimes, and more and more over
> > time, also becomes a burden for maintenance and a limitation for new
> > features and improvements. It's probably time to have a comprehensive
> > review and clean-up over all the public APIs.
> >
> >
> > Furthermore, next year is the 10th year for Flink as an Apache project.
> > Flink joined the Apache incubator in April 2014, and became a top-level
> > project in 

Re: [DISCUSS] Planning Flink 2.0

2023-04-25 Thread ConradJam
Thanks Xintong and Jark for kicking off the great discussion!

I checked the list carefully. The plans are detailed and most of the
problems are covered
Some of the ideas Chesnay mentioned, I think we should iterate in
small steps and collect feedback in time
Looking forward to the start of the work of Flink2.0, I am willing to
provide assistance ~

Xintong Song  于2023年4月25日周二 19:10写道:
>
> Hi everyone,
>
> I'd like to start a discussion on planning for a Flink 2.0 release.
>
> AFAIK, in the past years this topic has been mentioned from time to time,
> in mailing lists, jira tickets and offline discussions. However, few
> concrete steps have been taken, due to the significant determination and
> efforts it requires and distractions from other prioritized focuses. After
> a series of offline discussions in the recent weeks, with folks mostly from
> our team internally as well as a few from outside Alibaba / Ververica
> (thanks for insights from Becket and Robert), we believe it's time to kick
> this off in the community.
>
> Below are some of our thoughts about the 2.0 release. Looking forward to
> your opinions and feedback.
>
>
> ## Why plan for release 2.0?
>
>
> Flink 1.0.0 was released in March 2016. In the past 7 years, many new
> features have been added and the project has become different from what it
> used to be. So what is Flink now? What will it become in the next 3-5
> years? What do we think of Flink's position in the industry? We believe
> it's time to rethink these questions, and draw a roadmap towards another
> milestone, a milestone that worths a new major release.
>
>
> In addition, we are still providing backwards compatibility (maybe not
> perfectly but largely) with APIs that we designed and claimed stable 7
> years ago. While such backwards compatibility helps users to stick with the
> latest Flink releases more easily, it sometimes, and more and more over
> time, also becomes a burden for maintenance and a limitation for new
> features and improvements. It's probably time to have a comprehensive
> review and clean-up over all the public APIs.
>
>
> Furthermore, next year is the 10th year for Flink as an Apache project.
> Flink joined the Apache incubator in April 2014, and became a top-level
> project in December 2014. That makes 2024 a perfect time for bringing out
> the release 2.0 milestone. And for such a major release, we'd expect it
> takes one year or even longer to prepare for, which means we probably
> should start now.
>
>
> ## What should we focus on in release 2.0?
>
>
>- Roadmap discussion - How do we define and position Flink for now and
>in future? This is probably something we lacked. I believe some people have
>thought about it, but at least it's not explicitly discussed and aligned in
>the community. Ideally, the 2.0 release should be a result of the roadmap.
>- Breaking changes - Important improvements, bugfixes, technical debts
>that involve breaking of API backwards compatibility, which can only be
>carried out in major releases.
>   - With breaking API changes, we may need multiple 2.0-alpha/beta
>   versions to collect feedback.
>- Key features - Significant features and improvements (e.g., new user
>stories, architectural upgrades) that may change how users use Flink and
>its position in the industry. Some items from this category may also
>involve API breaking changes or significant behavior changes.
>   - There are also opinions that we should stay focused as much as
>   possible on the breaking changes only. Incremental / non-breaking
>   improvements and features, or anything that can be added in 2.x minor
>   releases, should not block the 2.0 release.
>
> It might be better to discuss the detailed technical items later in another
> thread, to keep the current discussion focused on the overall proposal, and
> to leave time for all parties to think about their technical plans. For
> your reference, I've attached a preliminary list of work items proposed by
> Alibaba / Ververica [1]. Note that the listed items are still being
> carefully evaluated and prioritized, and may change in future.
>
>
> ## How do we manage the release?
>
>
>  Release Process
>
>
> We'd expect the release process for Flink 2.0 to be different from the 1.x
> releases.
>
>
> A major difference is that, we think the timeline-based release management
> may not be suitable. The idea behind the timeline-based approach is that we
> can have more frequent releases and deliver completed features to users
> earlier, while incompleted features can be postponed to the next release
> which won't be too late with the short release cycle. However, for breaking
> changes that can only take place in major releases, the price for missing a
> release is too high.
>
>
> Alternatively, we probably should discuss and agree on a list of must-have
> work items. That doesn't mean keep postponing the release upon a few
> delayed 

Re: [DISCUSS] Planning Flink 2.0

2023-04-25 Thread Jing Ge
Thanks Xingtong and Jark for kicking off and driving the discussion! It is
really good to see we finally start talking about Flink 2.0. There are so
many great ideas that require breaking API changes and so many tech debts
need to be cleaned up. With the Flink 2.0 ahead, we will be more fast-paced
to bring Flink to the next level. +1 for your proposal.

Best regards,
Jing



On Tue, Apr 25, 2023 at 3:55 PM Chesnay Schepler  wrote:

> This is definitely a good discussion so have.
>
> Some thoughts:
>
> One aspect that wasn't mentioned is what this release means going
> forward. I already waited a decade for 2.0; don't really want to wait
> another one to see Flink 3.0.
> We should discuss how regularly we will ship major releases from now on.
> Let's avoid again making breaking changes because we "gotta do it now
> because 3.0 isn't happening anytime soon".
> (e.g., every 2 years or something)
>
> Related to that we need to figure out how long 1.x will be supported and
> in what way (features+patches vs only patches).
>
> The timeline/branch/release-manager bits sound good to me.
>
>  > /There are also opinions that we should stay focused as much as
> //possible on the breaking changes only. Incremental /
> non-breaking//improvements and features, or anything that can be added
> in 2.x minor releases, should not block the 2.0 release./
>
> I would definitely agree with this. I'd much rather focus on resolving
> technical debt and setting us up for improvements later than trying to
> tackle both at the same time.
> The "marketing perspective" of having big key features to me just
> doesn't make sense considering what features we shipped with 1.x
> releases in the past years.
> If that means 2.0 comes along faster, then that's a bonus in my book.
> We may of course ship features (e.g., Java 17 which basically comes for
> free if we drop the Scala APIs), but they shouldn't be a focus.
>
>  > /With breaking API changes, we may need multiple 2.0-alpha/beta
> versions to collect feedback./
>
> Personally I wouldn't even aim for a big 2.0 release. I think that will
> become quiet a mess and very difficult to actually get feedback on.
> My thinking goes rather in the area of defining Milestone releases, each
> Milestone targeting specific changes.
> For example, one milestone could cleanup the REST API (+ X,Y,Z), while
> another removes deprecated APIs, etc etc.
> Depending on the scope we could iterate quite fast on these.
> (Note that I haven't thought this through yet from the dev workflow
> perspective, but it'd likely require longer-living feature branches)
>
> There are some clear benefits to this approach; if we'd drop deprecated
> APIs in M1 then we could already offers users a version of Flink that
> works with Java 17.
>
> On 25/04/2023 13:09, Xintong Song wrote:
> > Hi everyone,
> >
> > I'd like to start a discussion on planning for a Flink 2.0 release.
> >
> > AFAIK, in the past years this topic has been mentioned from time to time,
> > in mailing lists, jira tickets and offline discussions. However, few
> > concrete steps have been taken, due to the significant determination and
> > efforts it requires and distractions from other prioritized focuses.
> After
> > a series of offline discussions in the recent weeks, with folks mostly
> from
> > our team internally as well as a few from outside Alibaba / Ververica
> > (thanks for insights from Becket and Robert), we believe it's time to
> kick
> > this off in the community.
> >
> > Below are some of our thoughts about the 2.0 release. Looking forward to
> > your opinions and feedback.
> >
> >
> > ## Why plan for release 2.0?
> >
> >
> > Flink 1.0.0 was released in March 2016. In the past 7 years, many new
> > features have been added and the project has become different from what
> it
> > used to be. So what is Flink now? What will it become in the next 3-5
> > years? What do we think of Flink's position in the industry? We believe
> > it's time to rethink these questions, and draw a roadmap towards another
> > milestone, a milestone that worths a new major release.
> >
> >
> > In addition, we are still providing backwards compatibility (maybe not
> > perfectly but largely) with APIs that we designed and claimed stable 7
> > years ago. While such backwards compatibility helps users to stick with
> the
> > latest Flink releases more easily, it sometimes, and more and more over
> > time, also becomes a burden for maintenance and a limitation for new
> > features and improvements. It's probably time to have a comprehensive
> > review and clean-up over all the public APIs.
> >
> >
> > Furthermore, next year is the 10th year for Flink as an Apache project.
> > Flink joined the Apache incubator in April 2014, and became a top-level
> > project in December 2014. That makes 2024 a perfect time for bringing out
> > the release 2.0 milestone. And for such a major release, we'd expect it
> > takes one year or even longer to prepare for, which means we probably

Re: [DISCUSS] Planning Flink 2.0

2023-04-25 Thread Chesnay Schepler

This is definitely a good discussion so have.

Some thoughts:

One aspect that wasn't mentioned is what this release means going 
forward. I already waited a decade for 2.0; don't really want to wait 
another one to see Flink 3.0.
We should discuss how regularly we will ship major releases from now on. 
Let's avoid again making breaking changes because we "gotta do it now 
because 3.0 isn't happening anytime soon".

(e.g., every 2 years or something)

Related to that we need to figure out how long 1.x will be supported and 
in what way (features+patches vs only patches).


The timeline/branch/release-manager bits sound good to me.

> /There are also opinions that we should stay focused as much as 
//possible on the breaking changes only. Incremental / 
non-breaking//improvements and features, or anything that can be added 
in 2.x minor releases, should not block the 2.0 release./


I would definitely agree with this. I'd much rather focus on resolving 
technical debt and setting us up for improvements later than trying to 
tackle both at the same time.
The "marketing perspective" of having big key features to me just 
doesn't make sense considering what features we shipped with 1.x 
releases in the past years.

If that means 2.0 comes along faster, then that's a bonus in my book.
We may of course ship features (e.g., Java 17 which basically comes for 
free if we drop the Scala APIs), but they shouldn't be a focus.


> /With breaking API changes, we may need multiple 2.0-alpha/beta 
versions to collect feedback./


Personally I wouldn't even aim for a big 2.0 release. I think that will 
become quiet a mess and very difficult to actually get feedback on.
My thinking goes rather in the area of defining Milestone releases, each 
Milestone targeting specific changes.
For example, one milestone could cleanup the REST API (+ X,Y,Z), while 
another removes deprecated APIs, etc etc.

Depending on the scope we could iterate quite fast on these.
(Note that I haven't thought this through yet from the dev workflow 
perspective, but it'd likely require longer-living feature branches)


There are some clear benefits to this approach; if we'd drop deprecated 
APIs in M1 then we could already offers users a version of Flink that 
works with Java 17.


On 25/04/2023 13:09, Xintong Song wrote:

Hi everyone,

I'd like to start a discussion on planning for a Flink 2.0 release.

AFAIK, in the past years this topic has been mentioned from time to time,
in mailing lists, jira tickets and offline discussions. However, few
concrete steps have been taken, due to the significant determination and
efforts it requires and distractions from other prioritized focuses. After
a series of offline discussions in the recent weeks, with folks mostly from
our team internally as well as a few from outside Alibaba / Ververica
(thanks for insights from Becket and Robert), we believe it's time to kick
this off in the community.

Below are some of our thoughts about the 2.0 release. Looking forward to
your opinions and feedback.


## Why plan for release 2.0?


Flink 1.0.0 was released in March 2016. In the past 7 years, many new
features have been added and the project has become different from what it
used to be. So what is Flink now? What will it become in the next 3-5
years? What do we think of Flink's position in the industry? We believe
it's time to rethink these questions, and draw a roadmap towards another
milestone, a milestone that worths a new major release.


In addition, we are still providing backwards compatibility (maybe not
perfectly but largely) with APIs that we designed and claimed stable 7
years ago. While such backwards compatibility helps users to stick with the
latest Flink releases more easily, it sometimes, and more and more over
time, also becomes a burden for maintenance and a limitation for new
features and improvements. It's probably time to have a comprehensive
review and clean-up over all the public APIs.


Furthermore, next year is the 10th year for Flink as an Apache project.
Flink joined the Apache incubator in April 2014, and became a top-level
project in December 2014. That makes 2024 a perfect time for bringing out
the release 2.0 milestone. And for such a major release, we'd expect it
takes one year or even longer to prepare for, which means we probably
should start now.


## What should we focus on in release 2.0?


- Roadmap discussion - How do we define and position Flink for now and
in future? This is probably something we lacked. I believe some people have
thought about it, but at least it's not explicitly discussed and aligned in
the community. Ideally, the 2.0 release should be a result of the roadmap.
- Breaking changes - Important improvements, bugfixes, technical debts
that involve breaking of API backwards compatibility, which can only be
carried out in major releases.
   - With breaking API changes, we may need multiple 2.0-alpha/beta
   versions to collect 

Re: [DISCUSS] Planning Flink 2.0

2023-04-25 Thread Martijn Visser
Hi all,

I think it's a great idea to have a concrete planning and roadmap
discussion on Flink 2.0. I've also thought on this topic previously and
would like to volunteer as one of the release managers.

A couple of initial thoughts:

* I'm assuming that as a desired outcome of this discussion thread, we
would like to get to a roadmap that the community thinks should be
pursued for a Flink 2.0
* If we want to have that roadmap, should we consolidate this into a
dedicated Confluence page over storing it in a Google doc? Or do you
propose that different contributors all write down their individual
thoughts on what they would like to see in a Flink 2.0?

Best regards,

Martijn

On Tue, Apr 25, 2023 at 1:54 PM Leonard Xu  wrote:

> Thanks Xintong and Jark for kicking off the great discussion!
>
> The time goes so fast, it is already the 10th anniversary of Flink as an
> Apache project. Although I haven't gone through the proposal carefully, +1
> for the perfect release time and the release managers candidates.
>
> Best,
> Leonard
>
> > Furthermore, next year is the 10th year for Flink as an Apache project.
> > Flink joined the Apache incubator in April 2014, and became a top-level
> > project in December 2014. That makes 2024 a perfect time for bringing out
> > the release 2.0 milestone.
>
>


Re: [DISCUSS] Planning Flink 2.0

2023-04-25 Thread Becket Qin
Hi Xintong and Jark,

Thanks for starting the discussion about Flink 2.0. This is indeed
something that people talk about all the time but without material actions
taken. It is good timing to kick off this effort, so we can bring Flink to
the next stage and move faster.

I'd also volunteer to be a release manager of the Flink 2.0 release.

Cheers,

Jiangjie (Becket) Qin

On Tue, Apr 25, 2023 at 7:53 PM Leonard Xu  wrote:

> Thanks Xintong and Jark for kicking off the great discussion!
>
> The time goes so fast, it is already the 10th anniversary of Flink as an
> Apache project. Although I haven't gone through the proposal carefully, +1
> for the perfect release time and the release managers candidates.
>
> Best,
> Leonard
>
> > Furthermore, next year is the 10th year for Flink as an Apache project.
> > Flink joined the Apache incubator in April 2014, and became a top-level
> > project in December 2014. That makes 2024 a perfect time for bringing out
> > the release 2.0 milestone.
>
>


Re: [DISCUSS] Planning Flink 2.0

2023-04-25 Thread Leonard Xu
Thanks Xintong and Jark for kicking off the great discussion!  

The time goes so fast, it is already the 10th anniversary of Flink as an Apache 
project. Although I haven't gone through the proposal carefully, +1 for the 
perfect release time and the release managers candidates.

Best,
Leonard

> Furthermore, next year is the 10th year for Flink as an Apache project.
> Flink joined the Apache incubator in April 2014, and became a top-level
> project in December 2014. That makes 2024 a perfect time for bringing out
> the release 2.0 milestone.



[DISCUSS] Planning Flink 2.0

2023-04-25 Thread Xintong Song
Hi everyone,

I'd like to start a discussion on planning for a Flink 2.0 release.

AFAIK, in the past years this topic has been mentioned from time to time,
in mailing lists, jira tickets and offline discussions. However, few
concrete steps have been taken, due to the significant determination and
efforts it requires and distractions from other prioritized focuses. After
a series of offline discussions in the recent weeks, with folks mostly from
our team internally as well as a few from outside Alibaba / Ververica
(thanks for insights from Becket and Robert), we believe it's time to kick
this off in the community.

Below are some of our thoughts about the 2.0 release. Looking forward to
your opinions and feedback.


## Why plan for release 2.0?


Flink 1.0.0 was released in March 2016. In the past 7 years, many new
features have been added and the project has become different from what it
used to be. So what is Flink now? What will it become in the next 3-5
years? What do we think of Flink's position in the industry? We believe
it's time to rethink these questions, and draw a roadmap towards another
milestone, a milestone that worths a new major release.


In addition, we are still providing backwards compatibility (maybe not
perfectly but largely) with APIs that we designed and claimed stable 7
years ago. While such backwards compatibility helps users to stick with the
latest Flink releases more easily, it sometimes, and more and more over
time, also becomes a burden for maintenance and a limitation for new
features and improvements. It's probably time to have a comprehensive
review and clean-up over all the public APIs.


Furthermore, next year is the 10th year for Flink as an Apache project.
Flink joined the Apache incubator in April 2014, and became a top-level
project in December 2014. That makes 2024 a perfect time for bringing out
the release 2.0 milestone. And for such a major release, we'd expect it
takes one year or even longer to prepare for, which means we probably
should start now.


## What should we focus on in release 2.0?


   - Roadmap discussion - How do we define and position Flink for now and
   in future? This is probably something we lacked. I believe some people have
   thought about it, but at least it's not explicitly discussed and aligned in
   the community. Ideally, the 2.0 release should be a result of the roadmap.
   - Breaking changes - Important improvements, bugfixes, technical debts
   that involve breaking of API backwards compatibility, which can only be
   carried out in major releases.
  - With breaking API changes, we may need multiple 2.0-alpha/beta
  versions to collect feedback.
   - Key features - Significant features and improvements (e.g., new user
   stories, architectural upgrades) that may change how users use Flink and
   its position in the industry. Some items from this category may also
   involve API breaking changes or significant behavior changes.
  - There are also opinions that we should stay focused as much as
  possible on the breaking changes only. Incremental / non-breaking
  improvements and features, or anything that can be added in 2.x minor
  releases, should not block the 2.0 release.

It might be better to discuss the detailed technical items later in another
thread, to keep the current discussion focused on the overall proposal, and
to leave time for all parties to think about their technical plans. For
your reference, I've attached a preliminary list of work items proposed by
Alibaba / Ververica [1]. Note that the listed items are still being
carefully evaluated and prioritized, and may change in future.


## How do we manage the release?


 Release Process


We'd expect the release process for Flink 2.0 to be different from the 1.x
releases.


A major difference is that, we think the timeline-based release management
may not be suitable. The idea behind the timeline-based approach is that we
can have more frequent releases and deliver completed features to users
earlier, while incompleted features can be postponed to the next release
which won't be too late with the short release cycle. However, for breaking
changes that can only take place in major releases, the price for missing a
release is too high.


Alternatively, we probably should discuss and agree on a list of must-have
work items. That doesn't mean keep postponing the release upon a few
delayed features. In fact, we would need to closely monitor the progress of
the must-have items during the entire release cycle, making sure they are
taken care of by contributors with enough expertise and capacities.


 Timeline


The release cycle should be decided according to the feature list,
especially the must-have items that we plan to do in the release. However,
a target feature freeze date would still be helpful when making the plan,
so that we don't pack too many things into the release. We propose to aim
for a feature freeze around mid 2024, so that