Hi ConradJam,

I think Chesnay has already put his name as the Contributor for the two
tasks you listed. Maybe you can reach out to him to see if you can
collaborate on this.

In general, I don't think contributing to a release 2.0 issue is much
different from contributing to a regular issue. We haven't yet created JIRA
tickets for all the listed tasks because many of them needs further
discussions and / or FLIPs to decide whether and how they should be
performed.

Best,

Xintong



On Mon, Jul 3, 2023 at 10:37 PM ConradJam <jam.gz...@gmail.com> wrote:

> Hi Community:
>   I see some tasks in the 2.0 list that haven't been assigned yet. I want
> to take the initiative to take on some tasks that I can complete. How do I
> apply to the community for this part of the task? I am interested in the
> following parts of FLINK-32377
> <https://issues.apache.org/jira/browse/FLINK-32377>, do I need to create
> issuse myself and point it to myself?
>
> - the current timestamp, which is problematic w.r.t. caching and testing,
> while providing no value.
> - Remove JarRequestBody#programArgs in favor of #programArgsList.
>
> [1] FLINK-32377 <https://issues.apache.org/jira/browse/FLINK-32377>
> https://issues.apache.org/jira/browse/FLINK-32377
>
> Teoh, Hong <lian...@amazon.co.uk.invalid> 于2023年6月30日周五 00:53写道:
>
>
> Teoh, Hong <lian...@amazon.co.uk.invalid> 于2023年6月30日周五 00:53写道:
>
> > Thanks Xintong for driving the effort.
> >
> > I’d add a +1 to reworking configs, as suggested by @Jark and @Chesnay,
> > especially the types. We have various configs that encode Time /
> MemorySize
> > that are Long instead!
> >
> > Regards,
> > Hong
> >
> >
> >
> > > On 29 Jun 2023, at 16:19, Yuan Mei <yuanmei.w...@gmail.com> wrote:
> > >
> > > CAUTION: This email originated from outside of the organization. Do not
> > click links or open attachments unless you can confirm the sender and
> know
> > the content is safe.
> > >
> > >
> > >
> > > Thanks for driving this effort, Xintong!
> > >
> > > To Chesnay
> > >> I'm curious as to why the "Disaggregated State Management" item is
> > >> marked as a must-have; will it require changes that break something?
> > >> What prevents it from being added in 2.1?
> > >
> > > As to "Disaggregated State Management".
> > >
> > > We plan to provide a new type of state backend to support DFS as
> primary
> > > storage.
> > > To achieve this, we at least need to include two parts of amends (not
> > > entirely sure yet, since we are still in the designing and prototype
> > phase)
> > >
> > > 1. Statebackend Change
> > > 2. State Access Change
> > >
> > > Not all of the interfaces related are `@Internal`. Some of the
> interfaces
> > > like `StateBackend` is `@PublicEvolving`
> > > So, you are right in the sense that "Disaggregated State Management"
> > itself
> > > probably does not need to be a "Must Have"
> > >
> > > But I was hoping changes that related to public APIs can be finalized
> and
> > > merged in Flink 2.0 (I will fix the wiki accordingly).
> > >
> > > I also agree with Jark that 2.0 is a good chance to rework the default
> > > value of configurations.
> > >
> > > Best
> > > Yuan
> > >
> > >
> > > On Thu, Jun 29, 2023 at 8:43 PM Chesnay Schepler <ches...@apache.org>
> > wrote:
> > >
> > >> Something else configuration-related is that there are a bunch of
> > >> options where the type isn't quite correct (e.g., a String where it
> > >> could be an enum, a string where it should be an int or something).
> > >> Could do a pass over those as well.
> > >>
> > >> On 29/06/2023 13:50, Jark Wu wrote:
> > >>> Hi,
> > >>>
> > >>> I think one more thing we need to consider to do in 2.0 is changing
> the
> > >>> default value of configuration to improve out-of-box user experience.
> > >>>
> > >>> Currently, in order to run a Flink job, users may need to set
> > >>> a bunch of configurations, such as minibatch, checkpoint interval,
> > >>> exactly-once,
> > >>> incremental-checkpoint, etc. It's very verbose and hard to use for
> > >>> beginners.
> > >>> Most of them can have a universally applicable value.  Because
> changing
> > >> the
> > >>> default value is a breaking change. I think It's worth considering
> > >> changing
> > >>> them in 2.0.
> > >>>
> > >>> What do you think?
> > >>>
> > >>> Best,
> > >>> Jark
> > >>>
> > >>>
> > >>> On Wed, 28 Jun 2023 at 14:10, Sergey Nuyanzin <snuyan...@gmail.com>
> > >> wrote:
> > >>>
> > >>>> Hi Chesnay
> > >>>>
> > >>>>> "Move Calcite rules from Scala to Java": I would hope that this
> would
> > >> be
> > >>>>> an entirely internal change, and could thus be an incremental
> process
> > >>>>> independent of major releases.
> > >>>>> What is the actual scale of this item; how much are we actually
> > >>>> re-writing?
> > >>>>
> > >>>> Thanks for asking
> > >>>> yes, you're right, that should be internal change.
> > >>>> Yeah I was also thinking about incremental change (rule by rule or
> > >>>> reasonable small group of rules).
> > >>>> And yes, this could be an independent (on major release) activity
> > >>>>
> > >>>> The problem is actually for children of RelOptRule.
> > >>>> Currently I see 60+ such rules (in Scala) using the mentioned
> > deprecated
> > >>>> api.
> > >>>> There are also children of ConverterRule (50+) which do not have
> such
> > >>>> issues.
> > >>>> Maybe it could be considered as the next step to have all the rules
> in
> > >>>> Java.
> > >>>>
> > >>>> On Tue, Jun 27, 2023 at 1:34 PM Xintong Song <tonysong...@gmail.com
> >
> > >>>> wrote:
> > >>>>
> > >>>>> Hi Alex & Gyula,
> > >>>>>
> > >>>>> By compatibility discussion do you mean the "[DISCUSS] FLIP-321:
> > >>>> Introduce
> > >>>>>> an API deprecation process" thread [1]?
> > >>>>>>
> > >>>>> Yes, I meant the FLIP-321 discussion. I just noticed I pasted the
> > wrong
> > >>>> url
> > >>>>> in my previous email. Sorry for the mistake.
> > >>>>>
> > >>>>> I am also curious to know if the rationale behind this new API has
> > been
> > >>>>>> previously discussed on the mailing list. Do we have a list of
> > >>>>> shortcomings
> > >>>>>> in the current DataStream API that it tries to resolve? How does
> the
> > >>>>>> current ProcessFunction functionality fit into the picture? Will
> it
> > be
> > >>>>> kept
> > >>>>>> as is or subsumed by new API?
> > >>>>>>
> > >>>>> I don't think we should create a replacement for the DataStream API
> > >>>> unless
> > >>>>>> we have a very good reason to do so and with a proper discussion
> > about
> > >>>>> this
> > >>>>>> as Alex said.
> > >>>>>
> > >>>>> The ProcessFunction API which is targeting to replace DataStream
> API
> > is
> > >>>>> still a proposal, not a decision. Sorry for the confusion, I should
> > >> have
> > >>>>> been more careful with my words, not giving the impression that
> this
> > is
> > >>>>> something we'll do anyway.
> > >>>>>
> > >>>>> There will be a FLIP describing the motivations and designs in
> > detail,
> > >>>> for
> > >>>>> the community to discuss and vote on. We are still working on it.
> > TBH,
> > >>>> this
> > >>>>> is not trivial and we would need more time on it.
> > >>>>>
> > >>>>> Just to quickly share some backgrounds:
> > >>>>>
> > >>>>>    - We see quite some problems with the current DataStream APIs
> > >>>>>       - Users are working with concrete classes rather than
> > >> interfaces,
> > >>>>>       which means
> > >>>>>       - Users can access methods that are designed to be used by
> > >> internal
> > >>>>>          classes, even though they are annotated with `@Internal`.
> > >> E.g.,
> > >>>>>          `DataStream#getTransformation`.
> > >>>>>          - Changes to the non-API implementations (e.g.,
> > >>>> `Transformation`)
> > >>>>>          would affect the API classes (e.g., `DataStream`), which
> > >>>>> makes it hard to
> > >>>>>          provide binary compatibility.
> > >>>>>       - Internal classes are used as parameter / return-value of
> > >> public
> > >>>>>       APIs. E.g., while `AbstractStreamOperator` is PublicEvolving,
> > >>>>> `StreamTask`
> > >>>>>       which returns from `AbstractStreamOperator#getContainingTask`
> > is
> > >>>>> Internal.
> > >>>>>       - In many cases, users are asked to extend the API classes,
> > >> rather
> > >>>>>       than implementing interfaces. E.g., `AbstractStreamOperator`.
> > >>>>>          - Any changes to the base classes, even the internal part,
> > >> may
> > >>>>>          affect the behavior of the user-provided sub-classes
> > >>>>>          - Users can override the behavior of the base classes
> > >>>>>       - The API module `flink-streaming-java` contains non-API
> > >> classes,
> > >>>> and
> > >>>>>       depends on internal modules such as `flink-runtime`, which
> > means
> > >>>>>       - Changes to the internal modules may affect the API modules,
> > >> which
> > >>>>>          requires users to re-build their applications upon
> upgrading
> > >>>>>          - The artifact user needs for building their application
> > >> larger
> > >>>>>          than necessary.
> > >>>>>       - We probably should not expose operators (e.g.,
> > >>>>>       `AbstractStreamOperator`) to users. Functions should be
> enough
> > >>>>> for users to
> > >>>>>       define their data processing logics. Exposing operator-level
> > >>>> concepts
> > >>>>>       (e.g., mailbox thread model, checkpoint barrier alignment,
> > >> etc.) is
> > >>>>>       unnecessary and limits the improvement regarding such exposed
> > >>>>> mechanisms
> > >>>>>       with compatibility considerations.
> > >>>>>       - The current DataStream API seems to be a mixture of many
> > >> things,
> > >>>>>       making it hard to understand especially for newcomers. It
> might
> > >> be
> > >>>>> better
> > >>>>>       to re-organize it into several parts: (the taxonomy below are
> > >> just
> > >>>> an
> > >>>>>       example of the, we are still working on this)
> > >>>>>          - The most fundamental stateful stream processing:
> streams,
> > >>>>>          partitions / key, process functions, state,
> timeline-service
> > >>>>>          - An extension for common batch-streaming unified
> functions:
> > >>>> map,
> > >>>>>          flatmap, filter, agg, reduce, join, etc.
> > >>>>>          - An extension for windowing supports:  window, triggering
> > >>>>>          - An extension for event-time supports: event time,
> > watermark
> > >>>>>          - The extensions are like short-cuts / sugars, without
> which
> > >>>> users
> > >>>>>          can probably still achieve the same behavior by working
> with
> > >> the
> > >>>>>          fundamental APIs, but would be a lot easier with the
> > >> extensions
> > >>>>>       - The original plan was to do in-place refactors / changes on
> > >>>>>    DataStream API. Some related items are listed in this doc [2]
> > >> attached
> > >>>>> to
> > >>>>>    the kicking off email [3]. Not all of the above issues are
> listed,
> > >>>>> because
> > >>>>>    we haven't looked into this as deeply as now  by that time.
> > >>>>>    - We proposed this as a new API rather than in-place refactors
> in
> > >> the
> > >>>>>    2.0 work item list, because we realized the changes might be too
> > >> big
> > >>>>> for an
> > >>>>>    in-place change. First having a new API then gradually retiring
> > the
> > >>>> old
> > >>>>> one
> > >>>>>    would help users to smoothly migrate between them.
> > >>>>>
> > >>>>> A thorough discussion is definitely needed once the FLIP is out.
> And
> > of
> > >>>>> course it's possible that the FLIP might be rejected. Given that we
> > are
> > >>>>> planning for release 2.0, I just feel it would be better to bring
> > this
> > >> up
> > >>>>> early even the concrete plan is not yet ready,
> > >>>>>
> > >>>>> Best,
> > >>>>>
> > >>>>> Xintong
> > >>>>>
> > >>>>>
> > >>>>> [1]
> https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> > >>>>> [2]
> > >>>>>
> > >>>>>
> > >>>>
> > >>
> >
> https://docs.google.com/document/d/1_PMGl5RuDQGlV99_gL3y7OiRsF0DgCk91Coua6hFXhE/edit?usp=sharing
> > >>>>> [3]
> https://lists.apache.org/thread/b8w5cx0qqbwzzklyn5xxf54vw9ymys1c
> > >>>>>
> > >>>>> On Tue, Jun 27, 2023 at 5:15 PM Gyula Fóra <gyf...@apache.org>
> > wrote:
> > >>>>>
> > >>>>>> Hey!
> > >>>>>>
> > >>>>>> I share the same concerns mentioned above regarding the
> > >>>> "ProcessFunction
> > >>>>>> API".
> > >>>>>>
> > >>>>>> I don't think we should create a replacement for the DataStream
> API
> > >>>>> unless
> > >>>>>> we have a very good reason to do so and with a proper discussion
> > about
> > >>>>> this
> > >>>>>> as Alex said.
> > >>>>>>
> > >>>>>> Cheers,
> > >>>>>> Gyula
> > >>>>>>
> > >>>>>> On Tue, Jun 27, 2023 at 11:03 AM Alexander Fedulov <
> > >>>>>> alexander.fedu...@gmail.com> wrote:
> > >>>>>>
> > >>>>>>> Hi Xintong,
> > >>>>>>>
> > >>>>>>> By compatibility discussion do you mean the "[DISCUSS] FLIP-321:
> > >>>>>> Introduce
> > >>>>>>> an API deprecation process" thread [1]?
> > >>>>>>>
> > >>>>>>> I am also curious to know if the rationale behind this new API
> has
> > >>>> been
> > >>>>>>> previously discussed on the mailing list. Do we have a list of
> > >>>>>> shortcomings
> > >>>>>>> in the current DataStream API that it tries to resolve? How does
> > the
> > >>>>>>> current ProcessFunction functionality fit into the picture? Will
> it
> > >>>> be
> > >>>>>> kept
> > >>>>>>> as is or subsumed by new API?
> > >>>>>>>
> > >>>>>>> [1]
> > https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> > >>>>>>>
> > >>>>>>> Best,
> > >>>>>>> Alex
> > >>>>>>>
> > >>>>>>> On Mon, 26 Jun 2023 at 14:33, Xintong Song <
> tonysong...@gmail.com>
> > >>>>>> wrote:
> > >>>>>>>>> The ProcessFunction API item is giving me the most headaches
> > >>>>> because
> > >>>>>>> it's
> > >>>>>>>>> very unclear what it actually entails; like is it an entirely
> > >>>>>> separate
> > >>>>>>>> API
> > >>>>>>>>> to DataStream (sounds like it is!) or an extension of
> DataStream.
> > >>>>> How
> > >>>>>>>> much
> > >>>>>>>>> will it share the internals with DataStream etc.; how does it
> > >>>>> relate
> > >>>>>> to
> > >>>>>>>> the
> > >>>>>>>>> Table API (w.r.t. switching APIs / what Table API uses
> > >>>> underneath).
> > >>>>>>>> I totally understand your confusion. We started planning this
> > after
> > >>>>>>> kicking
> > >>>>>>>> off the release 2.0, so there's still a lot to be explored and
> the
> > >>>>> plan
> > >>>>>>>> keeps changing.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>    - In the beginning, we planned to do an in-place refactor of
> > >>>>>>> DataStream
> > >>>>>>>>    API, until the API migration period is proposed.
> > >>>>>>>>    - Then we want to make it an entirely separate API to
> > >>>> DataStream,
> > >>>>>> and
> > >>>>>>>>    listed as a must-have for release 2.0 so that we can remove
> > >>>>>> DataStream
> > >>>>>>>> once
> > >>>>>>>>    it's ready.
> > >>>>>>>>    - However, depending on the outcome of the API compatibility
> > >>>>>>> discussion
> > >>>>>>>>    [1], we may not be able to remove DataStream in 2.0 anyway,
> > >>>> which
> > >>>>>>> means
> > >>>>>>>> we
> > >>>>>>>>    might need to re-evaluate the necessity of this item for 2.0.
> > >>>>>>>>
> > >>>>>>>> I'd say we wait a bit longer for the compatibility discussion
> [1]
> > >>>> and
> > >>>>>>>> decide the priority for this item afterwards.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Best,
> > >>>>>>>>
> > >>>>>>>> Xintong
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> [1] https://lists.apache.org/list.html?dev@flink.apache.org
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> On Mon, Jun 26, 2023 at 6:00 PM Chesnay Schepler <
> > >>>> ches...@apache.org
> > >>>>>>>> wrote:
> > >>>>>>>>
> > >>>>>>>>> by-and-large I'm quite happy with the list of items.
> > >>>>>>>>>
> > >>>>>>>>> I'm curious as to why the "Disaggregated State Management" item
> > >>>> is
> > >>>>>>> marked
> > >>>>>>>>> as a must-have; will it require changes that break something?
> > >>>> What
> > >>>>>>>> prevents
> > >>>>>>>>> it from being added in 2.1?
> > >>>>>>>>>
> > >>>>>>>>> We may want to update the Java 17 item to "Make Java 17 the
> > >>>>> default,
> > >>>>>>> drop
> > >>>>>>>>> Java 8/11". Maybe even split it into a must-have "Drop Java 8"
> > >>>> and
> > >>>>> a
> > >>>>>>>>> nice-to-have "Drop Java 11"?
> > >>>>>>>>>
> > >>>>>>>>> "Move Calcite rules from Scala to Java": I would hope that this
> > >>>>> would
> > >>>>>>> be
> > >>>>>>>>> an entirely internal change, and could thus be an incremental
> > >>>>> process
> > >>>>>>>>> independent of major releases.
> > >>>>>>>>> What is the actual scale of this item; how much are we actually
> > >>>>>>>> re-writing?
> > >>>>>>>>> "Add MetricGroup#getLogicalScope": I'd raise this to a
> > >>>> must-have; i
> > >>>>>>> think
> > >>>>>>>>> I marked it down as nice-to-have only because it depends on
> > >>>> another
> > >>>>>>> item.
> > >>>>>>>>> The ProcessFunction API item is giving me the most headaches
> > >>>>> because
> > >>>>>>> it's
> > >>>>>>>>> very unclear what it actually entails; like is it an entirely
> > >>>>>> separate
> > >>>>>>>> API
> > >>>>>>>>> to DataStream (sounds like it is!) or an extension of
> DataStream.
> > >>>>> How
> > >>>>>>>> much
> > >>>>>>>>> will it share the internals with DataStream etc.; how does it
> > >>>>> relate
> > >>>>>> to
> > >>>>>>>> the
> > >>>>>>>>> Table API (w.r.t. switching APIs / what Table API uses
> > >>>> underneath).
> > >>>>>>>>> There are a few items I added as ideas which don't have a
> > >>>> priority
> > >>>>>> yet;
> > >>>>>>>>> would love to get some feedback on those.
> > >>>>>>>>>
> > >>>>>>>>> On 21/06/2023 08:41, Xintong Song wrote:
> > >>>>>>>>>
> > >>>>>>>>> Hi devs,
> > >>>>>>>>>
> > >>>>>>>>> As previously discussed in [1], we had been collecting work
> item
> > >>>>>>>> proposals
> > >>>>>>>>> for the 2.0 release until June 15th, on the wiki page [2].
> > >>>>>>>>>
> > >>>>>>>>>    - As we have passed the due date, I'd like to kindly remind
> > >>>>>> everyone
> > >>>>>>>> *not
> > >>>>>>>>>    to add / remove items directly on the wiki page*. If needed,
> > >>>>>> please
> > >>>>>>>> post
> > >>>>>>>>>    in this thread or reach out to the release managers instead.
> > >>>>>>>>>    - I've reached out to some folks for clarifications about
> > >>>> their
> > >>>>>>>>>    proposals. Some of them mentioned that they can not yet tell
> > >>>>>> whether
> > >>>>>>>> we
> > >>>>>>>>>    should do an item or not, and would need more time /
> > >>>> discussions
> > >>>>>> to
> > >>>>>>>> make
> > >>>>>>>>>    the decision. So I added a new symbol for items whose
> > >>>> priorities
> > >>>>>> are
> > >>>>>>>> `TBD`.
> > >>>>>>>>> Now it's time to collaboratively decide a minimum set of
> > >>>> must-have
> > >>>>>>> items.
> > >>>>>>>>> I've gone through the entire list of proposed items, and found
> > >>>> most
> > >>>>>> of
> > >>>>>>>> them
> > >>>>>>>>> make quite much sense. So I think an online sync might not be
> > >>>>>> necessary
> > >>>>>>>> for
> > >>>>>>>>> this. I'd like to go with this DISCUSS thread, where everyone
> can
> > >>>>>>> comment
> > >>>>>>>>> on how they think the list can be improved, followed by a VOTE
> to
> > >>>>>>>> formally
> > >>>>>>>>> make the decision.
> > >>>>>>>>>
> > >>>>>>>>> Any feedback and opinions, including but not limited to the
> > >>>>> following
> > >>>>>>>>> aspects, will be appreciated.
> > >>>>>>>>>
> > >>>>>>>>>    - Important items that are missing from the list
> > >>>>>>>>>    - Concerns regarding the listed items or their priorities
> > >>>>>>>>>
> > >>>>>>>>> Looking forward to your feedback.
> > >>>>>>>>>
> > >>>>>>>>> Best,
> > >>>>>>>>>
> > >>>>>>>>> Xintong
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> [1]
> > >>>>
> > >>
> >
> https://lists.apache.org/list?dev@flink.apache.org:lte=1M:release%202.0%20status%20updates
> > >>>>>>>>> [2]
> > >>>> https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>
> > >>>> --
> > >>>> Best regards,
> > >>>> Sergey
> > >>>>
> > >>
> > >>
> >
> >
>
> --
> Best
>
> ConradJam
>

Reply via email to