Hi Xintong,

If the current implementation of queryable state would not block the 
implementation of disaggregated state-backends.
I prefer to not removing the implementation until we have a better solution 
(maybe based on the queryable snapshot) cc @Yuan.

If the list of "Remove deprecated APIs" means, we must remove the code in 
Flink-2.0 initial release, I would vote -1 for queryable state before we get an 
alternative.
And I will raise the concern in the Flink roadmap discussion.


Best
Yun Tang
________________________________
From: Xintong Song <tonysong...@gmail.com>
Sent: Monday, July 17, 2023 10:07
To: dev@flink.apache.org <dev@flink.apache.org>
Subject: Re: [VOTE] Release 2.0 must-have work items

@Yun,
I see your point that the ability queryable states trying to provide is
meaningful but the current implementation of the feature is problematic. So
what's your opinion on deprecating the current queryable state? Do you
think we need to wait until there is a new implementation of queryable
state to remove the current one? Or maybe the current implementation is not
well functional anyway and we can treat the removal of it as
independent from introducing a new one?

However, I don't want to make users feel that this feature cannot be done
> well, and maybe we can redesign this feature.
>
TBH, the impression that I got from the roadmap[1] is that the queryable
state is retiring and will be replaced by the state processor api. If this
is not the impression we want users to have, you probably also need to
raise it in the roadmap discussion [2].

Best,

Xintong


[1] https://flink.apache.org/roadmap

[2] https://lists.apache.org/thread/szdr4ngrfcmo7zko4917393zbqhgw0v5



On Mon, Jul 17, 2023 at 9:53 AM Xintong Song <tonysong...@gmail.com> wrote:

> I'd propose to downgrade "Refactor the API modules" to TBD. The original
> proposal was based on the condition that we are allowed to introduce
> in-place API breaking changes in release 2.0. As the migration period is
> introduced, and we are no longer planning to do in-place changes /
> removal for DataStream (and same for APIs in `flink-core`), we need to
> re-evaluate whether it's feasible to do things like moving classes to
> different module / packages, turning concrete classes into interfaces on
> the API classes.
>
> Best,
>
> Xintong
>
>
>
> On Mon, Jul 17, 2023 at 1:10 AM Yun Tang <myas...@live.com> wrote:
>
>> I agree that we could downgrade "Eager state declaration" to a
>> nice-to-have feature.
>>
>> For the depreciation of "queryable state", can we just rename to
>> deprecate "current implementation of queryable state"? The feature to query
>> the internal state is actually very useful for debugging and could provide
>> more possibility to extend FlinkSQL more like a database.
>>
>> Just as Yuan replied in the previous email [1], current implementation of
>> queryable state has many problems in design. However, I don't want to make
>> users feel that this feature cannot be done well, and maybe we can redesign
>> this feature. As far as I know, risingwave already support  queryable state
>> with better user experience [2].
>>
>>
>> [1] https://lists.apache.org/thread/9hmwcjb3q5c24pk3qshjvybfqk62v17m
>> [2] https://syntaxbug.com/06a3e7c554/
>>
>> Best
>> Yun Tang
>> ________________________________
>> From: Xintong Song <tonysong...@gmail.com>
>> Sent: Friday, July 14, 2023 13:51
>> To: dev@flink.apache.org <dev@flink.apache.org>
>> Subject: Re: [VOTE] Release 2.0 must-have work items
>>
>> Thanks for the support, Yu.
>>
>> We will have the guideline before removing DataSet. We are currently
>> prioritizing works that need to be done before the 1.18 feature freeze,
>> and
>> will soon get back to working on the guidelines. We expect to get the
>> guideline ready before or soon after the 1.18 release, which will
>> definitely be before removing DataSet in 2.0.
>>
>> Best,
>>
>> Xintong
>>
>>
>>
>> On Fri, Jul 14, 2023 at 1:06 PM Yu Li <car...@gmail.com> wrote:
>>
>> > It's great to see the discussion about what we need to improve on
>> > (completely) switching from DataSet API to DataStream API from the user
>> > perspective. I feel that these improvements would happen faster (only)
>> when
>> > we seriously prepare to remove the DataSet APIs with a target release,
>> just
>> > like what we are doing now. And the same applies to the SinkV1 related
>> > discussions (smile).
>> >
>> > I support Xintong's opinion on keeping "Remove the DataSet APIs" a
>> > must-have item, meantime I support Yuxia's opinion that we should
>> > explicitly let our users know how to migrate their existing DataSet API
>> > based applications afterwards, meaning that the guideline Xintong
>> mentioned
>> > is a must-have (rather than best efforts) before removing the DataSet
>> APIs.
>> >
>> > Best Regards,
>> > Yu
>> >
>> >
>> > On Wed, 12 Jul 2023 at 14:00, yuxia <luoyu...@alumni.sjtu.edu.cn>
>> wrote:
>> >
>> > > Thanks Xintong for clarification. A guideline to help users migrating
>> > from
>> > > DataSet to DataStream will definitely be helpful.
>> > >
>> > > Best regards,
>> > > Yuxia
>> > >
>> > > ----- 原始邮件 -----
>> > > 发件人: "Xintong Song" <tonysong...@gmail.com>
>> > > 收件人: "dev" <dev@flink.apache.org>
>> > > 发送时间: 星期三, 2023年 7 月 12日 上午 11:40:12
>> > > 主题: Re: [VOTE] Release 2.0 must-have work items
>> > >
>> > > @Yuxia,
>> > >
>> > > We are aware of the issue that you mentioned. Actually, I don't think
>> the
>> > > DataStream API can cover everything in the DataSet API in exactly the
>> > same
>> > > way, because the fundamental model, concepts and primitives of the two
>> > sets
>> > > of APIs are completely different. Many of the DataSet APIs, especially
>> > > those accessing the full data set at once, do not fit in the
>> DataStream
>> > > concepts at all. I think what's important is that users can achieve
>> the
>> > > same function, even if they may need to code in a different way.
>> > >
>> > > We have gone through all the existing DataSet APIs, and categorized
>> them
>> > > into 3 kinds:
>> > > - APIs that are well supported by DataStream API as is. E.g., map,
>> reduce
>> > > on grouped dataset, etc.
>> > > - APIs that can be achieved by DataStream API as is, but with a price
>> > > (programming complexity, or computation efficiency). E.g., reduce on
>> full
>> > > dataset, sort partition, etc. Admittedly, there is room for
>> improvement
>> > on
>> > > these. We may keep improving these for the DataStream API, or we can
>> > > concentrate on supporting them better in the new ProcessFunction API.
>> > > Either way, I don't think we should block the retiring of DataSet API
>> on
>> > > them.
>> > > - There are also a few APIs that cannot be supported by the DataStream
>> > API
>> > > as is, unless users write their custom operators from the ground up.
>> Only
>> > > left/rightOuterJoin and combineGroup fall into this category. I think
>> > > combinedGroup is probably not a problem, because this is more like a
>> > > variant of reduceGroup that allows the framework to execute more
>> > > efficiently. As for the outer joins, depending on how badly this is
>> > needed,
>> > > it can be supported by emitting the non-joined entries upon
>> triggering a
>> > > window join.
>> > >
>> > > We are also planning to draft a guideline to help users migrating from
>> > > DataSet to DataStream, which should demonstrate how users can achieve
>> > > things like sort-partition with DataStream API.
>> > >
>> > > Last but not least, I'd like to point out that the decision to
>> deprecate
>> > > and eventually remove the DataSet API was approved in FLIP-131, and
>> all
>> > the
>> > > prerequisites mentioned in the FLIP have been completed.
>> > >
>> > > Best,
>> > >
>> > > Xintong
>> > >
>> > >
>> > > [1]
>> > >
>> >
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=158866741
>> > >
>> > >
>> > >
>> > > On Wed, Jul 12, 2023 at 10:20 AM Jingsong Li <jingsongl...@gmail.com>
>> > > wrote:
>> > >
>> > > > +1 to Leonard and Galen and Jing.
>> > > >
>> > > > About Source and Sink.
>> > > > We're still missing quite a bit of work, including functionality,
>> > > > including ease of use, including bug fixes, and I'm not sure we'll
>> be
>> > > > completely done by 2.0.
>> > > > Until that's done, we won't be in a position to clean up the old
>> APIs.
>> > > >
>> > > > Best,
>> > > > Jingsong
>> > > >
>> > > > On Wed, Jul 12, 2023 at 9:41 AM yuxia <luoyu...@alumni.sjtu.edu.cn>
>> > > wrote:
>> > > > >
>> > > > > Hi,Xintong.
>> > > > > Sorry to disturb the voting. I just found an email[1] about
>> DataSet
>> > API
>> > > > from flink-user-zh channel. And I think it's not just a single case
>> > > > according to my observation.
>> > > > >
>> > > > > Remove DataSet is a must have item in release-2.0. But as the user
>> > > email
>> > > > said, if we remove DataSet, how users can implement
>> Sort/PartitionBy,
>> > etc
>> > > > as they did with DataSet?
>> > > > > Do we will also provide similar api in datastream or some other
>> thing
>> > > > before we remove DataSet?
>> > > > > Btw, as far as I see, with regarding to replcaing DataSet with
>> > > > Datastream, Datastream are missing many API. I think it may well
>> take
>> > > much
>> > > > effort to fully cover the missing api.
>> > > > >
>> > > > > [1]
>> https://lists.apache.org/thread/syjmt8f74gh8ok3z4lhgt95zl4dzn168
>> > > > >
>> > > > > Best regards,
>> > > > > Yuxia
>> > > > >
>> > > > > ----- 原始邮件 -----
>> > > > > 发件人: "Jing Ge" <j...@ververica.com.INVALID>
>> > > > > 收件人: "dev" <dev@flink.apache.org>
>> > > > > 发送时间: 星期三, 2023年 7 月 12日 上午 1:23:40
>> > > > > 主题: Re: [VOTE] Release 2.0 must-have work items
>> > > > >
>> > > > > agree with what Leonard said. There are actually more issues wrt
>> the
>> > > new
>> > > > > Source and SinkV2[1]
>> > > > >
>> > > > > Speaking of must-have vs nice-to-have, I think it depends on the
>> > > > priority.
>> > > > > If removing them has higher priority, we should keep related
>> tasks as
>> > > > > must-have and make sure enough effort will be put to solve those
>> > issues
>> > > > and
>> > > > > therefore be able to remove those APIs.
>> > > > >
>> > > > > Best regards,
>> > > > > Jing
>> > > > >
>> > > > > [1]
>> https://lists.apache.org/thread/90qc9nrlzf0vbvg92klzp9ftxxc43nbk
>> > > > >
>> > > > > On Tue, Jul 11, 2023 at 10:26 AM Leonard Xu <xbjt...@gmail.com>
>> > wrote:
>> > > > >
>> > > > > > Thanks Xintong for driving this great work! But I’ve to give my
>> > > > > > -1(binding) here:
>> > > > > >
>> > > > > > -1 to mark "deprecat SourceFunction/SinkFunction/Sinkv1" item as
>> > must
>> > > > to
>> > > > > > have for release 2.0.
>> > > > > >
>> > > > > > I do a lot of connector work in the community, and I have two
>> > > insights
>> > > > > > from past experience:
>> > > > > >
>> > > > > > 1. Many developers reported that it is very difficult to migrate
>> > from
>> > > > > > SourceFunction to new Source [1]. The migration of existing
>> > > conenctors
>> > > > > > after deprecated SourceFunction is very difficult. Some
>> developers
>> > > > (Flavio
>> > > > > > Pompermaier) reported that they gave up the migration because it
>> > was
>> > > > too
>> > > > > > complicated. I believe it's not a few cases. This means that
>> > > > deprecating
>> > > > > > SourceFunction related interfaces require community
>> contributors to
>> > > > reduce
>> > > > > > the migration cost before starting the migration work.
>> > > > > >
>> > > > > > 2. IIRC, the function of SinkV2 cannot currently cover
>> SinkFunction
>> > > as
>> > > > > > described in FLIP-287[2], it means the migration path after
>> > deprecate
>> > > > > > SinkFunction/Sinkv1 does not exist, thus we cannot mark the
>> related
>> > > > > > interfaces of sinkfunction/sinkv1  as deprecated in 1.18.
>> > > > > >
>> > > > > > Based on these two cognitions, I think we should not mark these
>> > > > interfaces
>> > > > > > as must to have in 2.0. Maintaining the two sets of source/sink
>> > > > interfaces
>> > > > > > is not a concern for me, users can choose the interface to
>> > implement
>> > > > > > according to their energy and needs.
>> > > > > >
>> > > > > > Btw, some work items in 2.0 are marked as must to have, but no
>> > > > contributor
>> > > > > > has claimed them yet. I think this is a risk and hope the
>> Release
>> > > > Managers
>> > > > > > could pay attention to it.
>> > > > > >
>> > > > > > Thank you all RMs for your work, sorry again for interrupting
>> the
>> > > vote
>> > > > > >
>> > > > > > Best,
>> > > > > > Leonard
>> > > > > >
>> > > > > > [1]
>> > https://lists.apache.org/thread/sqq26s9rorynr4vx4nhxz3fmmxpgtdqp
>> > > > > > [2]
>> > > > > >
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=240880853
>> > > > > >
>> > > > > > > On Jul 11, 2023, at 4:11 PM, Yuan Mei <yuanmei.w...@gmail.com
>> >
>> > > > wrote:
>> > > > > > >
>> > > > > > > As a second thought, I think "Eager State Declaration" is
>> > probably
>> > > > not a
>> > > > > > > must-have.
>> > > > > > >
>> > > > > > > I was originally thinking it is a prerequisite for "state
>> > querying
>> > > > for
>> > > > > > > disaggregated state management".
>> > > > > > >
>> > > > > > > Since disaggregated state management itself is not a
>> must-have,
>> > > > "Eager
>> > > > > > > State Declaration" is not as well. We can downgrade it to
>> "nice
>> > to
>> > > > have"
>> > > > > > if
>> > > > > > > no objection.
>> > > > > > >
>> > > > > > > Best
>> > > > > > >
>> > > > > > > Yuan
>> > > > > > >
>> > > > > > > On Mon, Jul 10, 2023 at 7:02 PM Jing Ge
>> > <j...@ververica.com.invalid
>> > > >
>> > > > > > wrote:
>> > > > > > >
>> > > > > > >> +1
>> > > > > > >>
>> > > > > > >> On Mon, Jul 10, 2023 at 12:52 PM Yu Li <car...@gmail.com>
>> > wrote:
>> > > > > > >>
>> > > > > > >>> +1 (binding)
>> > > > > > >>>
>> > > > > > >>> Thanks for driving this and great to see us moving forward.
>> > > > > > >>>
>> > > > > > >>> Best Regards,
>> > > > > > >>> Yu
>> > > > > > >>>
>> > > > > > >>>
>> > > > > > >>> On Mon, 10 Jul 2023 at 11:59, Feng Wang <
>> wangfeng...@gmail.com
>> > >
>> > > > wrote:
>> > > > > > >>>
>> > > > > > >>>> +1
>> > > > > > >>>> Thanks for driving this, looking forward to the next stage
>> of
>> > > > flink.
>> > > > > > >>>>
>> > > > > > >>>> On Fri, Jul 7, 2023 at 5:31 PM Xintong Song <
>> > > > tonysong...@gmail.com>
>> > > > > > >>> wrote:
>> > > > > > >>>>
>> > > > > > >>>>> Hi all,
>> > > > > > >>>>>
>> > > > > > >>>>> I'd like to start the VOTE for the must-have work items
>> for
>> > > > release
>> > > > > > >> 2.0
>> > > > > > >>>>> [1]. The corresponding discussion thread is [2].
>> > > > > > >>>>>
>> > > > > > >>>>> Please note that once the vote is approved, any changes to
>> > the
>> > > > > > >>> must-have
>> > > > > > >>>>> items (adding / removing must-have items, changing the
>> > > priority)
>> > > > > > >>> requires
>> > > > > > >>>>> another vote. Assigning contributors / reviewers, updating
>> > > > > > >>> descriptions /
>> > > > > > >>>>> progress, changes to nice-to-have items do not require
>> > another
>> > > > vote.
>> > > > > > >>>>>
>> > > > > > >>>>> The vote will be open until at least July 12, following
>> the
>> > > > consensus
>> > > > > > >>>>> voting process. Votes of PMC members are binding.
>> > > > > > >>>>>
>> > > > > > >>>>> Best,
>> > > > > > >>>>>
>> > > > > > >>>>> Xintong
>> > > > > > >>>>>
>> > > > > > >>>>>
>> > > > > > >>>>> [1]
>> > > > https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
>> > > > > > >>>>>
>> > > > > > >>>>> [2]
>> > > > https://lists.apache.org/thread/l3dkdypyrovd3txzodn07lgdwtwvhgk4
>> > > > > > >>>>>
>> > > > > > >>>>
>> > > > > > >>>
>> > > > > > >>
>> > > > > >
>> > > > > >
>> > > >
>> > >
>> >
>>
>

Reply via email to