Hi,weijie

Thank you very much to Weijie for proposing this series of improvements,
especially the complete decoupling of user interface and implementation.
This part is actually a very serious problem that disturbs downstream users
in the community. I hope this problem can be completely solved in the
future.

However, regarding the API decoupling part, I have a question: Do
connectors and SQL currently have similar problems? If so, will similar
methods be used to solve them?

Best,
Guowei


On Tue, Feb 20, 2024 at 3:10 PM weijie guo <guoweijieres...@gmail.com>
wrote:

> Hi All,
>
> Thanks for all the feedback.
>
> If there are no more comments, I would like to start the vote thread,
> thanks again!
>
> Best regards,
>
> Weijie
>
>
> Xintong Song <tonysong...@gmail.com> 于2024年1月30日周二 11:04写道:
>
> > Thanks for working on this, Weijie.
> >
> > The design flaws of the current DataStream API (i.e., V1) have been a
> pain
> > for a long time. It's great to see efforts going on trying to resolve
> them.
> >
> > Significant changes to such an important and comprehensive set of public
> > APIs deserves caution. From that perspective, the ideas of introducing a
> > new set of APIs that gradually replace the current one, splitting the
> > introducing of the new APIs into many separate FLIPs, and making
> > intermediate APIs @Experiemental until all of them are completed make
> > great sense to me.
> >
> > Besides, the ideas of generalized watermark, execution hints sound quite
> > interesting. Looking forward to more detailed discussions in the
> > corresponding sub-FLIPs.
> >
> > +1 for the roadmap.
> >
> > Best,
> >
> > Xintong
> >
> >
> >
> > On Tue, Jan 30, 2024 at 11:00 AM weijie guo <guoweijieres...@gmail.com>
> > wrote:
> >
> > > Hi Wencong:
> > >
> > > > The Processing TimerService is currently
> > > defined as one of the basic primitives, partly because it's understood
> > that
> > > you have to choose between processing time and event time.
> > > The other part of the reason is that it needs to work based on the
> task's
> > > mailbox thread model to avoid concurrency issues. Could you clarify the
> > > second
> > > part of the reason?
> > >
> > > Since the processing logic of the operators takes place in the mailbox
> > > thread, the processing timer's callback function must also be executed
> in
> > > the mailbox to ensure thread safety.
> > > If we do not define the Processing TimerService as primitive, there is
> no
> > > way for the user to dispatch custom logic to the mailbox thread.
> > >
> > >
> > > Best regards,
> > >
> > > Weijie
> > >
> > >
> > > Xuannan Su <suxuanna...@gmail.com> 于2024年1月29日周一 17:12写道:
> > >
> > > > Hi Weijie,
> > > >
> > > > Thanks for driving the work! There are indeed many pain points in the
> > > > current DataStream API, which are challenging to resolve with its
> > > > existing design. It is a great opportunity to propose a new
> DataStream
> > > > API that tackles these issues. I like the way we've divided the FLIP
> > > > into multiple sub-FLIPs; the roadmap is clear and comprehensible. +1
> > > > for the umbrella FLIP. I am eager to see the sub-FLIPs!
> > > >
> > > > Best regards,
> > > > Xuannan
> > > >
> > > >
> > > >
> > > >
> > > > On Wed, Jan 24, 2024 at 8:55 PM Wencong Liu <liuwencle...@163.com>
> > > wrote:
> > > > >
> > > > > Hi Weijie,
> > > > >
> > > > >
> > > > > Thank you for the effort you've put into the DataStream API ! By
> > > > reorganizing and
> > > > > redesigning the DataStream API, as well as addressing some of the
> > > > unreasonable
> > > > > designs within it, we can enhance the efficiency of job development
> > for
> > > > developers.
> > > > > It also allows developers to design more flexible Flink jobs to
> meet
> > > > business requirements.
> > > > >
> > > > >
> > > > > I have conducted a comprehensive review of the DataStream API
> design
> > in
> > > > versions
> > > > > 1.18 and 1.19. I found quite a few functional defects in the
> > DataStream
> > > > API, such as the
> > > > > lack of corresponding APIs in batch processing scenarios. In the
> > > > upcoming 1.20 version,
> > > > > I will further improve the DataStream API in batch computing
> > scenarios.
> > > > >
> > > > >
> > > > > The issues existing in the old DataStream API (which can be
> referred
> > to
> > > > as V1) can be
> > > > > addressed from a design perspective in the initial version of V2. I
> > > hope
> > > > to also have the
> > > > >  opportunity to participate in the development of DataStream V2 and
> > > make
> > > > my contribution.
> > > > >
> > > > >
> > > > > Regarding FLIP-408, I have a question: The Processing TimerService
> is
> > > > currently
> > > > > defined as one of the basic primitives, partly because it's
> > understood
> > > > that
> > > > > you have to choose between processing time and event time.
> > > > > The other part of the reason is that it needs to work based on the
> > > task's
> > > > > mailbox thread model to avoid concurrency issues. Could you clarify
> > the
> > > > second
> > > > > part of the reason?
> > > > >
> > > > > Best,
> > > > > Wencong Liu
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > At 2023-12-26 14:42:20, "weijie guo" <guoweijieres...@gmail.com>
> > > wrote:
> > > > > >Hi devs,
> > > > > >
> > > > > >
> > > > > >I'd like to start a discussion about FLIP-408: [Umbrella]
> Introduce
> > > > > >DataStream API V2 [1].
> > > > > >
> > > > > >
> > > > > >The DataStream API is one of the two main APIs that Flink provides
> > for
> > > > > >writing data processing programs. As an API that was introduced
> > > > > >practically since day-1 of the project and has been evolved for
> > nearly
> > > > > >a decade, we are observing more and more problems of it.
> > Improvements
> > > > > >on these problems require significant breaking changes, which
> makes
> > > > > >in-place refactor impractical. Therefore, we propose to introduce
> a
> > > > > >new set of APIs, the DataStream API V2, to gradually replace the
> > > > > >original DataStream API.
> > > > > >
> > > > > >
> > > > > >The proposal to introduce a whole set new API is complex and
> > includes
> > > > > >massive changes. We are planning  to break it down into multiple
> > > > > >sub-FLIPs for incremental discussion. This FLIP is only used as an
> > > > > >umbrella, mainly focusing on motivation, goals, and overall
> > planning.
> > > > > >That is to say, more design and implementation details  will be
> > > > > >discussed in other FLIPs.
> > > > > >
> > > > > >
> > > > > >Given that it's hard to imagine the detailed design of the new API
> > if
> > > > > >we're just talking about this umbrella FLIP, and we probably won't
> > be
> > > > > >able to give an opinion on it. Therefore, I have prepared two
> > > > > >sub-FLIPs [2][3] at the same time, and the discussion of them will
> > be
> > > > > >posted later in separate threads.
> > > > > >
> > > > > >
> > > > > >Looking forward to hearing from you, thanks!
> > > > > >
> > > > > >
> > > > > >Best regards,
> > > > > >
> > > > > >Weijie
> > > > > >
> > > > > >
> > > > > >
> > > > > >[1]
> > > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-408%3A+%5BUmbrella%5D+Introduce+DataStream+API+V2
> > > > > >
> > > > > >[2]
> > > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-409%3A+DataStream+V2+Building+Blocks%3A+DataStream%2C+Partitioning+and+ProcessFunction
> > > > > >
> > > > > >
> > > > > >[3]
> > > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-410%3A++Config%2C+Context+and+Processing+Timer+Service+of+DataStream+API+V2
> > > >
> > >
> >
>

Reply via email to