Re: [DISCUSS] Make window state queryable

vino yang Thu, 04 Jul 2019 20:49:45 -0700

Hi all,

Thanks to Kostas for reminding me that as early as March 2017, the
community had a thread called "Future of Queryable State Feature". [1]


It has already discussed the queryable state and how to make the window
state queryable. I still think it can offer many advantages, especially for
Ad-Hoc.

Best,
Vino

[1]:
http://mail-archives.apache.org/mod_mbox/flink-dev/201703.mbox/%3C362C780C-9672-4DBD-B3F1-4EE7D1DB4CA6%40apache.org%3E

mayozhang <18124766...@163.com> 于2019年7月4日周四 下午10:21写道：

> It’s a good idea to get the process information of large ongoing window.
> +1 from my side.
>
> > 在 2019年7月4日，11:41，vino yang <yanghua1...@gmail.com> 写道：
> >
> > Hi folks,
> >
> > Currently, the queryable state is not widely used in production. IMO,
> there
> > are two key reasons caused this result. 1) the client of the queryable
> > state is hard to use. Because it requires users to know the address of
> > TaskManager and the port of the proxy. Actually, most business users who
> do
> > not have good knowledge about the Flink's inner and runtime in
> production.
> > 2) The benefit of this feature has not been excavated. In Flink
> DataStream
> > API, State is the first level citizen, it’s Flink key advantage compared
> > with other compute engines. Because the queryable state is the most
> > effective way to pry the latest computing progress.
> >
> > Three months ago, I started a discussion about improving the queryable
> > state and introducing a proxy component.[1] It brings a lot of attention
> > and discussion. Recently, I have submitted a design document about the
> > proposal.[2] These efforts try to process the first problem.
> >
> > About the second question, the most essential solution is that we should
> > really make the queryable state work. The window operator is one of the
> > most valuable and most frequently used operators of all Flink operators.
> > And it also uses keyed state which is queryable. So we propose to let the
> > state of the window operator be queried. This is not only for increasing
> > the value of the queryable state but also for the real business needs.
> >
> > IMO, allowing window state to be queried will provide great value. In
> many
> > scenarios, we often use large windows for aggregate calculations. A very
> > common example is a day-level window that counts the PV of a day. But
> > usually, the user is not only satisfied to wait until the end of the
> window
> > to get the result. They want to get "intermediate results" at a smaller
> > time granularity to analyze trends. Because Flink does not provide
> periodic
> > triggers for fixed windows. We have extended this and implemented an
> > "incremental window". It can trigger a fixed window with a smaller
> interval
> > period and feedback intermediate results. However, we believe that this
> > approach is still not flexible enough. We should let the user query the
> > current calculation result of the window through the API at any time.
> >
> > However, I know that if we want to implement it, we still have some
> details
> > that need to be discussed, such as how to let users know the state
> > descriptors in the window, namespace and so on.
> >
> > This discussion thread is mainly to listen to the community's opinion on
> > this proposal.
> >
> > Any feedback and ideas are welcome and appreciated.
> >
> > Best,
> > Vino
> >
> > [1]:
> >
> http://mail-archives.apache.org/mod_mbox/flink-dev/201907.mbox/%3ctencent_35a56d6858408be2e2064...@qq.com%3E
> > [2]:
> >
> https://docs.google.com/document/d/181qYVIiHQGrc3hCj3QBn1iEHF4bUztdw4XO8VSaf_uI/edit?usp=sharing
>
>
>

Re: [DISCUSS] Make window state queryable

Reply via email to