I don't think it is a good idea to expose the internal state of a query as
queryable state for the following reasons:

1. plan generation: the streaming programs and operators are created based
on the optimized plan. A user cannot know which operators a query will run
on. Naming the queryable states would be another issue.
2. query internals: an operator should use the most efficient state
representation. A user cannot easily know how the state is organized
without looking into the internals of the operator. Hence, querying the
state would not provide much value because the result will be hard to
interpret without knowing the details.
3. backwards compatibility: we need to be able to reimplement operators
(which also includes the design of state). If we expose state as queryable,
that would not be possible without breaking compatibility. Better
optimization might even remove or merge certain operators.

Regarding the queryable state table sink: Yes, in the initial design the
result might be replicated in the state, but also might not. This depends
on the previous operation. We can later still add optimizations that unify
replicated state.

Best,
Fabian



2018-03-08 7:35 GMT-08:00 Stefano Bortoli <stefano.bort...@huawei.com>:

> Hi Timo, Renjie,
>
> Well, the idea is that stream processing could become a complex pipeline
> of multiple queries and sinking data on a separate sink for monitoring does
> not seem efficient. In fact, pulling state value on demand would allow to
> monitor the values of different parts of the stream  processing pipeline
> without needing to deal with arrival/output rate also at the monitoring
> level in a sink.
>
> I'm aware it would require some modifications of the level of the table.
> However, having a separate sink wouldn't duplicate part of the data with
> respect to the state and output sink? In fact, you would have to keep also
> these data in a state for stateful processing.
>
> As you are going anyway to have a configuration parameter that should be
> interpreted during the StreamSQL query compilation, perhaps hooking the
> queryable state as a parameter for the processing functions (and other
> streaming operators) would be the easier and would not create any overhead
> (besides the query to the state, of course).
>
> My2c.
>
> Best,
> Stefano
>
> -----Original Message-----
> From: Renjie Liu [mailto:liurenjie2...@gmail.com]
> Sent: Monday, March 05, 2018 3:52 AM
> To: dev@flink.apache.org
> Subject: Re: StreamSQL queriable state
>
> Hi, Timo:
> I've read your QueryableStateTableSink implementation and that basically
> implementes what I want to do. I also want to extend SQL client so that
> user can do point query against the table sink. Do we still need a design
> doc for that? It seems that I just need to finish the left part and do some
> test against it.
>
> Hi, Stefano:
> Your requirement needs some changes to the flink table implementation but
> I don't know why you need that? For debugging? The operator state is
> internal and subject to optisimation logic, so I think it maybe meanless to
> expose that.
>
> On Fri, Mar 2, 2018 at 9:37 PM Stefano Bortoli <stefano.bort...@huawei.com
> >
> wrote:
>
> > Hi Timo, Renjie,
> >
> > What I was thinking did not include the QueryableStateTableSink, but
> > rather tap in directly into the state of a streaming operator. Perhaps
> > it is the same thing, but just it sounds not intuitive to consider it a
> sink.
> >
> > So, we would need a way to configure the environment for the query to
> > share the "state name" before the query is executed, and then use this
> > to create the hook for the queriable state in the operator. Perhaps
> > extend the current codegen and operator implementations to get as a
> > parameter the StateDescriptor to be inquired.
> >
> > Looking forward for the design document, will be happy to give you
> > feedback.
> >
> > Best,
> > Stefano
> >
> > -----Original Message-----
> > From: Renjie Liu [mailto:liurenjie2...@gmail.com]
> > Sent: Friday, March 02, 2018 11:42 AM
> > To: dev@flink.apache.org
> > Subject: Re: StreamSQL queriable state
> >
> > Great, thank you.
> > I'll start by writing a design doc.
> >
> > On Fri, Mar 2, 2018 at 6:40 PM Timo Walther <twal...@apache.org> wrote:
> >
> > > I gave you contributor permissions in Jira. You should be able to
> > > assign it to yourself now.
> > >
> > > Am 3/2/18 um 11:33 AM schrieb Renjie Liu:
> > > > Hi, Timo:
> > > > It seems that I can't assign it to myself. Could you please help
> > > > to
> > > assign
> > > > that to me?
> > > > My jira username is liurenjie1024 and my email is
> > > liurenjie2...@gmail.com
> > > >
> > > > On Fri, Mar 2, 2018 at 6:24 PM Timo Walther <twal...@apache.org>
> > wrote:
> > > >
> > > >> Hi Renjie,
> > > >>
> > > >> that would be great. There is already a Jira issue for it:
> > > >> https://issues.apache.org/jira/browse/FLINK-6968
> > > >>
> > > >> Feel free to assign it to yourself. You can reuse parts of my
> > > >> code if you want. But maybe it would make sense to have a little
> > > >> design document first about what we want to support.
> > > >>
> > > >> Regards,
> > > >> Timo
> > > >>
> > > >>
> > > >> Am 3/2/18 um 11:10 AM schrieb Renjie Liu:
> > > >>> Hi, Timo, I've been planning on the same thing and would like to
> > > >> contribute
> > > >>> that.
> > > >>>
> > > >>> On Fri, Mar 2, 2018 at 6:05 PM Timo Walther <twal...@apache.org>
> > > wrote:
> > > >>>
> > > >>>> Hi Stefano,
> > > >>>>
> > > >>>> yes there are plan in this direction. Actually, I already
> > > >>>> worked on
> > > such
> > > >>>> a QueryableStateTableSink [1] in the past but never finished it
> > > because
> > > >>>> of priority shifts. Would be great if somebody wants to
> > > >>>> contribute
> > > this
> > > >>>> functionality :)
> > > >>>>
> > > >>>> Regards,
> > > >>>> Timo
> > > >>>>
> > > >>>> [1] https://github.com/twalthr/flink/tree/QueryableTableSink
> > > >>>>
> > > >>>> Am 3/2/18 um 10:58 AM schrieb Stefano Bortoli:
> > > >>>>> Hi guys,
> > > >>>>>
> > > >>>>> I am checking out the queriable state API, and it seems that
> > > >>>>> most of
> > > >> the
> > > >>>> tooling is already available. However, the queriable state is
> > > available
> > > >>>> just for the streaming API, not at the StreamSQL API level. In
> > > >> principle,
> > > >>>> as the flink-table is aware of the query semantic and data
> > > >>>> output
> > > type,
> > > >> it
> > > >>>> should be possible to configure the query compilation to nest
> > > queriable
> > > >>>> state in the process/window functions. Is there any plan in
> > > >>>> this
> > > >> direction?
> > > >>>>> Best,
> > > >>>>> Stefano
> > > >>>>>
> > > >>>> --
> > > >>> Liu, Renjie
> > > >>> Software Engineer, MVAD
> > > >>>
> > > >> --
> > > > Liu, Renjie
> > > > Software Engineer, MVAD
> > > >
> > >
> > > --
> > Liu, Renjie
> > Software Engineer, MVAD
> >
> --
> Liu, Renjie
> Software Engineer, MVAD
>

Reply via email to