Re: Queryable State Deprecation

2022-03-11 Thread Ron Crocker
Hi Dawid -

I’m pretty keen on keeping it alive. Do we have a sense of what it would take 
to get it “to a production ready state?”

Thanks!

Ron

> On Feb 4, 2022, at 5:06 AM, Dawid Wysakowicz  wrote:
> 
> Hi Karthik,
> 
> The reason we deprecated it is because we lacked committers who could spend 
> time on getting the Queryable state to a production ready state. I might be 
> speaking for myself here, but I think the main use case for the queryable 
> state is to have an insight into the current state of the application for 
> debugging purposes. If it is used for data serving purposes, we believe it's 
> better to sink the data into an external store, which can provide better 
> discoverability and more user friendly APIs for querying the results.
> 
> As for debugging/tracking insights you may try to achieve similar results 
> with metrics.
> 
> Best,
> 
> Dawid
> 
> On 01/02/2022 16:36, Jatti, Karthik wrote:
>> Hi, 
>> 
>> I see on the Flink Roadmap that Queryable state API is scheduled to be 
>> deprecated but I couldn’t find much information on confluence or this 
>> mailing group’s archives to understand the background as to why it’s being 
>> deprecated and what would be a an alternative.  Any pointers to help me get 
>> some more information here would be great. 
>>  
>> Thanks,
>> Karthik 
>> 
>> 
>> The information in the email message containing a link to this page, 
>> including any attachments thereto (collectively, “the e-mail”), is only for 
>> use by the intended recipient(s). The e-mail may contain information that is 
>> confidential, proprietary and/or privileged. If you have reason to believe 
>> that you are not the intended recipient, please notify the sender that you 
>> may have received this e-mail in error and delete all copies of it, 
>> including attachments, from your computer. Any viewing, copying, disclosure 
>> or distribution of this information by an unintended recipient is prohibited 
>> and by an intended recipient may be governed by arrangements in place 
>> between the sender’s and recipient’s respective firms. Eze Software does not 
>> represent that the e-mail is virus-free, complete or accurate. Eze Software 
>> accepts no liability for any damage sustained in connection with the content 
>> or transmission of the e-mail.
>> Copyright © 2013 Eze Castle Software LLC. All Rights Reserved.



Re: Queryable State Deprecation

2022-02-12 Thread Frank Dekervel

Hello,

This is what we did, but i'm not quite convinced that its the best way 
(maybe others could chime in ?).


 * We have a zalando postgres cluster running next to the flink
   cluster, so we can just use a jdbc sink for the state. In theory we
   should be able to switch to exactly once (we didn't do this so far)
 * our stateful processor is a state machine that emits outgoing
   messages based on incoming messages. Sometimes we need to "rewind"
   the state machine to correctly process an incoming message. This
   forces us to keep some history of past messages
 * We don't materialize the state directly, we only materialize the
   state changes, which are then re-materialized in postgres. It took
   us some time to make this bug-free. When we were still debugging
   this, we read a savepoint to look in the state and compare it with
   what we had in postgres.

In a zalando postgres cluster you can only write to the master. But for 
readers, if a small delay is acceptable, you can load balance to the 
replica's.


Greetings,

Frank


On 12.02.22 16:56, Jatti, Karthik wrote:


Hi Frank,

What sink did you end up choosing for materializing the state ?

Our use case into looking at queryable state is that we have many 
readers and a very few writers (readers to writers ratio in the 
1000s). Each consuming application (reader) needs a live view of a 
subset of the state and these applications come online and go offline 
many times a day. What would be a good sink in such a scenario ?


e.g if the state of the flink app was a dynamic table of inventory of 
products built from Kafka streams of purchases and sales. And a subset 
of this state needs to be available for 1000s of readers who have a 
live view of what is available in stock with different aggregations 
and filters . And these consumers come online and go offline, so they 
need to be able to restore their substate and continue to receive 
updates for it.


We are evaluating sinks but haven’t narrowed on anything that would 
look like an obvious case.


Thanks,

Karthik

*From: *Jatti, Karthik 
*Date: *Friday, February 11, 2022 at 6:00 PM
*To: *Frank Dekervel , user@flink.apache.org 
, dwysakow...@apache.org 

*Subject: *Re: Queryable State Deprecation

Thank you Frank and Dawid for providing the context here.

*From: *Frank Dekervel 
*Date: *Friday, February 4, 2022 at 9:56 AM
*To: *user@flink.apache.org 
*Subject: *Re: Queryable State Deprecation

*EXTERNAL SENDER*



Hello,

To give an extra datapoint: after a not so successful experiment with 
faust-streaming we moved our application to flink. Since flinks 
queryable state was apparently stagnant, we implemented what was 
needed to sink the state to an external data store for querying.


However, if queryable state was in good shape we would definately have 
used it. Making sure that the state is always reflected correctly in 
our external system turned out to be non-trivial for a number of 
reasons: our state is not trivially convertable to rows in a table, 
and sometimes we had (due to our own bugs, but still) inconsistencies 
between the internal flink state and the externally materialized 
state, especially after replaying from a checkpoint/savepoint after a 
crash (we cannot use exactly_once sinks in all occasions).


Also, obviously, we could not use flinks partitioning/parallellism to 
help making state querying more scalable.


Greetings,
Frank

On 04.02.22 14:06, Dawid Wysakowicz wrote:

Hi Karthik,

The reason we deprecated it is because we lacked committers who
could spend time on getting the Queryable state to a production
ready state. I might be speaking for myself here, but I think the
main use case for the queryable state is to have an insight into
the current state of the application for debugging purposes. If it
is used for data serving purposes, we believe it's better to sink
the data into an external store, which can provide better
discoverability and more user friendly APIs for querying the results.

As for debugging/tracking insights you may try to achieve similar
results with metrics.

Best,

Dawid

On 01/02/2022 16:36, Jatti, Karthik wrote:

Hi,

I see on the Flink Roadmap that Queryable state API is
scheduled to be deprecated but I couldn’t find much
information on confluence or this mailing group’s archives to
understand the background as to why it’s being deprecated and
what would be a an alternative.  Any pointers to help me get
some more information here would be great.

Thanks,

Karthik




The information in the email message containing a link to this
page, including any attachments thereto (collectively, “the
e-mail”), is only for use by the intended r

Re: Queryable State Deprecation

2022-02-12 Thread Jatti, Karthik
Hi Frank,

What sink did you end up choosing for materializing the state ?

Our use case into looking at queryable state is that we have many readers and a 
very few writers (readers to writers ratio in the 1000s). Each consuming 
application (reader) needs a live view of a subset of the state and these 
applications come online and go offline many times a day. What would be a good 
sink in such a scenario ?

e.g if the state of the flink app was a dynamic table of inventory of products 
built from Kafka streams of purchases and sales. And a subset of this state 
needs to be available for 1000s of readers who have a live view of what is 
available in stock with different aggregations and filters . And these 
consumers come online and go offline, so they need to be able to restore their 
substate and continue to receive updates for it.

We are evaluating sinks but haven’t narrowed on anything that would look like 
an obvious case.

Thanks,
Karthik


From: Jatti, Karthik 
Date: Friday, February 11, 2022 at 6:00 PM
To: Frank Dekervel , user@flink.apache.org 
, dwysakow...@apache.org 
Subject: Re: Queryable State Deprecation
Thank you Frank and Dawid for providing the context here.

From: Frank Dekervel 
Date: Friday, February 4, 2022 at 9:56 AM
To: user@flink.apache.org 
Subject: Re: Queryable State Deprecation
EXTERNAL SENDER


Hello,

To give an extra datapoint: after a not so successful experiment with 
faust-streaming we moved our application to flink. Since flinks queryable state 
was apparently  stagnant, we implemented what was needed to sink the state to 
an external data store for querying.

However, if queryable state was in good shape we would definately have used it. 
Making sure that the state is always reflected correctly in our external system 
turned out to be non-trivial for a number of reasons: our state is not 
trivially convertable to rows in a table, and sometimes we had (due to our own 
bugs, but still) inconsistencies between the internal flink state and the 
externally materialized state, especially after replaying from a 
checkpoint/savepoint after a crash (we cannot use exactly_once sinks in all 
occasions).

Also, obviously, we could not use flinks partitioning/parallellism to help 
making state querying more scalable.

Greetings,
Frank




On 04.02.22 14:06, Dawid Wysakowicz wrote:

Hi Karthik,

The reason we deprecated it is because we lacked committers who could spend 
time on getting the Queryable state to a production ready state. I might be 
speaking for myself here, but I think the main use case for the queryable state 
is to have an insight into the current state of the application for debugging 
purposes. If it is used for data serving purposes, we believe it's better to 
sink the data into an external store, which can provide better discoverability 
and more user friendly APIs for querying the results.

As for debugging/tracking insights you may try to achieve similar results with 
metrics.

Best,

Dawid
On 01/02/2022 16:36, Jatti, Karthik wrote:
Hi,

I see on the Flink Roadmap that Queryable state API is scheduled to be 
deprecated but I couldn’t find much information on confluence or this mailing 
group’s archives to understand the background as to why it’s being deprecated 
and what would be a an alternative.  Any pointers to help me get some more 
information here would be great.

Thanks,
Karthik



The information in the email message containing a link to this page, including 
any attachments thereto (collectively, “the e-mail”), is only for use by the 
intended recipient(s). The e-mail may contain information that is confidential, 
proprietary and/or privileged. If you have reason to believe that you are not 
the intended recipient, please notify the sender that you may have received 
this e-mail in error and delete all copies of it, including attachments, from 
your computer. Any viewing, copying, disclosure or distribution of this 
information by an unintended recipient is prohibited and by an intended 
recipient may be governed by arrangements in place between the sender’s and 
recipient’s respective firms. Eze Software does not represent that the e-mail 
is virus-free, complete or accurate. Eze Software accepts no liability for any 
damage sustained in connection with the content or transmission of the e-mail.

Copyright © 2013 Eze Castle Software LLC. All Rights Reserved.


Re: Queryable State Deprecation

2022-02-11 Thread Jatti, Karthik
Thank you Frank and Dawid for providing the context here.

From: Frank Dekervel 
Date: Friday, February 4, 2022 at 9:56 AM
To: user@flink.apache.org 
Subject: Re: Queryable State Deprecation
EXTERNAL SENDER


Hello,

To give an extra datapoint: after a not so successful experiment with 
faust-streaming we moved our application to flink. Since flinks queryable state 
was apparently  stagnant, we implemented what was needed to sink the state to 
an external data store for querying.

However, if queryable state was in good shape we would definately have used it. 
Making sure that the state is always reflected correctly in our external system 
turned out to be non-trivial for a number of reasons: our state is not 
trivially convertable to rows in a table, and sometimes we had (due to our own 
bugs, but still) inconsistencies between the internal flink state and the 
externally materialized state, especially after replaying from a 
checkpoint/savepoint after a crash (we cannot use exactly_once sinks in all 
occasions).

Also, obviously, we could not use flinks partitioning/parallellism to help 
making state querying more scalable.

Greetings,
Frank




On 04.02.22 14:06, Dawid Wysakowicz wrote:

Hi Karthik,

The reason we deprecated it is because we lacked committers who could spend 
time on getting the Queryable state to a production ready state. I might be 
speaking for myself here, but I think the main use case for the queryable state 
is to have an insight into the current state of the application for debugging 
purposes. If it is used for data serving purposes, we believe it's better to 
sink the data into an external store, which can provide better discoverability 
and more user friendly APIs for querying the results.

As for debugging/tracking insights you may try to achieve similar results with 
metrics.

Best,

Dawid
On 01/02/2022 16:36, Jatti, Karthik wrote:
Hi,

I see on the Flink Roadmap that Queryable state API is scheduled to be 
deprecated but I couldn’t find much information on confluence or this mailing 
group’s archives to understand the background as to why it’s being deprecated 
and what would be a an alternative.  Any pointers to help me get some more 
information here would be great.

Thanks,
Karthik



The information in the email message containing a link to this page, including 
any attachments thereto (collectively, “the e-mail”), is only for use by the 
intended recipient(s). The e-mail may contain information that is confidential, 
proprietary and/or privileged. If you have reason to believe that you are not 
the intended recipient, please notify the sender that you may have received 
this e-mail in error and delete all copies of it, including attachments, from 
your computer. Any viewing, copying, disclosure or distribution of this 
information by an unintended recipient is prohibited and by an intended 
recipient may be governed by arrangements in place between the sender’s and 
recipient’s respective firms. Eze Software does not represent that the e-mail 
is virus-free, complete or accurate. Eze Software accepts no liability for any 
damage sustained in connection with the content or transmission of the e-mail.

Copyright © 2013 Eze Castle Software LLC. All Rights Reserved.


Re: Queryable State Deprecation

2022-02-04 Thread Frank Dekervel

Hello,

To give an extra datapoint: after a not so successful experiment with 
faust-streaming we moved our application to flink. Since flinks 
queryable state was apparently  stagnant, we implemented what was needed 
to sink the state to an external data store for querying.


However, if queryable state was in good shape we would definately have 
used it. Making sure that the state is always reflected correctly in our 
external system turned out to be non-trivial for a number of reasons: 
our state is not trivially convertable to rows in a table, and sometimes 
we had (due to our own bugs, but still) inconsistencies between the 
internal flink state and the externally materialized state, especially 
after replaying from a checkpoint/savepoint after a crash (we cannot use 
exactly_once sinks in all occasions).


Also, obviously, we could not use flinks partitioning/parallellism to 
help making state querying more scalable.


Greetings,
Frank



On 04.02.22 14:06, Dawid Wysakowicz wrote:


Hi Karthik,

The reason we deprecated it is because we lacked committers who could 
spend time on getting the Queryable state to a production ready state. 
I might be speaking for myself here, but I think the main use case for 
the queryable state is to have an insight into the current state of 
the application for debugging purposes. If it is used for data serving 
purposes, we believe it's better to sink the data into an external 
store, which can provide better discoverability and more user friendly 
APIs for querying the results.


As for debugging/tracking insights you may try to achieve similar 
results with metrics.


Best,

Dawid

On 01/02/2022 16:36, Jatti, Karthik wrote:


Hi,

I see on the Flink Roadmap that Queryable state API is scheduled to 
be deprecated but I couldn’t find much information on confluence or 
this mailing group’s archives to understand the background as to why 
it’s being deprecated and what would be a an alternative.  Any 
pointers to help me get some more information here would be great.


Thanks,

Karthik




The information in the email message containing a link to this page, 
including any attachments thereto (collectively, “the e-mail”), is 
only for use by the intended recipient(s). The e-mail may contain 
information that is confidential, proprietary and/or privileged. If 
you have reason to believe that you are not the intended recipient, 
please notify the sender that you may have received this e-mail in 
error and delete all copies of it, including attachments, from your 
computer. Any viewing, copying, disclosure or distribution of this 
information by an unintended recipient is prohibited and by an 
intended recipient may be governed by arrangements in place between 
the sender’s and recipient’s respective firms. Eze Software does not 
represent that the e-mail is virus-free, complete or accurate. Eze 
Software accepts no liability for any damage sustained in connection 
with the content or transmission of the e-mail.


Copyright © 2013 Eze Castle Software LLC. All Rights Reserved.

Re: Queryable State Deprecation

2022-02-04 Thread Dawid Wysakowicz

Hi Karthik,

The reason we deprecated it is because we lacked committers who could 
spend time on getting the Queryable state to a production ready state. I 
might be speaking for myself here, but I think the main use case for the 
queryable state is to have an insight into the current state of the 
application for debugging purposes. If it is used for data serving 
purposes, we believe it's better to sink the data into an external 
store, which can provide better discoverability and more user friendly 
APIs for querying the results.


As for debugging/tracking insights you may try to achieve similar 
results with metrics.


Best,

Dawid

On 01/02/2022 16:36, Jatti, Karthik wrote:


Hi,

I see on the Flink Roadmap that Queryable state API is scheduled to be 
deprecated but I couldn’t find much information on confluence or this 
mailing group’s archives to understand the background as to why it’s 
being deprecated and what would be a an alternative.  Any pointers to 
help me get some more information here would be great.


Thanks,

Karthik




The information in the email message containing a link to this page, 
including any attachments thereto (collectively, “the e-mail”), is 
only for use by the intended recipient(s). The e-mail may contain 
information that is confidential, proprietary and/or privileged. If 
you have reason to believe that you are not the intended recipient, 
please notify the sender that you may have received this e-mail in 
error and delete all copies of it, including attachments, from your 
computer. Any viewing, copying, disclosure or distribution of this 
information by an unintended recipient is prohibited and by an 
intended recipient may be governed by arrangements in place between 
the sender’s and recipient’s respective firms. Eze Software does not 
represent that the e-mail is virus-free, complete or accurate. Eze 
Software accepts no liability for any damage sustained in connection 
with the content or transmission of the e-mail.


Copyright © 2013 Eze Castle Software LLC. All Rights Reserved.


OpenPGP_signature
Description: OpenPGP digital signature


Queryable State Deprecation

2022-02-01 Thread Jatti, Karthik
Hi,

I see on the Flink Roadmap that Queryable state API is scheduled to be 
deprecated but I couldn’t find much information on confluence or this mailing 
group’s archives to understand the background as to why it’s being deprecated 
and what would be a an alternative.  Any pointers to help me get some more 
information here would be great.

Thanks,
Karthik



The information in the email message containing a link to this page, including 
any attachments thereto (collectively, “the e-mail”), is only for use by the 
intended recipient(s). The e-mail may contain information that is confidential, 
proprietary and/or privileged. If you have reason to believe that you are not 
the intended recipient, please notify the sender that you may have received 
this e-mail in error and delete all copies of it, including attachments, from 
your computer. Any viewing, copying, disclosure or distribution of this 
information by an unintended recipient is prohibited and by an intended 
recipient may be governed by arrangements in place between the sender’s and 
recipient’s respective firms. Eze Software does not represent that the e-mail 
is virus-free, complete or accurate. Eze Software accepts no liability for any 
damage sustained in connection with the content or transmission of the e-mail.

Copyright © 2013 Eze Castle Software LLC. All Rights Reserved.