Hi Kant,

Checkpointing interval is configurable, but I wouldn’t count on it working well 
with even 10s intervals. 
 
I think what you are this is not supported by Flink generically. Maybe 
Queryable state I mentioned before? But I have never used it.

I think you would have to implement your own custom operator that would output 
changes to it’s internal state as a side output.

Piotrek

> On 30 Oct 2019, at 16:14, kant kodali <kanth...@gmail.com> wrote:
> 
> Hi Piotr,
> 
> I am talking about the internal state. How often this state gets 
> checkpointed? if it is every few seconds then it may not meet our real-time 
> requirement(sub second).
> 
> The question really is can I read this internal state in a streaming fashion 
> in an update mode? The state processor API seems to expose DataSet but not 
> DataStream so I am not sure how to read internal state in streaming fashion 
> in an update made?
> 
> Thanks!
> 
> On Wed, Oct 30, 2019 at 7:25 AM Piotr Nowojski <pi...@ververica.com 
> <mailto:pi...@ververica.com>> wrote:
> Hi,
> 
> I’m not sure what are you trying to achieve. What do you mean by “state of 
> full outer join”? The result of it? Or it’s internal state? Also keep in 
> mind, that internal state of the operators in Flink is already 
> snapshoted/written down to an external storage during checkpointing mechanism.
> 
> The result should be simple, just write it to some Sink.
> 
> For the internal state, it sounds like you are doing something not the way it 
> was intended… having said that, you can try one of the following options:
> a) Implement your own outer join operator (might not be as easy if you are 
> using Table API/SQL) and just create a side output for the state changes.
> b) Use state processor API to read the content of a savepoint/checkpoint 
> [1][2]
> c) Use queryable state [3] (I’m not sure about this, I have never used 
> queryable state)
> 
> Piotrek
> 
> [1] 
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/libs/state_processor_api.html
>  
> <https://ci.apache.org/projects/flink/flink-docs-stable/dev/libs/state_processor_api.html>
> [2] https://flink.apache.org/feature/2019/09/13/state-processor-api.html 
> <https://flink.apache.org/feature/2019/09/13/state-processor-api.html>
> [3] 
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/queryable_state.html
>  
> <https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/queryable_state.html>
> 
>> On 29 Oct 2019, at 16:42, kant kodali <kanth...@gmail.com 
>> <mailto:kanth...@gmail.com>> wrote:
>> 
>> Hi All,
>> 
>> I want to do a full outer join on two streaming data sources and store the 
>> state of full outer join in some external storage like rocksdb or something 
>> else. And then want to use this intermediate state as a streaming source 
>> again, do some transformation and write it to some external store. is that 
>> possible with Flink 1.9?
>> 
>> Also what storage systems support push mechanism for the intermediate data? 
>> For example, In the use case above does rocksdb support push/emit events in 
>> a streaming fashion?
>> 
>> Thanks!
> 

Reply via email to