Use State query to dump state into datalake

Lian Jiang Sun, 02 May 2021 12:07:28 -0700

Hi,

I am interested in dumping Flink state from Rockdb to datalake using state
query
https://ci.apache.org/projects/flink/flink-docs-master/docs/dev/datastream/fault-tolerance/queryable_state/.
My map state could have 200 million key-values pairs and the total size
could be 150G bytes. My batch job scheduled using airflow will have one
task which uses Flink state query to dump the Flink state to datalake in
parquet format so other spark tasks can use it.


Is there any scalability concern for using state query in this way?
Appreciate any insight. Thanks!

Use State query to dump state into datalake

Reply via email to