Yuchen Liu created SPARK-48588:
----------------------------------
Summary: Fine-grained State Data Source
Key: SPARK-48588
URL: https://issues.apache.org/jira/browse/SPARK-48588
Project: Spark
Issue Type: Epic
Components: Structured Streaming
Affects Versions: 4.0.0
Reporter: Yuchen Liu
The current state reader API replays the state store rows from the latest
snapshot and newer delta files if any. The issue with this mechanism is that
sometimes, the snapshot files could be wrongly constructed, or user want to
know the change of state across batches. We need to improve the State Reader so
that it can handle a variety of fine-grained requirements. For example,
reconstruct a state based on arbitrary snapshot; support CDC mode for state
evolution.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]