Hi dev,

I'd like to start a discussion on "State Data Source - Reader".

This proposal aims to introduce a new data source "statestore" which
enables reading the state rows from existing checkpoint via offline (batch)
query. This will enable users to 1) create unit tests against stateful
query verifying the state value (especially flatMapGroupsWithState), 2)
gather more context on the status when an incident occurs, especially for
incorrect output.

*SPIP*:
https://docs.google.com/document/d/1HjEupRv8TRFeULtJuxRq_tEG1Wq-9UNu-ctGgCYRke0/edit?usp=sharing
*JIRA*: https://issues.apache.org/jira/browse/SPARK-45511

Looking forward to your feedback!

Thanks,
Jungtaek Lim (HeartSaVioR)

ps. The scope of the project is narrowed to the reader in this SPIP, since
the writer requires us to consider more cases. We are planning on it.

Reply via email to