Shixiong Zhu created SPARK-36519:
------------------------------------
Summary: Store the RocksDB format in the checkpoint for a
streaming query
Key: SPARK-36519
URL: https://issues.apache.org/jira/browse/SPARK-36519
Project: Spark
Issue Type: Improvement
Components: Structured Streaming
Affects Versions: 3.2.0
Reporter: Shixiong Zhu
Assignee: Shixiong Zhu
RocksDB provides backward compatibility but it doesn't always provide forward
compatibility. It's better to store the RocksDB format version in the
checkpoint so that it would give us more information to provide the rollback
guarantee when we upgrade the RocksDB version that may introduce incompatible
change in a new Spark version.
A typical case is when a user upgrades their query to a new Spark version, and
this new Spark version has a new RocksDB version which may use a new format.
But the user hits some bug and decide to rollback. But in the old Spark
version, the old RocksDB version cannot read the new format.
In order to handle this case, we will write the RocksDB format version to the
checkpoint. When restarting from a checkpoint, we will force RocksDB to use the
format version stored in the checkpoint. This will ensure the user can rollback
their Spark version if needed.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]