Bhuwan Sahni created SPARK-47568:
------------------------------------

             Summary: Fix race condition between maintenance thread and task 
thead for RocksDB snapshot
                 Key: SPARK-47568
                 URL: https://issues.apache.org/jira/browse/SPARK-47568
             Project: Spark
          Issue Type: Bug
          Components: Structured Streaming
    Affects Versions: 3.5.1, 3.5.0, 4.0.0, 3.5.2
            Reporter: Bhuwan Sahni


There are currently some race conditions between maintenance thread and task 
thread which can result in corrupted checkpoint state.
 # The maintenance thread currently relies on class variable {{lastSnapshot}} 
to find the latest checkpoint and uploads it to DFS. This checkpoint can be 
modified at commit time by Task thread if a new snapshot is created.
 # The task thread does not reset lastSnapshot at load time, which can result 
in newer snapshots (if a old version is loaded) being considered valid and 
uploaded to DFS. This results in VersionIdMismatch errors.

This issue proposes to fix these issues by guarding latestSnapshot variable 
modification, and setting latestSnapshot properly at load time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to