Juliusz Sompolski created SPARK-45435:
-----------------------------------------

             Summary: Document that lazy checkpoint may cause undeterministm
                 Key: SPARK-45435
                 URL: https://issues.apache.org/jira/browse/SPARK-45435
             Project: Spark
          Issue Type: Documentation
          Components: Spark Core, SQL
    Affects Versions: 4.0.0
            Reporter: Juliusz Sompolski


Some people may want to use checkpoint to get a consistent snapshot of the 
Dataset / RDD. Warn that this is not the case with lazy checkpoint, because 
checkpoint is computed only at the end of the first action, and the data used 
during the first action may be different because of non-determinism and retries.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to