[
https://issues.apache.org/jira/browse/SPARK-55058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jerry Zheng updated SPARK-55058:
--------------------------------
Description: The {{metadata}} file holds the streaming query ID, and should
be existent if the commit and offset files are non-empty. This file not
existing will result in duplicates and incorrectness downstream if using
exactly-once sinks like DeltaSink which uses the streaming query ID to dedup
commits for the same batch. If the metadata file isn’t there, but the commit
and offset files are there, we should throw an error as the checkpoint is in an
inconsistent state. (was: The {{metadata}} file holds the streaming query ID,
and should be existent if the commit and offset files are non-empty. This file
not existing will result in duplicates and incorrectness downstream if using
DeltaSink which uses the streaming query ID to dedup commits for the same
batch. If the metadata file isn’t there, but the commit and offset files are
there, we should throw an error as the checkpoint is in an inconsistent state.)
> Throw an error if the /metadata file is not present, but offset or commit
> directories are non-empty
> ---------------------------------------------------------------------------------------------------
>
> Key: SPARK-55058
> URL: https://issues.apache.org/jira/browse/SPARK-55058
> Project: Spark
> Issue Type: Task
> Components: Structured Streaming
> Affects Versions: 4.2.0
> Reporter: Jerry Zheng
> Priority: Major
>
> The {{metadata}} file holds the streaming query ID, and should be existent if
> the commit and offset files are non-empty. This file not existing will result
> in duplicates and incorrectness downstream if using exactly-once sinks like
> DeltaSink which uses the streaming query ID to dedup commits for the same
> batch. If the metadata file isn’t there, but the commit and offset files are
> there, we should throw an error as the checkpoint is in an inconsistent state.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]