[
https://issues.apache.org/jira/browse/SPARK-48931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dongjoon Hyun updated SPARK-48931:
----------------------------------
Parent: SPARK-44111
Issue Type: Sub-task (was: Improvement)
> Reduce Cloud Store List API cost for state store maintenance task
> -----------------------------------------------------------------
>
> Key: SPARK-48931
> URL: https://issues.apache.org/jira/browse/SPARK-48931
> Project: Spark
> Issue Type: Sub-task
> Components: Structured Streaming
> Affects Versions: 3.4.3
> Reporter: Riya Verma
> Assignee: Riya Verma
> Priority: Major
> Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Currently, during the state store maintenance process, we find which old
> version files of the RocksDB state store to delete by listing all existing
> snapshotted version files in the checkpoint directory every 1 minute by
> default. The frequent list calls in the cloud can result in high costs. To
> address this concern and reduce the cost associated with state store
> maintenance, we should aim to minimize the frequency of listing object stores
> inside the maintenance task. To minimize the frequency, we will try to
> accumulate versions to delete and only call list when the number of versions
> to delete reaches a configured threshold.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]