[ 
https://issues.apache.org/jira/browse/SPARK-48931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-48931:
-----------------------------------
    Labels: pull-request-available  (was: )

> Reduce Cloud Store List API cost for state store maintenance task
> -----------------------------------------------------------------
>
>                 Key: SPARK-48931
>                 URL: https://issues.apache.org/jira/browse/SPARK-48931
>             Project: Spark
>          Issue Type: Improvement
>          Components: Structured Streaming
>    Affects Versions: 3.4.3
>            Reporter: Riya Verma
>            Priority: Major
>              Labels: pull-request-available
>
> Currently, during the state store maintenance process, we find which old 
> version files of the RocksDB state store to delete by listing all existing 
> snapshotted version files in the checkpoint directory every 1 minute by 
> default. The frequent list calls in the cloud can result in high costs. To 
> address this concern and reduce the cost associated with state store 
> maintenance, we should aim to minimize the frequency of listing object stores 
> inside the maintenance task. To minimize the frequency, we will try to 
> accumulate versions to delete and only call list when the number of versions 
> to delete reaches a configured threshold. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to