[
https://issues.apache.org/jira/browse/SPARK-56498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18089251#comment-18089251
]
Livia Zhu commented on SPARK-56498:
-----------------------------------
I will work on this
> Implement decoupled state store maintenance
> -------------------------------------------
>
> Key: SPARK-56498
> URL: https://issues.apache.org/jira/browse/SPARK-56498
> Project: Spark
> Issue Type: Task
> Components: Structured Streaming
> Affects Versions: 4.2.0
> Reporter: Jerry Zheng
> Priority: Major
>
> In our StateStore, we have a maintenance task that is responsible for
> clearing old state files, uploading snapshots to the cloud, and unloading old
> providers. We want to revamp the state store maintenance model to solve
> various issues stemming from maintenance starvation. The new design should:
> * Decouple cleanup and snapshot operations
> * Eliminate starvation for cleanup and snapshot operations even under very
> long cleanup/snapshot times and resource constraints
> * Allow only 1 cleanup operation and 1 upload snapshot operation per
> provider at a time
> * Not reintroduce any previous issues with maintenance, such as close safety
> * Run snapshot and cleanup each at least once before unloading
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]