[ 
https://issues.apache.org/jira/browse/SPARK-56044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-56044:
-------------------------------------

    Assignee: Shuai Lu

> HistoryServerDiskManager does not delete app store on release when app is not 
> in active map
> -------------------------------------------------------------------------------------------
>
>                 Key: SPARK-56044
>                 URL: https://issues.apache.org/jira/browse/SPARK-56044
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 3.1.1, 4.1.0, 3.5.7, 4.2.0, 4.1.1
>            Reporter: Shuai Lu
>            Assignee: Shuai Lu
>            Priority: Major
>              Labels: pull-request-available
>
> In {{HistoryServerDiskManager.release()}}, the store directory deletion is 
> gated inside an {{oldSizeOpt.foreach}} block, which only executes when the 
> application is present in the in-memory {{active}} map:
> {code:scala}
> val oldSizeOpt = active.synchronized {
>   active.remove(appId -> attemptId)
> }
> oldSizeOpt.foreach { oldSize =>
>   val path = appStorePath(appId, attemptId)
>   updateUsage(-oldSize, committed = true)
>   if (path.isDirectory()) {
>     if (delete) {
>       deleteStore(path)   // never reached if app is not in active map
>     }
>     ...
>   }
> }
> {code}
> The {{active}} map is in-memory only and is empty after a History Server 
> restart. When log expiration triggers {{release(appId, attemptId, delete = 
> true)}} for an app that was never reopened after a restart, {{oldSizeOpt}} is 
> {{None}}, the block is skipped entirely, and the on-disk store directory 
> (.ldb / .rdb) is never deleted. Over time these orphaned store directories 
> accumulate, consuming disk space indefinitely.
> *Fix:*
> Separate the {{updateUsage}} deduction (which correctly applies only to 
> actively tracked apps) from the directory operation (which should apply 
> whenever the directory exists on disk). When deleting an app that was not in 
> the active map, derive its size directly from disk before deducting it from 
> usage to keep accounting accurate.
> Steps to Reproduce:
> # Start History Server with a non-trivial max disk usage setting.
> # Load several applications (their .ldb/.rdb stores are created on disk).
> # Close the application UIs (release without delete -- stores remain on disk).
> # Restart the History Server (active map is now empty).
> # Wait for or trigger log expiration cleanup.
> # Observe that the .ldb/.rdb store directories are NOT deleted despite 
> release(delete=true) being called.
> *Expected:* Store directories are deleted when {{release(delete=true)}} is 
> called.
> *Actual:* Store directories are silently left on disk when the app is not in 
> the active map.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to