[jira] [Comment Edited] (HUDI-3425) Clean up spill path created by Hudi during uneventful shutdown

Xinglong Wang (Jira) Wed, 09 Aug 2023 01:58:06 -0700


    [ 
https://issues.apache.org/jira/browse/HUDI-3425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17752332#comment-17752332
 ]


Xinglong Wang edited comment on HUDI-3425 at 8/9/23 8:57 AM:
-------------------------------------------------------------

{{I have encountered the same problem. I am using Flink on Yarn. When the job 
executes compaction but encounters an abnormal situation (for example, 
container is running beyond physical memory limits or other exceptions) and 
performs a full-restart, if `HoodieMergedLogRecordScanner` is still scanning 
log files at this time, and `ExternalSpillableMap#close()` is not executed to 
clean up, resulting in the accumulation of spillable map files in the /tmp 
directory, and eventually the disk is exhausted.}}
{{Now I set `hoodie.memory.spillable.map.path` to the `$PWD/spillable-map/` 
directory when Yarn container launches, environment variable `PWD` is exported 
in `launch_container.sh`, so that the spillable map files will be cleaned up 
when the container is closed.}}


was (Author: JIRAUSER295509):
I have encountered the same problem. I am using Flink on Yarn. When the job 
executes compaction but encounters an abnormal situation (for example, 
container is running beyond physical memory limits or other exceptions) and 
performs a full-restart, if `HoodieMergedLogRecordScanner` is still scanning 
log files at this time, and `ExternalSpillableMap#close()` is not executed to 
clean up, resulting in the accumulation of spillable map files in the /tmp 
directory, and eventually the disk is exhausted.
Now I set `hoodie.memory.spillable.map.path` to the `$PWD/spillable-map/` 
directory when Yarn container launches, environment variable `PWD` is exported 
in `launch_container.sh`, so that the spillable map files will be cleaned up 
when the container is closed.

> Clean up spill path created by Hudi during uneventful shutdown
> --------------------------------------------------------------
>
>                 Key: HUDI-3425
>                 URL: https://issues.apache.org/jira/browse/HUDI-3425
>             Project: Apache Hudi
>          Issue Type: Improvement
>          Components: compaction
>            Reporter: sivabalan narayanan
>            Assignee: sivabalan narayanan
>            Priority: Critical
>             Fix For: 0.12.1
>
>
> h1. Hudi spill path not getting cleared when containers getting killed 
> abruptly. 
>  
> When yarn kills the containers abruptly for any reason while hudi stage is in 
> progress then the spill path created by hudi on the disk is not cleaned and 
> as a result of which the nodes on the cluster start running out of space. We 
> need to clear the spill path manually to free out disk space.
>  
> Ref issue: https://github.com/apache/hudi/issues/4771



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (HUDI-3425) Clean up spill path created by Hudi during uneventful shutdown

Reply via email to