[ 
https://issues.apache.org/jira/browse/TEZ-3877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16293186#comment-16293186
 ] 

Jason Lowe commented on TEZ-3877:
---------------------------------

TEZ-3350 tracks the problem where intermediate spills are not placed in the 
container-specific directory.  Even if the task deletes intermediate spills 
once they are consumed, we still need TEZ-3350 to solve the problem of 
"leaking" spill files if the task crashes or is killed during a shuffle/merge.


> Delete spill files once merge is done
> -------------------------------------
>
>                 Key: TEZ-3877
>                 URL: https://issues.apache.org/jira/browse/TEZ-3877
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Rohini Palaniswamy
>
>   I see that spill files are not deleted right after merge completes. We 
> should do that as it takes up a lot of space and we can't afford that wastage 
> when Tez takes up a lot of shuffle space with complex DAGs. [~jlowe] told me 
> they are only cleaned up after application completes as they are written in 
> app directory and not container directory. That also has to be done so that 
> they are cleaned up by node manager during task failures or container crashes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to