[
https://issues.apache.org/jira/browse/TEZ-3877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16293186#comment-16293186
]
Jason Lowe commented on TEZ-3877:
---------------------------------
TEZ-3350 tracks the problem where intermediate spills are not placed in the
container-specific directory. Even if the task deletes intermediate spills
once they are consumed, we still need TEZ-3350 to solve the problem of
"leaking" spill files if the task crashes or is killed during a shuffle/merge.
> Delete spill files once merge is done
> -------------------------------------
>
> Key: TEZ-3877
> URL: https://issues.apache.org/jira/browse/TEZ-3877
> Project: Apache Tez
> Issue Type: Bug
> Reporter: Rohini Palaniswamy
>
> I see that spill files are not deleted right after merge completes. We
> should do that as it takes up a lot of space and we can't afford that wastage
> when Tez takes up a lot of shuffle space with complex DAGs. [~jlowe] told me
> they are only cleaned up after application completes as they are written in
> app directory and not container directory. That also has to be done so that
> they are cleaned up by node manager during task failures or container crashes.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)