[jira] [Updated] (MAPREDUCE-7337) Task fails while deleting spill files on slow disk

Scott Oaks (Jira) Mon, 19 Apr 2021 15:39:10 -0700


     [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Scott Oaks updated MAPREDUCE-7337:
----------------------------------
    Summary: Task fails while deleting spill files on slow disk  (was: Task 
files while deleting spill files on slow disk)

> Task fails while deleting spill files on slow disk
> --------------------------------------------------
>
>                 Key: MAPREDUCE-7337
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7337
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: performance
>            Reporter: Scott Oaks
>            Priority: Minor
>
> We sometimes have tasks fail when deleting spill files in this loop (line 
> 2005 of MapTask.java):
> {code:java}
> for(int i = 0; i < numSpills; i++) {
>   rfs.delete(filename[i],true);
> }{code}
> During this loop, there is no communication back to the master server, and 
> hence if the loop takes too long, the master server assumes the child has 
> timed out and tells the nodeagent to kill the yarn child.
> Typically this is linked to storage issues, and we've seen it most often due 
> to an underlying bug in the filesystem (where there is contention in the 
> filesystem delete path when deleting several files). But while there are 
> usually underlying issues, it still wouldn't hurt to mark progress in the 
> task during this loop periodically.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Updated] (MAPREDUCE-7337) Task fails while deleting spill files on slow disk

Reply via email to