[jira] [Updated] (MAPREDUCE-6489) Fail fast rogue tasks that write too much to local disk

Jason Lowe (JIRA) Wed, 21 Oct 2015 07:14:39 -0700

     [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Jason Lowe updated MAPREDUCE-6489:
----------------------------------
       Resolution: Fixed
     Hadoop Flags: Reviewed
    Fix Version/s: 2.8.0
           Status: Resolved  (was: Patch Available)

Thanks, Maysam!  I committed this to trunk and branch-2.

> Fail fast rogue tasks that write too much to local disk
> -------------------------------------------------------
>
>                 Key: MAPREDUCE-6489
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6489
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: task
>    Affects Versions: 2.7.1
>            Reporter: Maysam Yabandeh
>            Assignee: Maysam Yabandeh
>             Fix For: 2.8.0
>
>         Attachments: MAPREDUCE-6489-branch-2.003.patch, 
> MAPREDUCE-6489.001.patch, MAPREDUCE-6489.002.patch, MAPREDUCE-6489.003.patch
>
>
> Tasks of the rogue jobs can write too much to local disk, negatively 
> affecting the jobs running in collocated containers. Ideally YARN will be 
> able to limit amount of local disk used by each task: YARN-4011. Until then, 
> the mapreduce task can fail fast if the task is writing too much (above a 
> configured threshold) to local disk.
> As we discussed 
> [here|https://issues.apache.org/jira/browse/YARN-4011?focusedCommentId=14902750&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14902750]
>  the suggested approach is that the MapReduce task checks for BYTES_WRITTEN 
> counter for the local disk and throws an exception when it goes beyond a 
> configured value.  It is true that written bytes is larger than the actual 
> used disk space, but to detect a rogue task the exact value is not required 
> and a very large value for written bytes to local disk is a good indicative 
> that the task is misbehaving.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MAPREDUCE-6489) Fail fast rogue tasks that write too much to local disk

Reply via email to