[
https://issues.apache.org/jira/browse/MAPREDUCE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13065540#comment-13065540
]
Robert Joseph Evans commented on MAPREDUCE-2324:
------------------------------------------------
Looking back I realize that I probably have not answered Todd's question
satisfactorily. Yes there are out of band heartbeats, and in fact not every TT
heartbeat will make it all the way through to this piece code, because the node
may have no slots available by the time it gets to this Job. The intention was
not to verify that the job has been tried on every TT before giving up. The
idea was to do a reasonable effort in trying to schedule the job before giving
up. I suspect that the amount of free disk space on a node may very quite a
bit between heartbeats, just because jobs are using disk space that then go
away, HDFS is storing a file that is deleted, or several new blocks are added,
so even if we give every node a chance at this job before giving up there is
still a possibility that it will succeed later on. We cannot predict the
future, but we do need to put an upper bound on how long we try to do
something, otherwise there will always be corner cases where we can get
starvation.
It may also make since to use some statistical heuristics in MR-279 to try and
give up sooner rather then later if someone is asking for something that is
really outside of the norm. But that is just an optimization.
> Job should fail if a reduce task can't be scheduled anywhere
> ------------------------------------------------------------
>
> Key: MAPREDUCE-2324
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2324
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Affects Versions: 0.20.2, 0.20.205.0
> Reporter: Todd Lipcon
> Assignee: Robert Joseph Evans
> Attachments: MR-2324-security-v1.txt
>
>
> If there's a reduce task that needs more disk space than is available on any
> mapred.local.dir in the cluster, that task will stay pending forever. For
> example, we produced this in a QA cluster by accidentally running terasort
> with one reducer - since no mapred.local.dir had 1T free, the job remained in
> pending state for several days. The reason for the "stuck" task wasn't clear
> from a user perspective until we looked at the JT logs.
> Probably better to just fail the job if a reduce task goes through all TTs
> and finds that there isn't enough space.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira