[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13955750#comment-13955750
 ] 

Sangjin Lee commented on MAPREDUCE-5817:
----------------------------------------

We're talking about two options for this: (1) modify 
JobImpl.actOnUnusableNode() so that if all reducers are completed do not 
reschedule mappers, and (2) modify checkReadyForCommit() so that it transitions 
to COMMITTING if all reducers are completed (if reducers exist) instead of 
checking all tasks are completed.

Either approach seems to have some downsides.

For (1), the change is pretty narrow (only affects the rescheduling scenario). 
However, it still lets the mapper tasks that were rescheduled prior to reducer 
completion run. So the job may linger until those mapper tasks run to 
completion. And if those mapper tasks fail for any reason, it may render the 
job as failed (even though all reducers may have succeeded in reality).

For (2), it would be effective and would make the job finish much more quickly. 
On the other hand, we'd need to do something about the mapper tasks that are 
running at that point. They may need to be killed. Also, if the original mapper 
tasks were successful, we may need to "resurrect" their status from KILLED to 
SUCCESSFUL to avoid confusion.



> mappers get rescheduled on node transition even after all reducers are 
> completed
> --------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5817
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5817
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: applicationmaster
>    Affects Versions: 2.3.0
>            Reporter: Sangjin Lee
>
> We're seeing a behavior where a job runs long after all reducers were already 
> finished. We found that the job was rescheduling and running a number of 
> mappers beyond the point of reducer completion. In one situation, the job ran 
> for some 9 more hours after all reducers completed!
> This happens because whenever a node transition (to an unusable state) comes 
> into the app master, it just reschedules all mappers that already ran on the 
> node in all cases.
> Therefore, if any node transition has a potential to extend the job period. 
> Once this window opens, another node transition can prolong it, and this can 
> happen indefinitely in theory.
> If there is some instability in the pool (unhealthy, etc.) for a duration, 
> then any big job is severely vulnerable to this problem.
> If all reducers have been completed, JobImpl.actOnUnusableNode() should not 
> reschedule mapper tasks. If all reducers are completed, the mapper outputs 
> are no longer needed, and there is no need to reschedule mapper tasks as they 
> would not be consumed anyway.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to