[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14900410#comment-14900410
 ] 

Rohith Sharma K S commented on MAPREDUCE-6485:
----------------------------------------------

Got the logs from [~Jobo] offline. One of the potential issue is when 
speculative task attempt launches. The below is the flow of events which causes 
problem.
# There are T1..Tn task attempts are there. The tasks T1..Tn-1 are finished and 
task attempt Tn is running. 
# Reducers are scheduled(RR is sent to RM). Some of the reducers attempts are 
running. So Tn th task + Reducer tasks are occupied full cluster. And 50 more 
reducers task are in queue i.e RM has to assign 50 containers with 10 i.e 
reducer priority.
# Speculative task for Tn map task attempt is spawned with priority 20 and 
ResourceRequest has sent to RM.
# Since Reducers has higher priority, RM will not assign any containers to 
mapper task resource request(speculative task).
# Tn mapper task has timed out because of some reason. So task attempt is 
marked as failed.
# But new attempt with failed map task priority WILL NOT BE created since 
already one attempt i.e speculative attempt is there with scheduledMap. 
{color:red}MR job hangs forever here.{color}

Correct me if I am wrong, I think MAPREDUCE-6302 solution will not solve fully 
because even if reducer preemption happens it has to preempt all the scheduled 
reducers containers by which means RM has to assign container to reducers and 
preempt. After all the queued reducers request are served since reducers has  
higher priority and pick up the mapper resource request for assignment.

> MR job hanged forever because all resources are taken up by reducers and the 
> last map attempt never get resource to run
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-6485
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6485
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: applicationmaster
>    Affects Versions: 3.0.0, 2.4.1, 2.6.0, 2.7.1
>            Reporter: Bob
>            Priority: Critical
>
> The scenarios is like this:
> With configuring mapreduce.job.reduce.slowstart.completedmaps=0.8, reduces 
> will take resource and  start to run when all the map have not finished. 
> But It could happened that when all the resources are taken up by running 
> reduces, there is still one map not finished. 
> Under this condition , the last map have two task attempts .
> As for the first attempt was killed due to timeout(mapreduce.task.timeout), 
> and its state transitioned from RUNNING to FAIL_CONTAINER_CLEANUP, so failed 
> map attempt would not be started. 
> As for the second attempt which was started due to having enable map task 
> speculative is pending at UNASSINGED state because of no resource available. 
> But the second map attempt request have lower priority than reduces, so 
> preemption would not happened.
> As a result all reduces would not finished because of there is one map left. 
> and the last map hanged there because of no resource available. so, the job 
> would never finish.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to