[ https://issues.apache.org/jira/browse/MAPREDUCE-6485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14900410#comment-14900410 ]
Rohith Sharma K S commented on MAPREDUCE-6485: ---------------------------------------------- Got the logs from [~Jobo] offline. One of the potential issue is when speculative task attempt launches. The below is the flow of events which causes problem. # There are T1..Tn task attempts are there. The tasks T1..Tn-1 are finished and task attempt Tn is running. # Reducers are scheduled(RR is sent to RM). Some of the reducers attempts are running. So Tn th task + Reducer tasks are occupied full cluster. And 50 more reducers task are in queue i.e RM has to assign 50 containers with 10 i.e reducer priority. # Speculative task for Tn map task attempt is spawned with priority 20 and ResourceRequest has sent to RM. # Since Reducers has higher priority, RM will not assign any containers to mapper task resource request(speculative task). # Tn mapper task has timed out because of some reason. So task attempt is marked as failed. # But new attempt with failed map task priority WILL NOT BE created since already one attempt i.e speculative attempt is there with scheduledMap. {color:red}MR job hangs forever here.{color} Correct me if I am wrong, I think MAPREDUCE-6302 solution will not solve fully because even if reducer preemption happens it has to preempt all the scheduled reducers containers by which means RM has to assign container to reducers and preempt. After all the queued reducers request are served since reducers has higher priority and pick up the mapper resource request for assignment. > MR job hanged forever because all resources are taken up by reducers and the > last map attempt never get resource to run > ----------------------------------------------------------------------------------------------------------------------- > > Key: MAPREDUCE-6485 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6485 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: applicationmaster > Affects Versions: 3.0.0, 2.4.1, 2.6.0, 2.7.1 > Reporter: Bob > Priority: Critical > > The scenarios is like this: > With configuring mapreduce.job.reduce.slowstart.completedmaps=0.8, reduces > will take resource and start to run when all the map have not finished. > But It could happened that when all the resources are taken up by running > reduces, there is still one map not finished. > Under this condition , the last map have two task attempts . > As for the first attempt was killed due to timeout(mapreduce.task.timeout), > and its state transitioned from RUNNING to FAIL_CONTAINER_CLEANUP, so failed > map attempt would not be started. > As for the second attempt which was started due to having enable map task > speculative is pending at UNASSINGED state because of no resource available. > But the second map attempt request have lower priority than reduces, so > preemption would not happened. > As a result all reduces would not finished because of there is one map left. > and the last map hanged there because of no resource available. so, the job > would never finish. -- This message was sent by Atlassian JIRA (v6.3.4#6332)