[
https://issues.apache.org/jira/browse/MAPREDUCE-3186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555502#comment-16555502
]
Avdhesh kumar commented on MAPREDUCE-3186:
------------------------------------------
Hello bro,
2018-07-25 10:02:39,659 ERROR [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Error communicating
with RM: Resource Manager doesn't recognize AttemptId:
appattempt_1532512489575_0001_000002
2018-07-25 09:58:22,587 ERROR [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Error communicating
with RM: Resource Manager doesn't recognize AttemptId:
appattempt_1532512489575_0001_000001
> User jobs are getting hanged if the Resource manager process goes down and
> comes up while job is getting executed.
> ------------------------------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-3186
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3186
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: mrv2
> Affects Versions: 0.23.0
> Environment: linux
> Reporter: Ramgopal N
> Assignee: Eric Payne
> Priority: Blocker
> Labels: test
> Fix For: 0.23.0
>
> Attachments: MAPREDUCE-3186.v1.txt, MAPREDUCE-3186.v2.txt,
> MR3186_v3.txt
>
>
> If the resource manager is restarted while the job execution is in progress,
> the job is getting hanged.
> UI shows the job as running.
> In the RM log, it is throwing an error "ERROR
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService:
> AppAttemptId doesnt exist in cache appattempt_1318579738195_0004_000001"
> In the console MRAppMaster and Runjar processes are not getting killed
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]