[
https://issues.apache.org/jira/browse/MAPREDUCE-5262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13663096#comment-13663096
]
Daryn Sharp commented on MAPREDUCE-5262:
----------------------------------------
Also, the web UI reports the job starting at the epoch:
{noformat}
Started: Wed Dec 31 23:59:59 UTC 1969
Finished: Tue May 21 00:32:03 UTC 2013
Elapsed: 380304hrs, 32mins, 3sec
{noformat}
> AM generates NPEs when RM connection fails
> ------------------------------------------
>
> Key: MAPREDUCE-5262
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5262
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: applicationmaster
> Affects Versions: 3.0.0, 2.0.4-alpha
> Reporter: Daryn Sharp
>
> If the AM fails to connect to the RM, it causes a cascade of NPEs as the AM
> attempts to shutdown and exit.
> {noformat}
> 2013-05-21 00:31:56,153 ERROR [main]
> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
> as:hadoopqa (auth:SIMPLE)
> cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
> Password not found for ApplicationAttempt
> appattempt_1367605529307_0034_000001
> 2013-05-21 00:31:56,154 WARN [main] org.apache.hadoop.ipc.Client: Exception
> encountered while connecting to the server :
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
> Password not found for ApplicationAttempt
> appattempt_1367605529307_0034_000001
> 2013-05-21 00:31:56,154 ERROR [main]
> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
> as:hadoopqa (auth:SIMPLE)
> cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
> Password not found for ApplicationAttempt
> appattempt_1367605529307_0034_000001
> 2013-05-21 00:31:56,156 ERROR [main]
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Exception while
> registering
> java.lang.reflect.UndeclaredThrowableException
> at
> org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:128)
> at
> org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.registerApplicationMaster(AMRMProtocolPBClientImpl.java:103)
> at
> org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:153)
> at
> org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.start(RMCommunicator.java:112)
> at
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.start(RMContainerAllocator.java:211)
> at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.start(MRAppMaster.java:797)
> at
> org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68)
> at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.start(MRAppMaster.java:1014)
> at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1374)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1477)
> at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1370)
> at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1318)
> Caused by:
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
> Password not found for ApplicationAttempt
> appattempt_1367605529307_0034_000001
> at org.apache.hadoop.ipc.Client.call(Client.java:1266)
> at org.apache.hadoop.ipc.Client.call(Client.java:1218)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
> at com.sun.proxy.$Proxy28.registerApplicationMaster(Unknown Source)
> at
> org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.registerApplicationMaster(AMRMProtocolPBClientImpl.java:100)
> ... 12 more
> 2013-05-21 00:31:56,158 ERROR [main]
> org.apache.hadoop.yarn.service.CompositeService: Error starting services
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster
> org.apache.hadoop.yarn.YarnException:
> java.lang.reflect.UndeclaredThrowableException
> at
> org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:166)
> at
> org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.start(RMCommunicator.java:112)
> at
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.start(RMContainerAllocator.java:211)
> at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.start(MRAppMaster.java:797)
> at
> org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68)
> at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.start(MRAppMaster.java:1014)
> at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1374)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1477)
> at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1370)
> at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1318)
> Caused by: java.lang.reflect.UndeclaredThrowableException
> at
> org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:128)
> at
> org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.registerApplicationMaster(AMRMProtocolPBClientImpl.java:103)
> at
> org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:153)
> ... 11 more
> Caused by:
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
> Password not found for ApplicationAttempt
> appattempt_1367605529307_0034_000001
> at org.apache.hadoop.ipc.Client.call(Client.java:1266)
> at org.apache.hadoop.ipc.Client.call(Client.java:1218)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
> at com.sun.proxy.$Proxy28.registerApplicationMaster(Unknown Source)
> at
> org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.registerApplicationMaster(AMRMProtocolPBClientImpl.java:100)
> ... 12 more
> 2013-05-21 00:31:56,158 INFO [main]
> org.apache.hadoop.yarn.service.CompositeService: Error stopping
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter
> java.lang.NullPointerException
> at
> org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.stop(RMCommunicator.java:219)
> at
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.stop(RMContainerAllocator.java:251)
> at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.stop(MRAppMaster.java:803)
> at
> org.apache.hadoop.yarn.service.CompositeService.stop(CompositeService.java:99)
> at
> org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:77)
> at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.start(MRAppMaster.java:1014)
> at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1374)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1477)
> at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1370)
> at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1318)
> 2013-05-21 00:31:56,158 INFO [main] org.apache.hadoop.ipc.Server: Stopping
> server on 39121
> 2013-05-21 00:31:56,160 INFO [main]
> org.apache.hadoop.yarn.service.AbstractService: Service:TaskHeartbeatHandler
> is stopped.
> 2013-05-21 00:31:56,160 INFO [IPC Server listener on 39121]
> org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 39121
> 2013-05-21 00:31:56,160 INFO [IPC Server Responder]
> org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
> 2013-05-21 00:31:56,160 INFO [TaskHeartbeatHandler PingChecker]
> org.apache.hadoop.mapreduce.v2.app.TaskHeartbeatHandler: TaskHeartbeatHandler
> thread interrupted
> 2013-05-21 00:31:56,160 INFO [main]
> org.apache.hadoop.yarn.service.AbstractService:
> Service:org.apache.hadoop.mapred.TaskAttemptListenerImpl is stopped.
> 2013-05-21 00:31:56,161 INFO [main]
> org.apache.hadoop.yarn.service.AbstractService: Service:CommitterEventHandler
> is stopped.
> 2013-05-21 00:31:56,161 INFO [main] org.apache.hadoop.ipc.Server: Stopping
> server on 50500
> 2013-05-21 00:31:56,161 INFO [IPC Server listener on 50500]
> org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 50500
> 2013-05-21 00:31:56,161 INFO [IPC Server Responder]
> org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
> 2013-05-21 00:31:56,164 INFO [main] org.mortbay.log: Stopped
> [email protected]:0
> 2013-05-21 00:31:56,264 INFO [main]
> org.apache.hadoop.yarn.service.AbstractService: Service:MRClientService is
> stopped.
> 2013-05-21 00:31:56,264 INFO [main]
> org.apache.hadoop.yarn.service.AbstractService: Service:Dispatcher is stopped.
> 2013-05-21 00:31:56,264 FATAL [main]
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster
> org.apache.hadoop.yarn.YarnException: Failed to Start
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster
> at
> org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:78)
> at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.start(MRAppMaster.java:1014)
> at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1374)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1477)
> at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1370)
> at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1318)
> Caused by: org.apache.hadoop.yarn.YarnException:
> java.lang.reflect.UndeclaredThrowableException
> at
> org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:166)
> at
> org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.start(RMCommunicator.java:112)
> at
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.start(RMContainerAllocator.java:211)
> at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.start(MRAppMaster.java:797)
> at
> org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68)
> ... 7 more
> Caused by: java.lang.reflect.UndeclaredThrowableException
> at
> org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:128)
> at
> org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.registerApplicationMaster(AMRMProtocolPBClientImpl.java:103)
> at
> org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:153)
> ... 11 more
> Caused by:
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
> Password not found for ApplicationAttempt
> appattempt_1367605529307_0034_000001
> at org.apache.hadoop.ipc.Client.call(Client.java:1266)
> at org.apache.hadoop.ipc.Client.call(Client.java:1218)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
> at com.sun.proxy.$Proxy28.registerApplicationMaster(Unknown Source)
> at
> org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.registerApplicationMaster(AMRMProtocolPBClientImpl.java:100)
> ... 12 more
> 2013-05-21 00:31:56,266 INFO [Thread-1]
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster: MRAppMaster received a
> signal. Signaling RMCommunicator and JobHistoryEventHandler.
> 2013-05-21 00:31:56,266 INFO [Thread-1]
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: RMCommunicator
> notified that iSignalled is: true
> 2013-05-21 00:31:56,266 INFO [Thread-1]
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Notify RMCommunicator
> isAMLastRetry: false
> 2013-05-21 00:31:56,266 INFO [Thread-1]
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: RMCommunicator
> notified that shouldUnregistered is: false
> 2013-05-21 00:31:56,267 INFO [Thread-1]
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Notify JHEH isAMLastRetry:
> false
> 2013-05-21 00:31:56,267 INFO [Thread-1]
> org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler:
> JobHistoryEventHandler notified that forceJobCompletion is false
> 2013-05-21 00:31:56,267 INFO [Thread-1]
> org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Stopping
> JobHistoryEventHandler. Size of the outstanding queue size is 3
> 2013-05-21 00:31:56,267 INFO [Thread-1]
> org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: In stop,
> writing event AM_STARTED
> 2013-05-21 00:31:56,347 INFO [Thread-1]
> org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Event Writer
> setup for JobId: job_1367605529307_0035, File:
> hdfs://hdfs-server:8020/user/hadoopqa/.staging/job_1367605529307_0035/job_1367605529307_0035_2.jhist
> 2013-05-21 00:31:56,356 WARN [Thread-1] org.apache.hadoop.conf.Configuration:
> user.name is deprecated. Instead, use mapreduce.job.user.name
> 2013-05-21 00:31:56,570 INFO [Thread-1]
> org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: In stop,
> writing event AM_STARTED
> 2013-05-21 00:31:56,571 INFO [Thread-1]
> org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: In stop,
> writing event JOB_SUBMITTED
> 2013-05-21 00:31:56,588 INFO [Thread-1]
> org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Stopped
> JobHistoryEventHandler. super.stop()
> 2013-05-21 00:31:56,588 INFO [Thread-1]
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Skipping cleaning up the
> staging dir. assuming AM will be retried.
> 2013-05-21 00:31:56,588 INFO [Thread-1]
> org.apache.hadoop.yarn.service.CompositeService: Error stopping
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerLauncherRouter
> java.lang.NullPointerException
> at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerLauncherRouter.stop(MRAppMaster.java:865)
> at
> org.apache.hadoop.yarn.service.CompositeService.stop(CompositeService.java:99)
> at
> org.apache.hadoop.yarn.service.CompositeService.stop(CompositeService.java:89)
> at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$MRAppMasterShutdownHook.run(MRAppMaster.java:1343)
> at
> org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
> 2013-05-21 00:31:56,588 INFO [Thread-1] org.apache.hadoop.ipc.Server:
> Stopping server on 39121
> 2013-05-21 00:31:56,588 INFO [Thread-1] org.apache.hadoop.ipc.Server:
> Stopping server on 50500
> {noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira