Daryn Sharp created MAPREDUCE-5262:
--------------------------------------

             Summary: AM generates NPEs when RM connection fails
                 Key: MAPREDUCE-5262
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5262
             Project: Hadoop Map/Reduce
          Issue Type: Bug
          Components: applicationmaster
    Affects Versions: 2.0.4-alpha, 3.0.0
            Reporter: Daryn Sharp


If the AM fails to connect to the RM, it causes a cascade of NPEs as the AM 
attempts to shutdown and exit.

{noformat}
2013-05-21 00:31:56,153 ERROR [main] 
org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException 
as:hadoopqa (auth:SIMPLE) 
cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
 Password not found for ApplicationAttempt appattempt_1367605529307_0034_000001
2013-05-21 00:31:56,154 WARN [main] org.apache.hadoop.ipc.Client: Exception 
encountered while connecting to the server : 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
 Password not found for ApplicationAttempt appattempt_1367605529307_0034_000001
2013-05-21 00:31:56,154 ERROR [main] 
org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException 
as:hadoopqa (auth:SIMPLE) 
cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
 Password not found for ApplicationAttempt appattempt_1367605529307_0034_000001
2013-05-21 00:31:56,156 ERROR [main] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Exception while 
registering
java.lang.reflect.UndeclaredThrowableException
        at 
org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:128)
        at 
org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.registerApplicationMaster(AMRMProtocolPBClientImpl.java:103)
        at 
org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:153)
        at 
org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.start(RMCommunicator.java:112)
        at 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.start(RMContainerAllocator.java:211)
        at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.start(MRAppMaster.java:797)
        at 
org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68)
        at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.start(MRAppMaster.java:1014)
        at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1374)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1477)
        at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1370)
        at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1318)
Caused by: 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
 Password not found for ApplicationAttempt appattempt_1367605529307_0034_000001
        at org.apache.hadoop.ipc.Client.call(Client.java:1266)
        at org.apache.hadoop.ipc.Client.call(Client.java:1218)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
        at com.sun.proxy.$Proxy28.registerApplicationMaster(Unknown Source)
        at 
org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.registerApplicationMaster(AMRMProtocolPBClientImpl.java:100)
        ... 12 more
2013-05-21 00:31:56,158 ERROR [main] 
org.apache.hadoop.yarn.service.CompositeService: Error starting services 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster
org.apache.hadoop.yarn.YarnException: 
java.lang.reflect.UndeclaredThrowableException
        at 
org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:166)
        at 
org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.start(RMCommunicator.java:112)
        at 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.start(RMContainerAllocator.java:211)
        at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.start(MRAppMaster.java:797)
        at 
org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68)
        at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.start(MRAppMaster.java:1014)
        at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1374)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1477)
        at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1370)
        at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1318)
Caused by: java.lang.reflect.UndeclaredThrowableException
        at 
org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:128)
        at 
org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.registerApplicationMaster(AMRMProtocolPBClientImpl.java:103)
        at 
org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:153)
        ... 11 more
Caused by: 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
 Password not found for ApplicationAttempt appattempt_1367605529307_0034_000001
        at org.apache.hadoop.ipc.Client.call(Client.java:1266)
        at org.apache.hadoop.ipc.Client.call(Client.java:1218)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
        at com.sun.proxy.$Proxy28.registerApplicationMaster(Unknown Source)
        at 
org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.registerApplicationMaster(AMRMProtocolPBClientImpl.java:100)
        ... 12 more
2013-05-21 00:31:56,158 INFO [main] 
org.apache.hadoop.yarn.service.CompositeService: Error stopping 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter
java.lang.NullPointerException
        at 
org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.stop(RMCommunicator.java:219)
        at 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.stop(RMContainerAllocator.java:251)
        at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.stop(MRAppMaster.java:803)
        at 
org.apache.hadoop.yarn.service.CompositeService.stop(CompositeService.java:99)
        at 
org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:77)
        at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.start(MRAppMaster.java:1014)
        at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1374)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1477)
        at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1370)
        at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1318)
2013-05-21 00:31:56,158 INFO [main] org.apache.hadoop.ipc.Server: Stopping 
server on 39121
2013-05-21 00:31:56,160 INFO [main] 
org.apache.hadoop.yarn.service.AbstractService: Service:TaskHeartbeatHandler is 
stopped.
2013-05-21 00:31:56,160 INFO [IPC Server listener on 39121] 
org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 39121
2013-05-21 00:31:56,160 INFO [IPC Server Responder] 
org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
2013-05-21 00:31:56,160 INFO [TaskHeartbeatHandler PingChecker] 
org.apache.hadoop.mapreduce.v2.app.TaskHeartbeatHandler: TaskHeartbeatHandler 
thread interrupted
2013-05-21 00:31:56,160 INFO [main] 
org.apache.hadoop.yarn.service.AbstractService: 
Service:org.apache.hadoop.mapred.TaskAttemptListenerImpl is stopped.
2013-05-21 00:31:56,161 INFO [main] 
org.apache.hadoop.yarn.service.AbstractService: Service:CommitterEventHandler 
is stopped.
2013-05-21 00:31:56,161 INFO [main] org.apache.hadoop.ipc.Server: Stopping 
server on 50500
2013-05-21 00:31:56,161 INFO [IPC Server listener on 50500] 
org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 50500
2013-05-21 00:31:56,161 INFO [IPC Server Responder] 
org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
2013-05-21 00:31:56,164 INFO [main] org.mortbay.log: Stopped 
SelectChannelConnector@0.0.0.0:0
2013-05-21 00:31:56,264 INFO [main] 
org.apache.hadoop.yarn.service.AbstractService: Service:MRClientService is 
stopped.
2013-05-21 00:31:56,264 INFO [main] 
org.apache.hadoop.yarn.service.AbstractService: Service:Dispatcher is stopped.
2013-05-21 00:31:56,264 FATAL [main] 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster
org.apache.hadoop.yarn.YarnException: Failed to Start 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster
        at 
org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:78)
        at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.start(MRAppMaster.java:1014)
        at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1374)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1477)
        at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1370)
        at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1318)
Caused by: org.apache.hadoop.yarn.YarnException: 
java.lang.reflect.UndeclaredThrowableException
        at 
org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:166)
        at 
org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.start(RMCommunicator.java:112)
        at 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.start(RMContainerAllocator.java:211)
        at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.start(MRAppMaster.java:797)
        at 
org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68)
        ... 7 more
Caused by: java.lang.reflect.UndeclaredThrowableException
        at 
org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:128)
        at 
org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.registerApplicationMaster(AMRMProtocolPBClientImpl.java:103)
        at 
org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:153)
        ... 11 more
Caused by: 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
 Password not found for ApplicationAttempt appattempt_1367605529307_0034_000001
        at org.apache.hadoop.ipc.Client.call(Client.java:1266)
        at org.apache.hadoop.ipc.Client.call(Client.java:1218)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
        at com.sun.proxy.$Proxy28.registerApplicationMaster(Unknown Source)
        at 
org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.registerApplicationMaster(AMRMProtocolPBClientImpl.java:100)
        ... 12 more
2013-05-21 00:31:56,266 INFO [Thread-1] 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster: MRAppMaster received a signal. 
Signaling RMCommunicator and JobHistoryEventHandler.
2013-05-21 00:31:56,266 INFO [Thread-1] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: RMCommunicator 
notified that iSignalled is: true
2013-05-21 00:31:56,266 INFO [Thread-1] 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Notify RMCommunicator 
isAMLastRetry: false
2013-05-21 00:31:56,266 INFO [Thread-1] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: RMCommunicator 
notified that shouldUnregistered is: false
2013-05-21 00:31:56,267 INFO [Thread-1] 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Notify JHEH isAMLastRetry: false
2013-05-21 00:31:56,267 INFO [Thread-1] 
org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: 
JobHistoryEventHandler notified that forceJobCompletion is false
2013-05-21 00:31:56,267 INFO [Thread-1] 
org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Stopping 
JobHistoryEventHandler. Size of the outstanding queue size is 3
2013-05-21 00:31:56,267 INFO [Thread-1] 
org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: In stop, writing 
event AM_STARTED
2013-05-21 00:31:56,347 INFO [Thread-1] 
org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Event Writer 
setup for JobId: job_1367605529307_0035, File: 
hdfs://hdfs-server:8020/user/hadoopqa/.staging/job_1367605529307_0035/job_1367605529307_0035_2.jhist
2013-05-21 00:31:56,356 WARN [Thread-1] org.apache.hadoop.conf.Configuration: 
user.name is deprecated. Instead, use mapreduce.job.user.name
2013-05-21 00:31:56,570 INFO [Thread-1] 
org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: In stop, writing 
event AM_STARTED
2013-05-21 00:31:56,571 INFO [Thread-1] 
org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: In stop, writing 
event JOB_SUBMITTED
2013-05-21 00:31:56,588 INFO [Thread-1] 
org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Stopped 
JobHistoryEventHandler. super.stop()
2013-05-21 00:31:56,588 INFO [Thread-1] 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Skipping cleaning up the 
staging dir. assuming AM will be retried.
2013-05-21 00:31:56,588 INFO [Thread-1] 
org.apache.hadoop.yarn.service.CompositeService: Error stopping 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerLauncherRouter
java.lang.NullPointerException
        at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerLauncherRouter.stop(MRAppMaster.java:865)
        at 
org.apache.hadoop.yarn.service.CompositeService.stop(CompositeService.java:99)
        at 
org.apache.hadoop.yarn.service.CompositeService.stop(CompositeService.java:89)
        at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$MRAppMasterShutdownHook.run(MRAppMaster.java:1343)
        at 
org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
2013-05-21 00:31:56,588 INFO [Thread-1] org.apache.hadoop.ipc.Server: Stopping 
server on 39121
2013-05-21 00:31:56,588 INFO [Thread-1] org.apache.hadoop.ipc.Server: Stopping 
server on 50500
{noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to