We have this erratic behavior where every so often the RM will shutdown with an
UnknownHostException. The odd thing is, the host it complains about have been
in use for days at that point without problem. Any ideas?
Thanks,
John
2014-03-13 14:38:14,746 INFO rmapp.RMAppImpl (RMAppImpl.java:handle(578)) -
application_1394204725813_0220 State change from ACCEPTED to RUNNING
2014-03-13 14:38:15,794 FATAL resourcemanager.ResourceManager
(ResourceManager.java:run(449)) - Error in handling event type NODE_UPDATE to
the scheduler
java.lang.IllegalArgumentException: java.net.UnknownHostException:
skitzo.office.datalever.com
at
org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:418)
at
org.apache.hadoop.yarn.server.utils.BuilderUtils.newContainerToken(BuilderUtils.java:247)
at
org.apache.hadoop.yarn.server.resourcemanager.security.RMContainerTokenSecretManager.createContainerToken(RMContainerTokenSecretManager.java:195)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.createContainerToken(LeafQueue.java:1297)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainer(LeafQueue.java:1345)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignOffSwitchContainers(LeafQueue.java:1211)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainersOnNode(LeafQueue.java:1170)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:871)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:645)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:559)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:690)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:734)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:86)
at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:440)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.net.UnknownHostException: skitzo.office.datalever.com
... 15 more
2014-03-13 14:38:15,794 INFO resourcemanager.ResourceManager
(ResourceManager.java:run(453)) - Exiting, bbye..
2014-03-13 14:38:15,911 INFO mortbay.log (Slf4jLog.java:info(67)) - Stopped
[email protected]:8088
2014-03-13 14:38:16,013 ERROR delegation.AbstractDelegationTokenSecretManager
(AbstractDelegationTokenSecretManager.java:run(557)) - InterruptedExcpetion
recieved for ExpiredTokenRemover thread java.lang.InterruptedException: sleep
interrupted
2014-03-13 14:38:16,013 INFO impl.MetricsSystemImpl
(MetricsSystemImpl.java:stop(200)) - Stopping ResourceManager metrics system...
2014-03-13 14:38:16,014 INFO impl.MetricsSystemImpl
(MetricsSystemImpl.java:stop(206)) - ResourceManager metrics system stopped.
2014-03-13 14:38:16,014 INFO impl.MetricsSystemImpl
(MetricsSystemImpl.java:shutdown(572)) - ResourceManager metrics system
shutdown complete.
2014-03-13 14:38:16,015 WARN amlauncher.ApplicationMasterLauncher
(ApplicationMasterLauncher.java:run(98)) -
org.apache.hadoop.yarn.server.resourcemanager.amlauncher.ApplicationMasterLauncher$LauncherThread
interrupted. Returning.
2014-03-13 14:38:16,015 INFO ipc.Server (Server.java:stop(2442)) - Stopping
server on 8141
2014-03-13 14:38:16,017 INFO ipc.Server (Server.java:stop(2442)) - Stopping
server on 8050
... and so on, it shuts down