-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49884/
-----------------------------------------------------------

Review request for Ambari and Myroslav Papirkovskyy.


Bugs: AMBARI-17646
    https://issues.apache.org/jira/browse/AMBARI-17646


Repository: ambari


Description
-------

Nodemanager is down on one of the nodes after installation. This has impacted
most of the splits in todays run (ambari-2.4.0.0-817).  
Nodemanager is found be down on one of the nodes in 3 node cluster and its
running on other two nodes.  
Live cluster is available here <https://172.22.66.85:8443/> and is alive for
another 24hrs

Below error is seen in nodemanager.log :

2016-07-10 04:40:59,678 INFO recovery.NMLeveldbStateStoreService
(NMLeveldbStateStoreService.java:checkVersion(1022)) - Loaded NM state version
info 1.0  
2016-07-10 04:40:59,889 WARN nodemanager.LinuxContainerExecutor
(LinuxContainerExecutor.java:init(195)) - Exit code from container executor
initialization is : 24  
ExitCodeException exitCode=24: File /etc/hadoop/2.4.2.0-258/0 must be owned by
root, but is owned by 2530

at org.apache.hadoop.util.Shell.runCommand(Shell.java:576)  
at org.apache.hadoop.util.Shell.run(Shell.java:487)  
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:753)  
at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(Linux
ContainerExecutor.java:192)  
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManag
er.java:236)  
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)  
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManag
er(NodeManager.java:547)  
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java
:595)  
2016-07-10 04:40:59,893 INFO nodemanager.ContainerExecutor
(ContainerExecutor.java:logOutput(322)) -  
2016-07-10 04:40:59,893 INFO service.AbstractService
(AbstractService.java:noteFailure(272)) - Service NodeManager failed in state
INITED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed
to initialize container executor  
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to initialize
container executor  
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManag
er.java:238)  
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)  
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManag
er(NodeManager.java:547)  
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java
:595)  
Caused by: java.io.IOException: Linux container executor not configured
properly (error=24)  
at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(Linux
ContainerExecutor.java:198)  
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManag
er.java:236)  
... 3 more  
Caused by: ExitCodeException exitCode=24: File /etc/hadoop/2.4.2.0-258/0 must
be owned by root, but is owned by 2530

at org.apache.hadoop.util.Shell.runCommand(Shell.java:576)  
at org.apache.hadoop.util.Shell.run(Shell.java:487)  
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:753)  
at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(Linux
ContainerExecutor.java:192)  
... 4 more  
2016-07-10 04:40:59,895 FATAL nodemanager.NodeManager
(NodeManager.java:initAndStartNodeManager(550)) - Error starting NodeManager  
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to initialize
container executor  
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManag
er.java:238)  
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)  
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManag
er(NodeManager.java:547)  
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java
:595)  
Caused by: java.io.IOException: Linux container executor not configured
properly (error=24)  
at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(Linux
ContainerExecutor.java:198)  
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManag
er.java:236)  
... 3 more  
Caused by: ExitCodeException exitCode=24: File /etc/hadoop/2.4.2.0-258/0 must
be owned by root, but is owned by 2530

at org.apache.hadoop.util.Shell.runCommand(Shell.java:576)  
at org.apache.hadoop.util.Shell.run(Shell.java:487)  
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:753)  
at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(Linux
ContainerExecutor.java:192)  
... 4 more  
2016-07-10 04:40:59,898 INFO nodemanager.NodeManager
(LogAdapter.java:info(45)) - SHUTDOWN_MSG:  
/************************************************************  
SHUTDOWN_MSG: Shutting down NodeManager at nat-d7-xals-ambarieu-
newamb-242-1-1/172.22.66.62


Diffs
-----

  
ambari-server/src/main/resources/common-services/YARN/2.1.0.2.0/package/scripts/yarn.py
 ed5d85b 
  ambari-server/src/test/python/stacks/2.0.6/YARN/test_resourcemanager.py 
5ecab1e 

Diff: https://reviews.apache.org/r/49884/diff/


Testing
-------

mvn clean test


Thanks,

Andrew Onischuk

Reply via email to