ChanaLii opened a new issue #198: restart hadoop services occurred an error 
when I finished the GPU setting for RM、NM  and container-executor.cfg
URL: https://github.com/apache/submarine/issues/198
 
 
   I am following the documentation to set up GPU for ResourceManager, 
NodeManager and container-executor.cfg in my environment.
   Then I turned to restart hadoop with the following code:
   `
   ARN_LOGFILE=resourcemanager.log ./sbin/yarn-daemon.sh start resourcemanager
   YARN_LOGFILE=nodemanager.log ./sbin/yarn-daemon.sh start nodemanager
   YARN_LOGFILE=timeline.log ./sbin/yarn-daemon.sh start timelineserver
   YARN_LOGFILE=mr-historyserver.log ./sbin/mr-jobhistory-daemon.sh start 
historyserver
   `
   
   I used the ** jps ** command to see if the service was running. 
Unfortunately, I found that the nodemanager service was not started. Then I 
found some errors in hadoop-root-nodemanager-71192c388b55.log
   
   `2020-03-01 09:52:38,744 ERROR 
org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler.ContainerScheduler:
 Failed to bootstrap configured resource subsystems! 
   
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.ResourceHandlerException:
 Controller devices not mounted. You either need to mount it with 
yarn.nodemanager.linux-container-executor.cgroups.mount or mount cgroups before 
launching Yarn
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.CGroupsHandlerImpl.initializePreMountedCGroupController(CGroupsHandlerImpl.java:392)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.CGroupsHandlerImpl.initializeCGroupController(CGroupsHandlerImpl.java:370)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.gpu.GpuResourceHandlerImpl.bootstrap(GpuResourceHandlerImpl.java:93)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.ResourceHandlerChain.bootstrap(ResourceHandlerChain.java:58)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler.ContainerScheduler.serviceInit(ContainerScheduler.java:146)
        at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
        at 
org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:323)
        at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
        at 
org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108)
        at 
org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:516)
        at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
        at 
org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:974)
        at 
org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:1054)
   2020-03-01 09:52:38,744 INFO org.apache.hadoop.service.AbstractService: 
Service 
org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler.ContainerScheduler
 failed in state INITED
   java.io.IOException: Failed to bootstrap configured resource subsystems!
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler.ContainerScheduler.serviceInit(ContainerScheduler.java:150)
        at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
        at 
org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:323)
        at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
        at 
org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108)
        at 
org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:516)
        at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
        at 
org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:974)
        at 
org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:1054)
   2020-03-01 09:52:38,745 INFO org.apache.hadoop.service.AbstractService: 
Service 
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl 
failed in state INITED
   org.apache.hadoop.service.ServiceStateException: java.io.IOException: Failed 
to bootstrap configured resource subsystems!
        at 
org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105)
        at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:173)
        at 
org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:323)
        at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
        at 
org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108)
        at 
org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:516)
        at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
        at 
org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:974)
        at 
org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:1054)
   Caused by: java.io.IOException: Failed to bootstrap configured resource 
subsystems!
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler.ContainerScheduler.serviceInit(ContainerScheduler.java:150)
        at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
        ... 8 more
   `
   
   It seems the env didn't mount "/sys/fs/cgroup",here's my docker started 
command:
   `
   ➜  Downloads docker run -it -v /data/docker-images/:/sys/fs/cgroup -m 10G 
968d612886ee bash
   `
   somebody can help me ?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to