Wilfred Spiegelenburg created YARN-7585:
-------------------------------------------

             Summary: NodeManager should go unhealthy when state store throws 
DBException 
                 Key: YARN-7585
                 URL: https://issues.apache.org/jira/browse/YARN-7585
             Project: Hadoop YARN
          Issue Type: Bug
          Components: nodemanager
            Reporter: Wilfred Spiegelenburg
            Assignee: Wilfred Spiegelenburg


If work preserving recover is enabled the NM will not start up if the state 
store does not initialise. However if the state store becomes unavailable after 
that for any reason the NM will not go unhealthy. 
Since the state store is not available new containers can not be started any 
more and the NM should become unhealthy:
{code}
AMLauncher: Error launching appattempt_1508806289867_268617_000001. Got 
exception: org.apache.hadoop.yarn.exceptions.YarnException: 
java.io.IOException: org.iq80.leveldb.DBException: IO error: 
/dsk/app/var/lib/hadoop-yarn/yarn-nm-recovery/yarn-nm-state/028269.log: 
Read-only file system
at o.a.h.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:38)
at 
o.a.h.y.s.n.cm.ContainerManagerImpl.startContainers(ContainerManagerImpl.java:721)
...
Caused by: java.io.IOException: org.iq80.leveldb.DBException: IO error: 
/dsk/app/var/lib/hadoop-yarn/yarn-nm-recovery/yarn-nm-state/028269.log: 
Read-only file system
at 
o.a.h.y.s.n.r.NMLeveldbStateStoreService.storeApplication(NMLeveldbStateStoreService.java:374)
at 
o.a.h.y.s.n.cm.ContainerManagerImpl.startContainerInternal(ContainerManagerImpl.java:848)
at 
o.a.h.y.s.n.cm.ContainerManagerImpl.startContainers(ContainerManagerImpl.java:712)
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org

Reply via email to