Bharat Viswanadham created HDDS-3066:
----------------------------------------

             Summary: SCM crash during loading containers to DB
                 Key: HDDS-3066
                 URL: https://issues.apache.org/jira/browse/HDDS-3066
             Project: Hadoop Distributed Data Store
          Issue Type: New Feature
            Reporter: Bharat Viswanadham
            Assignee: Bharat Viswanadham


 This is happening because pipeline scrubber came and removed pipeline, and it 
closed pipeline and removed from DB and triggered close containers to set them 
to CLOSING. When SCM is restarted before close container command is handled and 
change the state to CLOSING, the below issue can happen.

 

This can happen in other scenarios like when safeModeHandler calls 
finalizeAndDestroyPipeline and do SCM restart. 

 

The root cause for this is Pipeline removed from DB and the container is in 
open state in this scenario, and when trying to get pipeline we will crash SCM 
due to the {{PipelineNotFoundException error.}}

{{}}
{code:java}
 2020-02-21 13:57:34,888 [main] ERROR 
org.apache.hadoop.hdds.scm.server.StorageContainerManagerStarter: SCM start 
failed with exception 
org.apache.hadoop.hdds.scm.pipeline.PipelineNotFoundException: 
PipelineID=35dff62d-9bfa-449b-b6e8-6f00cc8c1b6e not found at 
org.apache.hadoop.hdds.scm.pipeline.PipelineStateMap.getPipeline(PipelineStateMap.java:133)
 at 
org.apache.hadoop.hdds.scm.pipeline.PipelineStateMap.addContainerToPipeline(PipelineStateMap.java:110)
 at 
org.apache.hadoop.hdds.scm.pipeline.PipelineStateManager.addContainerToPipeline(PipelineStateManager.java:59)
 at 
org.apache.hadoop.hdds.scm.pipeline.SCMPipelineManager.addContainerToPipeline(SCMPipelineManager.java:309)
 at 
org.apache.hadoop.hdds.scm.container.SCMContainerManager.loadExistingContainers(SCMContainerManager.java:121)
 at 
org.apache.hadoop.hdds.scm.container.SCMContainerManager.<init>(SCMContainerManager.java:107)
 at 
org.apache.hadoop.hdds.scm.server.StorageContainerManager.initializeSystemManagers(StorageContainerManager.java:412)
 at 
org.apache.hadoop.hdds.scm.server.StorageContainerManager.<init>(StorageContainerManager.java:283)
 at 
org.apache.hadoop.hdds.scm.server.StorageContainerManager.<init>(StorageContainerManager.java:215)
 at 
org.apache.hadoop.hdds.scm.server.StorageContainerManager.createSCM(StorageContainerManager.java:612)
 at 
org.apache.hadoop.hdds.scm.server.StorageContainerManagerStarter$SCMStarterHelper.start(StorageContainerManagerStarter.java:142)
 at 
org.apache.hadoop.hdds.scm.server.StorageContainerManagerStarter.startScm(StorageContainerManagerStarter.java:117)
 at 
org.apache.hadoop.hdds.scm.server.StorageContainerManagerStarter.call(StorageContainerManagerStarter.java:66)
 at 
org.apache.hadoop.hdds.scm.server.StorageContainerManagerStarter.call(StorageContainerManagerStarter.java:42)
 at picocli.CommandLine.execute(CommandLine.java:1173) at 
picocli.CommandLine.access$800(CommandLine.java:141) at 
picocli.CommandLine$RunLast.handle(CommandLine.java:1367) at 
picocli.CommandLine$RunLast.handle(CommandLine.java:1335) at 
picocli.CommandLine$AbstractParseResultHandler.handleParseResult(CommandLine.java:1243)
 at picocli.CommandLine.parseWithHandlers(CommandLine.java:1526) at 
picocli.CommandLine.parseWithHandler(CommandLine.java:1465) at 
org.apache.hadoop.hdds.cli.GenericCli.execute(GenericCli.java:65) at 
org.apache.hadoop.hdds.cli.GenericCli.run(GenericCli.java:56) at 
org.apache.hadoop.hdds.scm.server.StorageContainerManagerStarter.main(StorageContainerManagerStarter.java:55)
 2020-02-21 13:57:34,892 [shutdown-hook-0] INFO 
org.apache.hadoop.hdds.scm.server.StorageContainerManagerStarter: SHUTDOWN_MSG: 
/************************************************************ SHUTDOWN_MSG: 
Shutting down StorageContainerManager at om-ha-1.vpc.cloudera.com/10.65.51.49 
************************************************************/{code}
{{}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

Reply via email to