[ 
https://issues.apache.org/jira/browse/HDDS-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-3066:
-------------------------------------
    Status: Patch Available  (was: Open)

> SCM startup failed  during loading containers from DB
> -----------------------------------------------------
>
>                 Key: HDDS-3066
>                 URL: https://issues.apache.org/jira/browse/HDDS-3066
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>            Reporter: Bharat Viswanadham
>            Assignee: Bharat Viswanadham
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
>  This is happening because pipeline scrubber came and removed pipeline, and 
> it closed pipeline and removed from DB and triggered close containers to set 
> them to CLOSING. When SCM is restarted before close container command is 
> handled and change the state to CLOSING, the below issue can happen.
>  
> This can happen in other scenarios like when safeModeHandler calls 
> finalizeAndDestroyPipeline and do SCM restart. 
>  
> The root cause for this is Pipeline removed from DB and the container is in 
> open state in this scenario, and when trying to get pipeline we will crash 
> SCM due to the {{PipelineNotFoundException error.}}
> {{}}
> {code:java}
>  2020-02-21 13:57:34,888 [main] ERROR 
> org.apache.hadoop.hdds.scm.server.StorageContainerManagerStarter: SCM start 
> failed with exception 
> org.apache.hadoop.hdds.scm.pipeline.PipelineNotFoundException: 
> PipelineID=35dff62d-9bfa-449b-b6e8-6f00cc8c1b6e not found at 
> org.apache.hadoop.hdds.scm.pipeline.PipelineStateMap.getPipeline(PipelineStateMap.java:133)
>  at 
> org.apache.hadoop.hdds.scm.pipeline.PipelineStateMap.addContainerToPipeline(PipelineStateMap.java:110)
>  at 
> org.apache.hadoop.hdds.scm.pipeline.PipelineStateManager.addContainerToPipeline(PipelineStateManager.java:59)
>  at 
> org.apache.hadoop.hdds.scm.pipeline.SCMPipelineManager.addContainerToPipeline(SCMPipelineManager.java:309)
>  at 
> org.apache.hadoop.hdds.scm.container.SCMContainerManager.loadExistingContainers(SCMContainerManager.java:121)
>  at 
> org.apache.hadoop.hdds.scm.container.SCMContainerManager.<init>(SCMContainerManager.java:107)
>  at 
> org.apache.hadoop.hdds.scm.server.StorageContainerManager.initializeSystemManagers(StorageContainerManager.java:412)
>  at 
> org.apache.hadoop.hdds.scm.server.StorageContainerManager.<init>(StorageContainerManager.java:283)
>  at 
> org.apache.hadoop.hdds.scm.server.StorageContainerManager.<init>(StorageContainerManager.java:215)
>  at 
> org.apache.hadoop.hdds.scm.server.StorageContainerManager.createSCM(StorageContainerManager.java:612)
>  at 
> org.apache.hadoop.hdds.scm.server.StorageContainerManagerStarter$SCMStarterHelper.start(StorageContainerManagerStarter.java:142)
>  at 
> org.apache.hadoop.hdds.scm.server.StorageContainerManagerStarter.startScm(StorageContainerManagerStarter.java:117)
>  at 
> org.apache.hadoop.hdds.scm.server.StorageContainerManagerStarter.call(StorageContainerManagerStarter.java:66)
>  at 
> org.apache.hadoop.hdds.scm.server.StorageContainerManagerStarter.call(StorageContainerManagerStarter.java:42)
>  at picocli.CommandLine.execute(CommandLine.java:1173) at 
> picocli.CommandLine.access$800(CommandLine.java:141) at 
> picocli.CommandLine$RunLast.handle(CommandLine.java:1367) at 
> picocli.CommandLine$RunLast.handle(CommandLine.java:1335) at 
> picocli.CommandLine$AbstractParseResultHandler.handleParseResult(CommandLine.java:1243)
>  at picocli.CommandLine.parseWithHandlers(CommandLine.java:1526) at 
> picocli.CommandLine.parseWithHandler(CommandLine.java:1465) at 
> org.apache.hadoop.hdds.cli.GenericCli.execute(GenericCli.java:65) at 
> org.apache.hadoop.hdds.cli.GenericCli.run(GenericCli.java:56) at 
> org.apache.hadoop.hdds.scm.server.StorageContainerManagerStarter.main(StorageContainerManagerStarter.java:55)
>  2020-02-21 13:57:34,892 [shutdown-hook-0] INFO 
> org.apache.hadoop.hdds.scm.server.StorageContainerManagerStarter: 
> SHUTDOWN_MSG: /************************************************************ 
> SHUTDOWN_MSG: Shutting down StorageContainerManager at 
> om-ha-1.vpc.cloudera.com/10.65.51.49 
> ************************************************************/{code}
> {{}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

Reply via email to