[ 
https://issues.apache.org/jira/browse/HDDS-5284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee resolved HDDS-5284.
---------------------------------------
    Resolution: Fixed

> [SCM-HA] SCM start failed with PipelineNotFoundException
> --------------------------------------------------------
>
>                 Key: HDDS-5284
>                 URL: https://issues.apache.org/jira/browse/HDDS-5284
>             Project: Apache Ozone
>          Issue Type: Sub-task
>          Components: SCM HA
>            Reporter: Nilotpal Nandi
>            Assignee: Shashikant Banerjee
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.2.0
>
>
> {code:java}
> scm.log 
> 2021-05-27 09:55:42,189 INFO 
> org.apache.hadoop.hdds.scm.pipeline.RatisPipelineProvider: Sending 
> CreatePipelineCommand for 
> pipeline:PipelineID=875b2073-4034-4374-bba6-39011294a280 to 
> datanode:028fed4a-0087-4b70-b6e3-11f18d739094
> 2021-05-27 09:55:42,189 INFO 
> org.apache.hadoop.hdds.scm.pipeline.RatisPipelineProvider: Sending 
> CreatePipelineCommand for 
> pipeline:PipelineID=875b2073-4034-4374-bba6-39011294a280 to 
> datanode:a4b76016-dc24-47f2-a3ff-03c309fdcf9b
> 2021-05-27 09:55:42,189 INFO 
> org.apache.hadoop.hdds.scm.pipeline.RatisPipelineProvider: Sending 
> CreatePipelineCommand for 
> pipeline:PipelineID=875b2073-4034-4374-bba6-39011294a280 to 
> datanode:ed9d4872-166d-41c6-96ab-437a44e4168b
> 2021-05-27 09:55:42,199 INFO 
> org.apache.hadoop.hdds.scm.pipeline.PipelineStateManager: Created pipeline 
> Pipeline[ Id: 875b2073-4034-4374-bba6-39011294a280, Nodes: 
> 028fed4a-0087-4b70-b6e3-11f18d739094{ip: 172.27.167.6, host: 
> quasar-wudsvy-6.quasar-wudsvy.root.hwx.site, ports: [REPLICATION=9886, 
> RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, STANDALONE=9859], 
> networkLocation: /default, certSerialId: null, persistedOpState: IN_SERVICE, 
> persistedOpStateExpiryEpochSec: 0}a4b76016-dc24-47f2-a3ff-03c309fdcf9b{ip: 
> 172.27.12.201, host: quasar-wudsvy-4.quasar-wudsvy.root.hwx.site, ports: 
> [REPLICATION=9886, RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, 
> STANDALONE=9859], networkLocation: /default, certSerialId: null, 
> persistedOpState: IN_SERVICE, persistedOpStateExpiryEpochSec: 
> 0}ed9d4872-166d-41c6-96ab-437a44e4168b{ip: 172.27.74.4, host: 
> quasar-wudsvy-1.quasar-wudsvy.root.hwx.site, ports: [REPLICATION=9886, 
> RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, STANDALONE=9859], 
> networkLocation: /default, certSerialId: null, persistedOpState: IN_SERVICE, 
> persistedOpStateExpiryEpochSec: 0}, ReplicationConfig: RATIS/THREE, 
> State:ALLOCATED, leaderId:, CreationTimestamp2021-05-27T09:55:42.189Z].
> 2021-05-27 09:55:54,426 INFO 
> org.apache.hadoop.hdds.scm.pipeline.PipelineManagerV2Impl: Pipeline Pipeline[ 
> Id: 875b2073-4034-4374-bba6-39011294a280, Nodes: 
> 028fed4a-0087-4b70-b6e3-11f18d739094{ip: 172.27.167.6, host: 
> quasar-wudsvy-6.quasar-wudsvy.root.hwx.site, ports: [REPLICATION=9886, 
> RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, STANDALONE=9859], 
> networkLocation: /default, certSerialId: null, persistedOpState: IN_SERVICE, 
> persistedOpStateExpiryEpochSec: 0}a4b76016-dc24-47f2-a3ff-03c309fdcf9b{ip: 
> 172.27.12.201, host: quasar-wudsvy-4.quasar-wudsvy.root.hwx.site, ports: 
> [REPLICATION=9886, RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, 
> STANDALONE=9859], networkLocation: /default, certSerialId: null, 
> persistedOpState: IN_SERVICE, persistedOpStateExpiryEpochSec: 
> 0}ed9d4872-166d-41c6-96ab-437a44e4168b{ip: 172.27.74.4, host: 
> quasar-wudsvy-1.quasar-wudsvy.root.hwx.site, ports: [REPLICATION=9886, 
> RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, STANDALONE=9859], 
> networkLocation: /default, certSerialId: null, persistedOpState: IN_SERVICE, 
> persistedOpStateExpiryEpochSec: 0}, ReplicationConfig: RATIS/THREE, 
> State:ALLOCATED, leaderId:028fed4a-0087-4b70-b6e3-11f18d739094, 
> CreationTimestamp2021-05-27T09:55:42.189Z] moved to OPEN state
> 2021-05-27 10:06:45,920 INFO 
> org.apache.hadoop.hdds.scm.node.StaleNodeHandler: Datanode 
> 028fed4a-0087-4b70-b6e3-11f18d739094{ip: 172.27.167.6, host: 
> quasar-wudsvy-6.quasar-wudsvy.root.hwx.site, ports: [REPLICATION=9886, 
> RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, STANDALONE=9859], 
> networkLocation: /default, certSerialId: null, persistedOpState: IN_SERVICE, 
> persistedOpStateExpiryEpochSec: 0} moved to stale state. Finalizing its 
> pipelines [PipelineID=cd4a2a77-9715-4437-8d1d-3618a2c93103, 
> PipelineID=ca6100b9-b42c-4b77-bef5-35a9b1e725f2, 
> PipelineID=875b2073-4034-4374-bba6-39011294a280]
> 2021-05-27 10:06:45,932 INFO 
> org.apache.hadoop.hdds.scm.pipeline.PipelineManagerV2Impl: Pipeline Pipeline[ 
> Id: 875b2073-4034-4374-bba6-39011294a280, Nodes: 
> 028fed4a-0087-4b70-b6e3-11f18d739094{ip: 172.27.167.6, host: 
> quasar-wudsvy-6.quasar-wudsvy.root.hwx.site, ports: [REPLICATION=9886, 
> RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, STANDALONE=9859], 
> networkLocation: /default, certSerialId: null, persistedOpState: IN_SERVICE, 
> persistedOpStateExpiryEpochSec: 0}a4b76016-dc24-47f2-a3ff-03c309fdcf9b{ip: 
> 172.27.12.201, host: quasar-wudsvy-4.quasar-wudsvy.root.hwx.site, ports: 
> [REPLICATION=9886, RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, 
> STANDALONE=9859], networkLocation: /default, certSerialId: null, 
> persistedOpState: IN_SERVICE, persistedOpStateExpiryEpochSec: 
> 0}ed9d4872-166d-41c6-96ab-437a44e4168b{ip: 172.27.74.4, host: 
> quasar-wudsvy-1.quasar-wudsvy.root.hwx.site, ports: [REPLICATION=9886, 
> RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, STANDALONE=9859], 
> networkLocation: /default, certSerialId: null, persistedOpState: IN_SERVICE, 
> persistedOpStateExpiryEpochSec: 0}, ReplicationConfig: RATIS/THREE, 
> State:DORMANT, leaderId:a4b76016-dc24-47f2-a3ff-03c309fdcf9b, 
> CreationTimestamp2021-05-27T09:55:42.189Z] moved to CLOSED state
> 2021-05-27 10:06:57,921 INFO 
> org.apache.hadoop.hdds.scm.node.StaleNodeHandler: Datanode 
> a4b76016-dc24-47f2-a3ff-03c309fdcf9b{ip: 172.27.12.201, host: 
> quasar-wudsvy-4.quasar-wudsvy.root.hwx.site, ports: [REPLICATION=9886, 
> RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, STANDALONE=9859], 
> networkLocation: /default, certSerialId: null, persistedOpState: IN_SERVICE, 
> persistedOpStateExpiryEpochSec: 0} moved to stale state. Finalizing its 
> pipelines [PipelineID=cd4a2a77-9715-4437-8d1d-3618a2c93103, 
> PipelineID875b2073-4034-4374-bba6-39011294a280, 
> PipelineID=2878c722-84dc-40f9-b1c1-46ed0f8bcdd7]
> 2021-05-27 10:07:41,073 INFO 
> org.apache.hadoop.hdds.scm.pipeline.PipelineManagerV2Impl: Scrubbing 
> pipeline: id: PipelineID=875b2073-4034-4374-bba6-39011294a280 since it stays 
> at CLOSED stage.
> 2021-05-27 10:07:41,073 INFO 
> org.apache.hadoop.hdds.scm.pipeline.RatisPipelineProvider: Send 
> pipeline:PipelineID=875b2073-4034-4374-bba6-39011294a280 close command to 
> datanode 028fed4a-0087-4b70-b6e3-11f18d739094
> 2021-05-27 10:07:41,073 INFO 
> org.apache.hadoop.hdds.scm.pipeline.RatisPipelineProvider: Send 
> pipeline:PipelineID=875b2073-4034-4374-bba6-39011294a280 close command to 
> datanode a4b76016-dc24-47f2-a3ff-03c309fdcf9b
> 2021-05-27 10:07:41,073 INFO 
> org.apache.hadoop.hdds.scm.pipeline.RatisPipelineProvider: Send 
> pipeline:PipelineID=875b2073-4034-4374-bba6-39011294a280 close command to 
> datanode ed9d4872-166d-41c6-96ab-437a44e4168b
> 2021-05-27 10:07:41,075 INFO 
> org.apache.hadoop.hdds.scm.pipeline.PipelineStateManager: Pipeline Pipeline[ 
> Id: 875b2073-4034-4374-bba6-39011294a280, Nodes: 
> 028fed4a-0087-4b70-b6e3-11f18d739094{ip: 172.27.167.6, host: 
> quasar-wudsvy-6.quasar-wudsvy.root.hwx.site, ports: [REPLICATION=9886, 
> RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, STANDALONE=9859], 
> networkLocation: /default, certSerialId: null, persistedOpState: IN_SERVICE, 
> persistedOpStateExpiryEpochSec: 0}a4b76016-dc24-47f2-a3ff-03c309fdcf9b{ip: 
> 172.27.12.201, host: quasar-wudsvy-4.quasar-wudsvy.root.hwx.site, ports: 
> [REPLICATION=9886, RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, 
> STANDALONE=9859], networkLocation: /default, certSerialId: null, 
> persistedOpState: IN_SERVICE, persistedOpStateExpiryEpochSec: 
> 0}ed9d4872-166d-41c6-96ab-437a44e4168b{ip: 172.27.74.4, host: 
> quasar-wudsvy-1.quasar-wudsvy.root.hwx.site, ports: [REPLICATION=9886, 
> RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, STANDALONE=9859], 
> networkLocation: /default, certSerialId: null, persistedOpState: IN_SERVICE, 
> persistedOpStateExpiryEpochSec: 0}, ReplicationConfig: RATIS/THREE, 
> State:CLOSED, leaderId:a4b76016-dc24-47f2-a3ff-03c309fdcf9b, 
> CreationTimestamp2021-05-27T09:55:42.189Z] removed.
> {code}
> The logs indicate that, a pipeline got created, moved to open state, and then 
> one of the datanodes went stale, thereby the pipeline moved to closed state. 
> The pipeline got scrubbed by the pipeline scrubber and got deleted. 
> {code:java}
> 2021-05-27 10:07:41,073 INFO 
> org.apache.hadoop.hdds.scm.pipeline.PipelineManagerV2Impl: Scrubbing 
> pipeline: id: PipelineID=875b2073-4034-4374-bba6-39011294a280 since it stays 
> at CLOSED stage.{code}
> Next update for the pipeline to be moved to close state as a part of  report 
> from other datanodes in the pipeline will fail as the pipeline is removed 
> from scm memory/db and hence scm terminates.
> The solution would be to ignore PipelineNotFoundException in 
> PipelineStateManagerV2Impl#updatePipelineState.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to