[
https://issues.apache.org/jira/browse/HDDS-5284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shashikant Banerjee resolved HDDS-5284.
---------------------------------------
Resolution: Fixed
> [SCM-HA] SCM start failed with PipelineNotFoundException
> --------------------------------------------------------
>
> Key: HDDS-5284
> URL: https://issues.apache.org/jira/browse/HDDS-5284
> Project: Apache Ozone
> Issue Type: Sub-task
> Components: SCM HA
> Reporter: Nilotpal Nandi
> Assignee: Shashikant Banerjee
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.2.0
>
>
> {code:java}
> scm.log
> 2021-05-27 09:55:42,189 INFO
> org.apache.hadoop.hdds.scm.pipeline.RatisPipelineProvider: Sending
> CreatePipelineCommand for
> pipeline:PipelineID=875b2073-4034-4374-bba6-39011294a280 to
> datanode:028fed4a-0087-4b70-b6e3-11f18d739094
> 2021-05-27 09:55:42,189 INFO
> org.apache.hadoop.hdds.scm.pipeline.RatisPipelineProvider: Sending
> CreatePipelineCommand for
> pipeline:PipelineID=875b2073-4034-4374-bba6-39011294a280 to
> datanode:a4b76016-dc24-47f2-a3ff-03c309fdcf9b
> 2021-05-27 09:55:42,189 INFO
> org.apache.hadoop.hdds.scm.pipeline.RatisPipelineProvider: Sending
> CreatePipelineCommand for
> pipeline:PipelineID=875b2073-4034-4374-bba6-39011294a280 to
> datanode:ed9d4872-166d-41c6-96ab-437a44e4168b
> 2021-05-27 09:55:42,199 INFO
> org.apache.hadoop.hdds.scm.pipeline.PipelineStateManager: Created pipeline
> Pipeline[ Id: 875b2073-4034-4374-bba6-39011294a280, Nodes:
> 028fed4a-0087-4b70-b6e3-11f18d739094{ip: 172.27.167.6, host:
> quasar-wudsvy-6.quasar-wudsvy.root.hwx.site, ports: [REPLICATION=9886,
> RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, STANDALONE=9859],
> networkLocation: /default, certSerialId: null, persistedOpState: IN_SERVICE,
> persistedOpStateExpiryEpochSec: 0}a4b76016-dc24-47f2-a3ff-03c309fdcf9b{ip:
> 172.27.12.201, host: quasar-wudsvy-4.quasar-wudsvy.root.hwx.site, ports:
> [REPLICATION=9886, RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856,
> STANDALONE=9859], networkLocation: /default, certSerialId: null,
> persistedOpState: IN_SERVICE, persistedOpStateExpiryEpochSec:
> 0}ed9d4872-166d-41c6-96ab-437a44e4168b{ip: 172.27.74.4, host:
> quasar-wudsvy-1.quasar-wudsvy.root.hwx.site, ports: [REPLICATION=9886,
> RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, STANDALONE=9859],
> networkLocation: /default, certSerialId: null, persistedOpState: IN_SERVICE,
> persistedOpStateExpiryEpochSec: 0}, ReplicationConfig: RATIS/THREE,
> State:ALLOCATED, leaderId:, CreationTimestamp2021-05-27T09:55:42.189Z].
> 2021-05-27 09:55:54,426 INFO
> org.apache.hadoop.hdds.scm.pipeline.PipelineManagerV2Impl: Pipeline Pipeline[
> Id: 875b2073-4034-4374-bba6-39011294a280, Nodes:
> 028fed4a-0087-4b70-b6e3-11f18d739094{ip: 172.27.167.6, host:
> quasar-wudsvy-6.quasar-wudsvy.root.hwx.site, ports: [REPLICATION=9886,
> RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, STANDALONE=9859],
> networkLocation: /default, certSerialId: null, persistedOpState: IN_SERVICE,
> persistedOpStateExpiryEpochSec: 0}a4b76016-dc24-47f2-a3ff-03c309fdcf9b{ip:
> 172.27.12.201, host: quasar-wudsvy-4.quasar-wudsvy.root.hwx.site, ports:
> [REPLICATION=9886, RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856,
> STANDALONE=9859], networkLocation: /default, certSerialId: null,
> persistedOpState: IN_SERVICE, persistedOpStateExpiryEpochSec:
> 0}ed9d4872-166d-41c6-96ab-437a44e4168b{ip: 172.27.74.4, host:
> quasar-wudsvy-1.quasar-wudsvy.root.hwx.site, ports: [REPLICATION=9886,
> RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, STANDALONE=9859],
> networkLocation: /default, certSerialId: null, persistedOpState: IN_SERVICE,
> persistedOpStateExpiryEpochSec: 0}, ReplicationConfig: RATIS/THREE,
> State:ALLOCATED, leaderId:028fed4a-0087-4b70-b6e3-11f18d739094,
> CreationTimestamp2021-05-27T09:55:42.189Z] moved to OPEN state
> 2021-05-27 10:06:45,920 INFO
> org.apache.hadoop.hdds.scm.node.StaleNodeHandler: Datanode
> 028fed4a-0087-4b70-b6e3-11f18d739094{ip: 172.27.167.6, host:
> quasar-wudsvy-6.quasar-wudsvy.root.hwx.site, ports: [REPLICATION=9886,
> RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, STANDALONE=9859],
> networkLocation: /default, certSerialId: null, persistedOpState: IN_SERVICE,
> persistedOpStateExpiryEpochSec: 0} moved to stale state. Finalizing its
> pipelines [PipelineID=cd4a2a77-9715-4437-8d1d-3618a2c93103,
> PipelineID=ca6100b9-b42c-4b77-bef5-35a9b1e725f2,
> PipelineID=875b2073-4034-4374-bba6-39011294a280]
> 2021-05-27 10:06:45,932 INFO
> org.apache.hadoop.hdds.scm.pipeline.PipelineManagerV2Impl: Pipeline Pipeline[
> Id: 875b2073-4034-4374-bba6-39011294a280, Nodes:
> 028fed4a-0087-4b70-b6e3-11f18d739094{ip: 172.27.167.6, host:
> quasar-wudsvy-6.quasar-wudsvy.root.hwx.site, ports: [REPLICATION=9886,
> RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, STANDALONE=9859],
> networkLocation: /default, certSerialId: null, persistedOpState: IN_SERVICE,
> persistedOpStateExpiryEpochSec: 0}a4b76016-dc24-47f2-a3ff-03c309fdcf9b{ip:
> 172.27.12.201, host: quasar-wudsvy-4.quasar-wudsvy.root.hwx.site, ports:
> [REPLICATION=9886, RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856,
> STANDALONE=9859], networkLocation: /default, certSerialId: null,
> persistedOpState: IN_SERVICE, persistedOpStateExpiryEpochSec:
> 0}ed9d4872-166d-41c6-96ab-437a44e4168b{ip: 172.27.74.4, host:
> quasar-wudsvy-1.quasar-wudsvy.root.hwx.site, ports: [REPLICATION=9886,
> RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, STANDALONE=9859],
> networkLocation: /default, certSerialId: null, persistedOpState: IN_SERVICE,
> persistedOpStateExpiryEpochSec: 0}, ReplicationConfig: RATIS/THREE,
> State:DORMANT, leaderId:a4b76016-dc24-47f2-a3ff-03c309fdcf9b,
> CreationTimestamp2021-05-27T09:55:42.189Z] moved to CLOSED state
> 2021-05-27 10:06:57,921 INFO
> org.apache.hadoop.hdds.scm.node.StaleNodeHandler: Datanode
> a4b76016-dc24-47f2-a3ff-03c309fdcf9b{ip: 172.27.12.201, host:
> quasar-wudsvy-4.quasar-wudsvy.root.hwx.site, ports: [REPLICATION=9886,
> RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, STANDALONE=9859],
> networkLocation: /default, certSerialId: null, persistedOpState: IN_SERVICE,
> persistedOpStateExpiryEpochSec: 0} moved to stale state. Finalizing its
> pipelines [PipelineID=cd4a2a77-9715-4437-8d1d-3618a2c93103,
> PipelineID875b2073-4034-4374-bba6-39011294a280,
> PipelineID=2878c722-84dc-40f9-b1c1-46ed0f8bcdd7]
> 2021-05-27 10:07:41,073 INFO
> org.apache.hadoop.hdds.scm.pipeline.PipelineManagerV2Impl: Scrubbing
> pipeline: id: PipelineID=875b2073-4034-4374-bba6-39011294a280 since it stays
> at CLOSED stage.
> 2021-05-27 10:07:41,073 INFO
> org.apache.hadoop.hdds.scm.pipeline.RatisPipelineProvider: Send
> pipeline:PipelineID=875b2073-4034-4374-bba6-39011294a280 close command to
> datanode 028fed4a-0087-4b70-b6e3-11f18d739094
> 2021-05-27 10:07:41,073 INFO
> org.apache.hadoop.hdds.scm.pipeline.RatisPipelineProvider: Send
> pipeline:PipelineID=875b2073-4034-4374-bba6-39011294a280 close command to
> datanode a4b76016-dc24-47f2-a3ff-03c309fdcf9b
> 2021-05-27 10:07:41,073 INFO
> org.apache.hadoop.hdds.scm.pipeline.RatisPipelineProvider: Send
> pipeline:PipelineID=875b2073-4034-4374-bba6-39011294a280 close command to
> datanode ed9d4872-166d-41c6-96ab-437a44e4168b
> 2021-05-27 10:07:41,075 INFO
> org.apache.hadoop.hdds.scm.pipeline.PipelineStateManager: Pipeline Pipeline[
> Id: 875b2073-4034-4374-bba6-39011294a280, Nodes:
> 028fed4a-0087-4b70-b6e3-11f18d739094{ip: 172.27.167.6, host:
> quasar-wudsvy-6.quasar-wudsvy.root.hwx.site, ports: [REPLICATION=9886,
> RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, STANDALONE=9859],
> networkLocation: /default, certSerialId: null, persistedOpState: IN_SERVICE,
> persistedOpStateExpiryEpochSec: 0}a4b76016-dc24-47f2-a3ff-03c309fdcf9b{ip:
> 172.27.12.201, host: quasar-wudsvy-4.quasar-wudsvy.root.hwx.site, ports:
> [REPLICATION=9886, RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856,
> STANDALONE=9859], networkLocation: /default, certSerialId: null,
> persistedOpState: IN_SERVICE, persistedOpStateExpiryEpochSec:
> 0}ed9d4872-166d-41c6-96ab-437a44e4168b{ip: 172.27.74.4, host:
> quasar-wudsvy-1.quasar-wudsvy.root.hwx.site, ports: [REPLICATION=9886,
> RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, STANDALONE=9859],
> networkLocation: /default, certSerialId: null, persistedOpState: IN_SERVICE,
> persistedOpStateExpiryEpochSec: 0}, ReplicationConfig: RATIS/THREE,
> State:CLOSED, leaderId:a4b76016-dc24-47f2-a3ff-03c309fdcf9b,
> CreationTimestamp2021-05-27T09:55:42.189Z] removed.
> {code}
> The logs indicate that, a pipeline got created, moved to open state, and then
> one of the datanodes went stale, thereby the pipeline moved to closed state.
> The pipeline got scrubbed by the pipeline scrubber and got deleted.
> {code:java}
> 2021-05-27 10:07:41,073 INFO
> org.apache.hadoop.hdds.scm.pipeline.PipelineManagerV2Impl: Scrubbing
> pipeline: id: PipelineID=875b2073-4034-4374-bba6-39011294a280 since it stays
> at CLOSED stage.{code}
> Next update for the pipeline to be moved to close state as a part of report
> from other datanodes in the pipeline will fail as the pipeline is removed
> from scm memory/db and hence scm terminates.
> The solution would be to ignore PipelineNotFoundException in
> PipelineStateManagerV2Impl#updatePipelineState.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]