[
https://issues.apache.org/jira/browse/HDDS-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shashikant Banerjee updated HDDS-5356:
--------------------------------------
Description:
{code:java}
{code}
was:
{code:java}
scm.log
2021-05-27 09:55:42,189 INFO
org.apache.hadoop.hdds.scm.pipeline.RatisPipelineProvider: Sending
CreatePipelineCommand for
pipeline:PipelineID=875b2073-4034-4374-bba6-39011294a280 to
datanode:028fed4a-0087-4b70-b6e3-11f18d739094
2021-05-27 09:55:42,189 INFO
org.apache.hadoop.hdds.scm.pipeline.RatisPipelineProvider: Sending
CreatePipelineCommand for
pipeline:PipelineID=875b2073-4034-4374-bba6-39011294a280 to
datanode:a4b76016-dc24-47f2-a3ff-03c309fdcf9b
2021-05-27 09:55:42,189 INFO
org.apache.hadoop.hdds.scm.pipeline.RatisPipelineProvider: Sending
CreatePipelineCommand for
pipeline:PipelineID=875b2073-4034-4374-bba6-39011294a280 to
datanode:ed9d4872-166d-41c6-96ab-437a44e4168b
2021-05-27 09:55:42,199 INFO
org.apache.hadoop.hdds.scm.pipeline.PipelineStateManager: Created pipeline
Pipeline[ Id: 875b2073-4034-4374-bba6-39011294a280, Nodes:
028fed4a-0087-4b70-b6e3-11f18d739094{ip: 172.27.167.6, host:
quasar-wudsvy-6.quasar-wudsvy.root.hwx.site, ports: [REPLICATION=9886,
RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, STANDALONE=9859],
networkLocation: /default, certSerialId: null, persistedOpState: IN_SERVICE,
persistedOpStateExpiryEpochSec: 0}a4b76016-dc24-47f2-a3ff-03c309fdcf9b{ip:
172.27.12.201, host: quasar-wudsvy-4.quasar-wudsvy.root.hwx.site, ports:
[REPLICATION=9886, RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856,
STANDALONE=9859], networkLocation: /default, certSerialId: null,
persistedOpState: IN_SERVICE, persistedOpStateExpiryEpochSec:
0}ed9d4872-166d-41c6-96ab-437a44e4168b{ip: 172.27.74.4, host:
quasar-wudsvy-1.quasar-wudsvy.root.hwx.site, ports: [REPLICATION=9886,
RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, STANDALONE=9859],
networkLocation: /default, certSerialId: null, persistedOpState: IN_SERVICE,
persistedOpStateExpiryEpochSec: 0}, ReplicationConfig: RATIS/THREE,
State:ALLOCATED, leaderId:, CreationTimestamp2021-05-27T09:55:42.189Z].
2021-05-27 09:55:54,426 INFO
org.apache.hadoop.hdds.scm.pipeline.PipelineManagerV2Impl: Pipeline Pipeline[
Id: 875b2073-4034-4374-bba6-39011294a280, Nodes:
028fed4a-0087-4b70-b6e3-11f18d739094{ip: 172.27.167.6, host:
quasar-wudsvy-6.quasar-wudsvy.root.hwx.site, ports: [REPLICATION=9886,
RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, STANDALONE=9859],
networkLocation: /default, certSerialId: null, persistedOpState: IN_SERVICE,
persistedOpStateExpiryEpochSec: 0}a4b76016-dc24-47f2-a3ff-03c309fdcf9b{ip:
172.27.12.201, host: quasar-wudsvy-4.quasar-wudsvy.root.hwx.site, ports:
[REPLICATION=9886, RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856,
STANDALONE=9859], networkLocation: /default, certSerialId: null,
persistedOpState: IN_SERVICE, persistedOpStateExpiryEpochSec:
0}ed9d4872-166d-41c6-96ab-437a44e4168b{ip: 172.27.74.4, host:
quasar-wudsvy-1.quasar-wudsvy.root.hwx.site, ports: [REPLICATION=9886,
RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, STANDALONE=9859],
networkLocation: /default, certSerialId: null, persistedOpState: IN_SERVICE,
persistedOpStateExpiryEpochSec: 0}, ReplicationConfig: RATIS/THREE,
State:ALLOCATED, leaderId:028fed4a-0087-4b70-b6e3-11f18d739094,
CreationTimestamp2021-05-27T09:55:42.189Z] moved to OPEN state
2021-05-27 10:06:45,920 INFO org.apache.hadoop.hdds.scm.node.StaleNodeHandler:
Datanode 028fed4a-0087-4b70-b6e3-11f18d739094{ip: 172.27.167.6, host:
quasar-wudsvy-6.quasar-wudsvy.root.hwx.site, ports: [REPLICATION=9886,
RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, STANDALONE=9859],
networkLocation: /default, certSerialId: null, persistedOpState: IN_SERVICE,
persistedOpStateExpiryEpochSec: 0} moved to stale state. Finalizing its
pipelines [PipelineID=cd4a2a77-9715-4437-8d1d-3618a2c93103,
PipelineID=ca6100b9-b42c-4b77-bef5-35a9b1e725f2,
PipelineID=875b2073-4034-4374-bba6-39011294a280]
2021-05-27 10:06:45,932 INFO
org.apache.hadoop.hdds.scm.pipeline.PipelineManagerV2Impl: Pipeline Pipeline[
Id: 875b2073-4034-4374-bba6-39011294a280, Nodes:
028fed4a-0087-4b70-b6e3-11f18d739094{ip: 172.27.167.6, host:
quasar-wudsvy-6.quasar-wudsvy.root.hwx.site, ports: [REPLICATION=9886,
RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, STANDALONE=9859],
networkLocation: /default, certSerialId: null, persistedOpState: IN_SERVICE,
persistedOpStateExpiryEpochSec: 0}a4b76016-dc24-47f2-a3ff-03c309fdcf9b{ip:
172.27.12.201, host: quasar-wudsvy-4.quasar-wudsvy.root.hwx.site, ports:
[REPLICATION=9886, RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856,
STANDALONE=9859], networkLocation: /default, certSerialId: null,
persistedOpState: IN_SERVICE, persistedOpStateExpiryEpochSec:
0}ed9d4872-166d-41c6-96ab-437a44e4168b{ip: 172.27.74.4, host:
quasar-wudsvy-1.quasar-wudsvy.root.hwx.site, ports: [REPLICATION=9886,
RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, STANDALONE=9859],
networkLocation: /default, certSerialId: null, persistedOpState: IN_SERVICE,
persistedOpStateExpiryEpochSec: 0}, ReplicationConfig: RATIS/THREE,
State:DORMANT, leaderId:a4b76016-dc24-47f2-a3ff-03c309fdcf9b,
CreationTimestamp2021-05-27T09:55:42.189Z] moved to CLOSED state
2021-05-27 10:06:57,921 INFO org.apache.hadoop.hdds.scm.node.StaleNodeHandler:
Datanode a4b76016-dc24-47f2-a3ff-03c309fdcf9b{ip: 172.27.12.201, host:
quasar-wudsvy-4.quasar-wudsvy.root.hwx.site, ports: [REPLICATION=9886,
RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, STANDALONE=9859],
networkLocation: /default, certSerialId: null, persistedOpState: IN_SERVICE,
persistedOpStateExpiryEpochSec: 0} moved to stale state. Finalizing its
pipelines [PipelineID=cd4a2a77-9715-4437-8d1d-3618a2c93103,
PipelineID875b2073-4034-4374-bba6-39011294a280,
PipelineID=2878c722-84dc-40f9-b1c1-46ed0f8bcdd7]
2021-05-27 10:07:41,073 INFO
org.apache.hadoop.hdds.scm.pipeline.PipelineManagerV2Impl: Scrubbing pipeline:
id: PipelineID=875b2073-4034-4374-bba6-39011294a280 since it stays at CLOSED
stage.
2021-05-27 10:07:41,073 INFO
org.apache.hadoop.hdds.scm.pipeline.RatisPipelineProvider: Send
pipeline:PipelineID=875b2073-4034-4374-bba6-39011294a280 close command to
datanode 028fed4a-0087-4b70-b6e3-11f18d739094
2021-05-27 10:07:41,073 INFO
org.apache.hadoop.hdds.scm.pipeline.RatisPipelineProvider: Send
pipeline:PipelineID=875b2073-4034-4374-bba6-39011294a280 close command to
datanode a4b76016-dc24-47f2-a3ff-03c309fdcf9b
2021-05-27 10:07:41,073 INFO
org.apache.hadoop.hdds.scm.pipeline.RatisPipelineProvider: Send
pipeline:PipelineID=875b2073-4034-4374-bba6-39011294a280 close command to
datanode ed9d4872-166d-41c6-96ab-437a44e4168b
2021-05-27 10:07:41,075 INFO
org.apache.hadoop.hdds.scm.pipeline.PipelineStateManager: Pipeline Pipeline[
Id: 875b2073-4034-4374-bba6-39011294a280, Nodes:
028fed4a-0087-4b70-b6e3-11f18d739094{ip: 172.27.167.6, host:
quasar-wudsvy-6.quasar-wudsvy.root.hwx.site, ports: [REPLICATION=9886,
RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, STANDALONE=9859],
networkLocation: /default, certSerialId: null, persistedOpState: IN_SERVICE,
persistedOpStateExpiryEpochSec: 0}a4b76016-dc24-47f2-a3ff-03c309fdcf9b{ip:
172.27.12.201, host: quasar-wudsvy-4.quasar-wudsvy.root.hwx.site, ports:
[REPLICATION=9886, RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856,
STANDALONE=9859], networkLocation: /default, certSerialId: null,
persistedOpState: IN_SERVICE, persistedOpStateExpiryEpochSec:
0}ed9d4872-166d-41c6-96ab-437a44e4168b{ip: 172.27.74.4, host:
quasar-wudsvy-1.quasar-wudsvy.root.hwx.site, ports: [REPLICATION=9886,
RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, STANDALONE=9859],
networkLocation: /default, certSerialId: null, persistedOpState: IN_SERVICE,
persistedOpStateExpiryEpochSec: 0}, ReplicationConfig: RATIS/THREE,
State:CLOSED, leaderId:a4b76016-dc24-47f2-a3ff-03c309fdcf9b,
CreationTimestamp2021-05-27T09:55:42.189Z] removed.
{code}
The logs indicate that, a pipeline got created, moved to open state, and then
one of the datanodes went stale, thereby the pipeline moved to closed state.
The pipeline got scrubbed by the pipeline scrubber and got deleted.
{code:java}
2021-05-27 10:07:41,073 INFO
org.apache.hadoop.hdds.scm.pipeline.PipelineManagerV2Impl: Scrubbing pipeline:
id: PipelineID=875b2073-4034-4374-bba6-39011294a280 since it stays at CLOSED
stage.{code}
Next update for the pipeline to be moved to close state as a part of report
from other datanodes in the pipeline will fail as the pipeline is removed from
scm memory/db and hence scm terminates.
The solution would be to ignore PipelineNotFoundException in
PipelineStateManagerV2Impl#updatePipelineState.
> [SCM-HA] SCM start failed with PipelineNotFoundException
> --------------------------------------------------------
>
> Key: HDDS-5356
> URL: https://issues.apache.org/jira/browse/HDDS-5356
> Project: Apache Ozone
> Issue Type: Sub-task
> Components: SCM HA
> Reporter: Nilotpal Nandi
> Assignee: Shashikant Banerjee
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.2.0
>
>
> {code:java}
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]