[
https://issues.apache.org/jira/browse/HDDS-3277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Arpit Agarwal updated HDDS-3277:
--------------------------------
Target Version/s: 0.6.0
Labels: MiniOzoneChaosCluster Triaged (was:
MiniOzoneChaosCluster)
> Datanodes do not close pipeline when pipeline directory is deleted.
> -------------------------------------------------------------------
>
> Key: HDDS-3277
> URL: https://issues.apache.org/jira/browse/HDDS-3277
> Project: Hadoop Distributed Data Store
> Issue Type: Bug
> Components: Ozone Datanode
> Affects Versions: 0.6.0
> Reporter: Mukul Kumar Singh
> Priority: Major
> Labels: MiniOzoneChaosCluster, Triaged
>
> First the pipeline was deleted
> {code}
> 2020-03-25 19:44:22,669 [pool-22-thread-1] INFO failure.Failures
> (FailureManager.java:fail(49)) - failing with, DeletePipelineFailure
> 2020-03-25 19:44:22,669 [pool-22-thread-1] INFO failure.Failures
> (Failures.java:fail(118)) - deleteing pipeline directory
> /tmp/chaos-2020-03-25-19-42-52-IST/MiniOzoneClusterImpl-ef9b224f-a403-4e9b-a27a-ed38f46700
> c5/datanode-0/data/ratis/c4275846-2a44-4f53-b00d-c95a81785df9
> 2020-03-25 19:44:22,679 [pool-22-thread-1] INFO failure.Failures
> (Failures.java:fail(118)) - deleteing pipeline directory
> /tmp/chaos-2020-03-25-19-42-52-IST/MiniOzoneClusterImpl-ef9b224f-a403-4e9b-a27a-ed38f46700
> c5/datanode-3/data/ratis/c4275846-2a44-4f53-b00d-c95a81785df9
> 2020-03-25 19:44:22,681 [pool-22-thread-1] INFO failure.Failures
> (Failures.java:fail(118)) - deleteing pipeline directory
> /tmp/chaos-2020-03-25-19-42-52-IST/MiniOzoneClusterImpl-ef9b224f-a403-4e9b-a27a-ed38f46700
> c5/datanode-5/data/ratis/c4275846-2a44-4f53-b00d-c95a81785df9
> {code}
> However no pipeline failure handling was issued to SCM.
> {code}
> 2020-03-25 19:44:24,532
> [b5d165bc-d2b3-497c-ae38-10f649674a3f@group-C95A81785DF9-StateMachineUpdater]
> ERROR ratis.ContainerStateMachine
> (ContainerStateMachine.java:takeSnapshot(302)) - group-C95A81785DF9: Failed
> to write snapshot at:(t:1, i:2037) file
> /tmp/chaos-2020-03-25-19-42-52-IST/MiniOzoneClusterImpl-ef9b224f-a403-4e9b-a27a-ed38f46700c5/datanode-3/data/ratis/c4275846-2a44-4f53-b00d-c95a81785df9/sm/snapshot.1_2037
> 2020-03-25 19:44:24,532
> [b5d165bc-d2b3-497c-ae38-10f649674a3f@group-C95A81785DF9-StateMachineUpdater]
> ERROR impl.StateMachineUpdater (StateMachineUpdater.java:takeSnapshot(269)) -
> b5d165bc-d2b3-497c-ae38-10f649674a3f@group-C95A81785DF9-StateMachineUpdater:
> Failed to take snapshot
> java.io.FileNotFoundException:
> /tmp/chaos-2020-03-25-19-42-52-IST/MiniOzoneClusterImpl-ef9b224f-a403-4e9b-a27a-ed38f46700c5/datanode-3/data/ratis/c4275846-2a44-4f53-b00d-c95a81785df9/sm/snapshot.1_2037
> (No such file or directory)
> at java.io.FileOutputStream.open0(Native Method)
> at java.io.FileOutputStream.open(FileOutputStream.java:270)
> at java.io.FileOutputStream.<init>(FileOutputStream.java:213)
> at java.io.FileOutputStream.<init>(FileOutputStream.java:162)
> at
> org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.takeSnapshot(ContainerStateMachine.java:296)
> at
> org.apache.ratis.server.impl.StateMachineUpdater.takeSnapshot(StateMachineUpdater.java:258)
> at
> org.apache.ratis.server.impl.StateMachineUpdater.checkAndTakeSnapshot(StateMachineUpdater.java:250)
> at
> org.apache.ratis.server.impl.StateMachineUpdater.run(StateMachineUpdater.java:169)
> at java.lang.Thread.run(Thread.java:748)
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]