[
https://issues.apache.org/jira/browse/HDDS-1765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17133655#comment-17133655
]
Nanda kumar commented on HDDS-1765:
-----------------------------------
This is not causing any issue other than logging an error message. We can print
a single line error message instead of printing the stacktrace.
> destroyPipeline scheduled from finalizeAndDestroyPipeline fails for short
> dead node interval
> --------------------------------------------------------------------------------------------
>
> Key: HDDS-1765
> URL: https://issues.apache.org/jira/browse/HDDS-1765
> Project: Hadoop Distributed Data Store
> Issue Type: Bug
> Components: SCM
> Reporter: Supratim Deka
> Priority: Major
> Labels: MiniOzoneChaosCluster, Triaged
>
> This happens when
> OZONE_SCM_PIPELINE_DESTROY_TIMEOUT exceeds the value of
> OZONE_SCM_DEADNODE_INTERVAL. This is the case for start-chaos.sh
> When a Datanode is shutdown, SCM Stale node handler calls
> finalizeAndDestroyPipeline() which schedules destroyPipeline() operation with
> a delay
> of OZONE_SCM_PIPELINE_DESTROY_TIMEOUT. By the time this gets scheduled, dead
> node handler would have destroyed the pipeline.
>
> {code:java}
> 2019-07-05 14:45:16,358 INFO pipeline.SCMPipelineManager
> (SCMPipelineManager.java:finalizeAndDestroyPipeline(307)) - destroying
> pipeline:Pipeline[ Id: ef60537a-0a82-4fea-a574-109c881fa140, Nodes:
> 7947bf32-faaa-4b34-bf1e-2752a929938c{ip: 192.168.1.6, host: 192.168.1.6,
> networkLocation: /default-rack, certSerialId: null}, Type:RATIS, Factor:ONE,
> State:CLOSED]
> 2019-07-05 14:45:16,363 INFO pipeline.PipelineStateManager
> (PipelineStateManager.java:removePipeline(108)) - Pipeline Pipeline[ Id:
> ef60537a-0a82-4fea-a574-109c881fa140, Nodes:
> 7947bf32-faaa-4b34-bf1e-2752a929938c{ip: 192.168.1.6, host: 192.168.1.6,
> networkLocation: /default-rack, certSerialId: null}, Type:RATIS, Factor:ONE,
> State:CLOSED] removed from db
> ...
> 2019-07-05 14:46:12,400 WARN pipeline.RatisPipelineUtils
> (RatisPipelineUtils.java:destroyPipeline(66)) - Pipeline destroy failed for
> pipeline=PipelineID=ef60537a-0a82-4fea-a574-109c881fa140
> dn=7947bf32-faaa-4b34-bf1e-2752a929938c\{ip: 192.168.1.6, host: 192.168.1.6,
> networkLocation: /default-rack, certSerialId: null}
> 2019-07-05 14:46:12,401 ERROR pipeline.SCMPipelineManager
> (Scheduler.java:lambda$schedule$1(70)) - Destroy pipeline failed for
> pipeline:Pipeline[ Id: ef60537a-0a82-4fea-a574-109c881fa140, Nodes:
> 7947bf32-faaa-4b34-bf1e-2752a929938c\{ip: 192.168.1.6, host: 192.168.1.6,
> networkLocation: /default-rack, certSerialId: null}, Type:RATIS, Factor:ONE,
> State:OPEN]
> org.apache.hadoop.hdds.scm.pipeline.PipelineNotFoundException:
> PipelineID=ef60537a-0a82-4fea-a574-109c881fa140 not found
> at
> org.apache.hadoop.hdds.scm.pipeline.PipelineStateMap.getPipeline(PipelineStateMap.java:132)
> at
> org.apache.hadoop.hdds.scm.pipeline.PipelineStateMap.removePipeline(PipelineStateMap.java:322)
> at
> org.apache.hadoop.hdds.scm.pipeline.PipelineStateManager.removePipeline(PipelineStateManager.java:107)
> at
> org.apache.hadoop.hdds.scm.pipeline.SCMPipelineManager.removePipeline(SCMPipelineManager.java:401)
> at
> org.apache.hadoop.hdds.scm.pipeline.SCMPipelineManager.destroyPipeline(SCMPipelineManager.java:387)
> at
> org.apache.hadoop.hdds.scm.pipeline.SCMPipelineManager.lambda$finalizeAndDestroyPipeline$0(SCMPipelineManager.java:321)
> at
> org.apache.hadoop.utils.Scheduler.lambda$schedule$1(Scheduler.java:68)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {code}
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]