Supratim Deka created HDDS-1765:
-----------------------------------

             Summary: destroyPipeline scheduled from finalizeAndDestroyPipeline 
fails for short dead node interval
                 Key: HDDS-1765
                 URL: https://issues.apache.org/jira/browse/HDDS-1765
             Project: Hadoop Distributed Data Store
          Issue Type: Bug
          Components: SCM
            Reporter: Supratim Deka


This happens when 

OZONE_SCM_PIPELINE_DESTROY_TIMEOUT exceeds the value of 
OZONE_SCM_DEADNODE_INTERVAL. This is the case for start-chaos.sh

When a Datanode is shutdown, SCM Stale node handler calls 

finalizeAndDestroyPipeline() which schedules destroyPipeline() operation with a 
delay

of OZONE_SCM_PIPELINE_DESTROY_TIMEOUT. By the time this gets scheduled, dead 
node handler would have destroyed the pipeline.

 
{code:java}
2019-07-05 14:45:16,358 INFO  pipeline.SCMPipelineManager 
(SCMPipelineManager.java:finalizeAndDestroyPipeline(307)) - destroying 
pipeline:Pipeline[ Id: ef60537a-0a82-4fea-a574-109c881fa140, Nodes: 
7947bf32-faaa-4b34-bf1e-2752a929938c{ip: 192.168.1.6, host: 192.168.1.6, 
networkLocation: /default-rack, certSerialId: null}, Type:RATIS, Factor:ONE, 
State:CLOSED]

2019-07-05 14:45:16,363 INFO  pipeline.PipelineStateManager 
(PipelineStateManager.java:removePipeline(108)) - Pipeline Pipeline[ Id: 
ef60537a-0a82-4fea-a574-109c881fa140, Nodes: 
7947bf32-faaa-4b34-bf1e-2752a929938c{ip: 192.168.1.6, host: 192.168.1.6, 
networkLocation: /default-rack, certSerialId: null}, Type:RATIS, Factor:ONE, 
State:CLOSED] removed from db

...

2019-07-05 14:46:12,400 WARN  pipeline.RatisPipelineUtils 
(RatisPipelineUtils.java:destroyPipeline(66)) - Pipeline destroy failed for 
pipeline=PipelineID=ef60537a-0a82-4fea-a574-109c881fa140 
dn=7947bf32-faaa-4b34-bf1e-2752a929938c\{ip: 192.168.1.6, host: 192.168.1.6, 
networkLocation: /default-rack, certSerialId: null}

2019-07-05 14:46:12,401 ERROR pipeline.SCMPipelineManager 
(Scheduler.java:lambda$schedule$1(70)) - Destroy pipeline failed for 
pipeline:Pipeline[ Id: ef60537a-0a82-4fea-a574-109c881fa140, Nodes: 
7947bf32-faaa-4b34-bf1e-2752a929938c\{ip: 192.168.1.6, host: 192.168.1.6, 
networkLocation: /default-rack, certSerialId: null}, Type:RATIS, Factor:ONE, 
State:OPEN]

org.apache.hadoop.hdds.scm.pipeline.PipelineNotFoundException: 
PipelineID=ef60537a-0a82-4fea-a574-109c881fa140 not found

        at 
org.apache.hadoop.hdds.scm.pipeline.PipelineStateMap.getPipeline(PipelineStateMap.java:132)

        at 
org.apache.hadoop.hdds.scm.pipeline.PipelineStateMap.removePipeline(PipelineStateMap.java:322)

        at 
org.apache.hadoop.hdds.scm.pipeline.PipelineStateManager.removePipeline(PipelineStateManager.java:107)

        at 
org.apache.hadoop.hdds.scm.pipeline.SCMPipelineManager.removePipeline(SCMPipelineManager.java:401)

        at 
org.apache.hadoop.hdds.scm.pipeline.SCMPipelineManager.destroyPipeline(SCMPipelineManager.java:387)

        at 
org.apache.hadoop.hdds.scm.pipeline.SCMPipelineManager.lambda$finalizeAndDestroyPipeline$0(SCMPipelineManager.java:321)

        at 
org.apache.hadoop.utils.Scheduler.lambda$schedule$1(Scheduler.java:68)

        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

        at java.util.concurrent.FutureTask.run(FutureTask.java:266)

        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)

        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)

        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

        at java.lang.Thread.run(Thread.java:748)

{code}
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to