[jira] [Commented] (HDDS-1454) GC other system pause events can trigger pipeline destroy for all the nodes in the cluster

Siddharth Wagle (JIRA) Mon, 29 Apr 2019 17:38:57 -0700


    [ 
https://issues.apache.org/jira/browse/HDDS-1454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829858#comment-16829858
 ]


Siddharth Wagle commented on HDDS-1454:
---------------------------------------

HADOOP-9618 only talks about detecting a pause, is the scope of this Jira to 
take corrective action?

If we do have a thread that detects a pause: GC, a kernel issue, contention for 
machine resources, swap, etc. by using a thread which tracks its own sleep 
time, can we actually cancel pipeline destroy events? 

> GC other system pause events can trigger pipeline destroy for all the nodes 
> in the cluster
> ------------------------------------------------------------------------------------------
>
>                 Key: HDDS-1454
>                 URL: https://issues.apache.org/jira/browse/HDDS-1454
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>          Components: SCM
>    Affects Versions: 0.3.0
>            Reporter: Mukul Kumar Singh
>            Priority: Major
>              Labels: MiniOzoneChaosCluster
>
> In a MiniOzoneChaosCluster run it was observed that events like GC pauses or 
> any other pauses in SCM can mark all the datanodes as stale in SCM. This will 
> trigger multiple pipeline destroy and will render the system unusable. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HDDS-1454) GC other system pause events can trigger pipeline destroy for all the nodes in the cluster

Reply via email to