[
https://issues.apache.org/jira/browse/HDDS-1454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829858#comment-16829858
]
Siddharth Wagle commented on HDDS-1454:
---------------------------------------
HADOOP-9618 only talks about detecting a pause, is the scope of this Jira to
take corrective action?
If we do have a thread that detects a pause: GC, a kernel issue, contention for
machine resources, swap, etc. by using a thread which tracks its own sleep
time, can we actually cancel pipeline destroy events?
> GC other system pause events can trigger pipeline destroy for all the nodes
> in the cluster
> ------------------------------------------------------------------------------------------
>
> Key: HDDS-1454
> URL: https://issues.apache.org/jira/browse/HDDS-1454
> Project: Hadoop Distributed Data Store
> Issue Type: Bug
> Components: SCM
> Affects Versions: 0.3.0
> Reporter: Mukul Kumar Singh
> Priority: Major
> Labels: MiniOzoneChaosCluster
>
> In a MiniOzoneChaosCluster run it was observed that events like GC pauses or
> any other pauses in SCM can mark all the datanodes as stale in SCM. This will
> trigger multiple pipeline destroy and will render the system unusable.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]