[ 
https://issues.apache.org/jira/browse/HDDS-5708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sadanand Shenoy updated HDDS-5708:
----------------------------------
    Issue Type: Improvement  (was: New Feature)

> Skip sending container close command to unhealthy replica
> ---------------------------------------------------------
>
>                 Key: HDDS-5708
>                 URL: https://issues.apache.org/jira/browse/HDDS-5708
>             Project: Apache Ozone
>          Issue Type: Improvement
>            Reporter: Sammi Chen
>            Assignee: Sammi Chen
>            Priority: Major
>              Labels: pull-request-available
>
> unhealthy replica cannot be closed by close container command.  There is no 
> big impact except that there will huge logs as following in scm.log and has 
> some impact on problem investigation efficiency.   This task aims to reduce 
> the useless LOGs in scm.log.    
> Of course, we need a better way to handle the unhealthy container which is 
> always in CLOSING state.  We will find the solution once we know how the 
> container becomes unhealthy with all 3 unhealthy replicas. 
> 2021-09-01 21:19:10,903 [ReplicationMonitor] INFO 
> org.apache.hadoop.hdds.scm.container.ReplicationManager: Sending close 
> container command for container #110490 to datanode 
> 0a16a9a7-1af0-4fbe-9b32-9e67df46b4c7{ip: 11.26.17.139, host: 11.26.17.139, 
> ports: [REPLICATION=9886, RATIS=9858, RATIS_ADMIN=9858, RATIS_SERVER=9858, 
> STANDALONE=9859], parent: rack561349, networkLocation: /rack561349, 
> certSerialId: null, persistedOpState: IN_SERVICE, 
> persistedOpStateExpiryEpochSec: 0}.
> 2021-09-01 21:24:11,199 [ReplicationMonitor] INFO 
> org.apache.hadoop.hdds.scm.container.ReplicationManager: Sending close 
> container command for container #110490 to datanode 
> 0a16a9a7-1af0-4fbe-9b32-9e67df46b4c7{ip: 11.26.17.139, host: 11.26.17.139, 
> ports: [REPLICATION=9886, RATIS=9858, RATIS_ADMIN=9858, RATIS_SERVER=9858, 
> STANDALONE=9859], parent: rack561349, networkLocation: /rack561349, 
> certSerialId: null, persistedOpState: IN_SERVICE, 
> persistedOpStateExpiryEpochSec: 0}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to