[
https://issues.apache.org/jira/browse/HDDS-11648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Krishna Kumar Asawa reassigned HDDS-11648:
------------------------------------------
Assignee: Sumit Agrawal (was: Siddhant Sangwan)
> Ratis scrubber not replacing UNHEALTHY replicas with healthy replicas
> ---------------------------------------------------------------------
>
> Key: HDDS-11648
> URL: https://issues.apache.org/jira/browse/HDDS-11648
> Project: Apache Ozone
> Issue Type: Bug
> Components: SCM
> Reporter: Jyotirmoy Sinha
> Assignee: Sumit Agrawal
> Priority: Major
>
> Ratis scrubber not replacing UNHEALTHY replicas with healthy replicas
> Steps :
> # Close ratis container
> # Simulate UNHEALTHY replica in 1 datanode
> # Wait for it to be re-replicated with a healthy datanode
> Expected behaviour - UNHEALTHY replica is replaced with a healthy replica
> Observed behaviour - UNHEALTHY replica is not replaced with a healthy replica
> Container in test - 1
> Replica(node2) went into UNHEALTHY state at _2024-11-04 16:17:34,611 -_
> {code:java}
> 2024-11-04 16:17:34,611 | ERROR | ID=1 | Index=0 | BCSID=93 |
> State=UNHEALTHY | INCONSISTENT_CHUNK_LENGTH for file
> /hadoop-ozone/datanode/data735418/hdds/CID-138cba34-5e5e-4b3f-accd-709aef4b2619/current/containerDir0/1/chunks/113750153625600007.block.
> Message: Inconsistent read for chunk=113750153625600007_chunk_1 expected
> length=4194304 actual length=0 for block conID: 1 locID: 113750153625600007
> bcsId: 93 replicaIndex: null |{code}
> Container still has the UNHEALTHY replica at _Tue Nov 5 10:31:06 UTC 2024_ -
> {code:java}
> # ozone admin container info 1
> Container id: 1
> Pipeline id: e0b0c4b0-b73f-4e2a-8318-795d61367699
> Container State: CLOSED
> Datanodes: [0c0ac8d1-f176-48c9-82e8-72ea5c002e55/node4,
> 8872a5e3-7bf6-4e3f-85d4-3c931034e58b/node5,
> 49ca091e-216f-4804-bce7-7eb61f4919f6/node2]
> Replicas: [State: CLOSED; ReplicaIndex: 0; Origin:
> 0c0ac8d1-f176-48c9-82e8-72ea5c002e55; Location:
> 0c0ac8d1-f176-48c9-82e8-72ea5c002e55/node4,
> State: CLOSED; ReplicaIndex: 0; Origin: 8872a5e3-7bf6-4e3f-85d4-3c931034e58b;
> Location: 8872a5e3-7bf6-4e3f-85d4-3c931034e58b/node5,
> State: UNHEALTHY; ReplicaIndex: 0; Origin:
> 49ca091e-216f-4804-bce7-7eb61f4919f6; Location:
> 49ca091e-216f-4804-bce7-7eb61f4919f6/node2] {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]