sodonnel commented on PR #7542:
URL: https://github.com/apache/ozone/pull/7542#issuecomment-2578079971
Have all the unhealthy replicas used up all available DNs on the cluster, so
that there are no other free hosts to create a healthy copy?
What error did you see when the reconstruction failed? There have been some
bugs fixed around reconstruction over time, which could cause a container to
not get recovered.
If the container is not under replicated (ie all the replicas are there an
healthy), then the unhealthy ones should be cleaned up by the
ClosedWithUnhealthyReplicasHandler, which should mark the container over
replicated and then the unhealthy ones should be deleted. From the javadoc:
```
* Handles a closed EC container with unhealthy replicas. Note that if we
* reach here, there is no over or under replication. This handler
* will just send commands to delete the unhealthy replicas.
*
* <p>
* Consider the following set of replicas for a closed EC 3-2 container:
* Replica Index 1: Closed
* Replica Index 2: Closed
* Replica Index 3: Closed replica, Unhealthy replica (2 replicas)
* Replica Index 4: Closed
* Replica Index 5: Closed
*
* In this case, the unhealthy replica of index 3 should be deleted. The
* container will be marked over replicated as the unhealthy replicas need
* to be removed.
* </p>
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]