[
https://issues.apache.org/jira/browse/HDDS-9595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Siddhant Sangwan updated HDDS-9595:
-----------------------------------
Description:
We've seen some QUASI_CLOSED containers that have only 1 replica, and that 1
replica is UNHEALTHY and also empty. We need to investigate how the system ends
up having such containers, and what to do about them. The solution will likely
depend on whether the container is known by SCM to have zero keys. In general,
this ties up into the larger problem of the SCM not knowing which containers
that appear empty are actually not empty but have missing data, because it
doesn't know if there are keys mapped to this container.
Currently, this is our logic for calling a container empty:
{code}
private boolean isContainerEmpty(final ContainerInfo container,
final Set<ContainerReplica> replicas) {
return container.getState() == LifeCycleState.CLOSED &&
container.getNumberOfKeys() == 0 && replicas.stream().allMatch(
r -> r.getState() == State.CLOSED && r.getKeyCount() == 0);
}
{code}
was:
We've seen some QUASI_CLOSED containers that have only 1 replica, and that
UNHEALTHY replica is also empty. We need to investigate how the system ends up
having such containers, and what to do about them. The solution will likely
depend on whether the container is known by SCM to have zero keys. In general,
this ties up into the larger problem of the SCM not knowing which containers
that appear empty are actually not empty but have missing data, because it
doesn't know if there are keys mapped to this container.
Currently, this is our logic for calling a container empty:
{code}
private boolean isContainerEmpty(final ContainerInfo container,
final Set<ContainerReplica> replicas) {
return container.getState() == LifeCycleState.CLOSED &&
container.getNumberOfKeys() == 0 && replicas.stream().allMatch(
r -> r.getState() == State.CLOSED && r.getKeyCount() == 0);
}
{code}
> Investigate QUASI_CLOSED containers with only one UNHEALTHY and empty replica
> -----------------------------------------------------------------------------
>
> Key: HDDS-9595
> URL: https://issues.apache.org/jira/browse/HDDS-9595
> Project: Apache Ozone
> Issue Type: Sub-task
> Components: SCM
> Reporter: Siddhant Sangwan
> Priority: Major
>
> We've seen some QUASI_CLOSED containers that have only 1 replica, and that 1
> replica is UNHEALTHY and also empty. We need to investigate how the system
> ends up having such containers, and what to do about them. The solution will
> likely depend on whether the container is known by SCM to have zero keys. In
> general, this ties up into the larger problem of the SCM not knowing which
> containers that appear empty are actually not empty but have missing data,
> because it doesn't know if there are keys mapped to this container.
> Currently, this is our logic for calling a container empty:
> {code}
> private boolean isContainerEmpty(final ContainerInfo container,
> final Set<ContainerReplica> replicas) {
> return container.getState() == LifeCycleState.CLOSED &&
> container.getNumberOfKeys() == 0 && replicas.stream().allMatch(
> r -> r.getState() == State.CLOSED && r.getKeyCount() == 0);
> }
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]