[ 
https://issues.apache.org/jira/browse/HDDS-7915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddhant Sangwan updated HDDS-7915:
-----------------------------------
    Description: 
LegacyReplicationManager#closeReplicasIfPossible tries to close replicas. 

{code}
  private int closeReplicasIfPossible(ContainerInfo container,
                                      List<ContainerReplica> replicas) {
    // This method should not be used on open containers.
    if (container.getState() == LifeCycleState.OPEN) {
      return 0;
    }

    int numCloseCmdsSent = 0;
    Iterator<ContainerReplica> iterator = replicas.iterator();
    while (iterator.hasNext()) {
      final ContainerReplica replica = iterator.next();
      final State state = replica.getState();
      if (state == State.OPEN || state == State.CLOSING) {
        sendCloseCommand(container, replica.getDatanodeDetails(), false);
        numCloseCmdsSent++;
        iterator.remove();
      } else if (state == State.QUASI_CLOSED) {
        // Send force close command if the BCSID matches
        if (container.getSequenceId() == replica.getSequenceId()) {
          sendCloseCommand(container, replica.getDatanodeDetails(), true);
          numCloseCmdsSent++;
          iterator.remove();
        }
      }
    }

    return numCloseCmdsSent;
  }
{code}

In one case, this method is called from 
LegacyReplicationManager#handleUnderReplicatedUnhealthy, where the replica's 
state matches the container state. If the state is QUASI_CLOSED, we will end up 
closing the replica and eventually the container, even though the replica 
doesn't have all the data. This should not happen and is what this jira is 
meant to fix. Force closing a QUASI_CLOSED replica of a QUASI_CLOSED container 
is handled separately earlier in the processContainer method.

  was:
LegacyReplicationManager#closeReplicasIfPossible tries to close replicas. 

{code}
  private int closeReplicasIfPossible(ContainerInfo container,
                                      List<ContainerReplica> replicas) {
    // This method should not be used on open containers.
    if (container.getState() == LifeCycleState.OPEN) {
      return 0;
    }

    int numCloseCmdsSent = 0;
    Iterator<ContainerReplica> iterator = replicas.iterator();
    while (iterator.hasNext()) {
      final ContainerReplica replica = iterator.next();
      final State state = replica.getState();
      if (state == State.OPEN || state == State.CLOSING) {
        sendCloseCommand(container, replica.getDatanodeDetails(), false);
        numCloseCmdsSent++;
        iterator.remove();
      } else if (state == State.QUASI_CLOSED) {
        // Send force close command if the BCSID matches
        if (container.getSequenceId() == replica.getSequenceId()) {
          sendCloseCommand(container, replica.getDatanodeDetails(), true);
          numCloseCmdsSent++;
          iterator.remove();
        }
      }
    }

    return numCloseCmdsSent;
  }
{code}

In one case, this method is called from 
LegacyReplicationManager#handleUnderReplicatedUnhealthy, where the replica's 
state matches the container state. If the state is QUASI_CLOSED, we will end up 
closing the replica and eventually the container, even though the replica 
doesn't have all the data. This should not happen and is what this jira is 
meant to fix. Force closing is handled separately earlier in the 
processContainer method.


> Force close QUASI_CLOSED replicas only when the container is CLOSED in Legacy 
> RM
> --------------------------------------------------------------------------------
>
>                 Key: HDDS-7915
>                 URL: https://issues.apache.org/jira/browse/HDDS-7915
>             Project: Apache Ozone
>          Issue Type: Sub-task
>          Components: SCM
>            Reporter: Siddhant Sangwan
>            Assignee: Siddhant Sangwan
>            Priority: Major
>             Fix For: 1.4.0
>
>
> LegacyReplicationManager#closeReplicasIfPossible tries to close replicas. 
> {code}
>   private int closeReplicasIfPossible(ContainerInfo container,
>                                       List<ContainerReplica> replicas) {
>     // This method should not be used on open containers.
>     if (container.getState() == LifeCycleState.OPEN) {
>       return 0;
>     }
>     int numCloseCmdsSent = 0;
>     Iterator<ContainerReplica> iterator = replicas.iterator();
>     while (iterator.hasNext()) {
>       final ContainerReplica replica = iterator.next();
>       final State state = replica.getState();
>       if (state == State.OPEN || state == State.CLOSING) {
>         sendCloseCommand(container, replica.getDatanodeDetails(), false);
>         numCloseCmdsSent++;
>         iterator.remove();
>       } else if (state == State.QUASI_CLOSED) {
>         // Send force close command if the BCSID matches
>         if (container.getSequenceId() == replica.getSequenceId()) {
>           sendCloseCommand(container, replica.getDatanodeDetails(), true);
>           numCloseCmdsSent++;
>           iterator.remove();
>         }
>       }
>     }
>     return numCloseCmdsSent;
>   }
> {code}
> In one case, this method is called from 
> LegacyReplicationManager#handleUnderReplicatedUnhealthy, where the replica's 
> state matches the container state. If the state is QUASI_CLOSED, we will end 
> up closing the replica and eventually the container, even though the replica 
> doesn't have all the data. This should not happen and is what this jira is 
> meant to fix. Force closing a QUASI_CLOSED replica of a QUASI_CLOSED 
> container is handled separately earlier in the processContainer method.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to