lokeshj1703 commented on a change in pull request #2349:
URL: https://github.com/apache/ozone/pull/2349#discussion_r663665876



##########
File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ReplicationManager.java
##########
@@ -471,6 +549,127 @@ private void updateInflightAction(final ContainerInfo 
container,
     }
   }
 
+  /**
+   * add a move action for a given container.
+   *
+   * @param cid Container to move
+   * @param srcDn datanode to move from
+   * @param targetDn datanode to move to
+   */
+  public Optional<CompletableFuture<MoveResult>> move(ContainerID cid,
+      DatanodeDetails srcDn, DatanodeDetails targetDn)
+      throws ContainerNotFoundException, NodeNotFoundException {
+    LOG.info("receive a move requset about container {} , from {} to {}",
+        cid, srcDn.getUuid(), targetDn.getUuid());
+    Optional<CompletableFuture<MoveResult>> ret = Optional.empty();
+    if (!isRunning()) {
+      LOG.info("Replication Manager in not running. please start it first");
+      return ret;
+    }
+
+    /*
+     * make sure the flowing conditions are met:
+     *  1 the given two datanodes are in healthy state
+     *  2 the given container exists on the given source datanode
+     *  3 the given container does not exist on the given target datanode
+     *  4 the given container is in closed state
+     *  5 the giver container is not taking any inflight action
+     *  6 the given two datanodes are in IN_SERVICE state
+     *
+     * move is a combination of two steps : replication and deletion.
+     * if the conditions above are all met, then we take a conservative
+     * strategy here : replication can always be executed, but the execution
+     * of deletion always depends on placement policy
+     */
+
+    NodeStatus currentNodeStat = nodeManager.getNodeStatus(srcDn);
+    NodeState healthStat = currentNodeStat.getHealth();
+    NodeOperationalState operationalState =
+        currentNodeStat.getOperationalState();
+    if (healthStat != NodeState.HEALTHY) {
+      LOG.info("given source datanode is in {} state, " +
+          "not in HEALTHY state", healthStat);
+      return ret;
+    }
+    if (operationalState != NodeOperationalState.IN_SERVICE) {
+      LOG.info("given source datanode is in {} state, " +
+          "not in IN_SERVICE state", operationalState);
+      return ret;
+    }
+
+    currentNodeStat = nodeManager.getNodeStatus(targetDn);
+    healthStat = currentNodeStat.getHealth();
+    operationalState = currentNodeStat.getOperationalState();
+    if (healthStat != NodeState.HEALTHY) {
+      LOG.info("given target datanode is in {} state, " +
+          "not in HEALTHY state", healthStat);
+      return ret;
+    }
+    if (operationalState != NodeOperationalState.IN_SERVICE) {
+      LOG.info("given target datanode is in {} state, " +
+          "not in IN_SERVICE state", operationalState);
+      return ret;
+    }
+
+    // we need to synchronize on ContainerInfo, since it is
+    // shared by ICR/FCR handler and this.processContainer
+    // TODO: use a Read lock after introducing a RW lock into ContainerInfo
+    ContainerInfo cif = containerManager.getContainer(cid);
+    synchronized (cif) {
+      final Set<DatanodeDetails> replicas = containerManager
+            .getContainerReplicas(cid).stream()
+            .map(ContainerReplica::getDatanodeDetails)
+            .collect(Collectors.toSet());
+      if (replicas.contains(targetDn)) {
+        LOG.info("given container exists in the target Datanode");
+        return ret;
+      }
+      if (!replicas.contains(srcDn)) {
+        LOG.info("given container does not exist in the source Datanode");
+        return ret;
+      }
+
+      /*
+      * the reason why the given container should not be taking any inflight
+      * action is that: if the given container is being replicated or deleted,
+      * the num of its replica is not deterministic, so move operation issued
+      * by balancer may cause a nondeterministic result, so we should drop
+      * this option for this time.
+      * */
+
+      if (inflightReplication.containsKey(cid)) {
+        LOG.info("given container is in inflight replication");
+        return ret;
+      }
+      if (inflightDeletion.containsKey(cid)) {
+        LOG.info("given container is in inflight deletion");
+        return ret;
+      }
+
+      /*
+      * here, no need to see whether cid is in inflightMove, because
+      * these three map are all synchronized on ContainerInfo, if cid
+      * is in infligtMove , it must now being replicated or deleted,
+      * so it must be in inflightReplication or in infligthDeletion.
+      * thus, if we can not find cid in both of them , this cid must
+      * not be in inflightMove.
+      */
+

Review comment:
       I see. But I think there can be a case where d1, d2, d3 have the 
replicas in R1, R2 and R3 racks respectively. If we want to replace d1 with d4 
it should be made sure that d4 is from R1 itself. If d4 is from R2 for example, 
then after replication either d2 or d4 is deleted which is not really helping 
with move.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to