sodonnel commented on a change in pull request #3147:
URL: https://github.com/apache/ozone/pull/3147#discussion_r818174137



##########
File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/AbstractContainerReportHandler.java
##########
@@ -121,30 +122,77 @@ private void updateContainerStats(final DatanodeDetails 
datanodeDetails,
         containerInfo.updateSequenceId(
             replicaProto.getBlockCommitSequenceId());
       }
+      if (containerInfo.getReplicationConfig().getReplicationType()
+          == HddsProtos.ReplicationType.EC) {
+        updateECContainerStats(containerInfo, replicaProto, datanodeDetails);
+      } else {
+        updateRatisContainerStats(containerInfo, replicaProto, 
datanodeDetails);
+      }
+    }
+  }
+
+  private void updateRatisContainerStats(ContainerInfo containerInfo,
+      ContainerReplicaProto newReplica, DatanodeDetails newSource)
+      throws ContainerNotFoundException {
+    List<ContainerReplica> otherReplicas =
+        getOtherReplicas(containerInfo.containerID(), newSource);
+    long usedBytes = newReplica.getUsed();
+    long keyCount = newReplica.getKeyCount();
+    for (ContainerReplica r : otherReplicas) {
+      usedBytes = calculateUsage(containerInfo, usedBytes, r.getBytesUsed());
+      keyCount = calculateUsage(containerInfo, keyCount, r.getKeyCount());
+    }
+    updateContainerUsedAndKeys(containerInfo, usedBytes, keyCount);
+  }
+
+  private void updateECContainerStats(ContainerInfo containerInfo,
+      ContainerReplicaProto newReplica, DatanodeDetails newSource)
+      throws ContainerNotFoundException {
+    int dataNum =
+        ((ECReplicationConfig)containerInfo.getReplicationConfig()).getData();
+    // The first EC index and the parity indexes must all be the same size
+    // while the other data indexes may be smaller due to partial stripes.
+    // When calculating the stats, we only use the first data and parity and
+    // ignore the others. We only need to run the check if we are processing

Review comment:
       Yes, as far as I can tell, the used bytes is only used when you list a 
container to get information about in via an `ozone admin container info` 
command, and when we pick an existing container to be used for a new block. 
However there could be future uses for this used bytes - eg merging containers, 
balancer etc.
   
   However as I am typing this answer, I am not sure what I have done here is 
correct. What should the size of an EC container be? Probably, it should be the 
size of the data in the container group, and hence the size should really be 
the sum of the bytes used in each data container. The key count, should be 
taken from the first data or the parities.
   
   Ratis container sizes do not contain the replicated size - just the data 
size. They reach a limit of 5GB per container by default. I think each 
data-container in an EC container group should be able to reach 5GB, so the 
total container group will have approx dataNum * 5GB as the limit.
   
   Then when deciding if a container is full or not, with Ratis we would use 
5GB, but with EC we should use 5GB * dataSize. With this the first 
data-container could get larger than 5GB due to partial stripes, but the DN 
will trigger a close when any of the containers in the EC group reaches 5GB, so 
that will prevent it overshooting the limit. As with Ratis, the DNs should 
ideally trigger the closure of the group, and not SCM.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to