sodonnel commented on code in PR #7402:
URL: https://github.com/apache/ozone/pull/7402#discussion_r1844138325


##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/impl/ContainerSet.java:
##########
@@ -85,22 +98,41 @@ public void setRecoveringTimeout(long recoveringTimeout) {
     this.recoveringTimeout = recoveringTimeout;
   }
 
+  public boolean addContainer(Container<?> container) throws 
StorageContainerException {
+    return addContainer(container, false);
+  }
+
   /**
    * Add Container to container map.
    * @param container container to be added
    * @return If container is added to containerMap returns true, otherwise
    * false
    */
-  public boolean addContainer(Container<?> container) throws
+  public boolean addContainer(Container<?> container, boolean 
overwriteMissingContainers) throws
       StorageContainerException {
     Preconditions.checkNotNull(container, "container cannot be null");
 
     long containerId = container.getContainerData().getContainerID();
+    State containerState = container.getContainerData().getState();
+    if (!overwriteMissingContainers && 
missingContainerSet.contains(containerId)) {

Review Comment:
   Prior to this change, the missingContainerSet variable in the ContainerSet 
was not used for anything within that class. Now, we are potentially adding to 
it from the containerList - what impact does that have more widely? What was 
the missingContainerSet used for before?
   
   We also have to open the new containerList table - it would be more clear if 
we just loaded that into memory and use the new "seenContainerList" for 
previously existing checks.
   
   However backing up a bit - how can containers get created?
   
   One way, is via the write path - that is the problem we are trying to solve.
   
   Another is via EC Reconstruction - there is already logic there to deal with 
a duplicate container I think, however it may not cover volume failure. However 
EC reconstruction effectively uses the same write path as normal writes, so it 
its basically the same as the write path.
   
   Then there is replication, which is a very different path into the system.
   
   Maybe this problem is better not solved in ContainerSet, but at the higher 
level in the dispatcher where the writes are being handled?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to