swamirishi commented on PR #7402:
URL: https://github.com/apache/ozone/pull/7402#issuecomment-2487026394

   
[MissingContainerSet](https://github.com/apache/ozone/blob/4e603aa93c8479349776c1597ef54504453cb512/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/impl/ContainerSet.java#L62)
 in the containerSet was added as part of solving 
[HDDS-935](https://issues.apache.org/jira/browse/HDDS-935), which is trying to 
solve the same problem (**Avoid creating an already created container on a 
datanode in case of disk removal followed by datanode restart**) we are trying 
to tackle here. But unfortunately the solution only accounted for ratis 
container which was because it was only container type back then. The solution 
looks into adding all the containers that are present in the [ratis snapshot's 
containerBCSIdMap](https://github.com/apache/ozone/blob/2139367b52d68773f25d79abce0439323dde9671/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/ContainerStateMachine.java#L313-L323)
 but were not found on disk 
 after scanning through all the data volumes in the node. 
   [HDDS-8448](https://issues.apache.org/jira/browse/HDDS-8448) caused a 
regression resulting removing the containers on a failed volume instead of 
marking it as missing.
   This patch is trying to do something along the same lines by extending the 
working for EC containers and is trying to fix the issues in the implementation 
for handling volume failures in the system. I believe this particular 
implementation was working fine for ratis container before HDDS-8448 caused a 
regression. Creating a duplicate data structure performing the same 
functionalities is just redundant and could lead to future bugs because we 
would have to update 2 structures in place of two. I also don't see any value 
in implementing something all over again which would also involve thorough 
testing of that flow. IMO we should be reusing the existing frameworks in the 
system to achieve something rather than add more flows complicating the already 
existing system further. 
   If you still strongly feel we should move this logic out of the ContainerSet 
class then my opinion would be to get rid of the missingContainerSet entirely 
from all of our code flows and have only one of the 2 but not both. @sodonnel 
@kerneltime @adoroszlai what do you think we should do here?
   I have addressed the other review comments on the patch (Getting rid of the 
Proto2EnumCodec, Removing ReferenceCountedDB for the 
WitnessedContainerMetadataStore)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to