fapifta commented on pull request #881:
URL: https://github.com/apache/hadoop-ozone/pull/881#issuecomment-622181460


   Hi @sodonnel, thank you for this work. As I see the code itself seems to be 
fairly correct as far as I can tell based on just reading through the changes I 
haven't seen a logical issue so far.
   
   After the first overview, I went through some of the related code, and the 
current implementation, and then the changes again, and the following evolved 
in my mind during the process:
   I really liked the idea for the first time to have a default implementation 
for container placement status in an immutable class, but thinking it through, 
I would drop the whole ContainerPlacementStatus interface and the related code.
   My reason to tell this is a bit long and affects large portions of the 
solution.
   As I get the intent, rack awareness when set needs to be satisfied, but in 
the code as it seems ReplicationManager goes far beyond its limits, and 
contains concrete rack awareness related logic. I think the placement policy 
itself should encapsulate the fact that there are racks, because if it does 
not, then node group based policy will pollute ReplicationManager with node 
group related code, and so on with any other specific implementation logic.
   
   As.I look at the changeset from this angle, changes in ReplicationManager 
that mentions racks, should be internally inside the placement policy 
implementation.
   What I would suggest, is to change the PlacementPolicy interface, and add 
the following methods:
   boolean isValidPlacement(Set<ContainerReplica> replicas, int 
replicationFactor)
   void removeExcessReplicas(Set<ContainerReplica> currentReplicas, int 
replicationFactor, BiConsumer<ContainerInfo, DatanodeDetails> purger)
   void addMissingReplicas(Set<ContainerReplica> currentReplicas, int 
replicationFactor, BiConsumer<ContainerInfo, DatanodeDetails> replicator)
   
   With this, the placement validity can be checked by the isValidPlacement 
method where needed, after that the two other method can be called if the 
placement is invalid.
   The BiConsumers accept the containerId, and the DataNodeInfo, and 
removeExcessReplicas can take a BiConsumer like this: (container, datanode) -> 
sendDeleteCommand(container, datanode, true), while the addMissingReplicas can 
take a BiConsumer like this: (container, datanode) -> 
sendReplicateCommand(container, datanode, sources).
   
   With this approach I believe all the racks related code can be part of the 
ContainerPlacementRackAware, and the PipelinePlacementPolicy classes. Also I 
suggest to use Set<ContainerReplica> as the parameter everywhere in the 
PlacementPolicy, as that is coming in, and DatanodeDetails and other things can 
be calculated only when they are needed from that.
   
   After the change in the approach, the policies themselves have to handle the 
over and under replication via these methods, and these methods can be tested 
fairly similarly as in the TestContainerPlacementRackAware class it is added 
now, while I would suggest to change how we test the ReplicationManager, to 
make that part simpler as well, as there you can just provide a mock 
ContainerManager that provides the container ids, and replica informations as 
you wish, and then you can run the manager, and check if the proper events 
arrived to the queue ideally by spying on the EventQueue and verifying the 
events posted. I am suggesting this, as we purely want to check if the events 
are posted to the queue, and we do not want to validate the queue or anything 
else, so we want to provide containers for the ReplicationManager, and then 
check it it posts the expected commands given the policy and the container 
replicas we provided.
   
   I know that this is a fundamental change, but at the moment I am -1 on the 
implementation because of the fact that in the current state we would make the 
ReplicationManager also rack aware which I think we would like to avoid.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to