sodonnel commented on pull request #881:
URL: https://github.com/apache/hadoop-ozone/pull/881#issuecomment-623578614


   I am still not convinced putting the actual remove or add logic into the 
placement policies is the correct way to go here. That would require fairly 
fundamental changes to the existing replication manager, which I would prefer 
to leave alone as much as possible. It would also change the intent of the 
policies, which are read only "selectors" right now.
   
   > However I still think this whole logic of 
handle(Over|Under)ReplicatedContainers has to be part of the placement policy, 
as that should be the only point that handles the placement of containers, and 
if we leave this logic in ReplicationManager, then it will explode further and 
will be way harder to maintain as soon as a new placement policy is added, 
moreover it makes it impossible to really plug in a new placement policy by 
just a config change, as it is planned now.
   
   I have removed any reference to rack from replication manager (there were 
only variable names and not concrete logic) - now it deals with under over 
replicated and mis-replication. I believe we could plug a new policy in that 
handles 3 racks, or node groups with zero changes to replication manager, and 
just by extending the relevant placement policy classes. Considering what we 
have in HDFS, its hard to imagine a placement policy that does not fit into 
this model.
   
   We already have an API into the placement policies to pick new nodes to add 
(chooseDataNodes(...)) and this is used by the replication manager and pipeline 
manager, and perhaps elsewhere. Replication manager uses this to handle the 
under and mis-replication.
   
   What is missing is, perhaps, is an API called `chooseNodesToRemove()` in the 
placement policies. That would allow us to decide to do things, such as remove 
a replica from a more used node in some policies, but for now we have just 
"remove any healthy node that does not violate the policy". With only that 
current requirement, the logic added in this PR, which simply tries removing 
each replica in turn is probably OK.
   
   Remember that RM also has to keep track of the inflight replications, and 
consider them when running the calculations on the containers.
   
   I also feel that some of your suggestions are wider refactors, and ideally 
we should tackle them later rather than combine them with this change:
   
    * Change how we test replication manager
    * Use Set<ContainerReplica> everywhere in the placement policies


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to