sodonnel opened a new pull request, #4274:
URL: https://github.com/apache/ozone/pull/4274

   ## What changes were proposed in this pull request?
   
   To allow the balancer to be split from the Legacy ReplicationManager, this 
PR adds a MoveManager class. The Balancer will container a reference to an 
instance of this move manager, and use it to schedule its moves rather than 
calling the replication manager directly.
   
   As things stand, this Move Manager is not yet used, this PR simply adds it 
along with some tests, with a future PR planned to make the balancer start to 
use it.
   
   The MoveManager will keep track of the scheduled moves, and when the "add" 
part of a move completes, it will be notified by the ContainerReplicaPendingOps 
service, allowing it to schedule the delete part of the move if still 
appropriate.
   
   In order to play nicely with the command deadlines introduced into the new 
RM, and also the load limiting which is yet to be added to the new RM, the 
MoveManger schedules commands with "low" priority on the datanodes and also 
with a longer deadline. This allows the balancer to schedule large batches of 
work on the DNs with large deadlines, while not blocking replication commands 
related to decommission or under replication. Replication commands for under 
replication are scheduled with Normal priority, meaning they are processed 
before balancer commands.
   
   Recent changes building up to this change have resulted in several things 
which avoid the need for the Balancer / Move Manager to replicate its work 
across SCM in the case of failover:
   
   1. Commands are more robustly expired on the DNs in the event of an SCM Term 
(leader) change.
   2. Replicate and Delete commands now have deadlines so they are expired on 
the DNs if they take too long to process.
   
   We also plan to ensure the balancer / move manager does not run for a short 
delay after a failover, similar to how Replication Manager works.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-6572
   
   ## How was this patch tested?
   
   Additional tests still needed added.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to