sodonnel opened a new pull request, #3743:
URL: https://github.com/apache/ozone/pull/3743

   ## What changes were proposed in this pull request?
   
   
   One of the goals I had in mind when developing the new Replication Manager 
was to make the code cleaner, easier to test and easier to follow. The Legacy 
Replication Manager has a lot of logic in a single class, to handle all sorts 
of container states and conditions and that makes it hard to test and difficult 
to see what is even tested.
   
   If we define “business logic” as the rules and conditions which indicate if 
a container is healthy or not, and also the rules and logic to fix any 
conditions the container may have, such as bad replica states, over / under 
replication etc.
   
   Then the Replication Manager “infrastructure” or “plumbing” is what stitches 
the business logic together, provides access to queued containers, iterates 
over the containers etc. This logic is relatively simple in general.
   
   We have also started using the ReplicationManager class as a kind of proxy 
object, to
   
   My goal is to try to separate all the business logic into separate logical 
classes that can each be tested in isolation. Then the Replication Manager 
class itself can focus on iterating over the container containers and applying 
the business logic, dispatching commands and queuing containers for remediation 
(under / over replication processing).
   
   Already I can see some areas where we are starting to copy the legacy 
replication manager and putting some business logic items into the Replication 
Manager processing flow, so I would like to stop and have a discussion on the 
best way forward.
   
   Looking at the logic we have so far, the replication manager checks look 
like:
   
    * Open Container Healthy check - If the container is open and should be 
closed for some reason. Currently in ReplicationManager class
    * Unhealthy Replica Handing - If the container is closed and some replicas 
are not in the correct closed state. Currently in Replication Manager class. 
    * Empty Container Handling - If the container is empty, remove its 
replicas. Not yet implemented, but there is a PR adding this into the RM class 
- https://github.com/apache/ozone/pull/3660 
    * ECHealthCheck - under / over / mis replication (ECContainerHealthCheck)
    * RatisHealthCheck - under / over/ mis replication (not yet implemented, 
but will be in RatisContainerHealthCheck)
       The legacy Manager also has logic for CLOSING and QUASI_CLOSED 
containers we need to get in somewhere.
   
   What I would like to propose is that we should make each of these above 
checks into their own class. Some of the classes will be very small, and others 
will be medium sized.
   
   We can then use a variation of the "Chain of Responsibility" pattern to 
bring all the checks together into a chain.
   
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-7192
   
   ## How was this patch tested?
   
   New and existing tests
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to