adoroszlai opened a new pull request, #4810:
URL: https://github.com/apache/ozone/pull/4810

   ## What changes were proposed in this pull request?
   
   EC offline reconstruction has two properties that may be in conflict:
    * tries to exclude from targets any datanodes that are currently overloaded 
with replication commands
    * attempts partial reconstruction if full reconstruction is not possible 
due to lack of enough targets
   
   This may result in multiple subsequent partial reconstructions, wasting 
resources.
   
   This change lets SCM defer reconstruction if:
    * only partial reconstruction is currently possible AND
    * full reconstruction would be possible using overloaded nodes AND
    * at least one more replica may be lost before the container becomes 
unrecoverable (i.e. recovery is not yet critical)
   
   The improvement applies to EC(6,3) and EC(10,4) only, because EC(3,2) can 
only have 1 or 2 replicas missing before becoming unrecoverable: with 1 missing 
replica recovery is not partial, with 2 missing replicas recovery is "critical".
   
   https://issues.apache.org/jira/browse/HDDS-8727
   
   ## How was this patch tested?
   
   Added unit test.
   
   https://github.com/adoroszlai/hadoop-ozone/actions/runs/5134479384


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to