sky76093016 opened a new pull request #2757:
URL: https://github.com/apache/ozone/pull/2757


   ## What changes were proposed in this pull request?
   
   **Reasons for Intermittent failure:**
   
   In [HDDS-4710](https://github.com/apache/ozone/pull/1821), after 
healthyNodes are sorted by **load** from low to high, execute `DatanodeDetails 
selectedNode = healthyNodes.get(getRand().nextInt(healthyNodes.size())` to make 
[HDDS-4710_L168](https://github.com/apache/ozone/blob/996518a47f97abdfe29d80aa0982ae51c84095b5/hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/pipeline/PipelinePlacementPolicy.java#L168)
 meaningless.
   
   Intermittent failure occurs when the following conditions occur. (The number 
represents the number of times each healthynode is allocated to the pipeline)
   
   `5 5 5 5 5 5 3 3`
   
   When selecting three healthyNodes in the last round, it was found that only 
two healthyNodes were available, which reached the condition of 
[HDDS-4710_L172](https://github.com/apache/ozone/blob/996518a47f97abdfe29d80aa0982ae51c84095b5/hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/pipeline/PipelinePlacementPolicy.java#L172),
 so the Intermittent failure reappeared.
   
   **Solution:**
   
   Shuffle the healthyNodes before establishing the healthList, so that the 
initial order of the datanodes in the healthyList will be different, and then 
sort them according to the **load** order.
   When executing selectNode(), select the datanode at the first position in 
the healthyList to achieve a random effect.
   
   **Note:**
   Shuffle: To meet the random requirements of 
[HDDS-4710](https://github.com/apache/ozone/pull/1821), it is an alternative 
method that is sorted according to the number of **loads**.
   load: The number of pipelines the datanode currently has.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/projects/HDDS/issues/HDDS-5820
   
   ## How was this patch tested?
   
   UT completed the test about 4000 times.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to