GlenGeng commented on pull request #1700: URL: https://github.com/apache/ozone/pull/1700#issuecomment-745745415
Thanks @sodonnel for the insights, it really helps! The first solution `Collections.sort(eligibleReplicas);` sorts the replicas by UUIDs of the DNs they locates, which will cause imbalance of storage usage across DNs. Say, one pipeline is engaged by three DNs, DN1, DN2, DN3, whose UUIDs are in increasing order. For the containers created this pipeline, replicas on DN1 will always be removed before that on DN3, and cause imbalance of storage usage. Our solution should be, for the containers of the same pipeline, different containers will have different sorted list for DN1, DN2, DN3. Let’s just add salt into the sort, we just sort according to the `hash(containerID, UUID of DN)`. In this case, we will randomly remove replicas from the 3 DNs. However, it can only handle in-flight delete, which is not a thorough solution for both in-flight add and in-flight delete. We(you, nanda and me) can schedule a talk this week or next week for the discussion of a decent solution. Ahead of that, we would like merge this one-line-fix into HDDS-2823, as a protection for the potential data loss issue. After the decent solution is merged, we can decide whether revert this quick fix or not. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
