Maksim Myskov created HDDS-6225:
-----------------------------------

             Summary: Pipelines distribution across datanodes is uneven
                 Key: HDDS-6225
                 URL: https://issues.apache.org/jira/browse/HDDS-6225
             Project: Apache Ozone
          Issue Type: Bug
            Reporter: Maksim Myskov


Despite HDDS-5441 has been done to improve pipeline distribution, there is 
still an issue.

I made a couple of tests:

Configuration1. 6 datanodes, ozone.scm.datanode.disallow.same.peers = true, 3 
pipelines
|| Datanode||Pipelines||
|8cc97cc8-31dd-463d-8a89-7c76d95d709f|2|
|ab3dc657-736e-49a0-a5c9-e0e4962773d9|2|
|c3f91923-305f-48b1-b38b-2db1e4b92162|2|
|cd16db96-92cb-4056-86f8-563e1ac1bfce|2|
|292584d1-6de5-4249-a601-25a813df997e|2|
|4bf6cae2-1757-4b67-aba4-272d3148c838|2|

 

Configuration2. 6 datanodes, ozone.scm.datanode.disallow.same.peers = false, 8 
pipelines
||Datanode||Pipelines||
|cc131824-ccbe-4e8a-9ec4-e7889275e90b|5|
|164b4515-7735-4fb5-9836-d995c9fafb88|5|
|1a217fc9-9b45-4161-b1e8-fda71a3c38c6|5|
|2f0bcc4e-407a-4e88-8465-d9e5e6ba8bdc|3|
|5df86a41-3c17-4dc1-9e57-410fe5ad7842|1|
|71db1b20-5d89-4919-8566-683c3f24ad5e|5|

 

Configuration3. 7 datanodes, ozone.scm.datanode.disallow.same.peers = true, 7 
pipelines
||Datanode||Pipelines||
|a630e56a-3e1c-47c3-8d49-fb6636a81f2f|3|
|b99a61c9-d907-4536-ac65-e1df254b6190|3|
|005f3c5a-5001-46ef-9016-44114e7b4e32|3|
|213d0cfd-afbb-462e-9f70-2b959a2dbe91|3|
|6531b78c-2ba0-44c2-a7f7-7271f47bd381|3|
|706625cb-15ab-4f5e-a23a-f2de72946490|3|
|77ed9147-6a86-43e1-81fe-ed2378f8f518|3|

 

Configuration4. 7 datanodes, ozone.scm.datanode.disallow.same.peers = false, 10 
pipelines
||Datanode||Pipelines||
|9a019bd1-fa44-4e67-950f-d2e15338b0cd|5|
|c948417a-784e-4dd2-8ca3-715d67ac9891|5|
|d49c9c19-1a60-407e-975b-9fe00a8ac44c|5|
|e2792111-0506-49ed-8096-6bed67675fc4|5|
|0fb9ddcd-5bda-4a8b-bada-39f6f2ec7ec2|5|
|2011f197-6248-4a6f-9c7b-e3611e67d3d0|5|
|782d9313-345c-4297-ab6b-d57b7eb4910f|0|

 

 Configuration5. 8 datanodes, ozone.scm.datanode.disallow.same.peers = true, 7 
pipelines
||Datanode||Pipelines||
|9b1146e2-7ff0-4e25-b42d-7e7217bd01f3|3|
|a593ecab-5237-4c0e-bcd0-411839bcd1ef|3|
|02f2c58a-c172-4e10-84e9-5cae850327f9|3|
|0f18f35a-56a2-42ea-8f0e-7d2a84b16c6c|3|
|3a11728e-637f-4337-a3cf-7b9a274e3614|3|
|446a32f2-fdb3-4386-813c-0e5f1c9f8c31|0|
|56522197-646a-4964-8e94-d9c8f7e8687b|3|
|7f21c5a2-fcd0-4b40-8b0c-cc647c50e6b1|3|

 

Configuration6. 8 datanodes, ozone.scm.datanode.disallow.same.peers = false 12 
pipelines
||Datanode||Pipelines||
|9eba6024-9c1e-4a63-b0ea-c71593d0efeb|5|
|ad6ff8f0-336f-4ba5-b203-787870d1b7ca|5|
|030f3e80-6ac1-4c87-ab25-6538b50602cd|5|
|09e83e1d-8e9b-429a-8f89-81cba1523487|5|
|226b0722-42f1-47e4-8b90-083e7174a920|5|
|459a8649-4238-407d-9d3e-4d0577acf496|5|
|4680966d-fd24-4f24-831d-084382c93dfe|5|
|4aacc584-138c-4ce3-bb61-64bca8384571|1|

 

Configuration7. 9 datanodes, ozone.scm.datanode.disallow.same.peers = true, 8 
pipelines
||Datanode||Pipelines||
|a22a1248-0627-4616-a896-3d434687b4c9|1|
|b9cc0695-4817-4f7d-ab70-b2257bfb3fc4|3|
|bfd99d04-897b-4695-a328-31195bed7a46|4|
|fefa8454-a79e-4137-95fa-90f1942b5b1c|1|
|0c4a3312-7313-4b23-a5fa-f6bc2f7f1593|3|
|0e8f830e-40b0-46fb-b0f8-8bb07447b816|3|
|490cd956-16d6-401b-af54-9f2570dc044c|3|
|517346bc-4f3a-4c65-9141-0fb9b580bf74|3|
|5ec043f7-b69f-4780-8383-e085569ddc96|3|

 

Configuratio8. 9 datanodes, ozone.scm.datanode.disallow.same.peers = false, 15 
pipelines
||Datanode||Pipelines||
|a31576cf-1ff7-4931-b4f1-561e26deefff|5|
|c8ecddeb-1c6e-4f35-b4c7-a535ed95c53f|5|
|fae64881-b815-4fb1-ac8a-6932f7792d69|5|
|13fdcaa1-6032-4fc9-946b-7e92577cbd32|5|
|2024dce3-2de3-416b-bf62-2b4dff54cd9f|5|
|38e6915a-90e7-444f-a6f3-72b0b03de29d|5|
|60206eff-6630-4640-b68b-f3986bf3d299|5|
|68040769-eab6-49bc-a5b4-ebe179d0df09|5|
|7f755ac1-2fcd-48f6-8c18-b82914955f58|5|

 

{color:#000000}Configuration9. 10 {color} datanodes, 
ozone.scm.datanode.disallow.same.peers = true, 11 pipelines
||Datanode||Pipelines||
|8edb6712-e5c2-4f19-a9d2-be80cb444ddd|4|
|9676338d-125a-4798-98ad-1dd25fbbf150|2|
|d4607619-925a-4d26-85b1-7fb29fdd276a|4|
|d72313a7-4d95-44f8-b020-716ea44d7a24|3|
|daa01019-1595-4951-a404-c7ef88377280|4|
|f26d824a-40e1-4fef-be74-13021ec3383c|4|
|122d4373-16df-406e-b937-07a136031c98|3|
|26b84595-9bb3-40b5-8b19-f678de2f7ed3|4|
|28f8a99d-52e5-429e-b5a1-6cb3f180b2fa|2|
|44d41c19-8f30-48b4-a466-8d8413f21705|3|

 

{color:#000000}Configuration10. 10 {color} datanodes, 
ozone.scm.datanode.disallow.same.peers = false, 16 pipelines
||Datanode||Pipelines||
|a5b0d0a8-41f0-4110-aef1-73f7a63defa2|5|
|a9600399-19c6-4a4d-bd98-73853807266|5|
|fcefa984-e74f-4370-8bb7-e13fddd549fc|5|
|14a2cd7e-1f10-45f5-8ba0-d6515f6031b|5|
|1fc1c312-c9b7-403f-aa42-db44d07d4812|5|
|27d5af80-0bfe-423e-9b9d-882c380c5b28|3|
|423c5624-683b-4269-9444-b01bc69093a|5|
|5de9099e-0d3d-47a3-a9d2-344f99efaf7c|5|
|5edd32e1-ea56-4485-94cd-42ada48211da|5|
|64bc438e-a00d-4b62-9a62-2e5496058468|5|
 
Take a look at configuration5 and configuration6. In this case property 
introduced in HDDS-5441 makes worse.
Sometimes it helps(configuration4), sometimes not.
 
The root cause of the issue seems to be lack of pipeline rebalancing mechanism. 
If pipeline distribution of a cluster is uneven, I can manually rebalance 
pipelines by closing existing pipelines (via ozone admin tool), in this case 
SCM will create new pipelines instead of closed ones. It would be great for SCM 
to rebalance pipelines automatically (including cases when a new node is added 
to a cluster). Does such rebalancing mechanism fit Ozone architecture? Or is 
there another way to avoid uneven pipelines distribution?



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to