[ 
https://issues.apache.org/jira/browse/HDDS-10610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duong updated HDDS-10610:
-------------------------
    Description: 
Before HDDS-8674, the default number of EC pipelines is hardcoded as 5. This 
hardly utilizes the ability to write

After HDDS-8674, the default number of EC pipelines is fixed to be a dynamic 
number, calculated by 
healthy_disks/number_of_required_nodes_per_EC_replication_config. For example, 
in a cluster with 90 disks, there'll be 18 pipelines for EC rs-3-2-1024k. While 
this is a bright move, the limited number of pipelines still underutilizes the 
ability of disks in the cluster.  This is because when clients write data to a 
pipeline, it writes to one node (one disk) at a time. So, at any given time, 
only 1/5 (for rs-3-2) of the disks are busy. 

I did an EC write (rs-3-2) test on a small cluster with 30 disks on 10 nodes. 
The result with the default number of pipelines (6) is chaotic and it looks 
like the nodes/disks take turns to get the write load.

!ECWrite-6Pipelines.png|width=846,height=402!

When forcing the number of pipelines to 30 using ozone.scm.ec.pipeline.minimum, 
the write load is more evenly distributed over the nodes. And the total 
bandwidth increases significantly.

!ECWrite-30Pipelines.png|width=850,height=400!

 

 

  was:
Before HDDS-8674, the default number of EC pipelines is hardcoded as 5. This 
hardly utilizes the ability to write

After HDDS-8674, the default number of EC pipelines is fixed to be a dynamic 
number, calculated by 
healthy_disks/number_of_required_nodes_per_EC_replication_config. For example, 
in a cluster with 90 disks, there'll be 18 pipelines for EC 3-2-1024k. While 
this is a bright move, the number of pipelines still 


> Reconsider the default number of EC pipelines
> ---------------------------------------------
>
>                 Key: HDDS-10610
>                 URL: https://issues.apache.org/jira/browse/HDDS-10610
>             Project: Apache Ozone
>          Issue Type: Improvement
>            Reporter: Duong
>            Priority: Major
>         Attachments: ECWrite-30Pipelines.png, ECWrite-6Pipelines.png
>
>
> Before HDDS-8674, the default number of EC pipelines is hardcoded as 5. This 
> hardly utilizes the ability to write
> After HDDS-8674, the default number of EC pipelines is fixed to be a dynamic 
> number, calculated by 
> healthy_disks/number_of_required_nodes_per_EC_replication_config. For 
> example, in a cluster with 90 disks, there'll be 18 pipelines for EC 
> rs-3-2-1024k. While this is a bright move, the limited number of pipelines 
> still underutilizes the ability of disks in the cluster.  This is because 
> when clients write data to a pipeline, it writes to one node (one disk) at a 
> time. So, at any given time, only 1/5 (for rs-3-2) of the disks are busy. 
> I did an EC write (rs-3-2) test on a small cluster with 30 disks on 10 nodes. 
> The result with the default number of pipelines (6) is chaotic and it looks 
> like the nodes/disks take turns to get the write load.
> !ECWrite-6Pipelines.png|width=846,height=402!
> When forcing the number of pipelines to 30 using 
> ozone.scm.ec.pipeline.minimum, the write load is more evenly distributed over 
> the nodes. And the total bandwidth increases significantly.
> !ECWrite-30Pipelines.png|width=850,height=400!
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to