[ 
https://issues.apache.org/jira/browse/HDDS-5703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Gui updated HDDS-5703:
---------------------------
    Attachment: 20210830-183345.svg

> SCM HA performance degradation upon one peer down.
> --------------------------------------------------
>
>                 Key: HDDS-5703
>                 URL: https://issues.apache.org/jira/browse/HDDS-5703
>             Project: Apache Ozone
>          Issue Type: Bug
>            Reporter: Mark Gui
>            Priority: Major
>         Attachments: 20210830-183345.svg, 20210830-184833.svg
>
>
> When we use the SCM benchmark tool 
> (https://issues.apache.org/jira/browse/HDDS-5702) to test SCM throughput for 
> AllocateContainer, we found a dramatic degradation in the throughput when one 
> scm peer(follower) is down.
> Here is some statistics.
> Normal Case:
> {code:java}
> ***************************************
> Total allocated containers: 500000
> Total failed containers: 0
> Execution Time: 02:36:50,151
> Throughput: 53.000000 (ops)
> ***************************************
> {code}
> One scm follower down:
> {code:java}
> ***************************************
> Total allocated containers: 50000
> Total failed containers: 0
> Execution Time: 02:22:00,245
> Throughput: 5.000000 (ops)
> ***************************************
> {code}
> The overall throughput drops to 1/10 of the original.
>  
> We have a dig into this problem.
> Here are flame graphs captured by an open source tool (arthas):
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to