[
https://issues.apache.org/jira/browse/HDDS-5703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mark Gui updated HDDS-5703:
---------------------------
Attachment: 20210830-183345.svg
> SCM HA performance degradation upon one peer down.
> --------------------------------------------------
>
> Key: HDDS-5703
> URL: https://issues.apache.org/jira/browse/HDDS-5703
> Project: Apache Ozone
> Issue Type: Bug
> Reporter: Mark Gui
> Priority: Major
> Attachments: 20210830-183345.svg, 20210830-184833.svg
>
>
> When we use the SCM benchmark tool
> (https://issues.apache.org/jira/browse/HDDS-5702) to test SCM throughput for
> AllocateContainer, we found a dramatic degradation in the throughput when one
> scm peer(follower) is down.
> Here is some statistics.
> Normal Case:
> {code:java}
> ***************************************
> Total allocated containers: 500000
> Total failed containers: 0
> Execution Time: 02:36:50,151
> Throughput: 53.000000 (ops)
> ***************************************
> {code}
> One scm follower down:
> {code:java}
> ***************************************
> Total allocated containers: 50000
> Total failed containers: 0
> Execution Time: 02:22:00,245
> Throughput: 5.000000 (ops)
> ***************************************
> {code}
> The overall throughput drops to 1/10 of the original.
>
> We have a dig into this problem.
> Here are flame graphs captured by an open source tool (arthas):
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]