[ 
https://issues.apache.org/jira/browse/HDDS-3745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

runzhiwang updated HDDS-3745:
-----------------------------
    Description: 
What's the problem ?
I start a ozone cluster with 1000 datanodes and 10 s3gateway, and run two weeks 
with heavy workload, and perf om and scm.
1. From om perf, getServiceList cost 63.75% cpu.
 !screenshot-1.png! 
2. From scm perf, queryNode come from om::getServiceList cost 33.20% cpu
 !screenshot-2.png! 

What's the reason ?
Now s3g create a client for each request. when create each RpcClient, s3g will 
call ServiceInfoEx serviceInfoEx = ozoneManagerClient.getServiceInfo(), 
getServiceInfo will call getServiceList. Then om and scm are busy with 
getServiceList.
But s3g does not use the List<ServiceInfo> which got from getServiceList at 
all. 

 

  was:
I start a ozone cluster with 1000 datanodes and 10 s3gateway, and run two weeks 
with heavy workload, and perf om and scm.
1. From om perf, getServiceList cost 63.75% cpu.
 !screenshot-1.png! 
2. From scm perf, queryNode come from om::getServiceList cost 33.20% cpu
 !screenshot-2.png! 
Now s3g create a client for each request. when create each RpcClient, s3g will 
call ServiceInfoEx serviceInfoEx = ozoneManagerClient.getServiceInfo(), 
getServiceInfo will call getServiceList. Then om and scm are busy with 
getServiceList.

 


> Improve OM and SCM performance with 50% by avoid call getServiceList
> --------------------------------------------------------------------
>
>                 Key: HDDS-3745
>                 URL: https://issues.apache.org/jira/browse/HDDS-3745
>             Project: Hadoop Distributed Data Store
>          Issue Type: Sub-task
>            Reporter: runzhiwang
>            Assignee: runzhiwang
>            Priority: Major
>         Attachments: screenshot-1.png, screenshot-2.png
>
>
> What's the problem ?
> I start a ozone cluster with 1000 datanodes and 10 s3gateway, and run two 
> weeks with heavy workload, and perf om and scm.
> 1. From om perf, getServiceList cost 63.75% cpu.
>  !screenshot-1.png! 
> 2. From scm perf, queryNode come from om::getServiceList cost 33.20% cpu
>  !screenshot-2.png! 
> What's the reason ?
> Now s3g create a client for each request. when create each RpcClient, s3g 
> will call ServiceInfoEx serviceInfoEx = ozoneManagerClient.getServiceInfo(), 
> getServiceInfo will call getServiceList. Then om and scm are busy with 
> getServiceList.
> But s3g does not use the List<ServiceInfo> which got from getServiceList at 
> all. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to