[
https://issues.apache.org/jira/browse/HDDS-3900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rakesh Radhakrishnan updated HDDS-3900:
---------------------------------------
Summary: Update default value of 'ozone.om.ratis.segment.size' and
'preallocated.size' to improve OM write perf (was: Update default value for
'ozone.om.ratis.segment.size' and 'preallocated.size' to improve OM write perf)
> Update default value of 'ozone.om.ratis.segment.size' and 'preallocated.size'
> to improve OM write perf
> ------------------------------------------------------------------------------------------------------
>
> Key: HDDS-3900
> URL: https://issues.apache.org/jira/browse/HDDS-3900
> Project: Hadoop Distributed Data Store
> Issue Type: Bug
> Reporter: Rakesh Radhakrishnan
> Assignee: Rakesh Radhakrishnan
> Priority: Major
> Labels: om-perf, performance
>
> Based on the *OM* performance tests on HDDs - write heavy workload
> {{Synthetic NNLoadGenerator}} in single node HA, the default 16KB ratis
> segment size becomes the bottleneck which affects the OM performance.
> Below is the IOSTAT with 16KB segment.size and 16KB
> segment.preallocated.size, which causes high w_await time and very minimal
> batching <5 occurred in OM (most of the time during the run).
> {code:java}
> sdd: RATIS DISK
> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz
> await r_await w_await svctm %util
> sdd 0.00 0.00 0.00 138.00 0.00 1.27 18.88 21.99
> 65.25 0.00 65.25 6.88 94.90
> sdd 0.00 0.00 0.00 103.00 0.00 1.07 21.23 40.36
> 918.25 0.00 918.25 9.72 100.10
> sdd 0.00 0.00 0.00 104.00 0.00 1.04 20.55 30.08
> 1388.23 0.00 1388.23 9.62 100.10
> sdd 0.00 0.00 0.00 396.00 0.00 1.55 8.00 136.50
> 285.30 0.00 285.30 2.40 94.90
> {code}
>
> Below is the IOSTAT with 16MB segment.size and 16MB
> segment.preallocated.size. which minimizes the {{w_await}} time. This gives
> good performance improvement in traditional HDDs by doing more sync batching.
> {code:java}
> sdd: RATIS DISK
> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz
> await r_await w_await svctm %util
> sdd 0.00 0.00 0.00 125.74 0.00 19.85 323.34 3.05
> 24.28 0.00 24.28 7.17 90.10
> sdd 0.00 0.00 0.00 128.00 0.00 19.76 316.12 3.31
> 25.91 0.00 25.91 7.14 91.40
> sdd 0.00 0.00 0.00 115.00 0.00 4.59 81.81 0.93
> 8.10 0.00 8.10 8.04 92.50
> sdd 0.00 0.00 0.00 111.00 0.00 4.53 83.57 0.90
> 8.12 0.00 8.12 8.14 90.30
> sdd 0.00 0.00 0.00 115.00 0.00 4.64 82.64 0.93
> 8.08 0.00 8.08 8.10 93.20
> {code}
>
> Below is the IOSTAT with 4MB segment.size and 4MB segment.preallocated.size.
> which also minimizes the {{w_await}} time.
> {code:java}
> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz
> await r_await w_await svctm %util
> sdd 0.00 0.00 0.00 115.00 0.00 6.08 108.34 0.99
> 8.57 0.00 8.57 8.10 93.20
> sdd 0.00 0.00 0.00 122.00 0.00 7.81 131.13 1.48
> 12.15 0.00 12.15 7.80 95.10
> sdd 0.00 0.00 0.00 115.00 0.00 7.81 139.04 1.05
> 9.09 0.00 9.09 8.10 93.20
> sdd 0.00 0.00 0.00 115.00 0.00 7.85 139.78 1.04
> 8.95 0.00 8.95 8.03 92.30
> sdd 0.00 0.00 0.00 114.00 0.00 5.83 104.70 0.97
> 8.57 0.00 8.57 7.97 90.90
> sdd 0.00 0.00 0.00 115.00 0.00 7.80 138.92 1.05
> 9.10 0.00 9.10 8.11 93.30
> sdd 0.00 0.00 0.00 119.00 0.00 7.93 136.47 1.72
> 14.41 0.00 14.41 7.81 92.90
> {code}
>
> Recommended config could be a value in MBs, probably a value *higher than >
> 2MB or >4MB*
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]