[ 
https://issues.apache.org/jira/browse/HDDS-10465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-10465:
------------------------------
    Description: 
When using TestDFSIO to compare the random read performance of HDFS and Ozone, 
Ozone is way more slow than HDFS. Here are the data tested in YCloud cluster.

Test Suit: TestDFSIO

Number of files: 64

File Size: 1024MB

 
||Random read(execution time)||Round1(s)||Round2(s)||
|HDFS| 47.06|49.5|
|Ozone|147.31|149.47|

And for Ozone itself, sequence read is must faster than random read:
||Ozone||Round1(s)||Round2(s)||Round3(s)||
|read execution time|66.62|58.78|68.98|
|random read 
execution time|147.31|149.47|147.09|

While for HDFS, there is no much gap between its sequence read and random read 
execution time:
||HDFS||Round1(s)||Round2(s)||
|read execution time|51.53|44.88|
|random read 
execution time|47.06|49.5|

After some investigation, it's found that the total bytes read in TestDFSIO 
random read test is almost double the data size. Here the total data to read is 
64 * 1024MB = 64GB, why the aggregated DN bytesReadChunk metric value is 
increased by 128GB after one test run.  The root cause if when client read 
data, it will align the requested data size with 

  was:
When using TestDFSIO to compare the random read performance of HDFS and Ozone, 
Ozone is way more slow than HDFS. Here are the data, 

 
||Random read||Round1||Round2||
|HDFS|----- TestDFSIO ----- : random read
Number of files: 64
Total MBytes processed: 65562.04
Throughput mb/sec: 101.72
Average IO rate mb/sec: 111.12
IO rate std deviation: 36.39
Test exec time sec: 47.06|----- TestDFSIO ----- : random read
Number of files: 64
Total MBytes processed: 65561.07
Throughput mb/sec: 72.52
Average IO rate mb/sec: 89.14
IO rate std deviation: 46.83
Test exec time sec: 49.5|
|Ozone|----- TestDFSIO ----- : random read
Number of files: 64
Total MBytes processed: 65561.35
Throughput mb/sec: 19.59
Average IO rate mb/sec: 27.69
IO rate std deviation: 18.8
Test exec time sec: 147.31|----- TestDFSIO ----- : random read
Number of files: 64
Total MBytes processed: 65560.12
Throughput mb/sec: 22
Average IO rate mb/sec: 29.39
IO rate std deviation: 14.41
Test exec time sec: 149.47|


> Change ozone.client.bytes.per.checksum default to 16KB
> ------------------------------------------------------
>
>                 Key: HDDS-10465
>                 URL: https://issues.apache.org/jira/browse/HDDS-10465
>             Project: Apache Ozone
>          Issue Type: Improvement
>            Reporter: Sammi Chen
>            Assignee: Sammi Chen
>            Priority: Major
>
> When using TestDFSIO to compare the random read performance of HDFS and 
> Ozone, Ozone is way more slow than HDFS. Here are the data tested in YCloud 
> cluster.
> Test Suit: TestDFSIO
> Number of files: 64
> File Size: 1024MB
>  
> ||Random read(execution time)||Round1(s)||Round2(s)||
> |HDFS| 47.06|49.5|
> |Ozone|147.31|149.47|
> And for Ozone itself, sequence read is must faster than random read:
> ||Ozone||Round1(s)||Round2(s)||Round3(s)||
> |read execution time|66.62|58.78|68.98|
> |random read 
> execution time|147.31|149.47|147.09|
> While for HDFS, there is no much gap between its sequence read and random 
> read execution time:
> ||HDFS||Round1(s)||Round2(s)||
> |read execution time|51.53|44.88|
> |random read 
> execution time|47.06|49.5|
> After some investigation, it's found that the total bytes read in TestDFSIO 
> random read test is almost double the data size. Here the total data to read 
> is 64 * 1024MB = 64GB, why the aggregated DN bytesReadChunk metric value is 
> increased by 128GB after one test run.  The root cause if when client read 
> data, it will align the requested data size with 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to