cndaimin commented on pull request #3842:
URL: https://github.com/apache/hadoop/pull/3842#issuecomment-1012833910
@jojochuang @fapifta Thanks for the review.
We use `fio` with 60 threads to do random read on files under test directory
and to measure the performance by read IOPS, test steps as following:
- Prepare test files to speed up the next random reads.
- Drop page cache of both client and datanode servers.
- Do the random read test with 60 threads.
The test scripts:
```
# Prepare test files to speed up the next random reads.
fio -iodepth=32 -rw=write -ioengine=libaio -bs=4096k -size=1G -direct=0
-runtime=600 -directory=/mnt/dfs/iotest -numjobs=60 -thread -group_reporting
-name=i
# Drop page cache of both client and datanode servers.
pssh -h ~/hosts -t 0 -i "sync && echo 1 > /proc/sys/vm/drop_caches"
# Do the random read test.
fio -iodepth=1 -rw=randread -ioengine=libaio -bs=512 -size=1G -direct=0
-runtime=120 -directory=/mnt/dfs/iotest -numjobs=60 -thread -group_reporting
-name=i
```
And the test results:
- With default `max_background`, which is 12
```
# cat /sys/fs/fuse/connections/55/max_background
12
# fio -iodepth=1 -rw=randread -ioengine=libaio -bs=512 -size=1G -direct=0
-runtime=120 -directory=/mnt/dfs/iotest -numjobs=60 -thread -group_reporting
-name=i
i: (g=0): rw=randread, bs=(R) 512B-512B, (W) 512B-512B, (T) 512B-512B,
ioengine=libaio, iodepth=1
...
fio-3.7
Starting 60 threads
Jobs: 60 (f=60): [r(60)][100.0%][r=722KiB/s,w=0KiB/s][r=1444,w=0 IOPS][eta
00m:00s]]
i: (groupid=0, jobs=60): err= 0: pid=13143: Fri Jan 14 14:34:35 2022
read: IOPS=1365, BW=683KiB/s (699kB/s)(80.0MiB/120043msec)
slat (nsec): min=1868, max=331615k, avg=43862093.94, stdev=10900971.93
clat (nsec): min=615, max=234495, avg=2124.79, stdev=1111.62
lat (usec): min=2, max=331618, avg=43865.27, stdev=10901.06
clat percentiles (nsec):
| 1.00th=[ 1176], 5.00th=[ 1416], 10.00th=[ 1528], 20.00th=[ 1704],
| 30.00th=[ 1864], 40.00th=[ 1992], 50.00th=[ 2064], 60.00th=[ 2160],
| 70.00th=[ 2256], 80.00th=[ 2384], 90.00th=[ 2576], 95.00th=[ 2768],
| 99.00th=[ 3408], 99.50th=[ 7968], 99.90th=[16064], 99.95th=[19584],
| 99.99th=[27008]
bw ( KiB/s): min= 0, max= 16, per=1.66%, avg=11.34, stdev= 1.25,
samples=14398
iops : min= 1, max= 32, avg=22.72, stdev= 2.48, samples=14398
lat (nsec) : 750=0.34%, 1000=0.10%
lat (usec) : 2=40.90%, 4=57.85%, 10=0.51%, 20=0.25%, 50=0.05%
lat (usec) : 100=0.01%, 250=0.01%
cpu : usr=0.02%, sys=0.04%, ctx=324069, majf=0, minf=60
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%,
>=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
issued rwts: total=163917,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
READ: bw=683KiB/s (699kB/s), 683KiB/s-683KiB/s (699kB/s-699kB/s),
io=80.0MiB (83.9MB), run=120043-120043msec
```
- With -omax_background=100
```
# cat /sys/fs/fuse/connections/55/max_background
100
# fio -iodepth=1 -rw=randread -ioengine=libaio -bs=512 -size=1G -direct=0
-runtime=120 -directory=/mnt/dfs/iotest -numjobs=60 -thread -group_reporting
-name=i
i: (g=0): rw=randread, bs=(R) 512B-512B, (W) 512B-512B, (T) 512B-512B,
ioengine=libaio, iodepth=1
...
fio-3.7
Starting 60 threads
# Check JAVA_HOME
Jobs: 60 (f=60): [r(60)][100.0%][r=1768KiB/s,w=0KiB/s][r=3536,w=0 IOPS][eta
00m:00s]]
i: (groupid=0, jobs=60): err= 0: pid=12582: Fri Jan 14 14:31:25 2022
read: IOPS=3569, BW=1785KiB/s (1828kB/s)(209MiB/120037msec)
slat (nsec): min=1576, max=708718k, avg=16797865.36, stdev=16920653.78
clat (nsec): min=603, max=343984, avg=2023.56, stdev=1829.69
lat (usec): min=2, max=708721, avg=16800.82, stdev=16920.78
clat percentiles (nsec):
| 1.00th=[ 748], 5.00th=[ 1224], 10.00th=[ 1400], 20.00th=[ 1592],
| 30.00th=[ 1736], 40.00th=[ 1848], 50.00th=[ 1928], 60.00th=[ 2008],
| 70.00th=[ 2096], 80.00th=[ 2192], 90.00th=[ 2352], 95.00th=[ 2512],
| 99.00th=[ 7648], 99.50th=[11968], 99.90th=[21632], 99.95th=[25984],
| 99.99th=[39168]
bw ( KiB/s): min= 6, max= 54, per=1.67%, avg=29.71, stdev= 6.19,
samples=14397
iops : min= 12, max= 108, avg=59.46, stdev=12.37, samples=14397
lat (nsec) : 750=0.99%, 1000=0.80%
lat (usec) : 2=57.08%, 4=39.87%, 10=0.60%, 20=0.53%, 50=0.13%
lat (usec) : 100=0.01%, 250=0.01%, 500=0.01%
cpu : usr=0.04%, sys=0.08%, ctx=423468, majf=0, minf=60
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%,
>=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
issued rwts: total=428483,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
READ: bw=1785KiB/s (1828kB/s), 1785KiB/s-1785KiB/s (1828kB/s-1828kB/s),
io=209MiB (219MB), run=120037-120037msec
```
In our test, by setting `max_background` to 100 will improve the read IOPS
from 1365 to 3569. And when the resources like cpu/memory are sufficient(which
is not a problem generally), there seems no side effects of setting a bigger
value. We have been running `-omax_background=100` in our production
environment for months and it looks good. @fapifta
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]