Hi all,
I performed some benchmarks with fio with a block size of 4K. I guess
I experienced some performance problems, I can hardly imagine that the
IOPS can be so low...
My setup:
- 4 servers HP DL 360 G7 :
- E5606/ 2.13GHz 4C
- 6GB RAM
- root fs: HP 72GB 15K SAS RAID 1
- Controller JBOD with writeback enable
- 3 OSDs per server = 11 in total
- 3 MONs
- OSD disk: 600GB 10K SAS on XFS mounted with the following options:
* rw,noexec,nodev,noatime,nodiratime,barrier=0
- ubuntu 12.04.1 LTS
- ceph 0.48.2
- journals are stored on an SSD (journal on file not on block
device), with over-provisioning. The SSD is an OCZ vertex 4
- pg num 450 for each pool
- replica count of 2
- network:
- 1GB
- separate network for client and replication
- no network bottleneck, iperf test has been performed
Ceph conf relevant sections:
- auth supported = none
- osd journal size = 2048
- osd op threads = 24
- osd disk threads = 24
- filestore op threads = 6
- filestore queue max ops = 24
- filestore_flusher = false
RADOS Benchmarks (writes) with default options:
2012-10-31 22:46:54.133042min lat: 0.088034 max lat: 2.64786 avg lat: 0.425305
sec Cur ops started finished avg MB/s cur MB/s last lat avg lat
100 16 3767 3751 150.016 152 0.19291 0.425305
Total time run: 100.326526
Total writes made: 3767
Write size: 4194304
Bandwidth (MB/sec): 150.190
Stddev Bandwidth: 15.8426
Max bandwidth (MB/sec): 200
Min bandwidth (MB/sec): 108
Average Latency: 0.425902
Stddev Latency: 0.322846
Max latency: 2.64786
Min latency: 0.088034
For information, a DD with a block size of 1G shows 110MB/sec with
direct I/O. It's not so relevant nor a real life scenario but it's
something...
RADOS bench with 4K:
# rados -p bench bench 300 write -b 4096 -t 32 --no-cleanup
2012-11-13 09:38:44.485547min lat: 0.001807 max lat: 2.77526 avg lat: 0.0423748
sec Cur ops started finished avg MB/s cur MB/s last lat avg lat
300 31 226546 226515 2.94867 6.35156 0.003276 0.0423748
Total time run: 300.108349
Total writes made: 226546
Write size: 4096
Bandwidth (MB/sec): 2.949
Stddev Bandwidth: 1.93903
Max bandwidth (MB/sec): 12.2188
Min bandwidth (MB/sec): 0.015625
Average Latency: 0.0423857
Stddev Latency: 0.130588
Max latency: 2.77526
Min latency: 0.001807
Then seq:
# rados -p bench bench 300 seq -b 4096 -t 32
2012-11-13 09:40:09.465216min lat: 0.000678 max lat: 0.226029 avg lat:
0.00714306
sec Cur ops started finished avg MB/s cur MB/s last lat avg lat
40 31 179179 179148 17.4924 32.3945 0.001880.00714306
41 31 188937 188906 17.9953 38.1172 0.001151 0.0069414
42 32 196223 196191 18.2443 28.457 0.0012570.00684898
43 32 205245 205213 18.638 35.2422 0.0014290.00670422
44 31 214193 214162 19.0088 34.957 0.0014850.00657379
45 31 223028 222997 19.3532 34.5117 0.0012870.00645649
Total time run: 45.368758
Total reads made: 226546
Read size: 4096
Bandwidth (MB/sec): 19.506
Average Latency: 0.00640665
Max latency: 0.226029
Min latency: 0.000672
Fio template used to bench the rbd device:
[global]
ioengine=libaio
iodepth=100
size=1g
direct=1
runtime=60
filename=/dev/rbd2
[seq-read]
rw=read
bs=4M
stonewall
[rand-read]
rw=randread
bs=4k
stonewall
[seq-write]
rw=write
bs=4M
stonewall
[rand-write]
rw=randwrite
bs=4K
stonewall
Results:
fio rbd-bench.fio
seq-read: (g=0): rw=read, bs=4M-4M/4M-4M, ioengine=libaio, iodepth=100
rand-read: (g=1): rw=randread, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=100
seq-write: (g=2): rw=write, bs=4M-4M/4M-4M, ioengine=libaio, iodepth=100
rand-write: (g=3): rw=randwrite, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=100
fio 1.59
Starting 4 processes
Jobs: 1 (f=1): [___w] [75.0% done] [0K/0K /s] [0 /0 iops] [eta 00m:33s]
seq-read: (groupid=0, jobs=1): err= 0: pid=6302
read : io=1024.0MB, bw=104879KB/s, iops=25 , runt= 9998msec
slat (usec): min=298 , max=409745 , avg=36384.06, stdev=68708.48
clat (msec): min=681 , max=5488 , avg=3383.33, stdev=1108.83
lat (msec): min=682 , max=5637 , avg=3419.71, stdev=1109.07
bw (KB/s) : min= 0, max=114975, per=8.97%, avg=9410.35, stdev=29174.91
cpu : usr=0.00%, sys=2.28%, ctx=1644, majf=0, minf=102423
IO depths : 1=0.4%, 2=0.8%, 4=1.6%, 8=3.1%, 16=6.2%, 32=12.5%, >=64=75.4%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=99.4%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.6%
issued r/w/d: total=256/0/0, short=0/0/0
lat (msec): 750=1.56%, 1000=3.12%, 2000=9.38%, >=2000=85.94%
rand-read: (groupid=1, jobs=1): err= 0: pid=6547
read : io=1024.0MB, bw=65263KB/s, iops=16315 , runt= 16067msec
slat (usec): min=11 , max=244 , avg=24.82, stdev= 6.13
clat (usec): min=487 , max=44231 , avg=6100.85, stdev=6567.64
lat (usec): min=518 , max=44254 , avg=6125.99, stdev=6567.76
bw (KB/s) : min=29232, max=98960, per=100.23%, avg=65413.25, stdev=29122.39
cpu : usr=5.83%, sys=32.89%, ctx=346477, majf=0, minf=122
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
issued r/w/d: total=262144/0/0, short=0/0/0
lat (usec): 500=0.01%, 750=2.55%, 1000=5.85%
lat (msec): 2=25.49%, 4=27.47%, 10=14.39%, 20=19.13%, 50=5.12%
seq-write: (groupid=2, jobs=1): err= 0: pid=6845
write: io=1024.0MB, bw=114386KB/s, iops=27 , runt= 9167msec
slat (usec): min=449 , max=187559 , avg=33082.90, stdev=59961.36
clat (msec): min=695 , max=5848 , avg=3062.17, stdev=948.73
lat (msec): min=696 , max=5848 , avg=3095.26, stdev=948.11
bw (KB/s) : min= 0, max=134945, per=8.89%, avg=10166.26, stdev=32757.63
cpu : usr=1.09%, sys=0.61%, ctx=195, majf=0, minf=21
IO depths : 1=0.4%, 2=0.8%, 4=1.6%, 8=3.1%, 16=6.2%, 32=12.5%, >=64=75.4%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=99.4%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.6%
issued r/w/d: total=0/256/0, short=0/0/0
lat (msec): 750=1.56%, 1000=3.12%, 2000=10.94%, >=2000=84.38%
rand-write: (groupid=3, jobs=1): err= 0: pid=7054
write: io=189480KB, bw=3053.3KB/s, iops=763 , runt= 62063msec
slat (usec): min=11 , max=250 , avg=50.78, stdev=11.57
clat (msec): min=1 , max=4592 , avg=130.92, stdev=388.57
lat (msec): min=1 , max=4592 , avg=130.97, stdev=388.57
bw (KB/s) : min= 0, max=10408, per=58.69%, avg=1791.67, stdev=2133.80
cpu : usr=0.49%, sys=2.54%, ctx=80620, majf=0, minf=19
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=99.9%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
issued r/w/d: total=0/47370/0, short=0/0/0
lat (msec): 2=22.08%, 4=43.68%, 10=2.50%, 20=1.72%, 50=5.01%
lat (msec): 100=6.45%, 250=6.44%, 500=3.89%, 750=2.70%, 1000=1.76%
lat (msec): 2000=2.83%, >=2000=0.93%
Run status group 0 (all jobs):
READ: io=1024.0MB, aggrb=104878KB/s, minb=107395KB/s,
maxb=107395KB/s, mint=9998msec, maxt=9998msec
Run status group 1 (all jobs):
READ: io=1024.0MB, aggrb=65262KB/s, minb=66829KB/s, maxb=66829KB/s,
mint=16067msec, maxt=16067msec
Run status group 2 (all jobs):
WRITE: io=1024.0MB, aggrb=114385KB/s, minb=117131KB/s,
maxb=117131KB/s, mint=9167msec, maxt=9167msec
Run status group 3 (all jobs):
WRITE: io=189480KB, aggrb=3053KB/s, minb=3126KB/s, maxb=3126KB/s,
mint=62063msec, maxt=62063msec
Disk stats (read/write):
rbd2: ios=264358/49408, merge=0/0, ticks=2983236/7311972,
in_queue=10315204, util=98.99%
The RBD has been mapped on a client machine, connected to the ceph
cluster via the public network.
If you need more information, please ask.
Thanks in advance. Performance Gurus, it's all yours :)
Cheers!
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html