Hi everyone,
We have a deployment with 90 OSDs at the moment which is all SSD that’s not
hitting quite the performance that it should be in my opinion, a `rados bench`
run gives something along these numbers:
Maintaining 16 concurrent writes of 4194304 bytes to objects of size 4194304
for up to 10 seconds or 0 objects
Object prefix: benchmark_data_bench.vexxhost._30340
sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)
0 0 0 0 0 0 - 0
1 16 158 142 568.513 568 0.0965336 0.0939971
2 16 287 271 542.191 516 0.0291494 0.107503
3 16 375 359 478.75 352 0.0892724 0.118463
4 16 477 461 461.042 408 0.0243493 0.126649
5 16 540 524 419.216 252 0.239123 0.132195
6 16 644 628 418.67 416 0.347606 0.146832
7 16 734 718 410.281 360 0.0534447 0.147413
8 16 811 795 397.487 308 0.0311927 0.15004
9 16 879 863 383.537 272 0.0894534 0.158513
10 16 980 964 385.578 404 0.0969865 0.162121
11 3 981 978 355.613 56 0.798949 0.171779
Total time run: 11.063482
Total writes made: 981
Write size: 4194304
Object size: 4194304
Bandwidth (MB/sec): 354.68
Stddev Bandwidth: 137.608
Max bandwidth (MB/sec): 568
Min bandwidth (MB/sec): 56
Average IOPS: 88
Stddev IOPS: 34
Max IOPS: 142
Min IOPS: 14
Average Latency(s): 0.175273
Stddev Latency(s): 0.294736
Max latency(s): 1.97781
Min latency(s): 0.0205769
Cleaning up (deleting benchmark objects)
Clean up completed and total clean up time :3.895293
We’ve verified the network by running `iperf` across both replication and
public networks and it resulted in 9.8Gb/s (10G links for both). The machine
that’s running the benchmark doesn’t even saturate it’s port. The SSDs are
S3520 960GB drives which we’ve benchmarked and they can handle the load using
fio/etc. At this point, not really sure where to look next.. anyone running
all SSD clusters that might be able to share their experience?
Thanks,
Mohammed
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com