[ceph-users] Ceph performance expectations

Sergio A. de Carvalho Jr. Thu, 07 Apr 2016 05:01:51 -0700

Hi all,

I've setup a testing/development Ceph cluster consisting of 5 Dell
PowerEdge R720xd servers (256GB RAM, 2x 8-core Xeon E5-2650 @ 2.60 GHz,
dual-port 10Gb Ethernet, 2x 900GB + 12x 4TB disks) running CentOS 6.5 and
Ceph Hammer 0.94.6. All servers use one 900GB disk for the root partition
and the other 13 disks are assigned to OSDs, so we have 5 x 13 = 65 OSDs in
total. We also run 1 monitor on every host. Journals are 5GB partitions on
each disk (this is something we obviously will need to revisit later). The
purpose of this cluster will be to serve as a backend storage for Cinder
volumes and Glance images in an OpenStack cloud.


With this setup, I'm getting what I'm considering an "okay" performance:

# rados -p images bench 5 write
 Maintaining 16 concurrent writes of 4194304 bytes for up to 5 seconds or 0
objects

Total writes made:      394
Write size:             4194304
Bandwidth (MB/sec):     299.968

Stddev Bandwidth:       127.334
Max bandwidth (MB/sec): 348
Min bandwidth (MB/sec): 0
Average Latency:        0.212524
Stddev Latency:         0.13317
Max latency:            0.828946
Min latency:            0.0707341

Does that look acceptable? How much more can I expect to achieve by
fine-tunning and perhaps using a more efficient setup?

I do understand the bandwidth above is a product of running 16 concurrent
writes, and rather small object sizes (4MB). Bandwidth lowers significantly
with 64MB and 1 thread:

# rados -p images bench 5 write -b 67108864 -t 1
 Maintaining 1 concurrent writes of 67108864 bytes for up to 5 seconds or 0
objects

Total writes made:      7
Write size:             67108864
Bandwidth (MB/sec):     71.520

Stddev Bandwidth:       24.1897
Max bandwidth (MB/sec): 64
Min bandwidth (MB/sec): 0
Average Latency:        0.894792
Stddev Latency:         0.0547502
Max latency:            0.99311
Min latency:            0.832765

Is such a drop expected?

Now, what I'm really concerned is about upload times. Uploading a
randomly-generated 1GB file takes a bit too long:

# time rados -p images put random_1GB /tmp/random_1GB

real 0m35.328s
user 0m0.560s
sys 0m3.665s

Is this normal? If so, if I setup this cluster as a backend for Glance,
does that mean uploading a 1GB image will require 35 seconds (plus whatever
time Glance requires to do its own thing)?

Thanks,

Sergio

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Ceph performance expectations

Reply via email to