Sorry about the repost from the cbt list, but it was suggested I post here as
well:
I am attempting to track down some performance issues in a Ceph cluster
recently deployed. Our configuration is as follows:
3 storage nodes, each with:
- 8 Cores
- 64GB of RAM
- 2x 1TB 7200 RPM Spindle
- 1x 120GB Intel SSD
- 2x 10GBit NICs (In LACP Port-channel)
The OSD pool min_size is set to “1” and “size” is set to “3”. When creating a
new pool and running RADOS benchmarks, performance isn’t bad — about what I
would expect from this hardware configuration:
WRITES:
Total writes made: 207
Write size: 4194304
Bandwidth (MB/sec): 80.017
Stddev Bandwidth: 34.9212
Max bandwidth (MB/sec): 120
Min bandwidth (MB/sec): 0
Average Latency: 0.797667
Stddev Latency: 0.313188
Max latency: 1.72237
Min latency: 0.253286
RAND READS:
Total time run: 10.127990
Total reads made: 1263
Read size: 4194304
Bandwidth (MB/sec): 498.816
Average Latency: 0.127821
Max latency: 0.464181
Min latency: 0.0220425
This all looks fine, until we try to use the cluster for its purpose, which is
to house images for qemu-kvm, which are access using librbd. I/O inside VMs
have excessive I/O wait times (in the hundreds of ms at times, making some
operating systems, like Windows unusable) and throughput struggles to exceed
10MB/s (or less). Looking at ceph health, we see very low op/s numbers as well
as throughput and the requests blocked number seems very high. Any ideas as to
what to look at here?
health HEALTH_WARN
8 requests are blocked > 32 sec
monmap e3: 3 mons at
{storage-1=10.0.0.1:6789/0,storage-2=10.0.0.2:6789/0,storage-3=10.0.0.3:6789/0}
election epoch 128, quorum 0,1,2 storage-1,storage-2,storage-3
osdmap e69615: 6 osds: 6 up, 6 in
pgmap v3148541: 224 pgs, 1 pools, 819 GB data, 227 kobjects
2726 GB used, 2844 GB / 5571 GB avail
224 active+clean
client io 3957 B/s rd, 3494 kB/s wr, 30 op/s
Of note, on the other list, I was asked to provide the following:
- ceph version 0.94.1 (e4bfad3a3c51054df7e537a724c8d0bf9be972ff)
- The SSD is split into 8GB partitions. These 8GB partitions are used
as journal devices, specified in /etc/ceph/ceph.conf. For example:
[osd.0]
host = storage-1
osd journal =
/dev/mapper/INTEL_SSDSC2BB120G4_CVWL4363006R120LGNp1
- rbd_cache is enabled and qemu cache is set to “writeback"
- rbd_concurrent_management_ops is unset, so it appears the default is
“10”
Thanks,
--
Kenneth Van Alstyne
Systems Architect
Knight Point Systems, LLC
Service-Disabled Veteran-Owned Business
1775 Wiehle Avenue Suite 101 | Reston, VA 20190
c: 228-547-8045 f: 571-266-3106
www.knightpoint.com
DHS EAGLE II Prime Contractor: FC1 SDVOSB Track
GSA Schedule 70 SDVOSB: GS-35F-0646S
GSA MOBIS Schedule: GS-10F-0404Y
ISO 20000 / ISO 27001
Notice: This e-mail message, including any attachments, is for the sole use of
the intended recipient(s) and may contain confidential and privileged
information. Any unauthorized review, copy, use, disclosure, or distribution is
STRICTLY prohibited. If you are not the intended recipient, please contact the
sender by reply e-mail and destroy all copies of the original message.
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com