What ceph version are you using ?
It seems clients are not sending enough traffic to the cluster.
Could you try with rbd_cache=false or true and see if behavior changes ?
What is the client side cpu util ?
Performance also depends on the QD you are driving with.
I would suggest, run fio on top of VM with similar workload but high/low QD to 
isolate the issue.

Thanks & Regards
Somnath

-----Original Message-----
From: ceph-users [mailto:[email protected]] On Behalf Of J David
Sent: Wednesday, April 22, 2015 11:42 AM
To: [email protected]
Subject: [ceph-users] Having trouble getting good performance

A very small 3-node Ceph cluster with this OSD tree:

http://pastebin.com/mUhayBk9

has some performance issues.  All 27 OSDs are 5TB SATA drives, it keeps two 
copies of everything, and it's really only intended for nearline backups of 
large data objects.

All of the OSDs look OK in terms of utilization.  iostat reports them around 
5-10 %util with occasional spikes up to 20%, and doing typically 5-20 IOPs 
each.  (Which, if we figure 100  IOPs per drive, matches up nicely with the 
%util.

The Ceph nodes also look health.  CPU is 75-90% idle.  Plenty of RAM (the 
smaller node has 32GiB, the larger ones have 64GiB.  They have dual 10GBase-T 
NICs on an unloaded, dedicated storage switch.

The pool has 4096 placement groups.  Currently we have noscrub & nodeep-scrub 
set to eliminate sctubbing as source of performance problems.

When we increased placement groups from the default to 4096, a ton of data 
moved and quickly, the cluster is definitely capable of pushing data around at 
multi-gigabit speeds if it wants to.

Yet this cluster has (currently) one client, a KVM machine backed by a 32TB RBD 
image.  It is a Linux VM running ZFS that recevies ZFS snapshots from other 
machines to back them up.  However, the performance is pretty bad.  When it is 
receiving even one snapshot, iostat -x on the client reports it is choking on 
I/O to the Ceph rbd
image: 99.88 %util while doing 50-75 IOPs and 5-20 MB/sec of throughput.

Is there anything we can do to improve the performance of this configuration, 
or at least figure out why it is so bad?  These are large SATA drives with no 
SSD OSD's, so we don't expect miracles.  But it sure would be nice to see 
client I/O that was better than 50 IOPs and 20MB/sec.

Thanks in advance for any help or guidance on this!
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

________________________________

PLEASE NOTE: The information contained in this electronic mail message is 
intended only for the use of the designated recipient(s) named above. If the 
reader of this message is not the intended recipient, you are hereby notified 
that you have received this message in error and that any review, 
dissemination, distribution, or copying of this message is strictly prohibited. 
If you have received this communication in error, please notify the sender by 
telephone or e-mail (as shown above) immediately and destroy any and all copies 
of this message in your possession (whether hard copies or electronically 
stored copies).

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to