[ceph-users] Trying to identify performance bottlenecks

Lincoln Bryant Mon, 05 Aug 2013 11:05:01 -0700

Hi all,

I'm trying to identify the performance bottlenecks in my experimental Ceph 
cluster. A little background on my setup:
        10 storage servers, each configured with:
                -(2) dual-core opterons
                -8 GB of RAM
                -(6) 750GB disks (1 OSD per disk, 7200 RPM SATA, probably 4-5 
years old.). JBOD w/ BTRFS
                -1GbE
                -CentOS 6.4, custom kernel 3.7.8
        1 dedicated mds/mon server
                -same specs at OSD nodes
        (2 more dedicated mons waiting in the wings, recently reinstalled ceph)
        1 front-facing node mounting CephFS, with a 10GbE connection into the 
switch stack housing the storage machines
                -CentOS 6.4, custom kernel 3.7.8


        Some Ceph settings:
        [osd]
                osd journal size = 1000
                filestore xattr use omap = true

When I try to transfer files in/out via CephFS (10GbE host), I'm seeing only 
about 230MB/s at peak. First, is this what I should expect? Given 60 OSDs 
spread across 10 servers, I would have thought I'd get something closer to 
400-500 MB/s or more. I tried upping the number of placement groups to 3000 for 
my 'data' pool (following the formula here: 
http://ceph.com/docs/master/rados/operations/placement-groups/) with no 
increase in performance. I also saw no performance difference between XFS and 
BTRFS.

I also see a lot of messages like this in the log: 
10.1.6.4:6815/30138 3518 : [WRN] slow request 30.874441 seconds old, received 
at 2013-07-31 10:52:49.721518: osd_op(client.7763.1:67060 100000003ba.000013d4 
[write 0~4194304] 0.102b9365 RETRY=-1 snapc 1=[] e1454) currently waiting for 
subops from [1]

Does anyone have any thoughts as to what the bottleneck may be, if there is 
one? Or, any idea what I should try to measure to determine the bottleneck?

Perhaps my disks are just that bad? :)

Cheers,
Lincoln

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Trying to identify performance bottlenecks

Reply via email to