If you're running a single client to drive these tests, that's your bottleneck. Try running multiple clients and aggregating their numbers. -Greg
On Thursday, October 16, 2014, Mark Wu <[email protected]> wrote: > Hi list, > > During my test, I found ceph doesn't scale as I expected on a 30 osds > cluster. > The following is the information of my setup: > HW configuration: > 15 Dell R720 servers, and each server has: > Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz, 20 cores and hyper-thread > enabled. > 128GB memory > two Intel 3500 SSD disks, connected with MegaRAID SAS 2208 > controller, each disk is configured as raid0 separately. > bonding with two 10GbE nics, used for both the public network and > cluster network. > > SW configuration: > OS CentOS 6.5, Kernel 3.17, Ceph 0.86 > XFS as file system for data. > each SSD disk has two partitions, one is osd data and the other is osd > journal. > the pool has 2048 pgs. 2 replicas. > 5 monitors running on 5 of the 15 servers. > Ceph configuration (in memory debugging options are disabled) > > [osd] > osd data = /var/lib/ceph/osd/$cluster-$id > osd journal = /var/lib/ceph/osd/$cluster-$id/journal > osd mkfs type = xfs > osd mkfs options xfs = -f -i size=2048 > osd mount options xfs = rw,noatime,logbsize=256k,delaylog > osd journal size = 20480 > osd mon heartbeat interval = 30 # Performance tuning filestore > osd_max_backfills = 10 > osd_recovery_max_active = 15 > merge threshold = 40 > filestore split multiple = 8 > filestore fd cache size = 1024 > osd op threads = 64 # Recovery tuning osd recovery max active = 1 osd max > backfills = 1 > osd recovery op priority = 1 > throttler perf counter = false > osd enable op tracker = false > filestore_queue_max_ops = 5000 > filestore_queue_committing_max_ops = 5000 > journal_max_write_entries = 1000 > journal_queue_max_ops = 5000 > objecter_inflight_ops = 8192 > > > When I test with 7 servers (14 osds), the maximum iops of 4k random > write I saw is 17k on single volume and 44k on the whole cluster. > I expected the number of 30 osds cluster could approximate 90k. But > unfornately, I found that with 30 osds, it almost provides the performce > as 14 osds, even worse sometime. I checked the iostat output on all the > nodes, which have similar numbers. It's well distributed but disk > utilization is low. > In the test with 14 osds, I can see higher utilization of disk (80%~90%). > So do you have any tunning suggestion to improve the performace with 30 > osds? > Any feedback is appreciated. > > > iostat output: > Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz > avgqu-sz await svctm %util > sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 > sdb 0.00 88.50 0.00 5188.00 0.00 93397.00 18.00 > 0.90 0.17 0.09 47.85 > sdc 0.00 443.50 0.00 5561.50 0.00 97324.00 17.50 > 4.06 0.73 0.09 47.90 > dm-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 > dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 > > Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz > avgqu-sz await svctm %util > sda 0.00 17.50 0.00 28.00 0.00 3948.00 141.00 > 0.01 0.29 0.05 0.15 > sdb 0.00 69.50 0.00 4932.00 0.00 87067.50 17.65 > 2.27 0.46 0.09 43.45 > sdc 0.00 69.00 0.00 4855.50 0.00 105771.50 > 21.78 0.95 0.20 0.10 46.40 > dm-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 > dm-1 0.00 0.00 0.00 42.50 0.00 3948.00 92.89 > 0.01 0.19 0.04 0.15 > > Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz > avgqu-sz await svctm %util > sda 0.00 12.00 0.00 8.00 0.00 568.00 71.00 > 0.00 0.12 0.12 0.10 > sdb 0.00 72.50 0.00 5046.50 0.00 113198.50 > 22.43 1.09 0.22 0.10 51.40 > sdc 0.00 72.50 0.00 4912.00 0.00 91204.50 18.57 > 2.25 0.46 0.09 43.60 > dm-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 > dm-1 0.00 0.00 0.00 18.00 0.00 568.00 31.56 > 0.00 0.17 0.06 0.10 > > > > Regards, > Mark Wu > > -- Software Engineer #42 @ http://inktank.com | http://ceph.com
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
