Hello, I've been trying to nail down a nasty performance issue related to scrubbing. I am mostly using radosgw with a handful of buckets containing millions of various sized objects. When ceph scrubs, both regular and deep, radosgw blocks on external requests, and my cluster has a bunch of requests that have blocked for > 32 seconds. Frequently OSDs are marked down.
According to atop, the OSDs being deep scrubbed are reading at only 5mb/s to 8mb/s, and a scrub of a 6.4gb placement group takes 10-20 minutes. Here's a screenshot of atop from a node: https://s3.amazonaws.com/rwgps/screenshots/DgSSRyeF.png First question: is this a reasonable speed for scrubbing, given a very lightly used cluster? Here's some cluster details: deploy@drexler:~$ ceph --version ceph version 0.94.1-5-g85a68f9 (85a68f9a8237f7e74f44a1d1fbbd6cb4ac50f8e8) 2x Xeon E5-2630 per node, 64gb of ram per node. deploy@drexler:~$ ceph status cluster 234c6825-0e2b-4256-a710-71d29f4f023e health HEALTH_WARN 118 requests are blocked > 32 sec monmap e1: 3 mons at {drexler= 10.0.0.36:6789/0,lucy=10.0.0.38:6789/0,paley=10.0.0.34:6789/0} election epoch 296, quorum 0,1,2 paley,drexler,lucy mdsmap e19989: 1/1/1 up {0=lucy=up:active}, 1 up:standby osdmap e1115: 12 osds: 12 up, 12 in pgmap v21748062: 1424 pgs, 17 pools, 3185 GB data, 20493 kobjects 10060 GB used, 34629 GB / 44690 GB avail 1422 active+clean 1 active+clean+scrubbing+deep 1 active+clean+scrubbing client io 721 kB/s rd, 33398 B/s wr, 53 op/s deploy@drexler:~$ ceph osd tree ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -1 43.67999 root default -2 14.56000 host paley 0 3.64000 osd.0 up 1.00000 1.00000 3 3.64000 osd.3 up 1.00000 1.00000 6 3.64000 osd.6 up 1.00000 1.00000 9 3.64000 osd.9 up 1.00000 1.00000 -3 14.56000 host lucy 1 3.64000 osd.1 up 1.00000 1.00000 4 3.64000 osd.4 up 1.00000 1.00000 7 3.64000 osd.7 up 1.00000 1.00000 11 3.64000 osd.11 up 1.00000 1.00000 -4 14.56000 host drexler 2 3.64000 osd.2 up 1.00000 1.00000 5 3.64000 osd.5 up 1.00000 1.00000 8 3.64000 osd.8 up 1.00000 1.00000 10 3.64000 osd.10 up 1.00000 1.00000 My OSDs are 4tb 7200rpm Hitachi DeskStars, using XFS, with Samsung 850 Pro journals (very slow, ordered s3700 replacements, but shouldn't pose problems for reading as far as I understand things). MONs are co-located with OSD nodes, but the nodes are fairly beefy and has very low load. Drives are on a expanding backplane, with an LSI SAS3008 controller. I have a fairly standard config as well: https://gist.github.com/kingcu/aae7373eb62ceb7579da I know that I don't have a ton of OSDs, but I'd expect a little better performance than this. Checkout munin of my three nodes: http://munin.ridewithgps.com/ridewithgps.com/drexler.ridewithgps.com/index.html#disk http://munin.ridewithgps.com/ridewithgps.com/paley.ridewithgps.com/index.html#disk http://munin.ridewithgps.com/ridewithgps.com/lucy.ridewithgps.com/index.html#disk Any input would be appreciated, before I start trying to micro-optimize config params, as well as upgrading to Infernalis. Cheers, Cullen
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
