Thanks, This makes sense, but just wanted to sanity check my assumption against reality.
In my specific case, 24 of the OSD’s are HDD, 30 SSD in different roots/pools, and so deep scrubs on the other 23 spinning disks could in theory eat iops on a disk currently backfilling to the other OSD. Either way, make sense, and thanks for the insight. And don’t worry Wido, they aren’t SMR drives! Thanks, Reed > On May 30, 2017, at 11:03 AM, Wido den Hollander <w...@42on.com> wrote: > >> >> Op 30 mei 2017 om 17:37 schreef Reed Dier <reed.d...@focusvq.com>: >> >> >> Lost an OSD and having to rebuild it. >> >> 8TB drive, so it has to backfill a ton of data. >> Been taking a while, so looked at ceph -s and noticed that deep/scrubs were >> running even though I’m running newest Jewel (10.2.7) and OSD’s have the >> osd_scrub_during_recovery set to false. >> >>> $ cat /etc/ceph/ceph.conf | grep scrub | grep recovery >>> osd_scrub_during_recovery = false >> >>> $ sudo ceph daemon osd.0 config show | grep scrub | grep recovery >>> "osd_scrub_during_recovery": "false”, >> >>> $ ceph --version >>> ceph version 10.2.7 (50e863e0f4bc8f4b9e31156de690d765af245185) >> >>> cluster edeb727e-c6d3-4347-bfbb-b9ce7f60514b >>> health HEALTH_WARN >>> 133 pgs backfill_wait >>> 10 pgs backfilling >>> 143 pgs degraded >>> 143 pgs stuck degraded >>> 143 pgs stuck unclean >>> 143 pgs stuck undersized >>> 143 pgs undersized >>> recovery 22081436/1672287847 objects degraded (1.320%) >>> recovery 20054800/1672287847 objects misplaced (1.199%) >>> noout flag(s) set >>> monmap e1: 3 mons at >>> {core=10.0.1.249:6789/0,db=10.0.1.251:6789/0,dev=10.0.1.250:6789/0} >>> election epoch 4234, quorum 0,1,2 core,dev,db >>> fsmap e5013: 1/1/1 up {0=core=up:active}, 1 up:standby >>> osdmap e27892: 54 osds: 54 up, 54 in; 143 remapped pgs >>> flags noout,nodeep-scrub,sortbitwise,require_jewel_osds >>> pgmap v13840713: 4292 pgs, 6 pools, 59004 GB data, 564 Mobjects >>> 159 TB used, 69000 GB / 226 TB avail >>> 22081436/1672287847 objects degraded (1.320%) >>> 20054800/1672287847 objects misplaced (1.199%) >>> 4143 active+clean >>> 133 active+undersized+degraded+remapped+wait_backfill >>> 10 active+undersized+degraded+remapped+backfilling >>> 6 active+clean+scrubbing+deep >>> recovery io 21855 kB/s, 346 objects/s >>> client io 30021 kB/s rd, 1275 kB/s wr, 291 op/s rd, 62 op/s wr >> >> Looking at the ceph documentation for ‘master' >> >>> osd scrub during recovery >>> >>> Description: Allow scrub during recovery. Setting this to false will >>> disable scheduling new scrub (and deep–scrub) while there is active >>> recovery. Already running scrubs will be continued. This might be useful to >>> reduce load on busy clusters. >>> Type: Boolean >>> Default: true >> >> >> Are backfills not treated as recovery operations? Is it only preventing >> scrubs on the OSD’s that are actively recovering/backfilling? >> >> Just curious as to why the feature did not seem to kick in as expected. > > It is per OSD. So only on that OSD new (deep-)scrubs will not be started as > long as a recovery/backfill operation is active there. > > So other OSDs which have nothing to do with it will still perform scrubs. > > Wido > >> >> Thanks, >> >> Reed_______________________________________________ >> ceph-users mailing list >> ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com