any comment on this one ?
interesting what to do in this situation
On Wed, Jul 5, 2017 at 10:51 PM, Adrian Saul <[email protected]>
wrote:
>
>
> During a recent snafu with a production cluster I disabled scrubbing and
> deep scrubbing in order to reduce load on the cluster while things
> backfilled and settled down. The PTSD caused by the incident meant I was
> not keen to re-enable it until I was confident we had fixed the root cause
> of the issues (driver issues with a new NIC type introduced with new
> hardware that did not show up until production load hit them). My cluster
> is using Jewel 10.2.1, and is a mix of SSD and SATA over 20 hosts, 352 OSDs
> in total.
>
>
>
> Fast forward a few weeks and I was ready to re-enable it. On some reading
> I was concerned the cluster might kick off excessive scrubbing once I unset
> the flags, so I tried increasing the deep scrub interval from 7 days to 60
> days – with most of the last deep scrubs being from over a month before I
> was hoping it would distribute them over the next 30 days. Having unset
> the flag and carefully watched the cluster it seems to have just run a
> steady catch up without significant impact. What I am noticing though is
> that the scrubbing is seeming to just run through the full set of PGs, so
> it did some 2280 PGs last night over 6 hours, and so far today in 12 hours
> another 4000 odd. With 13408 PGs, I am guessing that all this will stop
> some time early tomorrow.
>
>
>
> ceph-glb-fec-01[/var/log]$ sudo ceph pg dump|awk '{print $20}'|grep
> 2017|sort|uniq -c
>
> dumped all in format plain
>
> 5 2017-05-23
>
> 18 2017-05-24
>
> 33 2017-05-25
>
> 52 2017-05-26
>
> 89 2017-05-27
>
> 114 2017-05-28
>
> 144 2017-05-29
>
> 172 2017-05-30
>
> 256 2017-05-31
>
> 191 2017-06-01
>
> 230 2017-06-02
>
> 369 2017-06-03
>
> 606 2017-06-04
>
> 680 2017-06-05
>
> 919 2017-06-06
>
> 1261 2017-06-07
>
> 1876 2017-06-08
>
> 15 2017-06-09
>
> 2280 2017-07-05
>
> 4098 2017-07-06
>
>
>
> My concern is am I now set to have all 13408 PGs do a deep scrub in 60
> days in a serial fashion again over 3 days. I would much rather they
> distribute over that period.
>
>
>
> Will the OSDs do this distribution themselves now they have caught up, or
> do I need to say create a script that will trigger batches of PGs to deep
> scrub over time to push out the distribution again?
>
>
>
>
>
> [image: Description: http://res.tpgi.com.au/img/signature/spacer.gif]
>
> [image: Description: http://res.tpgi.com.au/img/signature/prpletop.jpg]
>
> *Adrian Saul* | Infrastructure Projects Team Lead
> IT
> T 02 9009 9041 | M +61 402 075 760
> 30 Ross St, Glebe NSW 2037
> [email protected] | www.tpg.com.au
>
> *TPG Telecom (ASX: TPM)*
>
> [image: Description:
> http://res.tpgi.com.au/img/signature/tpgtelecomlogo.jpg]
>
> This email and any attachments are confidential and may be subject to
> copyright, legal or some other professional privilege. They are intended
> solely for the attention and use of the named addressee(s). They may only
> be copied, distributed or disclosed with the consent of the copyright
> owner. If you have received this email by mistake or by breach of the
> confidentiality clause, please notify the sender immediately by return
> email and delete or destroy all copies of the email. Any confidentiality,
> privilege or copyright is not waived or lost because this email has been
> sent to you by mistake.
>
>
>
>
> Confidentiality: This email and any attachments are confidential and may
> be subject to copyright, legal or some other professional privilege. They
> are intended solely for the attention and use of the named addressee(s).
> They may only be copied, distributed or disclosed with the consent of the
> copyright owner. If you have received this email by mistake or by breach of
> the confidentiality clause, please notify the sender immediately by return
> email and delete or destroy all copies of the email. Any confidentiality,
> privilege or copyright is not waived or lost because this email has been
> sent to you by mistake.
>
> _______________________________________________
> ceph-users mailing list
> [email protected]
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
--
*Alejandro Comisario*
*CTO | NUBELIU*
E-mail: [email protected]: +54911 3770 1857
_
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com