Hi Stefan,

after you wrote that you issue hundreds of deep-scrub commands per day I was 
already suspecting something like

> [...] deep_scrub daemon requests a deep-scrub [...]

Its not a Minion you hired that types these commends every so many seconds by 
hand and hopes for the best.  My guess is rather that you are actually running 
a cluster specifically configured for manual scrub scheduling as was discussed 
in a thread some time ago to solve the problem of "not deep scrubbed in time" 
messages due to the built-in scrub scheduler not using the last-scrubbed 
timestamp for priority (among other things).

On such a system I would not be surprised that these commands have their 
desired effect. To know why it works for you it would be helpful to disclose 
the whole story, for example what ceph config parameters are active within the 
context of your daemon executing the deep-scrub instructions and what other 
ceph-commands surround it in the same way that the pg repair is surrounded by 
injectargs instructions in the script I posted.

It doesn't work like that on a ceph cluster with default config. For example, 
on our cluster there is a very high likelihood that at least one OSD of any PG 
is part of a scrub at any time already. In that case, if a PG is not eligible 
for scrubbing because one of its OSDs has already max-scrubs (default=1) scrubs 
running, the reservation has no observable effect.

Some time ago I had a ceph-user thread discussing exactly that, I wanted to 
increase the concurrent scrubs running without increasing max-scrubs. The 
default scheduler seems to be very poor with ordering scrubs in such a way that 
a maximum number of pgs is scrubbed at any given time. One of the suggestions 
was to run manual scheduling, which seems exactly like what you are doing.

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: Stefan Kooman <ste...@bit.nl>
Sent: Wednesday, June 28, 2023 2:17 PM
To: Frank Schilder; Alexander E. Patrakov; Niklas Hambüchen
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] Re: 1 pg inconsistent and does not recover

On 6/28/23 10:45, Frank Schilder wrote:
> Hi Stefan,
>
> we run Octopus. The deep-scrub request is (immediately) cancelled if the 
> PG/OSD is already part of another (deep-)scrub or if some peering happens. As 
> far as I understood, the commands osd/pg deep-scrub and pg repair do not 
> create persistent reservations. If you issue this command, when does the PG 
> actually start scrubbing? As soon as another one finishes or when it is its 
> natural turn? Do you monitor the scrub order to confirm it was the manual 
> command that initiated a scrub?

We request a deep-scrub ... a few seconds later it starts
deep-scrubbing. We do not verify in this process if the PG really did
start, but they do. See example from a PG below:

Jun 27 22:59:50 mon1 pg_scrub[2478540]: [27-06-2023 22:59:34] Scrub PG
5.48a (last deep-scrub: 2023-06-16T22:54:58.684038+0200)


^^ deep_scrub daemon requests a deep-scrub, based on latest deep-scrub
timestamp. After a couple of minutes it's deep-scrubbed. See below the
deep-scrub timestamp (info from a PG query of 5.48a):

"last_deep_scrub_stamp": "2023-06-27T23:06:01.823894+0200"

We have been using this in Octopus (actually since Luminous, but in a
different way). Now we are on Pacific.

Gr. Stefan
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to