Hi,

thanks for the hint. We’re definitely running exact same binaries for all. :)

> On 5. Sep 2023, at 16:14, Eugen Block <ebl...@nde.ag> wrote:
> 
> Hi,
> 
> it sounds like you have auto-repair enabled (osd_scrub_auto_repair). I guess 
> you could disable that to see what's going on with the PGs and their 
> replicas. And/or you could enable debug logs. Are all daemons running the 
> same ceph (minor) version? I remember a customer case where different ceph 
> minor versions (but overall Octopus) caused damaged PGs, a repair fixed them 
> everytime. After they updated all daemons to the same minor version those 
> errors were gone.
> 
> Regards,
> Eugen
> 
> Zitat von Christian Theune <c...@flyingcircus.io>:
> 
>> Hi,
>> 
>> this is a bit older cluster (Nautilus, bluestore only).
>> 
>> We’ve noticed that the cluster is almost continuously repairing PGs. 
>> However, they all finish successfully with “0 fixed”. We do not see the 
>> trigger why Ceph decides to repair the PGs and it’s happening for a lot of 
>> PGs, not any specific individual one.
>> 
>> Deep-scrubs are generally running, but currently a bit late as we had some 
>> recoveries in the last week.
>> 
>> Logs look regular aside from the number of repairs. Here’s the last weeks 
>> from the perspective of a single PG. There’s one repair, but the same thing 
>> seems to happen for all PGs.
>> 
>> 2023-08-06 16:08:17.870 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 scrub starts
>> 2023-08-06 16:08:18.270 7fc49b1de640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 scrub ok
>> 2023-08-07 21:52:22.299 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 scrub starts
>> 2023-08-07 21:52:22.711 7fc49b1de640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 scrub ok
>> 2023-08-09 00:33:42.587 7fc49b1de640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 scrub starts
>> 2023-08-09 00:33:43.049 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 scrub ok
>> 2023-08-10 09:36:00.590 7fc49b1de640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 deep-scrub starts
>> 2023-08-10 09:36:28.811 7fc49b1de640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 deep-scrub ok
>> 2023-08-11 12:59:14.219 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 scrub starts
>> 2023-08-11 12:59:14.567 7fc49b1de640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 scrub ok
>> 2023-08-12 13:52:44.073 7fc49b1de640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 scrub starts
>> 2023-08-12 13:52:44.483 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 scrub ok
>> 2023-08-14 01:51:04.774 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 deep-scrub starts
>> 2023-08-14 01:51:33.113 7fc49b1de640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 deep-scrub ok
>> 2023-08-15 05:18:16.093 7fc49b1de640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 scrub starts
>> 2023-08-15 05:18:16.520 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 scrub ok
>> 2023-08-16 09:47:38.520 7fc49b1de640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 scrub starts
>> 2023-08-16 09:47:38.930 7fc49b1de640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 scrub ok
>> 2023-08-17 19:25:45.352 7fc49b1de640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 scrub starts
>> 2023-08-17 19:25:45.775 7fc49b1de640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 scrub ok
>> 2023-08-19 05:40:43.663 7fc49b1de640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 scrub starts
>> 2023-08-19 05:40:44.073 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 scrub ok
>> 2023-08-20 12:06:54.343 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 scrub starts
>> 2023-08-20 12:06:54.809 7fc49b1de640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 scrub ok
>> 2023-08-21 19:23:10.801 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 deep-scrub starts
>> 2023-08-21 19:23:39.936 7fc49b1de640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 deep-scrub ok
>> 2023-08-23 03:43:21.391 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 scrub starts
>> 2023-08-23 03:43:21.844 7fc49b1de640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 scrub ok
>> 2023-08-24 04:21:17.004 7fc49b1de640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 deep-scrub starts
>> 2023-08-24 04:21:47.972 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 deep-scrub ok
>> 2023-08-25 06:55:13.588 7fc49b1de640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 scrub starts
>> 2023-08-25 06:55:14.087 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 scrub ok
>> 2023-08-26 09:26:01.174 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 scrub starts
>> 2023-08-26 09:26:01.561 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 scrub ok
>> 2023-08-27 11:18:10.828 7fc49b1de640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 scrub starts
>> 2023-08-27 11:18:11.264 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 scrub ok
>> 2023-08-28 19:05:42.104 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 scrub starts
>> 2023-08-28 19:05:42.693 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 scrub ok
>> 2023-08-30 07:03:10.327 7fc49b1de640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 scrub starts
>> 2023-08-30 07:03:10.805 7fc49f1e6640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 scrub ok
>> 2023-08-31 14:43:23.849 7fc49b1de640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 deep-scrub starts
>> 2023-08-31 14:43:50.723 7fc49b1de640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 deep-scrub ok
>> 2023-09-01 20:53:42.749 7f37ca268640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 scrub starts
>> 2023-09-01 20:53:43.389 7f37c6260640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 scrub ok
>> 2023-09-02 22:57:49.542 7f37ca268640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 scrub starts
>> 2023-09-02 22:57:50.065 7f37c6260640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 scrub ok
>> 2023-09-04 03:16:14.754 7f37ca268640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 scrub starts
>> 2023-09-04 03:16:15.295 7f37ca268640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 scrub ok
>> 2023-09-05 14:50:36.064 7f37ca268640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 repair starts
>> 2023-09-05 14:51:04.407 7f37c6260640  0 log_channel(cluster) log [DBG] : 
>> 278.2f3 repair ok, 0 fixed
>> 
>> Googling didn’t help, unfortunately and the bug tracker doesn’t appear to 
>> have any relevant issue either.
>> 
>> Any ideas?
>> 
>> Liebe Grüße,
>> Christian Theune
>> 
>> --
>> Christian Theune · c...@flyingcircus.io · +49 345 219401 0
>> Flying Circus Internet Operations GmbH · https://flyingcircus.io
>> Leipziger Str. 70/71 · 06108 Halle (Saale) · Deutschland
>> HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, Christian 
>> Zagrodnick
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
> 
> 
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io

Liebe Grüße,
Christian Theune

-- 
Christian Theune · c...@flyingcircus.io · +49 345 219401 0
Flying Circus Internet Operations GmbH · https://flyingcircus.io
Leipziger Str. 70/71 · 06108 Halle (Saale) · Deutschland
HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, Christian Zagrodnick
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to