I'm sorry for the confusion ! I paste the wrong output.
ceph-objectstore-tool --data-path /var/lib/ceph/Id/osd.1 --op list --pgid 11.4 --no-mon-config OSD.1 log 2025-07-31T12:06:56.273+0000 7a9c2bf47680 0 set uid:gid to 167:167 (ceph:ceph) 2025-07-31T12:06:56.273+0000 7a9c2bf47680 0 ceph version 19.2.2 (0eceb0defba60152a8182f7bd87d164b639885b8) squid (stable), process ceph-osd, pid 7 2025-07-31T12:06:56.273+0000 7a9c2bf47680 0 pidfile_write: ignore empty --pid-file 2025-07-31T12:06:56.274+0000 7a9c2bf47680 1 bdev(0x57bd64210e00 /var/lib/ceph/osd/ceph-1/block) open path /var/lib/ceph/osd/ceph-1/block 2025-07-31T12:06:56.274+0000 7a9c2bf47680 -1 bdev(0x57bd64210e00 /var/lib/ceph/osd/ceph-1/block) open open got: (13) Permission denied 2025-07-31T12:06:56.274+0000 7a9c2bf47680 -1 ** ERROR: unable to open OSD superblock on /var/lib/ceph/osd/ceph-1: (2) No such file or directory ---------------------- I retried on OSD.2 with PG 2.1 to see if I disabled instead of just stopped the OSD.2 before objectstore-tool operation will change something but same error occurred ________________________________ De : Eugen Block <ebl...@nde.ag> Envoyé : jeudi 31 juillet 2025 13:27:51 À : GLE, Vivien Cc : ceph-users@ceph.io Objet : Re: [ceph-users] Re: Pgs troubleshooting Why did you look at OSD.2? According to the query output you provided I would have looked at OSD.1 (acting set). And you pasted the output of PG 11.4, now you’re trying to list PG 2.1, that is quite confusing. Zitat von "GLE, Vivien" <vivien....@inist.fr>: > I dont get why is he searching in this path because there is nothing > and this is the command I used to check bluestore > > > ceph-objectstore-tool --data-path /var/lib/ceph/"ID"/osd.2 --op list > --pgid 2.1 --no-mon-config > > ________________________________ > De : GLE, Vivien > Envoyé : jeudi 31 juillet 2025 09:38:25 > À : Eugen Block > Cc : ceph-users@ceph.io > Objet : RE: [ceph-users] Re: Pgs troubleshooting > > > Hi, > > >> Or could reducing min_size to 1 help here (Thanks, Anthony)? I’m not >> entirely sure and am on vacation. 😅 it could be worth a try. But don’t >> forget to reset min_size back to 2 afterwards. > > > Did it but nothing really changed, how many time should I wait to > see if it does something ? > > >> No, you use the ceph-objectstore-tool to export the PG from the intact >> OSD (you need to stop it though, set noout flag), make sure you have >> enough disk space. > > > I stopped my OSD and noout to check if my PG is stored in bluestore > (he is not) but when I tried to restart my OSD, OSD superblock was > gone > > > 2025-07-31T08:33:14.696+0000 7f0c7c889680 1 bdev(0x60945520ae00 > /var/lib/ceph/osd/ceph-2/block) open path > /var/lib/ceph/osd/ceph-2/block > 2025-07-31T08:33:14.697+0000 7f0c7c889680 -1 bdev(0x60945520ae00 > /var/lib/ceph/osd/ceph-2/block) open open got: (13) Permission denied > 2025-07-31T08:33:14.697+0000 7f0c7c889680 -1 ** ERROR: unable to > open OSD superblock on /var/lib/ceph/osd/ceph-2: (2) No such file or > directory > > Did I miss something? > > Thanks > Vivien > > > > > ________________________________ > De : Eugen Block <ebl...@nde.ag> > Envoyé : mercredi 30 juillet 2025 16:56:50 > À : GLE, Vivien > Cc : ceph-users@ceph.io > Objet : [ceph-users] Re: Pgs troubleshooting > > Or could reducing min_size to 1 help here (Thanks, Anthony)? I’m not > entirely sure and am on vacation. 😅 it could be worth a try. But don’t > forget to reset min_size back to 2 afterwards. > > Zitat von "GLE, Vivien" <vivien....@inist.fr>: > >> Hi, >> >> >>> did the two replaced OSDs fail at the sime time (before they were >>> completely drained)? This would most likely mean that both those >>> failed OSDs contained the other two replicas of this PG >> >> >> Unfortunately yes >> >> >>> This would most likely mean that both those >>> failed OSDs contained the other two replicas of this PG. A pg query >>> should show which OSDs are missing. >> >> >> If I understand well I need to move my PG on the OSD 1 ? >> >> >> ceph -w >> >> >> osd.1 [ERR] 11.4 has 2 objects unfound and apparently lost >> >> >> ceph pg query 11.4 >> >> >> >> "up": [ >> 1, >> 4, >> 5 >> ], >> "acting": [ >> 1, >> 4, >> 5 >> ], >> "avail_no_missing": [], >> "object_location_counts": [ >> { >> "shards": "3,4,5", >> "objects": 2 >> } >> ], >> "blocked_by": [], >> "up_primary": 1, >> "acting_primary": 1, >> "purged_snaps": [] >> }, >> >> >> >> Thanks >> >> >> Vivien >> >> ________________________________ >> De : Eugen Block <ebl...@nde.ag> >> Envoyé : mardi 29 juillet 2025 16:48:41 >> À : ceph-users@ceph.io >> Objet : [ceph-users] Re: Pgs troubleshooting >> >> Hi, >> >> did the two replaced OSDs fail at the sime time (before they were >> completely drained)? This would most likely mean that both those >> failed OSDs contained the other two replicas of this PG. A pg query >> should show which OSDs are missing. >> You could try with objectstore-tool to export the PG from the >> remaining OSD and import it on different OSDs. Or you mark the data as >> lost if you don't care about the data and want a healthy state quickly. >> >> Regards, >> Eugen >> >> Zitat von "GLE, Vivien" <vivien....@inist.fr>: >> >>> Thanks for your help ! This is my new pg stat with no more peering >>> pgs (after rebooting some OSD) >>> >>> ceph pg stat -> >>> >>> 498 pgs: 1 active+recovery_unfound+degraded, 3 >>> recovery_unfound+undersized+degraded+remapped+peered, 14 >>> active+clean+scrubbing+deep, 480 active+clean; >>> >>> 36 GiB data, 169 GiB used, 6.2 TiB / 6.4 TiB avail; 8.8 KiB/s rd, 0 >>> B/s wr, 12 op/s; 715/41838 objects degraded (1.709%); 5/13946 >>> objects unfound (0.036%) >>> >>> ceph pg ls recovery_unfound -> shows that PG are replica 3, tried to >>> repair but nothing happened >>> >>> >>> ceph -w -> >>> >>> osd.1 [ERR] 11.4 has 2 objects unfound and apparently lost >>> >>> >>> >>> ________________________________ >>> De : Frédéric Nass <frederic.n...@clyso.com> >>> Envoyé : mardi 29 juillet 2025 14:03:37 >>> À : GLE, Vivien >>> Cc : ceph-users@ceph.io >>> Objet : Re: [ceph-users] Pgs troubleshooting >>> >>> Hi Vivien, >>> >>> Unless you ran 'ceph pg stat' command when peering was occuring, the >>> 37 peering PGs might indicate a temporary peering issue with one or >>> more OSDs. If that's the case then restarting associated OSDs could >>> help with the peering or ceph pg. You could list those PGs and >>> associated OSDs with 'ceph pg ls peering' and trigger peering by >>> either restarting one common OSD or by using 'ceph pg repeer <pg_id>'. >>> >>> Regarding the unfound object and its associated backfill_unfound PG, >>> you could identify this PG with 'ceph pg ls backfill_unfound' and >>> investigate this PG with 'ceph pg <pg_id> query'. Depending on the >>> output, you could try running a 'ceph pg repair <pg_id>'. Could you >>> confirm that this PG is not part of a size=2 pool? >>> >>> Best regards, >>> Frédéric. >>> >>> -- >>> Frédéric Nass >>> Ceph Ambassador France | Senior Ceph Engineer @ CLYSO >>> Try our Ceph Analyzer -- https://analyzer.clyso.com/ >>> https://clyso.com | frederic.n...@clyso.com<mailto:frederic.n...@clyso.com> >>> >>> >>> Le mar. 29 juil. 2025 à 14:19, GLE, Vivien >>> <vivien....@inist.fr<mailto:vivien....@inist.fr>> a écrit : >>> Hi, >>> >>> After replacing 2 OSD (data corruption), this is the stats of my >>> testing ceph cluster >>> >>> ceph pg stat >>> >>> 498 pgs: 37 peering, 1 active+remapped+backfilling, 1 >>> active+clean+remapped, 1 active+recovery_wait+undersized+remapped, 1 >>> backfill_unfound+undersized+degraded+remapped+peered, 1 >>> remapped+peering, 12 active+clean+scrubbing+deep, 1 >>> active+undersized, 442 active+clean, 1 >>> active+recovering+undersized+remapped >>> >>> 34 GiB data, 175 GiB used, 6.2 TiB / 6.4 TiB avail; 1.7 KiB/s rd, 1 >>> op/s; 31/39768 objects degraded (0.078%); 6/39768 objects misplaced >>> (0.015%); 1/13256 objects unfound (0.008%) >>> >>> ceph osd stat >>> 7 osds: 7 up (since 20h), 7 in (since 20h); epoch: e427538; 4 remapped pgs >>> >>> Anyone had an idea of where to start to get a healthy cluster ? >>> >>> Thanks ! >>> >>> Vivien >>> >>> >>> _______________________________________________ >>> ceph-users mailing list -- ceph-users@ceph.io<mailto:ceph-users@ceph.io> >>> To unsubscribe send an email to >>> ceph-users-le...@ceph.io<mailto:ceph-users-le...@ceph.io> >>> _______________________________________________ >>> ceph-users mailing list -- ceph-users@ceph.io >>> To unsubscribe send an email to ceph-users-le...@ceph.io >> >> >> _______________________________________________ >> ceph-users mailing list -- ceph-users@ceph.io >> To unsubscribe send an email to ceph-users-le...@ceph.io > > > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io