That’s why I mentioned this two days ago: cephadm shell -- ceph-objectstore-tool --op list …
That’s how you can execute commands directly with cephadm shell, this is useful for batch operations like a for loop or similar. Of course, first entering the shell and then execute commands works quite as well.
Zitat von "GLE, Vivien" <vivien....@inist.fr>:
I was using ceph-objectstore-tool the wrong way by doing it on host instead of inside container via cephadm shell --name osd.x________________________________ De : GLE, Vivien <vivien....@inist.fr> Envoyé : vendredi 1 août 2025 09:02:59 À : Eugen Block Cc : ceph-users@ceph.io Objet : [ceph-users] Re: Pgs troubleshooting Hi, What is the good way of using objectstore tool ?My OSD are up ! I purged ceph-* on my host following this thread : https://www.reddit.com/r/ceph/comments/1me3kvd/containerized_ceph_base_os_experience/" Make sure that the base OS does not have any ceph packages installed, with Ubuntu in the past had issues with ceph-common being installed on the host OS and it trying to take ownership of the containerized ceph deployment. If you run into any issues check the base OS for ceph-* packages and uninstall. "I believe the only good way to use ceph commands is in cephadm Thanks for your help ! ________________________________ De : Eugen Block <ebl...@nde.ag> Envoyé : jeudi 31 juillet 2025 19:42:21 À : GLE, Vivien Cc : ceph-users@ceph.io Objet : Re: [ceph-users] Re: Pgs troubleshooting To use the objectstore tool within the container you don’t have to specify the cluster’s FSID because it’s mapped into the container. By using the objectstore tool you might have changed the ownership of the directory, change it back to the previous state. Other OSDs will show you which uid/user and/or gid/group that is. Zitat von "GLE, Vivien" <vivien....@inist.fr>:I'm sorry for the confusion ! I paste the wrong output. ceph-objectstore-tool --data-path /var/lib/ceph/Id/osd.1 --op list --pgid 11.4 --no-mon-config OSD.1 log 2025-07-31T12:06:56.273+0000 7a9c2bf47680 0 set uid:gid to 167:167 (ceph:ceph) 2025-07-31T12:06:56.273+0000 7a9c2bf47680 0 ceph version 19.2.2 (0eceb0defba60152a8182f7bd87d164b639885b8) squid (stable), process ceph-osd, pid 7 2025-07-31T12:06:56.273+0000 7a9c2bf47680 0 pidfile_write: ignore empty --pid-file 2025-07-31T12:06:56.274+0000 7a9c2bf47680 1 bdev(0x57bd64210e00 /var/lib/ceph/osd/ceph-1/block) open path /var/lib/ceph/osd/ceph-1/block 2025-07-31T12:06:56.274+0000 7a9c2bf47680 -1 bdev(0x57bd64210e00 /var/lib/ceph/osd/ceph-1/block) open open got: (13) Permission denied 2025-07-31T12:06:56.274+0000 7a9c2bf47680 -1 ** ERROR: unable to open OSD superblock on /var/lib/ceph/osd/ceph-1: (2) No such file or directory ---------------------- I retried on OSD.2 with PG 2.1 to see if I disabled instead of just stopped the OSD.2 before objectstore-tool operation will change something but same error occurred ________________________________ De : Eugen Block <ebl...@nde.ag> Envoyé : jeudi 31 juillet 2025 13:27:51 À : GLE, Vivien Cc : ceph-users@ceph.io Objet : Re: [ceph-users] Re: Pgs troubleshooting Why did you look at OSD.2? According to the query output you provided I would have looked at OSD.1 (acting set). And you pasted the output of PG 11.4, now you’re trying to list PG 2.1, that is quite confusing. Zitat von "GLE, Vivien" <vivien....@inist.fr>:I dont get why is he searching in this path because there is nothing and this is the command I used to check bluestore ceph-objectstore-tool --data-path /var/lib/ceph/"ID"/osd.2 --op list --pgid 2.1 --no-mon-config ________________________________ De : GLE, Vivien Envoyé : jeudi 31 juillet 2025 09:38:25 À : Eugen Block Cc : ceph-users@ceph.io Objet : RE: [ceph-users] Re: Pgs troubleshooting Hi,Or could reducing min_size to 1 help here (Thanks, Anthony)? I’m not entirely sure and am on vacation. 😅 it could be worth a try. But don’t forget to reset min_size back to 2 afterwards.Did it but nothing really changed, how many time should I wait to see if it does something ?No, you use the ceph-objectstore-tool to export the PG from the intact OSD (you need to stop it though, set noout flag), make sure you have enough disk space.I stopped my OSD and noout to check if my PG is stored in bluestore (he is not) but when I tried to restart my OSD, OSD superblock was gone 2025-07-31T08:33:14.696+0000 7f0c7c889680 1 bdev(0x60945520ae00 /var/lib/ceph/osd/ceph-2/block) open path /var/lib/ceph/osd/ceph-2/block 2025-07-31T08:33:14.697+0000 7f0c7c889680 -1 bdev(0x60945520ae00 /var/lib/ceph/osd/ceph-2/block) open open got: (13) Permission denied 2025-07-31T08:33:14.697+0000 7f0c7c889680 -1 ** ERROR: unable to open OSD superblock on /var/lib/ceph/osd/ceph-2: (2) No such file or directory Did I miss something? Thanks Vivien ________________________________ De : Eugen Block <ebl...@nde.ag> Envoyé : mercredi 30 juillet 2025 16:56:50 À : GLE, Vivien Cc : ceph-users@ceph.io Objet : [ceph-users] Re: Pgs troubleshooting Or could reducing min_size to 1 help here (Thanks, Anthony)? I’m not entirely sure and am on vacation. 😅 it could be worth a try. But don’t forget to reset min_size back to 2 afterwards. Zitat von "GLE, Vivien" <vivien....@inist.fr>:Hi,did the two replaced OSDs fail at the sime time (before they were completely drained)? This would most likely mean that both those failed OSDs contained the other two replicas of this PGUnfortunately yesThis would most likely mean that both those failed OSDs contained the other two replicas of this PG. A pg query should show which OSDs are missing.If I understand well I need to move my PG on the OSD 1 ? ceph -w osd.1 [ERR] 11.4 has 2 objects unfound and apparently lost ceph pg query 11.4 "up": [ 1, 4, 5 ], "acting": [ 1, 4, 5 ], "avail_no_missing": [], "object_location_counts": [ { "shards": "3,4,5", "objects": 2 } ], "blocked_by": [], "up_primary": 1, "acting_primary": 1, "purged_snaps": [] }, Thanks Vivien ________________________________ De : Eugen Block <ebl...@nde.ag> Envoyé : mardi 29 juillet 2025 16:48:41 À : ceph-users@ceph.io Objet : [ceph-users] Re: Pgs troubleshooting Hi, did the two replaced OSDs fail at the sime time (before they were completely drained)? This would most likely mean that both those failed OSDs contained the other two replicas of this PG. A pg query should show which OSDs are missing. You could try with objectstore-tool to export the PG from the remaining OSD and import it on different OSDs. Or you mark the data as lost if you don't care about the data and want a healthy state quickly. Regards, Eugen Zitat von "GLE, Vivien" <vivien....@inist.fr>:Thanks for your help ! This is my new pg stat with no more peering pgs (after rebooting some OSD) ceph pg stat -> 498 pgs: 1 active+recovery_unfound+degraded, 3 recovery_unfound+undersized+degraded+remapped+peered, 14 active+clean+scrubbing+deep, 480 active+clean; 36 GiB data, 169 GiB used, 6.2 TiB / 6.4 TiB avail; 8.8 KiB/s rd, 0 B/s wr, 12 op/s; 715/41838 objects degraded (1.709%); 5/13946 objects unfound (0.036%) ceph pg ls recovery_unfound -> shows that PG are replica 3, tried to repair but nothing happened ceph -w -> osd.1 [ERR] 11.4 has 2 objects unfound and apparently lost ________________________________ De : Frédéric Nass <frederic.n...@clyso.com> Envoyé : mardi 29 juillet 2025 14:03:37 À : GLE, Vivien Cc : ceph-users@ceph.io Objet : Re: [ceph-users] Pgs troubleshooting Hi Vivien, Unless you ran 'ceph pg stat' command when peering was occuring, the 37 peering PGs might indicate a temporary peering issue with one or more OSDs. If that's the case then restarting associated OSDs could help with the peering or ceph pg. You could list those PGs and associated OSDs with 'ceph pg ls peering' and trigger peering by either restarting one common OSD or by using 'ceph pg repeer <pg_id>'. Regarding the unfound object and its associated backfill_unfound PG, you could identify this PG with 'ceph pg ls backfill_unfound' and investigate this PG with 'ceph pg <pg_id> query'. Depending on the output, you could try running a 'ceph pg repair <pg_id>'. Could you confirm that this PG is not part of a size=2 pool? Best regards, Frédéric. -- Frédéric Nass Ceph Ambassador France | Senior Ceph Engineer @ CLYSO Try our Ceph Analyzer -- https://analyzer.clyso.com/ https://clyso.com | frederic.n...@clyso.com<mailto:frederic.n...@clyso.com> Le mar. 29 juil. 2025 à 14:19, GLE, Vivien <vivien....@inist.fr<mailto:vivien....@inist.fr>> a écrit : Hi, After replacing 2 OSD (data corruption), this is the stats of my testing ceph cluster ceph pg stat 498 pgs: 37 peering, 1 active+remapped+backfilling, 1 active+clean+remapped, 1 active+recovery_wait+undersized+remapped, 1 backfill_unfound+undersized+degraded+remapped+peered, 1 remapped+peering, 12 active+clean+scrubbing+deep, 1 active+undersized, 442 active+clean, 1 active+recovering+undersized+remapped 34 GiB data, 175 GiB used, 6.2 TiB / 6.4 TiB avail; 1.7 KiB/s rd, 1 op/s; 31/39768 objects degraded (0.078%); 6/39768 objects misplaced (0.015%); 1/13256 objects unfound (0.008%) ceph osd stat7 osds: 7 up (since 20h), 7 in (since 20h); epoch: e427538; 4 remapped pgsAnyone had an idea of where to start to get a healthy cluster ? Thanks ! Vivien _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io<mailto:ceph-users@ceph.io> To unsubscribe send an email to ceph-users-le...@ceph.io<mailto:ceph-users-le...@ceph.io> _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io_______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io_______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io_______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io