After a bright idea to pause 10.2.2 Ceph cluster for a minute to see if it will speed up backfill I managed to corrupt my MDS journal (should it happen after cluster pause/unpause, or is it some sort of a bug?). I had "Overall journal integrity: DAMAGED", etc

I was following http://docs.ceph.com/docs/jewel/cephfs/disaster-recovery/ and have some questions/feedback:

* It would be great to have some info when ‘snap’ or ‘inode’ should be reset
* It is not clear when MDS start should be attempted
* Can scan_extents/scan_inodes be run after MDS is running?
* "online MDS scrub" is mentioned in docs. Is it scan_extents/scan_inodes or some other command?

Now CephFS seems to be working (I have "mds0: Metadata damage detected" but scan_extends is currently running), let's see what happens when I finish scan_extends/scan_inodes.

Will these actions solve possible orphaned objects in pools? What else should I look into?

ceph-users mailing list

Reply via email to