[ceph-users] Re: MDS "newly corrupt dentry" after patch version upgrade

2023-05-12 Thread Janek Bevendorff
If is thrown while decoding the file name, then somebody probably managed to store files with non-UTF-8 characters in the name. Although I don't really know how this can happen. Perhaps some OS quirk. On 10/05/2023 22:33, Patrick Donnelly wrote: Hi Janek, All this indicates is that you have s

[ceph-users] Re: MDS "newly corrupt dentry" after patch version upgrade

2023-05-10 Thread Patrick Donnelly
Hi Janek, All this indicates is that you have some files with binary keys that cannot be decoded as utf-8. Unfortunately, the rados python library assumes that omap keys can be decoded this way. I have a ticket here: https://tracker.ceph.com/issues/59716 I hope to have a fix soon. On Thu, May

[ceph-users] Re: MDS "newly corrupt dentry" after patch version upgrade

2023-05-04 Thread Janek Bevendorff
After running the tool for 11 hours straight, it exited with the following exception: Traceback (most recent call last):   File "/home/webis/first-damage.py", line 156, in     traverse(f, ioctx)   File "/home/webis/first-damage.py", line 84, in traverse     for (dnk, val) in it:   File "rados.p

[ceph-users] Re: MDS "newly corrupt dentry" after patch version upgrade

2023-05-03 Thread Patrick Donnelly
On Wed, May 3, 2023 at 4:33 AM Janek Bevendorff wrote: > > Hi Patrick, > > > I'll try that tomorrow and let you know, thanks! > > I was unable to reproduce the crash today. Even with > mds_abort_on_newly_corrupt_dentry set to true, all MDS booted up > correctly (though they took forever to rejoin

[ceph-users] Re: MDS "newly corrupt dentry" after patch version upgrade

2023-05-03 Thread Janek Bevendorff
Hi Patrick, I'll try that tomorrow and let you know, thanks! I was unable to reproduce the crash today. Even with mds_abort_on_newly_corrupt_dentry set to true, all MDS booted up correctly (though they took forever to rejoin with logs set to 20). To me it looks like the issue has resolved

[ceph-users] Re: MDS "newly corrupt dentry" after patch version upgrade

2023-05-02 Thread Janek Bevendorff
Hi Patrick, Please be careful resetting the journal. It was not necessary. You can try to recover the missing inode using cephfs-data-scan [2]. Yes. I did that very reluctantly after trying everything else as a last resort. But since it only gave me another error, I restored the previous sta

[ceph-users] Re: MDS "newly corrupt dentry" after patch version upgrade

2023-05-02 Thread Patrick Donnelly
On Tue, May 2, 2023 at 10:31 AM Janek Bevendorff wrote: > > Hi, > > After a patch version upgrade from 16.2.10 to 16.2.12, our rank 0 MDS > fails start start. After replaying the journal, it just crashes with > > [ERR] : MDS abort because newly corrupt dentry to be committed: [dentry > #0x1/storag

[ceph-users] Re: MDS "newly corrupt dentry" after patch version upgrade

2023-05-02 Thread Janek Bevendorff
Thanks! I tried downgrading to 16.2.10 and was able to get it running again, but after a reboot, got a warning that two of the OSDs on that host had broken Bluestore compression. Restarting the two OSDs again got rid of it, but that's still a bit concerning. On 02/05/2023 16:48, Dan van der

[ceph-users] Re: MDS "newly corrupt dentry" after patch version upgrade

2023-05-02 Thread Dan van der Ster
Hi Janek, That assert is part of a new corruption check added in 16.2.12 -- see https://github.com/ceph/ceph/commit/1771aae8e79b577acde749a292d9965264f20202 The abort is controlled by a new option: +Option("mds_abort_on_newly_corrupt_dentry", Option::TYPE_BOOL, Option::LEVEL_ADVANCED) +.