If is thrown while decoding the file name, then somebody probably
managed to store files with non-UTF-8 characters in the name. Although I
don't really know how this can happen. Perhaps some OS quirk.
On 10/05/2023 22:33, Patrick Donnelly wrote:
Hi Janek,
All this indicates is that you have s
Hi Janek,
All this indicates is that you have some files with binary keys that
cannot be decoded as utf-8. Unfortunately, the rados python library
assumes that omap keys can be decoded this way. I have a ticket here:
https://tracker.ceph.com/issues/59716
I hope to have a fix soon.
On Thu, May
After running the tool for 11 hours straight, it exited with the
following exception:
Traceback (most recent call last):
File "/home/webis/first-damage.py", line 156, in
traverse(f, ioctx)
File "/home/webis/first-damage.py", line 84, in traverse
for (dnk, val) in it:
File "rados.p
On Wed, May 3, 2023 at 4:33 AM Janek Bevendorff
wrote:
>
> Hi Patrick,
>
> > I'll try that tomorrow and let you know, thanks!
>
> I was unable to reproduce the crash today. Even with
> mds_abort_on_newly_corrupt_dentry set to true, all MDS booted up
> correctly (though they took forever to rejoin
Hi Patrick,
I'll try that tomorrow and let you know, thanks!
I was unable to reproduce the crash today. Even with
mds_abort_on_newly_corrupt_dentry set to true, all MDS booted up
correctly (though they took forever to rejoin with logs set to 20).
To me it looks like the issue has resolved
Hi Patrick,
Please be careful resetting the journal. It was not necessary. You can
try to recover the missing inode using cephfs-data-scan [2].
Yes. I did that very reluctantly after trying everything else as a last
resort. But since it only gave me another error, I restored the previous
sta
On Tue, May 2, 2023 at 10:31 AM Janek Bevendorff
wrote:
>
> Hi,
>
> After a patch version upgrade from 16.2.10 to 16.2.12, our rank 0 MDS
> fails start start. After replaying the journal, it just crashes with
>
> [ERR] : MDS abort because newly corrupt dentry to be committed: [dentry
> #0x1/storag
Thanks!
I tried downgrading to 16.2.10 and was able to get it running again, but
after a reboot, got a warning that two of the OSDs on that host had
broken Bluestore compression. Restarting the two OSDs again got rid of
it, but that's still a bit concerning.
On 02/05/2023 16:48, Dan van der
Hi Janek,
That assert is part of a new corruption check added in 16.2.12 -- see
https://github.com/ceph/ceph/commit/1771aae8e79b577acde749a292d9965264f20202
The abort is controlled by a new option:
+Option("mds_abort_on_newly_corrupt_dentry", Option::TYPE_BOOL,
Option::LEVEL_ADVANCED)
+.