On Fri, Mar 05, 2021 at 09:48:50PM -0500, Theodore Ts'o wrote: > I can't reproduce the problem given your file system image. Given > your description, this is almost certainly operator error.
OK, I was finally able to reproduce the problem, but not using your reproduction instructions. I reproduced it via: 1. e2image -r rbd13.e2i.qcow2 /tmp/rbd13 2. truncate -s 10T /tmp/rbd13 3. e2fsck -fy /tmp/rbd13 4. resize2fs /tmp/rbd13 5. e2fsck -fy /tmp/rbd13 <note fs corruption> The bug report was also incorrect by saying that resizing the file system to 4T was sufficient; that is not true. It can only be reproduced by when a file system is resized sufficiently large that there is no longer enough room to grow the block group descriptors without moving the allocation bitmaps and/or inode table out of the way in order to create room for the block group descriptors. (As in the above reproduction recipe.) % e2image -r rbd13.e2i.qcow2 /tmp/rbd13 e2image 1.46.2 (28-Feb-2021) % truncate -s 4T /tmp/rbd13 % resize2fs /tmp/rbd13 resize2fs 1.46.2 (28-Feb-2021) Please run 'e2fsck -f /tmp/rbd13' first. % e2fsck -f /tmp/rbd13 e2fsck 1.46.2 (28-Feb-2021) e2fsck: MMP: e2fsck being run while checking MMP block MMP check failed: If you are sure the filesystem is not in use on any node, run: 'tune2fs -f -E clear_mmp /tmp/rbd13' MMP_block: mmp_magic: 0x4d4d50 mmp_check_interval: 10 mmp_sequence: e24d4d50 mmp_update_date: Sat Mar 6 22:47:25 2021 mmp_update_time: 1615088845 mmp_node_name: cwcc mmp_device_name: /tmp/rbd13 /tmp/rbd13: ********** WARNING: Filesystem still has errors ********** This is a separate bug. The issue here is that resize2fs is exiting after printing the "Please run 'e2fsck -f /tmp/rbd13' first." without cleanly stopping (resetting) the MMP protection. And this doesn't lead to file system corruption, as we can see here: % tune2fs -f -E clear_mmp /tmp/rbd13 tune2fs 1.46.2 (28-Feb-2021) % e2fsck -fy /tmp/rbd13 e2fsck 1.46.2 (28-Feb-2021) Clearing orphaned inode 45617124 (uid=107, gid=115, mode=0100600, size=16777216) Clearing orphaned inode 15073744 (uid=0, gid=0, mode=0100644, size=593696) Clearing orphaned inode 15073743 (uid=0, gid=0, mode=0100644, size=3031904) Clearing orphaned inode 50331709 (uid=0, gid=0, mode=0100644, size=149704) Clearing orphaned inode 50332495 (uid=0, gid=0, mode=0100755, size=231560) Clearing orphaned inode 50332319 (uid=0, gid=0, mode=0100644, size=2670992) Clearing orphaned inode 50332271 (uid=0, gid=0, mode=0100644, size=651472) Clearing orphaned inode 50332251 (uid=0, gid=0, mode=0100644, size=282752) Clearing orphaned inode 13 (uid=0, gid=0, mode=0100600, size=0) Pass 1: Checking inodes, blocks, and sizes Inode 46530577 extent tree (at level 1) could be shorter. Optimize? yes Inode 46530714 extent tree (at level 1) could be shorter. Optimize? yes Pass 1E: Optimizing extent trees Pass 2: Checking directory structure Pass 3: Checking directory connectivity Pass 4: Checking reference counts Pass 5: Checking group summary information /tmp/rbd13: ***** FILE SYSTEM WAS MODIFIED ***** /tmp/rbd13: 225913/67108864 files (8.6% non-contiguous), 185961407/268435456 blocks The workaround to the first bug is to clear the MMP feature, do the offline resize, and then enable the MMP feature again. Of course, while the MMP feature you won't be protected by another node trying to modify the file system. But it will allow you to grow the file system. Another workaround is to simply do an online resize (that is, on the node where the file system is mounted, run resize2fs on the mounted file system). However, depending on the kernel version, it won't allow you to resize past the limits of the reserved block group descriptors reserved by the resize inode. - Ted