Bug#1057843: linux: ext4 data corruption in 6.1.64-1
Online check of ext4: If your filesystem is located on a logical volume (LVM) then I assume you can make a snapshot and do a check of that. Make SS: lvcreate --snapshot --size 1G --name lv_root_SS --chunksize 4k /dev/VG1/lv_root EXT4 check: e2fsck -f /dev/dm-3 Remove SS: lvremove --yes VG1/lv_root_SS
Bug#1057843: linux: ext4 data corruption in 6.1.64-1
Will a file system check detect the corruptions? Can it be done online? Thank you.
Bug#1057843: linux: ext4 data corruption in 6.1.64-1
As there were some questions along in this thread let me summarize some points: The issue affects fs/ext4 code, so no other filesystems are affected (e.g. btrfs). The issue affects all kernels which have the commit 91562895f803 ("ext4: properly sync file size update after O_SYNC direct IO") from 6.7-rc1 (which is present in 6.6.3, 6.5.13 and 6.1.64) but when commit 936e114a245b ("iomap: update ki_pos a little later in iomap_dio_complete") from 6.5-rc1 is missing (which was backported to 5.15.142 and 6.1.66 additionally). The only upstream combination where that reverse and missing commit happened was 6.1.64 and 6.1.65. Debian is affected as per 6.1.64-1 upload which was the kernel aimed for 12.3 point release. The issue affects file corruption when direct IO writes are involved. O_DIRECT writes did not properly update current file position after the write so data and file was getting mangled. While this does not affect every write ever happend on the system on a ext4 filesystem with a broken kernel, O_DIRECT writes might be quite common in in programms trying to get high performance. It might be argued that it is not that common, but it's not inexistant. TTOMK, such file corruptions cannot be easily detected. Candidates to check are every modified file written since booted with the broken kernel 6.1.64-1. Poeple still not having booted into 6.1.66-1 are urged to do so. Regards, Salvatore
Bug#1057843: linux: ext4 data corruption in 6.1.64-1
On Mon, 11 Dec 2023 10:38:40 +0100 helios.sola...@gmx.ch wrote: > I have been running debian 12.3 with kernel 6.1.64-1 for a few hours, > how can I find out whether the file system has been corrupted? yes, I would also appreciate an explanation who could be affected, how to diagnose the problem, and what needs to be done. Please note that not all the users of Debian stable are kernel hackers who will be able to look at the filesystem code and understand the full extent of the problem. thanks, Dennis
Bug#1057843: linux: ext4 data corruption in 6.1.64-1
I have been running debian 12.3 with kernel 6.1.64-1 for a few hours, how can I find out whether the file system has been corrupted?
Bug#1057843: linux: ext4 data corruption in 6.1.64-1
Hi, On Sat, Dec 09, 2023 at 03:07:37PM +0100, Salvatore Bonaccorso wrote: > Source: linux > Version: 6.1.64-1 > Severity: grave > Tags: upstream > Justification: causes non-serious data loss > X-Debbugs-Cc: debian-rele...@lists.debian.org, car...@debian.org, > a...@debian.org > > Hi > > I'm filling this for visibility. > > There might be a ext4 data corruption issue with the kernel released > in the 12.3 bookworm point release (which is addressed in 6.1.66 > upstream already). > > The report about the regression and some details: > > https://lore.kernel.org/stable/20231205122122.dfhhoaswsfscuhc3@quack3/ 6.1.66 upstream fixes the issue: # uname -a Linux bookworm-amd64 6.1.0-15-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.66-1 (2023-12-06) x86_64 GNU/Linux # LTP_SINGLE_FS_TYPE=ext4 LTP_DEV_FS_TYPE=ext4 ./preadv03_64 tst_device.c:96: TINFO: Found free device 0 '/dev/loop0' tst_test.c:1690: TINFO: LTP version: 20230929-194-g5c096b2cf tst_test.c:1574: TINFO: Timeout per run is 0h 00m 30s tst_supported_fs_types.c:149: TINFO: WARNING: testing only ext4 tst_supported_fs_types.c:90: TINFO: Kernel supports ext4 tst_supported_fs_types.c:55: TINFO: mkfs.ext4 does exist tst_test.c:1650: TINFO: === Testing on ext4 === tst_test.c:1105: TINFO: Formatting /dev/loop0 with ext4 opts='' extra opts='' mke2fs 1.47.0 (5-Feb-2023) tst_test.c:1119: TINFO: Mounting /dev/loop0 to /tmp/LTP_preGGYjTj/mntpoint fstyp=ext4 flags=0 preadv03.c:102: TINFO: Using block size 512 preadv03.c:87: TPASS: preadv(O_DIRECT) read 512 bytes successfully with content 'a' expectedly preadv03.c:87: TPASS: preadv(O_DIRECT) read 512 bytes successfully with content 'a' expectedly preadv03.c:87: TPASS: preadv(O_DIRECT) read 512 bytes successfully with content 'b' expectedly Summary: passed 3 failed 0 broken 0 skipped 0 warnings 0 Regards, Salvatore
Bug#1057843: linux: ext4 data corruption in 6.1.64-1
Running the single test with ext4: # LTP_SINGLE_FS_TYPE=ext4 LTP_DEV_FS_TYPE=ext4 ./preadv03_64 tst_device.c:96: TINFO: Found free device 0 '/dev/loop0' tst_test.c:1690: TINFO: LTP version: 20230929-194-g5c096b2cf tst_test.c:1574: TINFO: Timeout per run is 0h 00m 30s tst_supported_fs_types.c:149: TINFO: WARNING: testing only ext4 tst_supported_fs_types.c:90: TINFO: Kernel supports ext4 tst_supported_fs_types.c:55: TINFO: mkfs.ext4 does exist tst_test.c:1650: TINFO: === Testing on ext4 === tst_test.c:1105: TINFO: Formatting /dev/loop0 with ext4 opts='' extra opts='' mke2fs 1.47.0 (5-Feb-2023) tst_test.c:1119: TINFO: Mounting /dev/loop0 to /tmp/LTP_preWBHd7l/mntpoint fstyp=ext4 flags=0 preadv03.c:102: TINFO: Using block size 512 preadv03.c:77: TFAIL: Buffer wrong at 0 have 62 expected 61 preadv03.c:77: TFAIL: Buffer wrong at 0 have 62 expected 61 preadv03.c:66: TFAIL: preadv(O_DIRECT) read 0 bytes, expected 512 Summary: passed 0 failed 3 broken 0 skipped 0 warnings 0
Bug#1057843: linux: ext4 data corruption in 6.1.64-1
Source: linux Version: 6.1.64-1 Severity: grave Tags: upstream Justification: causes non-serious data loss X-Debbugs-Cc: debian-rele...@lists.debian.org, car...@debian.org, a...@debian.org Hi I'm filling this for visibility. There might be a ext4 data corruption issue with the kernel released in the 12.3 bookworm point release (which is addressed in 6.1.66 upstream already). The report about the regression and some details: https://lore.kernel.org/stable/20231205122122.dfhhoaswsfscuhc3@quack3/ Regards, Salvatore