On 2025/7/9 01:01, Jan Kiszka wrote:
On 08.07.25 18:39, Jan Kiszka wrote:
On 08.07.25 17:57, Gao Xiang wrote:
On 2025/7/8 23:36, Gao Xiang wrote:
On 2025/7/8 23:32, Gao Xiang wrote:
On 2025/7/8 23:22, Jan Kiszka wrote:
On 08.07.25 17:12, Gao Xiang wrote:
Hi Jan,
On 2025/7/8 20:43, Jan Kiszka wrote:
On 08.07.25 14:41, Jan Kiszka wrote:
Hi all,
for some days, I'm trying to understand if we have an integration
issue
with erofs or rather some upstream bug. After playing with various
parameters, it rather looks like the latter:
$ ls -l erofs-dir/
total 132
-rwxr-xr-x 1 1000 users 132868 Jul 8 10:50 dash
(from Debian bookworm)
$ mkfs.erofs -z lz4hc erofs.img erofs-dir/
mkfs.erofs 1.8.6 (trixie version, but same happens with bookworm
1.5)
Build completed.
------
Filesystem UUID: aae0b2f0-4ee4-4850-af49-3c1aad7fa30c
Filesystem total blocks: 17 (of 4096-byte blocks)
Filesystem total inodes: 2
Filesystem total metadata blocks: 1
Filesystem total deduplicated bytes (of source files): 0
Now I have 6.15-rc5 and a defconfig-close setting for the 32-bit ARM
target BeagleBone Black. When booting into init=/bin/sh, then
running
# mount -t erofs /dev/mmcblk0p1 /mnt
erofs (device mmcblk0p1): mounted with root inode @ nid 36.
# /mnt/dash
Segmentation fault
Two extra quick questions:
- If the segfault happens, then if you run /mnt/dash again, does
segfault still happen?
- If the /mnt/dash segfault happens, then if you run
cat /mnt/dash > /dev/null
/mnt/dash
does segfault still happen?
Oh, sorry I didn't read the full hints, could you check if
the following patch resolve the issue (space-damaged)?
diff --git a/fs/erofs/data.c b/fs/erofs/data.c
index 6a329c329f43..701490b3ef7d 100644
--- a/fs/erofs/data.c
+++ b/fs/erofs/data.c
@@ -245,6 +245,7 @@ void erofs_onlinefolio_end(struct folio *folio, int
err)
if (v & ~EROFS_ONLINEFOLIO_EIO)
return;
folio->private = 0;
+ flush_dcache_folio(folio);
folio_end_read(folio, !(v & EROFS_ONLINEFOLIO_EIO));
}
Yeah, indeed that seem to have helped with the minimal test. Will do the
full scenario test (complete rootfs) next.
And that looks good as! Thanks a lot for that quick fix - hoping that is
the real solution already.
BTW, that change does not look very specific to the armhf arch, rather
like we were lucky that it didn't hit elsewhere, right?
I may submit a formal patch tomorrow.
This issue doesn't impact x86 and arm64. For example on arm64,
PG_dcache_clean is clear when it's a new page cache folio.
But it seems on arm platform flush_dcache_folio() does more
to handle D-cache aliasing so some caching setup may be
impacted.
Thanks,
Gao Xiang
Jan