On 08.07.25 19:09, Gao Xiang wrote: > > > On 2025/7/9 01:01, Jan Kiszka wrote: >> On 08.07.25 18:39, Jan Kiszka wrote: >>> On 08.07.25 17:57, Gao Xiang wrote: >>>> >>>> >>>> On 2025/7/8 23:36, Gao Xiang wrote: >>>>> >>>>> >>>>> On 2025/7/8 23:32, Gao Xiang wrote: >>>>>> >>>>>> >>>>>> On 2025/7/8 23:22, Jan Kiszka wrote: >>>>>>> On 08.07.25 17:12, Gao Xiang wrote: >>>>>>>> Hi Jan, >>>>>>>> >>>>>>>> On 2025/7/8 20:43, Jan Kiszka wrote: >>>>>>>>> On 08.07.25 14:41, Jan Kiszka wrote: >>>>>>>>>> Hi all, >>>>>>>>>> >>>>>>>>>> for some days, I'm trying to understand if we have an integration >>>>>>>>>> issue >>>>>>>>>> with erofs or rather some upstream bug. After playing with >>>>>>>>>> various >>>>>>>>>> parameters, it rather looks like the latter: >>>>>>>>>> >>>>>>>>>> $ ls -l erofs-dir/ >>>>>>>>>> total 132 >>>>>>>>>> -rwxr-xr-x 1 1000 users 132868 Jul 8 10:50 dash >>>>>>>>>> (from Debian bookworm) >>>>>>>>>> $ mkfs.erofs -z lz4hc erofs.img erofs-dir/ >>>>>>>>>> mkfs.erofs 1.8.6 (trixie version, but same happens with bookworm >>>>>>>>>> 1.5) >>>>>>>>>> Build completed. >>>>>>>>>> ------ >>>>>>>>>> Filesystem UUID: aae0b2f0-4ee4-4850-af49-3c1aad7fa30c >>>>>>>>>> Filesystem total blocks: 17 (of 4096-byte blocks) >>>>>>>>>> Filesystem total inodes: 2 >>>>>>>>>> Filesystem total metadata blocks: 1 >>>>>>>>>> Filesystem total deduplicated bytes (of source files): 0 >>>>>>>>>> >>>>>>>>>> Now I have 6.15-rc5 and a defconfig-close setting for the 32- >>>>>>>>>> bit ARM >>>>>>>>>> target BeagleBone Black. When booting into init=/bin/sh, then >>>>>>>>>> running >>>>>>>>>> >>>>>>>>>> # mount -t erofs /dev/mmcblk0p1 /mnt >>>>>>>>>> erofs (device mmcblk0p1): mounted with root inode @ nid 36. >>>>>>>>>> # /mnt/dash >>>>>>>>>> Segmentation fault >>>>> >>>>> Two extra quick questions: >>>>> - If the segfault happens, then if you run /mnt/dash again, does >>>>> segfault still happen? >>>>> >>>>> - If the /mnt/dash segfault happens, then if you run >>>>> cat /mnt/dash > /dev/null >>>>> /mnt/dash >>>>> does segfault still happen? >>>> >>>> Oh, sorry I didn't read the full hints, could you check if >>>> the following patch resolve the issue (space-damaged)? >>>> >>>> diff --git a/fs/erofs/data.c b/fs/erofs/data.c >>>> index 6a329c329f43..701490b3ef7d 100644 >>>> --- a/fs/erofs/data.c >>>> +++ b/fs/erofs/data.c >>>> @@ -245,6 +245,7 @@ void erofs_onlinefolio_end(struct folio *folio, int >>>> err) >>>> if (v & ~EROFS_ONLINEFOLIO_EIO) >>>> return; >>>> folio->private = 0; >>>> + flush_dcache_folio(folio); >>>> folio_end_read(folio, !(v & EROFS_ONLINEFOLIO_EIO)); >>>> } >>>> >>> >>> Yeah, indeed that seem to have helped with the minimal test. Will do the >>> full scenario test (complete rootfs) next. >>> >> >> And that looks good as! Thanks a lot for that quick fix - hoping that is >> the real solution already. >> >> BTW, that change does not look very specific to the armhf arch, rather >> like we were lucky that it didn't hit elsewhere, right? > > I may submit a formal patch tomorrow. >
Great thanks. I quickly checked backports, and it fits cleanly on 6.12, but at least 6.1 requires more work to find a home there as well. > This issue doesn't impact x86 and arm64. For example on arm64, > PG_dcache_clean is clear when it's a new page cache folio. > > But it seems on arm platform flush_dcache_folio() does more > to handle D-cache aliasing so some caching setup may be > impacted. Yeah, that would explain it. And Stefan (on CC) was on an arm32 as well back then. Jan -- Siemens AG, Foundational Technologies Linux Expert Center