[Bug 204479] KASAN hit at modprobe zram
https://bugzilla.kernel.org/show_bug.cgi?id=204479 Erhard F. (erhar...@mailbox.org) changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |CODE_FIX --- Comment #22 from Erhard F. (erhar...@mailbox.org) --- On kernel 5.4-rc1 zram loads & runs fine without KASAN complaining. As the original issue is fixed now I will close this bug. For any other issues KASAN was complaining about here, I will open new bugs if they still happen (like bug #205099). -- You are receiving this mail because: You are on the CC list for the bug.
[Bug 204479] KASAN hit at modprobe zram
https://bugzilla.kernel.org/show_bug.cgi?id=204479 Erhard F. (erhar...@mailbox.org) changed: What|Removed |Added Attachment #284271|0 |1 is obsolete|| --- Comment #21 from Erhard F. (erhar...@mailbox.org) --- Created attachment 284361 --> https://bugzilla.kernel.org/attachment.cgi?id=284361=edit kernel .config (5.3-rc4, PowerMac G4 DP) -- You are receiving this mail because: You are on the CC list for the bug.
[Bug 204479] KASAN hit at modprobe zram
https://bugzilla.kernel.org/show_bug.cgi?id=204479 --- Comment #20 from Erhard F. (erhar...@mailbox.org) --- (In reply to Christophe Leroy from comment #18) > Two possibilities, either the value in .rodata.cst16 is wrong or the stack > gets corrupted. > > Maybe you could try disabling KASAN in lib/raid6/Makefile for altivec8.o ? > Or maybe for the entire lib/raid6/ directory, just to see what happens ? Disabled KASAN with KASAN_SANITIZE := n in lib/raid6/Makefile. As you can see in my latest dmesg, the G4 continues booting without further issues. If btrfs gets loaded it still fails with KASAN (will update bug #204397). Another funny issue. Mounting my nfs share works via: modprobe nfs mount /media/distanthome If I mount it without modprobing nfs beforehand I get: [...] [ 66.271748] == [ 66.272076] BUG: KASAN: global-out-of-bounds in _copy_to_iter+0x3d4/0x5a8 [ 66.272331] Write of size 4096 at addr f1c27000 by task modprobe/312 [ 66.272598] CPU: 0 PID: 312 Comm: modprobe Tainted: GW 5.3.0-rc4+ #1 [ 66.272883] Call Trace: [ 66.272964] [e100b848] [c075026c] dump_stack+0xb0/0x10c (unreliable) [ 66.273211] [e100b878] [c02334a8] print_address_description+0x80/0x45c [ 66.273456] [e100b908] [c0233128] __kasan_report+0x140/0x188 [ 66.273667] [e100b948] [c0233fbc] check_memory_region+0x28/0x184 [ 66.273889] [e100b958] [c023206c] memcpy+0x48/0x74 [ 66.274061] [e100b978] [c044342c] _copy_to_iter+0x3d4/0x5a8 [ 66.274265] [e100baa8] [c04437a8] copy_page_to_iter+0x90/0x550 [ 66.274482] [e100bb08] [c01b6898] generic_file_read_iter+0x5c8/0x7bc [ 66.274720] [e100bb78] [c0249034] __vfs_read+0x1b0/0x1f4 [ 66.274912] [e100bca8] [c0249134] vfs_read+0xbc/0x124 [ 66.275094] [e100bcd8] [c02491f0] kernel_read+0x54/0x70 [ 66.275284] [e100bd08] [c02535c8] kernel_read_file+0x240/0x358 [ 66.275499] [e100bdb8] [c02537cc] kernel_read_file_from_fd+0x54/0x74 [ 66.275737] [e100bdf8] [c01068ac] sys_finit_module+0xd8/0x140 [ 66.275949] [e100bf38] [c001a274] ret_from_syscall+0x0/0x34 [ 66.276152] --- interrupt: c01 at 0xa602c4 LR = 0xbe87c4 [ 66.276417] Memory state around the buggy address: [ 66.276588] f1c27a00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 66.276824] f1c27a80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 66.277060] >f1c27b00: 00 00 00 00 00 00 00 00 05 fa fa fa fa fa fa fa [ 66.277293]^ [ 66.277453] f1c27b80: 07 fa fa fa fa fa fa fa 00 03 fa fa fa fa fa fa [ 66.277688] f1c27c00: 04 fa fa fa fa fa fa fa 00 06 fa fa fa fa fa fa [ 66.277920] == [ 66.428224] RPC: Registered named UNIX socket transport module. [ 66.428484] RPC: Registered udp transport module. [ 66.428647] RPC: Registered tcp transport module. [ 66.428809] RPC: Registered tcp NFSv4.1 backchannel transport module. [ 66.741275] Key type dns_resolver registered [ 67.974192] NFS: Registering the id_resolver key type [ 67.974534] Key type id_resolver registered [ 67.974681] Key type id_legacy registered But maybe it's better to not open too many ppc32 KASAN related bugs for now. ;) It probably can wait until you patches are in some later 5.3-rc I guess. -- You are receiving this mail because: You are on the CC list for the bug.
[Bug 204479] KASAN hit at modprobe zram
https://bugzilla.kernel.org/show_bug.cgi?id=204479 --- Comment #19 from Erhard F. (erhar...@mailbox.org) --- Created attachment 284355 --> https://bugzilla.kernel.org/attachment.cgi?id=284355=edit dmesg (kernel 5.3-rc4 + shadow patch + parallel patch, PowerMac G4 DP) -- You are receiving this mail because: You are on the CC list for the bug.
[Bug 204479] KASAN hit at modprobe zram
https://bugzilla.kernel.org/show_bug.cgi?id=204479 --- Comment #18 from Christophe Leroy (christophe.le...@c-s.fr) --- The Oops occurs at 0x3c8: 3b0: 81 21 00 88 lwz r9,136(r1) 3b4: 13 67 dc c4 vxorv27,v7,v27 3b8: 7d 11 a8 ce lvx v8,r17,r21 3bc: 11 5f 5b 06 vcmpgtsb v10,v31,v11 3c0: 11 6b 58 00 vaddubm v11,v11,v11 3c4: 81 41 00 8c lwz r10,140(r1) >3c8: 7c 00 48 ce lvx v0,0,r9 This is because the value in r9 is most likely wrong. r9 is loaded from the stack at 0x3b0 r9 was calculated and stored in the stack by the below code. 70: 3d 20 00 00 lis r9,0 72: R_PPC_ADDR16_HA .rodata.cst16 74: 3b b3 00 10 addir29,r19,16 78: 39 29 00 00 addir9,r9,0 7a: R_PPC_ADDR16_LO .rodata.cst16 7c: 91 21 00 88 stw r9,136(r1) The value comes from .rodata.cst16 Two possibilities, either the value in .rodata.cst16 is wrong or the stack gets corrupted. Maybe you could try disabling KASAN in lib/raid6/Makefile for altivec8.o ? Or maybe for the entire lib/raid6/ directory, just to see what happens ? -- You are receiving this mail because: You are on the CC list for the bug.
[Bug 204479] KASAN hit at modprobe zram
https://bugzilla.kernel.org/show_bug.cgi?id=204479 --- Comment #17 from Christophe Leroy (christophe.le...@c-s.fr) --- Created attachment 284343 --> https://bugzilla.kernel.org/attachment.cgi?id=284343=edit Disassembly of lib/raid6/altivec8.o -- You are receiving this mail because: You are on the CC list for the bug.
[Bug 204479] KASAN hit at modprobe zram
https://bugzilla.kernel.org/show_bug.cgi?id=204479 --- Comment #16 from Erhard F. (erhar...@mailbox.org) --- Created attachment 284309 --> https://bugzilla.kernel.org/attachment.cgi?id=284309=edit dmesg (kernel 5.3-rc3 + debug patch + shadow patch + parallel patch, PowerMac G4 DP) Also tested your powerpc-kasan-fix-parallele-loading-of-modules.diff now which seems to work fine! dmesg from the G4 DP with CONFIG_SMP back on is almost identical to non-smp kernel dmesg. raid6 pq reliably oopses. Probably the 1st issue revealed by ppc32 KASAN. ;) Loading the radeon module at boot still freezes the G4. modprobing it later on works, without any special dmesg output, switching display over from Offb to radeonfb. -- You are receiving this mail because: You are on the CC list for the bug.
[Bug 204479] KASAN hit at modprobe zram
https://bugzilla.kernel.org/show_bug.cgi?id=204479 --- Comment #15 from Christophe Leroy (christophe.le...@c-s.fr) --- As far as I can see in the latest dmesg, the Oops occurs in raid6 pq module. An this time it is not anymore in kasan register global. -- You are receiving this mail because: You are on the CC list for the bug.
[Bug 204479] KASAN hit at modprobe zram
https://bugzilla.kernel.org/show_bug.cgi?id=204479 --- Comment #14 from Erhard F. (erhar...@mailbox.org) --- Created attachment 284303 --> https://bugzilla.kernel.org/attachment.cgi?id=284303=edit dmesg (kernel 5.3-rc3 + patch + 2nd patch, without CONFIG_SMP, v2, PowerMac G4 DP) However the radeon module und btrfs (if built as module) still freeze the machine until the 2min reboot timer kicks in. Also some EHCI driver modules oopses, but not always. -- You are receiving this mail because: You are on the CC list for the bug.
[Bug 204479] KASAN hit at modprobe zram
https://bugzilla.kernel.org/show_bug.cgi?id=204479 --- Comment #13 from Erhard F. (erhar...@mailbox.org) --- Created attachment 284301 --> https://bugzilla.kernel.org/attachment.cgi?id=284301=edit dmesg (kernel 5.3-rc3 + patch + 2nd patch, without CONFIG_SMP, PowerMac G4 DP) Definitely an improvement with the latest patch. b43legacy and nfs load now reliably without Oops. -- You are receiving this mail because: You are on the CC list for the bug.
[Bug 204479] KASAN hit at modprobe zram
https://bugzilla.kernel.org/show_bug.cgi?id=204479 --- Comment #12 from Christophe Leroy (christophe.le...@c-s.fr) --- Patch at https://patchwork.ozlabs.org/patch/1144756/ -- You are receiving this mail because: You are on the CC list for the bug.
[Bug 204479] KASAN hit at modprobe zram
https://bugzilla.kernel.org/show_bug.cgi?id=204479 --- Comment #11 from Christophe Leroy (christophe.le...@c-s.fr) --- Thanks. Then it is not about SMP allthough there's anyway a theoritical problem with SMP that's I'll address in another patch. I think I finally spotted the issue. Let's take the first occurence of the first log: Aug 08 23:39:58 T600 kernel: ## module_alloc(4718) = f1065000 [fe20ca00-fe20d2e3] [...] Aug 08 23:39:59 T600 kernel: BUG: Unable to handle kernel data access at 0xfe20d040 In kasan_init_region(), the loop starts with k_cur = 0xfe20ca00 to set the pte for the first shadow page at 0xfe20c000. Then k_cur is increased by PAGE_SIZE so now k_cur = 0xfe20da00. As this is over 0xfe20d2e3, it doesn't set the pte for the second page at 0xfe20d000. It should be fixed by changing the init value of k_cur in the for() loop of kasan_init_region() by: for (k_cur = k_start & PAGE_MASK; ) Can you test it ? -- You are receiving this mail because: You are on the CC list for the bug.
[Bug 204479] KASAN hit at modprobe zram
https://bugzilla.kernel.org/show_bug.cgi?id=204479 --- Comment #10 from Erhard F. (erhar...@mailbox.org) --- Created attachment 284297 --> https://bugzilla.kernel.org/attachment.cgi?id=284297=edit dmesg (kernel 5.3-rc3 + patch, without CONFIG_SMP, PowerMac G4 DP) Here's the dmesg with the kernel built without CONFIG_SMP. -- You are receiving this mail because: You are on the CC list for the bug.
[Bug 204479] KASAN hit at modprobe zram
https://bugzilla.kernel.org/show_bug.cgi?id=204479 --- Comment #9 from Christophe Leroy (christophe.le...@c-s.fr) --- The module loads seems to be nested. It might then be an SMP issue, kasan_init_region() is most likely not SMP safe. Could you test without CONFIG_SMP or with only one CPU ? -- You are receiving this mail because: You are on the CC list for the bug.
[Bug 204479] KASAN hit at modprobe zram
https://bugzilla.kernel.org/show_bug.cgi?id=204479 --- Comment #8 from Christophe Leroy (christophe.le...@c-s.fr) --- List of allocated areas with associated kasan shadow area in [ ], together with the addresses when shadow initialisation fails: Aug 08 23:39:58 T600 kernel: ## module_alloc(c78c) = f147 [fe28e000-fe28f8f1] Aug 08 23:39:58 T600 kernel: ## module_alloc(36f8) = f147e000 [fe28fc00-fe2902df] Aug 08 23:39:58 T600 kernel: ## module_alloc(c78c) = f1483000 [fe290600-fe291ef1] Aug 08 23:39:58 T600 kernel: ## module_alloc(c78c) = f1491000 [fe292200-fe293af1] Aug 08 23:39:58 T600 kernel: ## module_alloc(36f8) = f1502000 [fe2a0400-fe2a0adf] Aug 08 23:39:58 T600 kernel: ## module_alloc(1521) = f1013000 [fe202600-fe2028a4] Aug 08 23:39:58 T600 kernel: ## module_alloc(13bc5) = f103d000 [fe207a00-fe20a178] Aug 08 23:39:58 T600 kernel: ## module_alloc(1357) = f1027000 [fe204e00-fe20506a] Aug 08 23:39:58 T600 kernel: ## module_alloc(36f8) = f102a000 [fe205400-fe205adf] Aug 08 23:39:58 T600 kernel: ## module_alloc(4301) = f102f000 [fe205e00-fe206660] Aug 08 23:39:58 T600 kernel: ## module_alloc(4718) = f1065000 [fe20ca00-fe20d2e3] Aug 08 23:39:58 T600 kernel: ## module_alloc(19ac) = f1076000 [fe20ec00-fe20ef35] Aug 08 23:39:58 T600 kernel: ## module_alloc(4718) = f129d000 [fe253a00-fe2542e3] Aug 08 23:39:58 T600 kernel: ## module_alloc(16ca) = f102a000 [fe205400-fe2056d9] Aug 08 23:39:58 T600 kernel: ## module_alloc(1f81) = f1079000 [fe20f200-fe20f5f0] Aug 08 23:39:58 T600 kernel: ## module_alloc(1f81) = f1027000 [fe204e00-fe2051f0] Aug 08 23:39:59 T600 kernel: BUG: Unable to handle kernel data access at 0xfe20d040 Aug 08 23:39:59 T600 kernel: ## module_alloc(185ef) = f12d [fe25a000-fe25d0bd] Aug 08 23:39:59 T600 kernel: ## module_alloc(4035) = f106b000 [fe20d600-fe20de06] Aug 08 23:39:59 T600 kernel: ## module_alloc(6196) = f12b3000 [fe256600-fe257232] Aug 08 23:39:59 T600 kernel: ## module_alloc(1d27) = f1071000 [fe20e200-fe20e5a4] Aug 08 23:39:59 T600 kernel: ## module_alloc(4035) = f102d000 [fe205a00-fe206206] Aug 08 23:39:59 T600 kernel: ## module_alloc(a11b) = f13ad000 [fe275a00-fe276e23] Aug 08 23:39:59 T600 kernel: ## module_alloc(4035) = f12b3000 [fe256600-fe256e06] Aug 08 23:39:59 T600 kernel: ## module_alloc(4035) = f12ea000 [fe25d400-fe25dc06] Aug 08 23:39:59 T600 kernel: ## module_alloc(1d27) = f1033000 [fe206600-fe2069a4] Aug 08 23:39:59 T600 kernel: ## module_alloc(4035) = f1397000 [fe272e00-fe273606] Aug 08 23:39:59 T600 kernel: ## module_alloc(307a) = f12f [fe25e000-fe25e60f] Aug 08 23:39:59 T600 kernel: ## module_alloc(1d27) = f1062000 [fe20c400-fe20c7a4] Aug 08 23:39:59 T600 kernel: ## module_alloc(1d27) = f12f7000 [fe25ee00-fe25f1a4] Aug 08 23:39:59 T600 kernel: ## module_alloc(1d27) = f12fd000 [fe25fa00-fe25fda4] Aug 08 23:39:59 T600 kernel: ## module_alloc(d102) = f1429000 [fe285200-fe286c20] Aug 08 23:39:59 T600 kernel: ## module_alloc(2a37) = f1033000 [fe206600-fe206b46] Aug 08 23:39:59 T600 kernel: ## module_alloc(4718) = f106b000 [fe20d600-fe20dee3] Aug 08 23:39:59 T600 kernel: ## module_alloc(9a3f2) = f1db8000 [fe3b7000-fe3ca47e] Aug 08 23:39:59 T600 kernel: ## module_alloc(18571) = f13cd000 [fe279a00-fe27caae] Aug 08 23:39:59 T600 kernel: ## module_alloc(1f81) = f1071000 [fe20e200-fe20e5f0] Aug 08 23:39:59 T600 kernel: ## module_alloc(1fdb9) = f1438000 [fe287000-fe28afb7] Aug 08 23:39:59 T600 kernel: ## module_alloc(56a49) = f1e54000 [fe3ca800-fe3d5549] Aug 08 23:39:59 T600 kernel: ## module_alloc(56a49) = f1eac000 [fe3d5800-fe3e0549] Aug 08 23:39:59 T600 kernel: ## module_alloc(56a49) = f1f04000 [fe3e0800-fe3eb549] Aug 08 23:39:59 T600 kernel: ## module_alloc(7c61) = f12ea000 [fe25d400-fe25e38c] Aug 08 23:39:59 T600 kernel: ## module_alloc(e011) = f140c000 [fe281800-fe283402] Aug 08 23:39:59 T600 kernel: ## module_alloc(56a49) = f1f5c000 [fe3eb800-fe3f6549] Aug 08 23:39:59 T600 kernel: ## module_alloc(56a49) = f1fb4000 [fe3f6800-fe401549] Aug 08 23:39:59 T600 kernel: ## module_alloc(e011) = f1459000 [fe28b200-fe28ce02] Aug 08 23:39:59 T600 kernel: ## module_alloc(e011) = f147e000 [fe28fc00-fe291802] Aug 08 23:39:59 T600 kernel: ## module_alloc(2561) = f1033000 [fe206600-fe206aac] Aug 08 23:39:59 T600 kernel: ## module_alloc(6ae1) = f12b3000 [fe256600-fe25735c] Aug 08 23:39:59 T600 kernel: ## module_alloc(e011) = f148e000 [fe291c00-fe293802] Aug 08 23:39:59 T600 kernel: ## module_alloc(e011) = f200c000 [fe401800-fe403402] Aug 08 23:40:00 T600 kernel: ## module_alloc(3355) = f1397000 [fe272e00-fe27346a] Aug 08 23:40:00 T600 kernel: ## module_alloc(1c8f) = f12f7000 [fe25ee00-fe25f191] Aug 08 23:40:00 T600 kernel: BUG: Unable to handle kernel data access at 0xfe2731a0 Aug 08 23:40:00 T600 kernel: ## module_alloc(1c078) = f13cd000 [fe279a00-fe27d20f] Aug 08
[Bug 204479] KASAN hit at modprobe zram
https://bugzilla.kernel.org/show_bug.cgi?id=204479 Erhard F. (erhar...@mailbox.org) changed: What|Removed |Added Attachment #284175|0 |1 is obsolete|| --- Comment #7 from Erhard F. (erhar...@mailbox.org) --- Created attachment 284273 --> https://bugzilla.kernel.org/attachment.cgi?id=284273=edit dmesg (kernel 5.3-rc3 + patch, PowerMac G4 DP) -- You are receiving this mail because: You are on the CC list for the bug.
[Bug 204479] KASAN hit at modprobe zram
https://bugzilla.kernel.org/show_bug.cgi?id=204479 --- Comment #6 from Erhard F. (erhar...@mailbox.org) --- (In reply to Christophe Leroy from comment #4) > We need to identify if the allocation of KASAN shadow area at module > allocation fails, or if kasan accesses outside of the allocated area. > > Could you please run again with the below trace: The patch did not apply to the mainstream kernnel with 'patch -p1 < ...' but I inserted the code manually. Please find the new results attached. -- You are receiving this mail because: You are on the CC list for the bug.
[Bug 204479] KASAN hit at modprobe zram
https://bugzilla.kernel.org/show_bug.cgi?id=204479 Erhard F. (erhar...@mailbox.org) changed: What|Removed |Added Attachment #284177|0 |1 is obsolete|| --- Comment #5 from Erhard F. (erhar...@mailbox.org) --- Created attachment 284271 --> https://bugzilla.kernel.org/attachment.cgi?id=284271=edit kernel .config (5.3-rc3, PowerMac G4 DP) -- You are receiving this mail because: You are on the CC list for the bug.
[Bug 204479] KASAN hit at modprobe zram
https://bugzilla.kernel.org/show_bug.cgi?id=204479 --- Comment #4 from Christophe Leroy (christophe.le...@c-s.fr) --- We need to identify if the allocation of KASAN shadow area at module allocation fails, or if kasan accesses outside of the allocated area. Could you please run again with the below trace: diff --git a/arch/powerpc/mm/kasan/kasan_init_32.c b/arch/powerpc/mm/kasan/kasan_init_32.c index 74f4555a62ba..2bca2bf691a9 100644 --- a/arch/powerpc/mm/kasan/kasan_init_32.c +++ b/arch/powerpc/mm/kasan/kasan_init_32.c @@ -142,6 +142,9 @@ void *module_alloc(unsigned long size) if (!base) return NULL; + pr_err("## module_alloc(%lx) = %px [%px-%px]\n", size, base, + kasan_mem_to_shadow(base), kasan_mem_to_shadow(base + size)); + if (!kasan_init_region(base, size)) return base; -- You are receiving this mail because: You are on the CC list for the bug.
[Bug 204479] KASAN hit at modprobe zram
https://bugzilla.kernel.org/show_bug.cgi?id=204479 --- Comment #3 from Erhard F. (erhar...@mailbox.org) --- Yes, at least one usb driver is also affected. Also radeon.ko sometimes loads ok, sometimes it stalls: # modprobe -v radeon insmod /lib/modules/5.3.0-rc2+/kernel/drivers/i2c/algos/i2c-algo-bit.ko -- You are receiving this mail because: You are on the CC list for the bug.
[Bug 204479] KASAN hit at modprobe zram
https://bugzilla.kernel.org/show_bug.cgi?id=204479 Christophe Leroy (christophe.le...@c-s.fr) changed: What|Removed |Added CC||christophe.le...@c-s.fr --- Comment #2 from Christophe Leroy (christophe.le...@c-s.fr) --- Looks like not only this modules but other ones too fail to load as far as I can tell from the attached dmesg. A powerpc dedicated module_alloc() function is defined in arch/powerpc/mm/kasan/kasan_init_32.c to create the shadow area associated to the module being loaded. This function seems to be OK for several drivers but not for all. -- You are receiving this mail because: You are on the CC list for the bug.
[Bug 204479] KASAN hit at modprobe zram
https://bugzilla.kernel.org/show_bug.cgi?id=204479 Erhard F. (erhar...@mailbox.org) changed: What|Removed |Added CC||linuxppc-dev@lists.ozlabs.o ||rg -- You are receiving this mail because: You are on the CC list for the bug.