Am Mi., 5. März 2025 um 14:24 Uhr schrieb Salvatore Bonaccorso <car...@debian.org>: > > Control: forcemerge 1098698 -1 > > Hi Norbert, > > On Wed, Mar 05, 2025 at 12:15:29PM +0100, Norbert Lange wrote: > > Package: src:linux > > Version: 6.12.12-1 > > Severity: important > > X-Debbugs-Cc: nolang...@gmail.com > > > > Dear Maintainer, > > > > I experience an immediate Kernel Crash when copying large files/directories > > from > > a mounted Samba share. > > I can consistently reproduce the crash in Qemu (from which I grabbed the > > Log). > > > > My content of `/etc/fstab`: > > > > ``` > > //vienas01.andritz.com/HIPASE /run/media/HIPASE_Q smb3 > > credentials=/tmp/creds.txt,uid=1000,user,vers=3,nofail,noatime,noauto 0 > > 0 > > ``` > > > > The sequence leading to the crash is a filecopy to local HDD: > > > > ``` bash > > mount /run/media/HIPASE_Q > > cp -r /run/media/HIPASE_Q/DIR ~/Download > > ``` > > > > Output from mount is: > > > > ``` > > //vienas01.andritz.com/HIPASE on /run/media/HIPASE_Q type smb3 > > (rw,nosuid,nodev,relatime,vers=3.1.1,cache=strict,username=XXX,domain=YYY,uid=1000,forceuid,gid=1000,forcegid,addr=172.24.180.161,file_mode=0755,dir_mode=0755,soft,nounix,serverino,mapposix,reparse=nfs,rsize=65536,wsize=65536,bsize=1048576,retrans=1,echo_interval=60,actimeo=1,closetimeo=1,user=XXX) > > ``` > > > > The crash is diagnosed as follows (again, under Qemu with the same kernel): > > > > ``` > > Debian GNU/Linux trixie/sid debian-replace ttyS0 > > > > debian-replace login: > > [ 222.366764] BUG: kernel NULL pointer dereference, address: > > 0000000000000068 > > [ 222.367079] #PF: supervisor read access in kernel mode > > [ 222.367268] #PF: error_code(0x0000) - not-present page > > [ 222.367465] PGD 0 P4D 0 > > [ 222.367565] Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI > > [ 222.367757] CPU: 1 UID: 0 PID: 45 Comm: kworker/1:1 Not tainted > > 6.12.12-amd64 #1 Debian 6.12.12-1 > > [ 222.368074] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS > > 1.16.3-debian-1.16.3-2 04/01/2014 > > [ 222.368456] Workqueue: cifsiod smb2_readv_worker [cifs] > > [ 222.368715] RIP: 0010:netfs_consume_read_data.isra.0 > > (fs/netfs/read_collect.c:262) netfs > > [ 222.368985] Code: 74 24 10 4c 89 fb 49 8b 47 68 48 85 d2 0f 85 ce 01 00 > > 00 48 8b 4c 24 30 49 8b 7f 30 48 83 c1 70 48 39 cf 74 17 4c 8b 5c 24 40 > > <49> 8b 73 68 49 03 73 60 49 39 77 60 0f 84 b2 04 00 00 48 29 d0 4c > > All code > > ======== > > 0: 74 24 je 0x26 > > 2: 10 4c 89 fb adc %cl,-0x5(%rcx,%rcx,4) > > 6: 49 8b 47 68 mov 0x68(%r15),%rax > > a: 48 85 d2 test %rdx,%rdx > > d: 0f 85 ce 01 00 00 jne 0x1e1 > > 13: 48 8b 4c 24 30 mov 0x30(%rsp),%rcx > > 18: 49 8b 7f 30 mov 0x30(%r15),%rdi > > 1c: 48 83 c1 70 add $0x70,%rcx > > 20: 48 39 cf cmp %rcx,%rdi > > 23: 74 17 je 0x3c > > 25: 4c 8b 5c 24 40 mov 0x40(%rsp),%r11 > > 2a:* 49 8b 73 68 mov 0x68(%r11),%rsi <-- > > trapping instruction > > 2e: 49 03 73 60 add 0x60(%r11),%rsi > > 32: 49 39 77 60 cmp %rsi,0x60(%r15) > > 36: 0f 84 b2 04 00 00 je 0x4ee > > 3c: 48 29 d0 sub %rdx,%rax > > 3f: 4c rex.WR > > > > Code starting with the faulting instruction > > =========================================== > > 0: 49 8b 73 68 mov 0x68(%r11),%rsi > > 4: 49 03 73 60 add 0x60(%r11),%rsi > > 8: 49 39 77 60 cmp %rsi,0x60(%r15) > > c: 0f 84 b2 04 00 00 je 0x4c4 > > 12: 48 29 d0 sub %rdx,%rax > > 15: 4c rex.WR > > [ 222.369710] RSP: 0018:ffffbdc900177dd0 EFLAGS: 00010283 > > [ 222.369902] RAX: 0000000000010000 RBX: ffff96530204b280 RCX: > > ffff9653020d7770 > > [ 222.370160] RDX: 0000000000000000 RSI: 0000000000440000 RDI: > > ffff96530204b168 > > [ 222.370434] RBP: 0000000000000000 R08: 0000000000010000 R09: > > 0000000000000000 > > [ 222.370711] R10: 0000000000000008 R11: 0000000000000000 R12: > > ffff9653020d78e8 > > [ 222.371001] R13: 0000000000040000 R14: ffff9653020d78e8 R15: > > ffff96530204b280 > > [ 222.371268] FS: 0000000000000000(0000) GS:ffff96537bd00000(0000) > > knlGS:0000000000000000 > > [ 222.371561] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [ 222.371764] CR2: 0000000000000068 CR3: 0000000106bb4000 CR4: > > 0000000000752ef0 > > [ 222.372036] PKRU: 55555554 > > [ 222.372147] Call Trace: > > [ 222.372248] <TASK> > > [ 222.372327] ? __die_body.cold (arch/x86/kernel/dumpstack.c:478 > > (discriminator 1) arch/x86/kernel/dumpstack.c:465 (discriminator 1) > > arch/x86/kernel/dumpstack.c:420 (discriminator 1)) > > [ 222.372492] ? page_fault_oops (arch/x86/mm/fault.c:711 (discriminator 1)) > > [ 222.372658] ? exc_page_fault (arch/x86/include/asm/paravirt.h:693 > > arch/x86/mm/fault.c:1489 arch/x86/mm/fault.c:1539) > > [ 222.372808] ? asm_exc_page_fault (arch/x86/include/asm/idtentry.h:623) > > [ 222.372961] ? netfs_consume_read_data.isra.0 > > (fs/netfs/read_collect.c:262) netfs > > [ 222.373176] netfs_read_subreq_terminated > > (arch/x86/include/asm/bitops.h:94 > > include/asm-generic/bitops/instrumented-non-atomic.h:45 > > fs/netfs/read_collect.c:502) netfs > > [ 222.373380] process_one_work (kernel/workqueue.c:3229) > > [ 222.373525] worker_thread (kernel/workqueue.c:3304 (discriminator 2) > > kernel/workqueue.c:3391 (discriminator 2)) > > [ 222.373657] ? __pfx_worker_thread (kernel/workqueue.c:3337) > > [ 222.373817] kthread (kernel/kthread.c:389) > > [ 222.373929] ? __pfx_kthread (kernel/kthread.c:342) > > [ 222.374077] ret_from_fork (arch/x86/kernel/process.c:147) > > [ 222.374203] ? __pfx_kthread (kernel/kthread.c:342) > > [ 222.374335] ret_from_fork_asm (arch/x86/entry/entry_64.S:257) > > [ 222.374472] </TASK> > > [ 222.374551] Modules linked in: cmac nls_utf8 cifs cifs_arc4 > > nls_ucs2_utils cifs_md4 dns_resolver netfs uinput snd_seq_dummy snd_hrtimer > > snd_seq snd_seq_device snd_timer snd soundcore rfkill nls_ascii nls_cp437 > > vfat fat intel_rapl_msr intel_rapl_common binfmt_misc > > intel_uncore_frequency_common intel_pmc_core intel_vsec pmt_telemetry > > pmt_class kvm_intel kvm crct10dif_pclmul ghash_clmulni_intel sha512_ssse3 > > sha256_ssse3 sha1_ssse3 aesni_intel gf128mul crypto_simd iTCO_wdt > > intel_pmc_bxt cryptd iTCO_vendor_support watchdog qxl rapl joydev serio_raw > > evdev pcspkr button vmwgfx drm_ttm_helper ttm drm_kms_helper drm configfs > > efi_pstore nfnetlink qemu_fw_cfg virtio_console virtio_rng ip_tables > > x_tables autofs4 ext4 crc16 mbcache jbd2 crc32c_generic ahci libahci > > xhci_pci libata xhci_hcd nvme crc32_pclmul crc32c_intel i2c_i801 scsi_mod > > psmouse virtio_net usbcore net_failover failover i2c_smbus nvme_core > > scsi_common lpc_ich nvme_auth usb_common > > [ 222.377319] CR2: 0000000000000068 > > [ 222.377437] ---[ end trace 0000000000000000 ]--- > > [ 222.377596] RIP: 0010:netfs_consume_read_data.isra.0 > > (fs/netfs/read_collect.c:262) netfs > > [ 222.377839] Code: 74 24 10 4c 89 fb 49 8b 47 68 48 85 d2 0f 85 ce 01 00 > > 00 48 8b 4c 24 30 49 8b 7f 30 48 83 c1 70 48 39 cf 74 17 4c 8b 5c 24 40 > > <49> 8b 73 68 49 03 73 60 49 39 77 60 0f 84 b2 04 00 00 48 29 d0 4c > > All code > > ======== > > 0: 74 24 je 0x26 > > 2: 10 4c 89 fb adc %cl,-0x5(%rcx,%rcx,4) > > 6: 49 8b 47 68 mov 0x68(%r15),%rax > > a: 48 85 d2 test %rdx,%rdx > > d: 0f 85 ce 01 00 00 jne 0x1e1 > > 13: 48 8b 4c 24 30 mov 0x30(%rsp),%rcx > > 18: 49 8b 7f 30 mov 0x30(%r15),%rdi > > 1c: 48 83 c1 70 add $0x70,%rcx > > 20: 48 39 cf cmp %rcx,%rdi > > 23: 74 17 je 0x3c > > 25: 4c 8b 5c 24 40 mov 0x40(%rsp),%r11 > > 2a:* 49 8b 73 68 mov 0x68(%r11),%rsi <-- > > trapping instruction > > 2e: 49 03 73 60 add 0x60(%r11),%rsi > > 32: 49 39 77 60 cmp %rsi,0x60(%r15) > > 36: 0f 84 b2 04 00 00 je 0x4ee > > 3c: 48 29 d0 sub %rdx,%rax > > 3f: 4c rex.WR > > > > Code starting with the faulting instruction > > =========================================== > > 0: 49 8b 73 68 mov 0x68(%r11),%rsi > > 4: 49 03 73 60 add 0x60(%r11),%rsi > > 8: 49 39 77 60 cmp %rsi,0x60(%r15) > > c: 0f 84 b2 04 00 00 je 0x4c4 > > 12: 48 29 d0 sub %rdx,%rax > > 15: 4c rex.WR > > [ 222.378484] RSP: 0018:ffffbdc900177dd0 EFLAGS: 00010283 > > [ 222.378666] RAX: 0000000000010000 RBX: ffff96530204b280 RCX: > > ffff9653020d7770 > > [ 222.378910] RDX: 0000000000000000 RSI: 0000000000440000 RDI: > > ffff96530204b168 > > [ 222.379153] RBP: 0000000000000000 R08: 0000000000010000 R09: > > 0000000000000000 > > [ 222.379396] R10: 0000000000000008 R11: 0000000000000000 R12: > > ffff9653020d78e8 > > [ 222.379638] R13: 0000000000040000 R14: ffff9653020d78e8 R15: > > ffff96530204b280 > > [ 222.379880] FS: 0000000000000000(0000) GS:ffff96537bd00000(0000) > > knlGS:0000000000000000 > > [ 222.380154] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [ 222.380352] CR2: 0000000000000068 CR3: 0000000106bb4000 CR4: > > 0000000000752ef0 > > [ 222.380596] PKRU: 55555554 > > [ 222.380692] Kernel panic - not syncing: Fatal exception in interrupt > > [ 222.381450] Kernel Offset: 0xbc00000 from 0xffffffff81000000 (relocation > > range: 0xffffffff80000000-0xffffffffbfffffff) > > [ 222.381829] ---[ end Kernel panic - not syncing: Fatal exception in > > interrupt ]--- > > ``` > > Thanks for your report. I believe this is all related to the same root > causes for #1098698, thus going to merge those both reports. > > If you have the possibilties have please a look at > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1098698#34 > and report back if that fixes your issue.
That specific patch seems to handle the issue with 'kernel BUG at fs/netfs/read_collect.c:315!' Not the segfault. > Max Kellermann has pointed > out the open issues here: > https://lore.kernel.org/netfs/CAKPOu+_WAM3RQJnHsKfEh5sG5tBuCPt1EWtoUFVC2ma=orj...@mail.gmail.com/ Its hard to follow what is merged in with branch/version upstream, and whats added to debian. Not sure which patches I should add. I tested the easily available versions in debian: linux-image-6.12.12-amd64 6.12.12-1 -> this bug report linux-image-6.12.17-amd64 6.12.17-1 -> identical behavior linux-image-6.13-amd64 6.13.5-1~exp1 -> 'kernel BUG at fs/netfs/read_collect.c:316!' 6.12 is a LTS kernel, aint there a repo where all proposed backports should be available? The situation is kinda bad right now, no workaround available. Regards, Norbert.