Bug#1057005: linux-image-6.1.0-13-amd64: Kernel Oops in nfs4_do_reclaim, which exits with irqs disabled
Control: tags -1 + moreinfo Hi James, On Mon, Nov 27, 2023 at 08:00:52PM +, James Chapman wrote: > Package: src:linux > Version: 6.1.55-1 > Severity: important > X-Debbugs-Cc: jamescope...@gmail.com > > Dear Maintainer, > > Hi, I have experienced an issue where my client lost access to an > NFS server for a period of around 15 minutes, then immediately > following server recovery, I experienced a kernel oops on the client > (details below). What made this more severe was the fact that > nfs4_do_reclaim exited with irqs disabled, which is possibly what > resulted in a number of "rcu_preempt self-detected stall on CPU" > errors and a very unstable system, leaving me no choice but to hit > the reset button. Are you able to reproduce the issue? otherwise it is quite hard to make any assessment for this bug. Regards, Salvatore
Bug#1057005: linux-image-6.1.0-13-amd64: Kernel Oops in nfs4_do_reclaim, which exits with irqs disabled
Package: src:linux Version: 6.1.55-1 Severity: important X-Debbugs-Cc: jamescope...@gmail.com Dear Maintainer, Hi, I have experienced an issue where my client lost access to an NFS server for a period of around 15 minutes, then immediately following server recovery, I experienced a kernel oops on the client (details below). What made this more severe was the fact that nfs4_do_reclaim exited with irqs disabled, which is possibly what resulted in a number of "rcu_preempt self-detected stall on CPU" errors and a very unstable system, leaving me no choice but to hit the reset button. BUG: unable to handle page fault for address: fff8 #PF: supervisor read access in kernel mode #PF: error_code(0x) - not-present page PGD 37de15067 P4D 37de15067 PUD 37de17067 PMD 0 Oops: [#1] PREEMPT SMP NOPTI CPU: 9 PID: 4154600 Comm: 192.168.253.7-m Not tainted 6.1.0-13-amd64 #1 Debian 6.1.55-1 Hardware name: System manufacturer System Product Name/PRIME X570-PRO, BIOS 3405 02/01/2021 RIP: 0010:complete+0x38/0x80 Code: 89 fb 4c 89 e7 e8 c8 d8 93 00 48 89 c5 8b 03 83 f8 ff 74 05 83 c0 01 89 03 48 8b 53 10 48 8d 43 10 48 39 c2 74 2e 48 8b 5b 10 <48> 8b 7b f8 e8 df cc fd ff 48 89 df e8 17 bd 43 00 84 c0 7 4 0e 48 RSP: 0018:b78d1e55bdc0 EFLAGS: 00010013 RAX: 9df24a45f2f8 RBX: RCX: RDX: RSI: RDI: 0001 RBP: 0247 R08: 9defc1f84040 R09: 9dee419a5910 R10: 0001 R11: 0001 R12: 9df24a45f2f0 R13: c12613a0 R14: 9dee5532d820 R15: 9defc1f84000 FS: () GS:9df54ec4() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: fff8 CR3: 0007eaf16000 CR4: 00350ee0 Call Trace: ? __die_body.cold+0x1a/0x1f ? page_fault_oops+0xd2/0x2b0 ? exc_page_fault+0xca/0x170 ? asm_exc_page_fault+0x22/0x30 ? complete+0x38/0x80 nfs4_do_reclaim+0x5b6/0x810 [nfsv4] nfs4_run_state_manager+0x882/0xab0 [nfsv4] ? __schedule+0x359/0xa20 ? preempt_count_add+0x6a/0xa0 ? nfs4_do_reclaim+0x810/0x810 [nfsv4] kthread+0xe9/0x110 ? kthread_complete_and_exit+0x20/0x20 ret_from_fork+0x22/0x30 Modules linked in: sd_mod uas usb_storage tcp_diag udp_diag inet_diag veth vhost_net vhost vhost _iotlb tun macvtap macvlan tap xt_CHECKSUM ipt_REJECT nf_reject_ipv4 cpufreq_powersave cpufreq_u serspace cpufreq_ondemand cpufreq_conservative rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nf s lockd grace fscache netfs nft_masq sunrpc bridge nft_chain_nat xt_MASQUERADE xt_nat nf_nat xt_ multiport xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_tcpudp nft_compat nf_tables libcrc32c nfnetlink binfmt_misc intel_rapl_msr intel_rapl_common amd64_edac edac_mce_amd kvm_am d nouveau kvm irqbypass ghash_clmulni_intel sha512_ssse3 video sha512_generic drm_display_helper asus_ec_sensors cec evdev aesni_intel rc_core crypto_simd pl2303 drm_ttm_helper cryptd ttm usbs erial rapl drm_kms_helper sp5100_tco ccp pcspkr wmi_bmof mxm_wmi watchdog k10temp button acpi_cp ufreq sg nct6775 nct6775_core hwmon_vid 8021q garp stp mrp llc drm fuse loop dm_mod efi_pstore c onfigfs efivarfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 crc32c_generic mlx4_ib ib_uverbs ib_core mlx4_en hid_g eneric usbhid hid sr_mod cdrom ahci libahci xhci_pci xhci_hcd nvme libata mlx4_core crc32_pclmul nvme_core crc32c_intel usbcore igb scsi_mod t10_pi i2c_piix4 crc64_rocksoft crc64 i2c_algo_bit crc_t10dif scsi_common usb_common crct10dif_generic dca crct10dif_pclmul crct10dif_common wmi CR2: fff8 ---[ end trace ]--- RIP: 0010:complete+0x38/0x80 Code: 89 fb 4c 89 e7 e8 c8 d8 93 00 48 89 c5 8b 03 83 f8 ff 74 05 83 c0 01 89 03 48 8b 53 10 48 8d 43 10 48 39 c2 74 2e 48 8b 5b 10 <48> 8b 7b f8 e8 df cc fd ff 48 89 df e8 17 bd 43 00 84 c0 74 0e 48 RSP: 0018:b78d1e55bdc0 EFLAGS: 00010013 RAX: 9df24a45f2f8 RBX: RCX: RDX: RSI: RDI: 0001 RBP: 0247 R08: 9defc1f84040 R09: 9dee419a5910 R10: 0001 R11: 0001 R12: 9df24a45f2f0 R13: c12613a0 R14: 9dee5532d820 R15: 9defc1f84000 FS: () GS:9df54ec4() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: fff8 CR3: 0007eaf16000 CR4: 00350ee0 note: 192.168.253.7-m[4154600] exited with irqs disabled note: 192.168.253.7-m[4154600] exited with preempt_count 2 -- Package-specific info: ** Version: Linux version 6.1.0-13-amd64 (debian-ker...@lists.debian.org) (gcc-12 (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1 SMP PREEMPT_DYNAMIC Debian 6.1.55-1 (2023-09-29) ** Command line: BOOT_IMAGE=/vmlinuz-6.1.0-13-amd64 root=UUID=dc88b740-868e-4b8f-9e70-2a5d47104b70 ro ipv6.disable=1 nomodeset clocksource=hpet retbleed=off