Control: tags -1 + moreinfo

On Thu, Dec 11, 2025 at 01:48:35PM +0900, Haruka Ma wrote:
> 
> Package: src:linux
> Version: 6.17.11-1
> Severity: normal
> X-Debbugs-Cc: none
> 
> 
> I'm using ZFS on my storage server and mounting 2 zvols on a remote machine
> via nvmet_rdma. I can't pinpoint the version, but after I once rebooted the
> server and started running on a new kernel, after running the server for a
> while, there will be a null pointer dereference apparently caused by
> nvmet_rdma (see kernel log below). It's a very hard to recover from this
> situation as the transport is supposed to be reliable and any packet loss
> would cause kernel lock-ups on the initiator side.
> 
> It seems to me that this is likely a kernel issue. I haven't tried if
> disabling cgroup io / blkio would avoid this. As the issue only appears
> after hours or days of using the block device, it's currently hard for me to
> bisect kernel versions, as I need to at least reboot one machine to recover
> and it's kinda disruptive.
> 
> The taints are because of loading ZFS, unfortunately due to how the storage
> system is designed, it's impossible for me to reproduce this without kernel
> taints.
> 
> reportbug output below (redacted network information):
> 
> -- Package-specific info:
> ** Version:
> Linux version 6.17.11+deb14-amd64 ([email protected])
> (x86_64-linux-gnu-gcc-15 (Debian 15.2.0-10) 15.2.0, GNU ld (GNU Binutils for
> Debian) 2.45.50.20251201) #1 SMP PREEMPT_DYNAMIC Debian 6.17.11-1
> (2025-12-07)
> 
> ** Command line:
> BOOT_IMAGE=/boot@/vmlinuz-6.17.11+deb14-amd64 root=ZFS=root/root ro
> root=ZFS=root/root mitigations=off
> 
> ** Tainted: PDWOE (12929)
>  * proprietary module was loaded
>  * kernel died recently, i.e. there was an OOPS or BUG
>  * kernel issued warning
>  * externally-built ("out-of-tree") module was loaded
>  * unsigned module was loaded
> 
> ** Kernel log:
> [  558.014398] TARGET_CORE[iSCSI]: Expected Transfer Length: 4096 does not
> match SCSI CDB Length: 255 for SAM Opcode: 0x12
> [  558.018382] TARGET_CORE[iSCSI]: Expected Transfer Length: 4096 does not
> match SCSI CDB Length: 255 for SAM Opcode: 0x12
> [  558.022377] TARGET_CORE[iSCSI]: Expected Transfer Length: 4096 does not
> match SCSI CDB Length: 255 for SAM Opcode: 0x12
> [  558.030373] TARGET_CORE[iSCSI]: Expected Transfer Length: 4096 does not
> match SCSI CDB Length: 255 for SAM Opcode: 0x12
> [ 8592.007814] perf: interrupt took too long (2523 > 2500), lowering
> kernel.perf_event_max_sample_rate to 79250
> [ 9319.094555] usb 3-12.1: USB disconnect, device number 4
> [10455.213639] perf: interrupt took too long (3185 > 3153), lowering
> kernel.perf_event_max_sample_rate to 62750
> [11960.618904] perf: interrupt took too long (3998 > 3981), lowering
> kernel.perf_event_max_sample_rate to 50000
> [13141.438762] perf: interrupt took too long (4998 > 4997), lowering
> kernel.perf_event_max_sample_rate to 40000
> [21012.774195] perf: interrupt took too long (6261 > 6247), lowering
> kernel.perf_event_max_sample_rate to 31750
> [80158.748360] usb 3-6: USB disconnect, device number 2
> [87562.612495] perf: interrupt took too long (7831 > 7826), lowering
> kernel.perf_event_max_sample_rate to 25500
> [166023.147960] BUG: kernel NULL pointer dereference, address:
> 0000000000000028
> [166023.149093] #PF: supervisor read access in kernel mode
> [166023.150016] #PF: error_code(0x0000) - not-present page
> [166023.150923] PGD 0 P4D 0 [166023.151829] Oops: Oops: 0000 [#1] SMP NOPTI
> [166023.152736] CPU: 8 UID: 0 PID: 586 Comm: kworker/8:1H Tainted: P
> OE       6.17.11+deb14-amd64 #1 PREEMPT(lazy)  Debian 6.17.11-1
> [166023.153661] Tainted: [P]=PROPRIETARY_MODULE, [O]=OOT_MODULE,
> [E]=UNSIGNED_MODULE
> [166023.154629] Hardware name: Dell Inc. PowerEdge T630/0NT78X, BIOS 2.11.0
> 12/23/2019
> [166023.155561] Workqueue: ib-comp-wq ib_cq_poll_work [ib_core]
> [166023.156581] RIP: 0010:blk_cgroup_bio_start+0x10/0x230
> [166023.157531] Code: 00 00 00 00 45 31 c0 eb da 90 90 90 90 90 90 90 90 90
> 90 90 90 90 90 90 90 90 66 0f 1f 00 0f 1f 44 00 00 8b 57 10 48 8b 47 48 <4c>
> 8b 40 28 89 d0 83 e0 01 80 fa 03 ba 02 00 00 00 48 0f 44 c2 0f
> [166023.159494] RSP: 0018:ffffcd998ef87c68 EFLAGS: 00010282
> [166023.160495] RAX: 0000000000000000 RBX: ffff8c00ae800168 RCX:
> 0000000000000000
> [166023.161536] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
> ffff8c00ae800168
> [166023.162554] RBP: ffffcd998ef87cb0 R08: 0000000000001000 R09:
> 0000000000000028
> [166023.163578] R10: 0000001400000000 R11: 0000000000000000 R12:
> 0000000000000000
> [166023.164594] R13: 0000000000000000 R14: 0000000000000000 R15:
> 00000002a54c93a8
> [166023.165614] FS:  0000000000000000(0000) GS:ffff8c1fb8d08000(0000)
> knlGS:0000000000000000
> [166023.166633] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [166023.167656] CR2: 0000000000000028 CR3: 0000002cc1c2c006 CR4:
> 00000000003726f0
> [166023.168684] Call Trace:
> [166023.169691]  <TASK>
> [166023.170680]  submit_bio_noacct_nocheck+0x30/0x350
> [166023.171685]  ? bio_associate_blkg+0x3d/0x80
> [166023.172672]  nvmet_bdev_execute_rw+0x29a/0x3d0 [nvmet]
> [166023.173676]  nvmet_rdma_execute_command+0x52/0x120 [nvmet_rdma]
> [166023.174649]  nvmet_rdma_handle_command+0xf7/0x2c0 [nvmet_rdma]
> [166023.175608]  __ib_process_cq+0x7f/0x180 [ib_core]
> [166023.176603]  ib_cq_poll_work+0x2a/0x80 [ib_core]
> [166023.177588]  process_one_work+0x18f/0x350
> [166023.178519]  worker_thread+0x25a/0x3a0
> [166023.179431]  ? __pfx_worker_thread+0x10/0x10
> [166023.180339]  kthread+0xf9/0x240
> [166023.181231]  ? __pfx_kthread+0x10/0x10
> [166023.182109]  ? __pfx_kthread+0x10/0x10
> [166023.182977]  ret_from_fork+0x194/0x1c0
> [166023.183837]  ? __pfx_kthread+0x10/0x10
> [166023.184676]  ret_from_fork_asm+0x1a/0x30
> [166023.185518]  </TASK>
> [166023.186333] Modules linked in: nvmet_rdma rdma_cm iw_cm ib_umad
> nvmet_tcp nvmet nls_utf8 nft_fib_ipv4 nft_fib nft_ct wireguard
> libchacha20poly1305 curve25519_x86_64 libcurve25519_generic chacha_x86_64
> libchacha libpoly1305 poly1305_x86_64 ip6_udp_tunnel udp_tunnel tcp_diag
> inet_diag target_core_user uio target_core_pscsi target_core_file
> target_core_iblock iscsi_target_mod target_core_mod ip6_tunnel tunnel6
> ib_ipoib ib_cm bridge 8021q garp stp llc mrp ext4 sunrpc crc16 mbcache jbd2
> crc32c_cryptoapi binfmt_misc nls_ascii nls_cp437 vfat fat ipmi_ssif
> intel_rapl_msr intel_rapl_common intel_uncore_frequency
> intel_uncore_frequency_common sb_edac x86_pkg_temp_thermal intel_powerclamp
> platform_profile kvm_intel kvm dell_smbios dell_wmi_descriptor irqbypass
> battery ghash_clmulni_intel rfkill aesni_intel video rapl mgag200
> intel_cstate drm_client_lib dcdbas drm_shmem_helper intel_uncore pcspkr
> drm_kms_helper mxm_wmi acpi_power_meter ses acpi_ipmi ipmi_si enclosure
> evdev joydev scsi_transport_sas ipmi_devintf mei_me
> [166023.186460]  ipmi_msghandler mei sg button nft_reject_inet
> nf_reject_ipv4 nf_reject_ipv6 nft_reject tcp_bbr nft_numgen nft_masq nft_nat
> nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables
> coretemp ksmbd gcm cifs_arc4 nls_ucs2_utils efi_pstore drm configfs
> nfnetlink efivarfs autofs4 zfs(POE) spl(OE) mlx5_ib hid_generic usbhid
> ib_uverbs hid sr_mod cdrom qla2xxx iTCO_wdt ib_core ahci sd_mod
> intel_pmc_bxt xhci_pci nvme_fc mlx5_core nvme nvme_fabrics ehci_pci libahci
> xhci_hcd iTCO_vendor_support bnx2x ehci_hcd libata megaraid_sas watchdog
> nvme_core scsi_transport_fc igb usbcore scsi_mod ioatdma mdio nvme_keyring
> mlxfw i2c_algo_bit pci_hyperv_intf nvme_auth wmi lpc_ich dca usb_common
> scsi_common
> [166023.195602] CR2: 0000000000000028
> [166023.196573] ---[ end trace 0000000000000000 ]---
> [166023.262742] RIP: 0010:blk_cgroup_bio_start+0x10/0x230
> [166023.263791] Code: 00 00 00 00 45 31 c0 eb da 90 90 90 90 90 90 90 90 90
> 90 90 90 90 90 90 90 90 66 0f 1f 00 0f 1f 44 00 00 8b 57 10 48 8b 47 48 <4c>
> 8b 40 28 89 d0 83 e0 01 80 fa 03 ba 02 00 00 00 48 0f 44 c2 0f
> [166023.265421] RSP: 0018:ffffcd998ef87c68 EFLAGS: 00010282
> [166023.266214] RAX: 0000000000000000 RBX: ffff8c00ae800168 RCX:
> 0000000000000000
> [166023.266996] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
> ffff8c00ae800168
> [166023.267779] RBP: ffffcd998ef87cb0 R08: 0000000000001000 R09:
> 0000000000000028
> [166023.268568] R10: 0000001400000000 R11: 0000000000000000 R12:
> 0000000000000000
> [166023.269360] R13: 0000000000000000 R14: 0000000000000000 R15:
> 00000002a54c93a8
> [166023.270149] FS:  0000000000000000(0000) GS:ffff8c1fb8d08000(0000)
> knlGS:0000000000000000
> [166023.270937] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [166023.271732] CR2: 0000000000000028 CR3: 0000002cc1c2c006 CR4:
> 00000000003726f0
> [166023.272550] note: kworker/8:1H[586] exited with irqs disabled
> [166023.282910] ------------[ cut here ]------------
> [166023.283487] WARNING: CPU: 8 PID: 586 at kernel/exit.c:898
> do_exit+0x7e4/0xa60
> [166023.284042] Modules linked in: nvmet_rdma rdma_cm iw_cm ib_umad
> nvmet_tcp nvmet nls_utf8 nft_fib_ipv4 nft_fib nft_ct wireguard
> libchacha20poly1305 curve25519_x86_64 libcurve25519_generic chacha_x86_64
> libchacha libpoly1305 poly1305_x86_64 ip6_udp_tunnel udp_tunnel tcp_diag
> inet_diag target_core_user uio target_core_pscsi target_core_file
> target_core_iblock iscsi_target_mod target_core_mod ip6_tunnel tunnel6
> ib_ipoib ib_cm bridge 8021q garp stp llc mrp ext4 sunrpc crc16 mbcache jbd2
> crc32c_cryptoapi binfmt_misc nls_ascii nls_cp437 vfat fat ipmi_ssif
> intel_rapl_msr intel_rapl_common intel_uncore_frequency
> intel_uncore_frequency_common sb_edac x86_pkg_temp_thermal intel_powerclamp
> platform_profile kvm_intel kvm dell_smbios dell_wmi_descriptor irqbypass
> battery ghash_clmulni_intel rfkill aesni_intel video rapl mgag200
> intel_cstate drm_client_lib dcdbas drm_shmem_helper intel_uncore pcspkr
> drm_kms_helper mxm_wmi acpi_power_meter ses acpi_ipmi ipmi_si enclosure
> evdev joydev scsi_transport_sas ipmi_devintf mei_me
> [166023.284121]  ipmi_msghandler mei sg button nft_reject_inet
> nf_reject_ipv4 nf_reject_ipv6 nft_reject tcp_bbr nft_numgen nft_masq nft_nat
> nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables
> coretemp ksmbd gcm cifs_arc4 nls_ucs2_utils efi_pstore drm configfs
> nfnetlink efivarfs autofs4 zfs(POE) spl(OE) mlx5_ib hid_generic usbhid
> ib_uverbs hid sr_mod cdrom qla2xxx iTCO_wdt ib_core ahci sd_mod
> intel_pmc_bxt xhci_pci nvme_fc mlx5_core nvme nvme_fabrics ehci_pci libahci
> xhci_hcd iTCO_vendor_support bnx2x ehci_hcd libata megaraid_sas watchdog
> nvme_core scsi_transport_fc igb usbcore scsi_mod ioatdma mdio nvme_keyring
> mlxfw i2c_algo_bit pci_hyperv_intf nvme_auth wmi lpc_ich dca usb_common
> scsi_common
> [166023.290179] CPU: 8 UID: 0 PID: 586 Comm: kworker/8:1H Tainted: P  D
> OE       6.17.11+deb14-amd64 #1 PREEMPT(lazy)  Debian 6.17.11-1
> [166023.290821] Tainted: [P]=PROPRIETARY_MODULE, [D]=DIE, [O]=OOT_MODULE,
> [E]=UNSIGNED_MODULE
> [166023.291483] Hardware name: Dell Inc. PowerEdge T630/0NT78X, BIOS 2.11.0
> 12/23/2019
> [166023.292125] Workqueue: ib-comp-wq ib_cq_poll_work [ib_core]
> [166023.292816] RIP: 0010:do_exit+0x7e4/0xa60
> [166023.293447] Code: 89 a3 f0 0a 00 00 48 89 6c 24 10 48 8b 83 10 0d 00 00
> e9 04 fe ff ff 48 8b bb d0 0a 00 00 31 f6 e8 51 e0 ff ff e9 b7 fd ff ff <0f>
> 0b e9 61 f8 ff ff 4c 89 e6 bf 05 06 00 00 e8 78 26 01 00 e9 98
> [166023.294749] RSP: 0018:ffffcd998ef87ee8 EFLAGS: 00010282
> [166023.295398] RAX: 0000000000000296 RBX: ffff8c00491a5040 RCX:
> 000000000000270f
> [166023.296049] RDX: 0000000000000000 RSI: 0000000000002710 RDI:
> 0000000000000009
> [166023.296675] RBP: 0000000000000000 R08: 0000000000000009 R09:
> ffff8c00491a5040
> [166023.297305] R10: ffff8c3f3f7fffa8 R11: 00000000ffffbfff R12:
> 0000000000000009
> [166023.297932] R13: 0000000000000000 R14: ffff8c00491a5040 R15:
> 0000000000000000
> [166023.298558] FS:  0000000000000000(0000) GS:ffff8c1fb8d08000(0000)
> knlGS:0000000000000000
> [166023.299193] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [166023.299823] CR2: 0000000000000028 CR3: 0000002cc1c2c006 CR4:
> 00000000003726f0
> [166023.300463] Call Trace:
> [166023.301093]  <TASK>
> [166023.301719]  make_task_dead+0x8d/0x90
> [166023.302344]  rewind_stack_and_make_dead+0x16/0x20
> [166023.302979] RIP: 0000:0x0
> [166023.303649] Code: Unable to access opcode bytes at 0xffffffffffffffd6.
> [166023.304277] RSP: 0000:0000000000000000 EFLAGS: 00000000 ORIG_RAX:
> 0000000000000000
> [166023.304906] RAX: 0000000000000000 RBX: 0000000000000000 RCX:
> 0000000000000000
> [166023.305534] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
> 0000000000000000
> [166023.306175] RBP: 0000000000000000 R08: 0000000000000000 R09:
> 0000000000000000
> [166023.306799] R10: 0000000000000000 R11: 0000000000000000 R12:
> 0000000000000000
> [166023.307440] R13: 0000000000000000 R14: 0000000000000000 R15:
> 0000000000000000
> [166023.308061]  </TASK>
> [166023.308679] ---[ end trace 0000000000000000 ]---

Thanks for the report. For making a upstream report we might want to
see a untained kernel (it might be rejected otherwise uupstream, but I
understand this might not be possible here).

Before trying to do so, please check if the issue is present still in
both 6.17.13-1 (the last of the series uploaded to unstable), and
6.18.2-1~exp1.

If it fails with 6.18.2-1~exp1, can you please provide the full kernel
log up to triggering the problem?

Please do decode as well the stacktrace of the fresh results with the
more recent kernels with scripts/decode_stacktrace.sh .

Regards,
Salvatore

Reply via email to