Hi Salvatore,
On 1/21/26 08:01, Salvatore Bonaccorso wrote:
Control: severity -1 important
Control: tags -1 + moreinfo
Hi Damir,
On Tue, Jan 20, 2026 at 03:12:12PM +0300, Damir Mansurov wrote:
Thanks for the report. A couple of questions:
6.17.13-1 is not the recent kernel (it is in backports) but still
please test with the current 6.18.5-1 from unstable to confirm if the
problem persist (please do with both the rt and the normal kernel).
If the problem is triggered as well in 6.18.y from unstable, then it
would be ideal to confirm the issue is as well present in mainline, so
test the kernel from experimental (6.19~rc5-1~exp1).
Is this an experienced regression from earlier version? If so from
which?
The last "good" kernel for me is 6.15, I see the issue starting from 6.17
kernels (RT and non-RT) and on latest 6.19-rt-amd64 6.19~rc5-1~exp1.
But on non-RT kernels this Call Trace is a bit different:
```
[ 364.028512] INFO: task ethtool:1442 blocked for more than 120 seconds.
[ 364.035075] Tainted: G W 6.17.13+deb14-amd64 #1 Debian 6.17.13-1
[ 364.043097] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 364.050958] task:ethtool state:D stack:0 pid:1442 tgid:1442 ppid:1387 task_flags:0x400000 flags:0x00004002
[ 364.062069] Call Trace:
[ 364.064535] <TASK>
[ 364.066645] __schedule+0x45c/0xd20
[ 364.071084] schedule+0x27/0xd0
[ 364.074397] schedule_preempt_disabled+0x15/0x30
[ 364.079145] __mutex_lock.constprop.0+0x530/0xa40
[ 364.083868] efx_mcdi_rx_pull_rss_config+0x28/0x60 [sfc]
[ 364.089241] ? rss_get_data_alloc+0x63/0xa0
[ 364.093498] efx_ethtool_get_rxfh+0x3b/0xd0 [sfc]
[ 364.098243] rss_prepare.isra.0+0x1c9/0x330
[ 364.103014] ethnl_default_doit+0x147/0x3f0
[ 364.107339] genl_family_rcv_msg_doit+0xff/0x160
[ 364.111987] genl_rcv_msg+0x1aa/0x2b0
[ 364.115754] ? __pfx_ethnl_default_doit+0x10/0x10
[ 364.120481] ? __pfx_genl_rcv_msg+0x10/0x10
[ 364.125291] netlink_rcv_skb+0x5c/0x110
[ 364.129733] genl_rcv+0x28/0x40
[ 364.133010] netlink_unicast+0x288/0x3c0
[ 364.136952] ? __alloc_skb+0xdb/0x1a0
[ 364.141215] netlink_sendmsg+0x20d/0x430
[ 364.145277] __sys_sendto+0x1f5/0x200
[ 364.148952] __x64_sys_sendto+0x24/0x30
[ 364.152842] do_syscall_64+0x82/0x320
[ 364.156635] ? mod_memcg_lruvec_state+0xe7/0x2e0
[ 364.161274] ? folio_wait_bit_common+0x2a7/0x320
[ 364.166482] ? __lruvec_stat_mod_folio+0x85/0xd0
[ 364.171242] ? xas_load+0x11/0x100
[ 364.174663] ? xas_find+0x83/0x1b0
[ 364.178666] ? next_uptodate_folio+0xa0/0x350
[ 364.183171] ? percpu_counter_add_batch+0x4a/0xb0
[ 364.187900] ? filemap_map_pages+0x53e/0x690
[ 364.192495] ? do_fault+0x34c/0x5a0
[ 364.196113] ? __handle_mm_fault+0x8db/0xef0
[ 364.201555] ? mt_find+0x21f/0x590
[ 364.205094] ? count_memcg_events+0xd6/0x220
[ 364.209382] ? handle_mm_fault+0x1d6/0x2d0
[ 364.214079] ? do_user_addr_fault+0x21a/0x690
[ 364.218847] ? exc_page_fault+0x74/0x180
[ 364.223178] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 364.228608] RIP: 0033:0x7feb25e91eb2
[ 364.232605] RSP: 002b:00007ffe6f478f50 EFLAGS: 00000202 ORIG_RAX: 000000000000002c
[ 364.240556] RAX: ffffffffffffffda RBX: 00007ffe6f479080 RCX: 00007feb25e91eb2
[ 364.248105] RDX: 0000000000000028 RSI: 000055a078bd4420 RDI: 0000000000000003
[ 364.255663] RBP: 000055a078bd4310 R08: 00007feb26107000 R09: 000000000000000c
[ 364.263283] R10: 0000000000000000 R11: 0000000000000202 R12: 000055a078bd43c0
[ 364.270864] R13: 000055a078bd43b0 R14: 0000000000000000 R15: 000055a0559a41dd
[ 364.278392] </TASK>
```
Futhermore the kernel is tained with OOT module, afaics this might be
sfc_resource. That is you still would need to check if the problem is
trigerable without loading OOT module.
Oops, I removed the OOT modules and still see this problem:
```
[ 262.974262] ------------[ cut here ]------------
[ 262.974265] rtmutex deadlock detected
[ 262.974270] WARNING: CPU: 5 PID: 1522 at kernel/locking/rtmutex.c:1674
__rt_mutex_slowlock_locked.constprop.0+0x1e8/0x220
[ 262.974278] Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfs lockd grace netfs nft_masq nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6
nf_defrag_ipv4 bridge nf_tables 8021q garp stp llc mrp binfmt_misc intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp
kvm_intel kvm evdev irqbypass ghash_clmulni_intel aesni_intel rapl ipmi_ssif intel_cstate mgag200 vga16fb intel_uncore vgastate drm_client_lib pcspkr
drm_shmem_helper acpi_cpufreq mei_me drm_kms_helper mei button sg acpi_ipmi ipmi_si ipmi_watchdog ipmi_devintf ipmi_msghandler drm efi_pstore auth_rpcgss
configfs sunrpc nfnetlink autofs4 ext4 crc16 mbcache jbd2 crc32c_cryptoapi sd_mod iTCO_wdt intel_pmc_bxt iTCO_vendor_support isci watchdog libsas ahci psmouse
libahci scsi_transport_sas serio_raw igb libata ehci_pci sfc ehci_hcd i2c_algo_bit wmi scsi_mod usbcore i2c_i801 scsi_common mtd i2c_smbus ioatdma usb_common
lpc_ich dca
[ 262.974338] CPU: 5 UID: 1100 PID: 1522 Comm: ethtool Tainted: G W
6.17.13+deb14-rt-amd64 #1 PREEMPT_{RT,(full)} Debian 6.17.13-1
[ 262.974341] Tainted: [W]=WARN
[ 262.974342] Hardware name: Supermicro
X9SRE/X9SRE-3F/X9SRi/X9SRi-3F/X9SRE/X9SRE-3F/X9SRi/X9SRi-3F, BIOS 3.0a
01/03/2014
[ 262.974344] RIP: 0010:__rt_mutex_slowlock_locked.constprop.0+0x1e8/0x220
[ 262.974347] Code: 00 4c 89 e6 48 89 ef e8 b6 35 ca 00 41 83 fd dd 0f 85 d7 fe ff ff 48 89 ef e8 f4 9c ca 00 48 c7 c7 6c 51 7d b1 e8 a8 00 f6 ff <0f> 0b 66 90
b8 01 00 00 00 87 43 18 e8 c7 3e fb ff eb ef bf 01 00
[ 262.974349] RSP: 0018:ffffd19302e2f870 EFLAGS: 00010246
[ 262.974351] RAX: 0000000000000000 RBX: ffff8c5387910000 RCX: 0000000000000027
[ 262.974352] RDX: ffff8c56ffd5ce88 RSI: 0000000000000001 RDI: ffff8c56ffd5ce80
[ 262.974354] RBP: ffff8c53810e9b10 R08: 0000000000000000 R09: ffffd19302e2f648
[ 262.974355] R10: ffffffffb20e3e48 R11: 00000000ffffefff R12: ffffd19302e2f870
[ 262.974356] R13: 00000000ffffffdd R14: ffffd19302e2f918 R15: ffffffffb1368880
[ 262.974358] FS: 00007f4403a9fb80(0000) GS:ffff8c574d141000(0000)
knlGS:0000000000000000
[ 262.974360] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 262.974361] CR2: 000055c79a144700 CR3: 0000000140494006 CR4: 00000000000626f0
[ 262.974363] Call Trace:
[ 262.974365] <TASK>
[ 262.974369] rt_mutex_slowlock.constprop.0+0x4d/0xc0
[ 262.974374] efx_mcdi_rx_pull_rss_config+0x28/0x60 [sfc]
[ 262.974410] efx_ethtool_get_rxfh+0x3b/0xd0 [sfc]
[ 262.974439] rss_prepare.isra.0+0x1c9/0x330
[ 262.974443] ethnl_default_doit+0x147/0x3f0
[ 262.974446] genl_family_rcv_msg_doit+0xff/0x160
[ 262.974452] genl_rcv_msg+0x1aa/0x2b0
[ 262.974455] ? __pfx_ethnl_default_doit+0x10/0x10
[ 262.974457] ? __pfx_genl_rcv_msg+0x10/0x10
[ 262.974460] netlink_rcv_skb+0x5c/0x110
[ 262.974465] genl_rcv+0x28/0x40
[ 262.974467] netlink_unicast+0x28f/0x3c0
[ 262.974470] ? __alloc_skb+0xdb/0x1a0
[ 262.974474] netlink_sendmsg+0x20d/0x440
[ 262.974478] __sys_sendto+0x1f5/0x200
[ 262.974483] __x64_sys_sendto+0x24/0x30
[ 262.974485] do_syscall_64+0x82/0x320
[ 262.974488] ? count_memcg_events+0xd6/0x220
[ 262.974492] ? handle_mm_fault+0x1d6/0x2d0
[ 262.974495] ? do_user_addr_fault+0x21a/0x690
[ 262.974499] ? exc_page_fault+0x74/0x180
[ 262.974503] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 262.974505] RIP: 0033:0x7f4403b32eb2
[ 262.974516] Code: 18 41 8b 93 08 03 00 00 59 5e 48 83 f8 fc 75 1a 83 e2 39 83 fa 08 75 12 e8 2b ff ff ff 0f 1f 00 49 89 ca 48 8b 44 24 20 0f 05 <48> 83 c4 18
c3 66 0f 1f 84 00 00 00 00 00 48 83 ec 10 ff 74 24 18
[ 262.974518] RSP: 002b:00007ffef9cea580 EFLAGS: 00000202 ORIG_RAX:
000000000000002c
[ 262.974520] RAX: ffffffffffffffda RBX: 00007ffef9cea6b0 RCX: 00007f4403b32eb2
[ 262.974521] RDX: 0000000000000028 RSI: 000055c7b2901420 RDI: 0000000000000003
[ 262.974522] RBP: 000055c7b2901310 R08: 00007f4403cc7000 R09: 000000000000000c
[ 262.974523] R10: 0000000000000000 R11: 0000000000000202 R12: 000055c7b29013c0
[ 262.974525] R13: 000055c7b29013b0 R14: 0000000000000000 R15: 000055c79a1491dd
[ 262.974527] </TASK>
[ 262.974528] ---[ end trace 0000000000000000 ]---
```
Regards,
Salvatore
--
Damir Mansurov
[email protected]