在 2026/1/30 19:59, Lukas Wunner 写道:
> 
> 
> [这封邮件来自外部发件人 谨防风险]
> 
> On Wed, Jan 28, 2026 at 06:07:51PM +0800, LeoLiu-oc wrote:
>> Without this patch, PCIE devices may not be usable and could even cause
>> more serious errors, such as a kernel panic. For example, the following
>> log is encountered in hardware testing:
>>
>> list_del corruption, ffff8881418b79e8->next is LIST_POISON1
>> (dead000000000100)
>> ------------[ cut here ]------------
>> kernel BUG at lib/list_debug.c:56!
>> invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
>> ...
>> Kernel panic - not syncing: Fatal exception
> 
> This should not happen.  Which kernel version are you using?

The kernel version I am using is 6.18.6. This patch has already been
included.

The complete log of the kernel panic is as follows:

[  100.304077][  T843] list_del corruption, ffff8881418b79e8->next is
LIST_POISON1 (dead000000000100)
[  100.312989][  T843] ------------[ cut here ]------------
[  100.318268][  T843] kernel BUG at lib/list_debug.c:56!
[  100.323380][  T843] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[  100.329250][  T843] CPU: 7 PID: 843 Comm: irq/27-pciehp Tainted: P
    W  OE     ------- ----  6.6.0-32.7.v2505.ky11.x86_64 #1
[  100.340793][  T843] Source Version:
71d5b964051132b7772acd935972fca11462bbfe
[  100.359228][  T843] RIP: 0010:__list_del_entry_valid_or_report+0x7f/0xc0
[  100.365877][  T843] Code: 66 4b a6 e8 c3 43 a9 ff 0f 0b 48 89 fe 48
c7 c7 10 67 4b a6 e8 b2 43 a9 ff 0f 0b 48 89 fe 48 c7 c7 40 67 4b a6 e8
a1 43 a9 ff <0f> 0b 48 89 fe 48 89 ca 48 c7 c7 78 67 4b a6 e8 8d 43 a9
ff 0f 0b
[  100.385158][  T843] RSP: 0018:ffffc9000f70fc08 EFLAGS: 00010246
[  100.391024][  T843] RAX: 000000000000004e RBX: ffff8881418b79e8 RCX:
0000000000000000
[  100.398781][  T843] RDX: 0000000000000000 RSI: ffff8897df5a32c0 RDI:
ffff8897df5a32c0
[  100.406538][  T843] RBP: ffff8881257f9608 R08: 0000000000000000 R09:
0000000000000003
[  100.414294][  T843] R10: ffffc9000f70fa90 R11: ffffffffa6fee508 R12:
0000000000000000
[  100.422050][  T843] R13: ffff8881257f9608 R14: ffff888116507c28 R15:
ffff888116507c28
[  100.429807][  T843] FS:  0000000000000000(0000)
GS:ffff8897df580000(0000) knlGS:0000000000000000
[  100.438511][  T843] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  100.444891][  T843] CR2: 00007f9563bac1c0 CR3: 0000000c4be26004 CR4:
0000000000570ee0
[  100.452647][  T843] PKRU: 55555554
[  100.456017][  T843] Call Trace:
[  100.459129][  T843]  <TASK>
[  100.461898][  T843]  ice_flow_rem_entry_sync.constprop.0+0x1c/0x90 [ice]
[  100.468663][  T843]  ice_flow_rem_entry+0x3d/0x60 [ice]
[  100.473925][  T843]
ice_fdir_erase_flow_from_hw.constprop.0+0x9b/0x100 [ice]
[  100.481078][  T843]  ice_fdir_rem_flow.constprop.0+0x32/0xb0 [ice]
[  100.487284][  T843]  ice_vsi_manage_fdir+0x7b/0xb0 [ice]
[  100.492629][  T843]  ice_deinit_features.part.0+0x46/0xc0 [ice]
[  100.498571][  T843]  ice_remove+0xcf/0x220 [ice]
[  100.503222][  T843]  pci_device_remove+0x3f/0xb0
[  100.507798][  T843]  device_release_driver_internal+0x19d/0x220
[  100.513667][  T843]  pci_stop_bus_device+0x6c/0x90
[  100.518417][  T843]  pci_stop_and_remove_bus_device+0x12/0x20
[  100.524110][  T843]  pciehp_unconfigure_device+0x9f/0x160
[  100.529463][  T843]  pciehp_disable_slot+0x69/0x130
[  100.534296][  T843]  pciehp_handle_presence_or_link_change+0xfc/0x210
[  100.540678][  T843]  pciehp_ist+0x204/0x230
[  100.544824][  T843]  ? __pfx_irq_thread_fn+0x10/0x10
[  100.549747][  T843]  irq_thread_fn+0x20/0x60
[  100.553978][  T843]  irq_thread+0xfb/0x1c0
[  100.558038][  T843]  ? __pfx_irq_thread_dtor+0x10/0x10
[  100.563130][  T843]  ? __pfx_irq_thread+0x10/0x10
[  100.567791][  T843]  kthread+0xe5/0x120
[  100.571594][  T843]  ? __pfx_kthread+0x10/0x10
[  100.575997][  T843]  ret_from_fork+0x17a/0x1a0
[  100.580403][  T843]  ? __pfx_kthread+0x10/0x10
[  100.584805][  T843]  ret_from_fork_asm+0x1a/0x30
[  100.589384][  T843]  </TASK>
[  100.592237][  T843] Modules linked in: zxmem(OE) einj amdgpu amdxcp
gpu_sched drm_exec drm_buddy nft_fib_inet nft_fib_ipv4 nft_fib_ipv6
nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct
nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 zhaoxin_cputemp
nf_defrag_ipv4 zhaoxin_rng snd_hda_codec_hdmi radeon rfkill
snd_hda_intel snd_intel_dspcfg irdma i2c_algo_bit snd_intel_sdw_acpi
ip_set i40e drm_suballoc_helper nf_tables drm_ttm_helper pcicfg(POE)
snd_hda_codec ib_uverbs sunrpc ttm ib_core snd_hda_core
drm_display_helper snd_hwdep kvm_intel snd_pcm cec vfat fat
drm_kms_helper snd_timer kvm video ice snd psmouse soundcore wmi
acpi_cpufreq pcspkr i2c_zhaoxin sg sch_fq_codel drm fuse backlight
nfnetlink xfs sd_mod t10_pi sm2_zhaoxin_gmi crct10dif_pclmul
crc32_pclmul ahci crc32c_intel libahci r8169 ghash_clmulni_intel libata
sha512_ssse3 serio_raw realtek dm_mirror dm_region_hash dm_log
dm_multipath dm_mod i2c_dev autofs4
[  100.674508][  T843] ---[ end trace 0000000000000000 ]---
[  100.709547][  T843] RIP: 0010:__list_del_entry_valid_or_report+0x7f/0xc0
[  100.716197][  T843] Code: 66 4b a6 e8 c3 43 a9 ff 0f 0b 48 89 fe 48
c7 c7 10 67 4b a6 e8 b2 43 a9 ff 0f 0b 48 89 fe 48 c7 c7 40 67 4b a6 e8
a1 43 a9 ff <0f> 0b 48 89 fe 48 89 ca 48 c7 c7 78 67 4b a6 e8 8d 43 a9
ff 0f 0b
[  100.735491][  T843] RSP: 0018:ffffc9000f70fc08 EFLAGS: 00010246
[  100.741367][  T843] RAX: 000000000000004e RBX: ffff8881418b79e8 RCX:
0000000000000000
[  100.749137][  T843] RDX: 0000000000000000 RSI: ffff8897df5a32c0 RDI:
ffff8897df5a32c0
[  100.756909][  T843] RBP: ffff8881257f9608 R08: 0000000000000000 R09:
0000000000000003
[  100.764678][  T843] R10: ffffc9000f70fa90 R11: ffffffffa6fee508 R12:
0000000000000000
[  100.772448][  T843] R13: ffff8881257f9608 R14: ffff888116507c28 R15:
ffff888116507c28
[  100.780218][  T843] FS:  0000000000000000(0000)
GS:ffff8897df580000(0000) knlGS:0000000000000000
[  100.788934][  T843] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  100.795329][  T843] CR2: 00007f9563bac1c0 CR3: 0000000c4be26004 CR4:
0000000000570ee0
[  100.803099][  T843] PKRU: 55555554
[  100.806483][  T843] Kernel panic - not syncing: Fatal exception
[  100.812794][  T843] Kernel Offset: disabled
[  100.821613][  T843] pstore: backend (erst) writing error (-28)
[  100.827481][  T843] ---[ end Kernel panic - not syncing: Fatal
exception ]---

The reason for this kernel panic is that the ice network card driver
executed the ice_pci_err_detected() for a longer time than the maximum
waiting time allowed by pciehp. After that, the pciehp_ist() will
execute the ice network card driver's ice_remove() process. This results
in the ice_pci_err_detected() having already deleted the list, while the
ice_remove() is still attempting to delete a list that no longer exists.

> There used to be a use-after-free on concurrent DPC and hot-removal.
> It was fixed by 11a1f4bc4736, which went into v6.11 and was subsequently
> ported all the way back to v5.10-stable.
> 
> I suspect you may be using a kernel which lacks that fix.
> 

>From the above analysis process, it is clear that this is not the same
issue.

Yours sincerely,
LeoLiu-oc

> Thanks,
> 
> Lukas


Reply via email to