On the CalDigit Thunderbolt Station 3 Plus, we've encountered an issue when the USB downstream display connection state changes. The problematic sequence observed is: ``` igb_io_error_detected igb_down igb_io_error_detected igb_down ```
The second igb_down call blocks at napi_synchronize. Simply avoiding redundant igb_down calls makes the Ethernet of the thunderbolt dock unusable. If Intel can identify when an Ethernet device is within a Thunderbolt tunnel, the patch can be more specific. On Thu, Jun 6, 2024 at 8:04 AM Stefan Schaeckeler <[email protected]> wrote: > > Hi all, > > This commit introduced a regression. What does not work with this commit is > an AER without a surprise removal event. > > > On 5/18/23 00:26, Ying Hsu wrote: > > In a setup where a Thunderbolt hub connects to Ethernet and a display > > through USB Type-C, users may experience a hung task timeout when they > > remove the cable between the PC and the Thunderbolt hub. > > This is because the igb_down function is called multiple times when > > the Thunderbolt hub is unplugged. For example, the igb_io_error_detected > > triggers the first call, and the igb_remove triggers the second call. > > The second call to igb_down will block at napi_synchronize. > > Here's the call trace: > > __schedule+0x3b0/0xddb > > ? __mod_timer+0x164/0x5d3 > > schedule+0x44/0xa8 > > schedule_timeout+0xb2/0x2a4 > > ? run_local_timers+0x4e/0x4e > > msleep+0x31/0x38 > > igb_down+0x12c/0x22a [igb 6615058754948bfde0bf01429257eb59f13030d4] > > __igb_close+0x6f/0x9c [igb 6615058754948bfde0bf01429257eb59f13030d4] > > igb_close+0x23/0x2b [igb 6615058754948bfde0bf01429257eb59f13030d4] > > __dev_close_many+0x95/0xec > > dev_close_many+0x6e/0x103 > > unregister_netdevice_many+0x105/0x5b1 > > unregister_netdevice_queue+0xc2/0x10d > > unregister_netdev+0x1c/0x23 > > igb_remove+0xa7/0x11c [igb 6615058754948bfde0bf01429257eb59f13030d4] > > pci_device_remove+0x3f/0x9c > > device_release_driver_internal+0xfe/0x1b4 > > pci_stop_bus_device+0x5b/0x7f > > pci_stop_bus_device+0x30/0x7f > > pci_stop_bus_device+0x30/0x7f > > pci_stop_and_remove_bus_device+0x12/0x19 > > pciehp_unconfigure_device+0x76/0xe9 > > pciehp_disable_slot+0x6e/0x131 > > pciehp_handle_presence_or_link_change+0x7a/0x3f7 > > pciehp_ist+0xbe/0x194 > > irq_thread_fn+0x22/0x4d > > ? irq_thread+0x1fd/0x1fd > > irq_thread+0x17b/0x1fd > > ? irq_forced_thread_fn+0x5f/0x5f > > kthread+0x142/0x153 > > ? __irq_get_irqchip_state+0x46/0x46 > > ? kthread_associate_blkcg+0x71/0x71 > > ret_from_fork+0x1f/0x30 > > > > In this case, igb_io_error_detected detaches the network interface > > and requests a PCIE slot reset, however, the PCIE reset callback is > > not being invoked and thus the Ethernet connection breaks down. > > As the PCIE error in this case is a non-fatal one, requesting a > > slot reset can be avoided. > > This patch fixes the task hung issue and preserves Ethernet > > connection by ignoring non-fatal PCIE errors. > > > > Signed-off-by: Ying Hsu <[email protected]> > > --- > > This commit has been tested on a HP Elite Dragonfly Chromebook and > > a Caldigit TS3+ Thunderbolt hub. The Ethernet driver for the hub > > is igb. Non-fatal PCIE errors happen when users hot-plug the cables > > connected to the chromebook or to the external display. > > > > drivers/net/ethernet/intel/igb/igb_main.c | 5 +++++ > > 1 file changed, 5 insertions(+) > > > > diff --git a/drivers/net/ethernet/intel/igb/igb_main.c > > b/drivers/net/ethernet/intel/igb/igb_main.c > > index 58872a4c2540..a8b217368ca1 100644 > > --- a/drivers/net/ethernet/intel/igb/igb_main.c > > +++ b/drivers/net/ethernet/intel/igb/igb_main.c > > @@ -9581,6 +9581,11 @@ static pci_ers_result_t igb_io_error_detected(struct > > pci_dev *pdev, > > struct net_device *netdev = pci_get_drvdata(pdev); > > struct igb_adapter *adapter = netdev_priv(netdev); > > > > + if (state == pci_channel_io_normal) { > > + dev_warn(&pdev->dev, "Non-correctable non-fatal error > > reported.\n"); > > + return PCI_ERS_RESULT_CAN_RECOVER; > > + } > > + > > netif_device_detach(netdev); > > > > if (state == pci_channel_io_perm_failure) > > We are currently stuck with the 5.4 kernel - our embedded system can't easily > boot arbitrary kernel versions. The igb driver code has not changed and I'm > quite positive the issue still persist in the latest upstream kernel. > > This issue is reproducible with aer_inject. 09:00.0 is our i210 which is > directly connected to the cpu root port 00:01.1: > > > - - - snip - - - > [node0_RP0_CPU0:~]$cat aer.log > AER > PCI_ID 09:00.0 > UNCOR_STATUS COMP_TIME > HEADER_LOG 0 1 2 3 > - - - snip - - - > > - - - snip - - - > [node0_RP0_CPU0:~]$aer-inject aer.log > [369145.900845] pcieport 0000:00:01.1: aer_inject: Injecting errors > 00000000/00004000 into device 0000:09:00.0 > [369145.912726] pcieport 0000:00:01.1: AER: Uncorrected (Non-Fatal) error > received: 0000:09:00.0 > [369145.923124] igb 0000:09:00.0: AER: PCIe Bus Error: severity=Uncorrected > (Non-Fatal), type=Transaction Layer, (Requester ID) > [369145.936791] igb 0000:09:00.0: AER: device [8086:1537] error > status/mask=00004000/00000000 > [369145.947068] igb 0000:09:00.0: AER: [14] CmpltTO > [369145.954602] igb 0000:09:00.0: Non-correctable non-fatal error reported. > [369145.984564] ------------[ cut here ]------------ > [369145.990285] kernel BUG at include/linux/netdevice.h:529! > [369145.996860] invalid opcode: 0000 [#1] SMP PTI > [369146.002267] CPU: 3 PID: 142 Comm: irq/26-aerdrv Kdump: loaded Tainted: G > O 5.4.251-yocto-standard #1 > [369146.015073] Hardware name: Cisco System Inc. SF-D8/Type2 - Board Product > Name1, BIOS 1-29-g46d9e72a-s 05/03/2019 > [369146.027570] RIP: 0010:igb_up+0x51/0x160 [igb] > [369146.032974] Code: d2 eb 16 f0 80 60 60 fe f0 80 60 60 f7 48 83 c2 01 39 > 93 14 02 00 00 76 13 48 8b 84 d3 08 0f 00 00 48 8b 48 60 83 e1 01 75 d9 <0f> > 0b f6 83 11 02 00 00 20 0f 85 c0 00 00 00 48 8b bb 08 0f 00 00 > [369146.055938] RSP: 0018:ffffb29b0045bcf0 EFLAGS: 00010246 > [369146.062399] RAX: ffff8d938d99e400 RBX: ffff8d9398a08740 RCX: > 0000000000000000 > [369146.071186] RDX: 0000000000000000 RSI: 0000000000000004 RDI: > ffff8d938d99e2c0 > [369146.079974] RBP: ffffb29b0045bd00 R08: 00000000000002ec R09: > ffffb29b002fa000 > [369146.088761] R10: 0000000000006f80 R11: 000000000090ef2c R12: > ffff8d9398a08000 > [369146.097547] R13: ffff8d9398a08740 R14: ffff8d939ae92c00 R15: > ffff8d939ae92c28 > [369146.106335] FS: 0000000000000000(0000) GS:ffff8d939fcc0000(0000) > knlGS:0000000000000000 > [369146.116286] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [369146.123381] CR2: 000056318eb418e8 CR3: 000000084e744002 CR4: > 00000000003606e0 > [369146.132170] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > 0000000000000000 > [369146.140960] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: > 0000000000000400 > [369146.149747] Call Trace: > [369146.152826] ? show_regs.cold+0x1a/0x1f > [369146.157592] ? __die+0x90/0xd9 > [369146.161408] ? die+0x30/0x50 > [369146.165002] ? do_trap+0x85/0xf0 > [369146.169029] ? do_error_trap+0x7c/0xb0 > [369146.173695] ? igb_up+0x51/0x160 [igb] > [369146.178355] ? do_invalid_op+0x3c/0x50 > [369146.183019] ? igb_up+0x51/0x160 [igb] > [369146.187681] ? invalid_op+0x7c/0x90 > [369146.192027] ? igb_up+0x51/0x160 [igb] > [369146.196691] ? hrtimer_try_to_cancel+0x2c/0x110 > [369146.202305] ? schedule_hrtimeout_range_clock+0xa0/0x110 > [369146.208871] ? hrtimer_init_sleeper+0x90/0x90 > [369146.214277] ? igb_rx_fifo_flush_82575+0x32/0x270 [igb] > [369146.220738] ? igb_configure+0x417/0x650 [igb] > [369146.226247] ? igb_up+0x51/0x160 [igb] > [369146.230913] igb_io_resume+0x31/0x50 [igb] > [369146.235998] report_resume+0x5c/0x80 > [369146.240449] ? pcie_portdrv_probe+0x70/0x70 > [369146.245631] pci_walk_bus+0x75/0x90 > [369146.249966] pcie_do_recovery+0x163/0x280 > [369146.254947] aer_process_err_devices+0xa2/0xd1 > [369146.260455] aer_isr.cold+0x52/0xa1 > [369146.264799] ? __schedule+0x2bf/0x680 > [369146.269349] ? irq_finalize_oneshot+0xf0/0xf0 > [369146.274754] irq_thread_fn+0x28/0x50 > [369146.279205] irq_thread+0xf8/0x180 > [369146.283445] ? wake_threads_waitq+0x30/0x30 > [369146.288638] kthread+0x104/0x140 > [369146.292666] ? irq_thread_check_affinity+0x80/0x80 > [369146.298597] ? __kthread_cancel_work+0x40/0x40 > [369146.304106] ret_from_fork+0x35/0x40 > - - - snip - - - > > BUG_ON() comes from > > - - - snip - - - > static inline void napi_enable(struct napi_struct *n) > { > BUG_ON(!test_bit(NAPI_STATE_SCHED, &n->state)); > smp_mb__before_atomic(); > clear_bit(NAPI_STATE_SCHED, &n->state); > clear_bit(NAPI_STATE_NPSVC, &n->state); > } > - - - snip - - - > > The stack-trace shows that the AER handler starts here and goes into igb_up(): > > - - - snip - - - > static void igb_io_resume(struct pci_dev *pdev) > { > struct net_device *netdev = pci_get_drvdata(pdev); > struct igb_adapter *adapter = netdev_priv(netdev); > > if (netif_running(netdev)) { > if (igb_up(adapter)) { > dev_err(&pdev->dev, "igb_up failed after reset\n"); > return; > } > } > > ... > } > - - - snip - - - > Three functions come into the picture: > > igb_io_error_detected() { // runs upon AER detection > igb_down(); // before this commit > noop; // with this commit > } > > igb_io_resume() { // runs upon AER handling > igb_up() > } > > igb_remove() { // runs upon rmmod igb, or surprise down removal (as shown in > the commit log) > igb_down(); > } > > Before this commit, the flow for an AER on 09:00.0 was > igb_down; // from igb_io_error_detect() - this sets the NAPI_STATE_SCHED bit > igb_up; // from igb_io_resume() - that works as it finds a set > NAPI_STATE_SCHED bit (and then clears it) > > Now with this commit, the flow for an AER on 09:00.0 is > noop; // from igb_io_error_detect() - the NAPI_STATE_SCHED bit is not > touched, e.g. it remains cleared > igb_up; // from igb_io_resume() - BUG_ON() is triggered as it finds > a cleared NAPI_STATE_SCHED bit > > > I don't have a means to reproduce this i210 on thunderbolt issue and don't > completely understand its flow. Before this commit, the "expected" flow for > i210 on thunderbolt was probably > > igb_down; // from igb_io_error_detect() - this sets the NAPI_STATE_SCHED bit > igb_up; // from igb_io_resume() - that works as it finds a set > NAPI_STATE_SCHED bit and cleans it. > igb_down; // from igb_remove() - that works b/c of the previous > igb_up() > > The bug report shows that this was not the case. I don't completely > understand the reported problem: what happens to and in igb_io_resume() and > its call to igb_up()? In the commit log backtrace, I see igb_remove() > eventually calling __dev_close_many() which clears the __LINK__STATE_START > bit. igb_io_resume() does check this bit via netif_running() and calls only > then igb_up(). My guess is igb_io_resume() executes after igb_remove() and > skips therefore the execution of igb_up(). As we see from the commit log > backtrace, that makes igb_remove() starve now in > igb_down()->napi_synchronize(). > > How to fix that, I don't know. > > Stefan > >
