> -----Original Message----- > From: Intel-wired-lan <[email protected]> On Behalf Of > Wander Lairson Costa > Sent: Wednesday, December 4, 2024 12:42 PM > To: Nguyen, Anthony L <[email protected]>; Kitszel, Przemyslaw > <[email protected]>; Andrew Lunn <[email protected]>; > David S. Miller <[email protected]>; Eric Dumazet > <[email protected]>; Jakub Kicinski <[email protected]>; Paolo Abeni > <[email protected]>; Sebastian Andrzej Siewior <[email protected]>; Clark > Williams <[email protected]>; Steven Rostedt <[email protected]>; Auke > Kok <[email protected]>; Jeff Garzik <[email protected]>; moderated > list:INTEL ETHERNET DRIVERS <[email protected]>; open > list:NETWORKING DRIVERS <[email protected]>; open list <linux- > [email protected]>; open list:Real-time Linux > (PREEMPT_RT):Keyword:PREEMPT_RT <[email protected]> > Cc: Wander Lairson Costa <[email protected]>; Yuying Ma > <[email protected]> > Subject: [Intel-wired-lan] [PATCH iwl-net 4/4] igb: fix igb_msix_other() > handling > for PREEMPT_RT > > During testing of SR-IOV, Red Hat QE encountered an issue where the ip link up > command intermittently fails for the igbvf interfaces when using the > PREEMPT_RT > variant. Investigation revealed that e1000_write_posted_mbx returns an error > due to the lack of an ACK from e1000_poll_for_ack. > > The underlying issue arises from the fact that IRQs are threaded by default > under > PREEMPT_RT. While the exact hardware details are not available, it appears > that > the IRQ handled by igb_msix_other must be processed before > e1000_poll_for_ack times out. However, e1000_write_posted_mbx is called with > preemption disabled, leading to a scenario where the IRQ is serviced only > after > the failure of e1000_write_posted_mbx. > > Commit 338c4d3902fe ("igb: Disable threaded IRQ for igb_msix_other") forced > the ISR to run in a non-threaded context. However, Sebastian observed that > some > functions called within the ISR acquire locks that may sleep. > > In the previous two patches, we managed to make igb_msg_mask() safe to call > from an interrupt context. > > In this commit, we move most of the ISR handling to an interrupt context, > leaving > non IRQ safe code to be called from the thread context under PREEMPT_RT. > > Reproducer: > > ipaddr_vlan=3 > nic_test=ens14f0 > vf=${nic_test}v0 # The main testing steps: > while true; do > ip link set ${nic_test} mtu 1500 > ip link set ${vf} mtu 1500 > ip link set $vf up > # 3. set vlan and ip for VF > ip link set ${nic_test} vf 0 vlan ${ipaddr_vlan} > ip addr add 172.30.${ipaddr_vlan}.1/24 dev ${vf} > ip addr add 2021:db8:${ipaddr_vlan}::1/64 dev ${vf} > # 4. check the link state for VF and PF > ip link show ${nic_test} > if ! ip link show $vf | grep 'state UP'; then > echo 'Error found' > break > fi > ip link set $vf down > done > > You can also reproduce it more reliably by setting nr_cpus=1 in the kernel > command line. > > Fixes: 9d5c824399de ("igb: PCI-Express 82575 Gigabit Ethernet driver") > Signed-off-by: Wander Lairson Costa <[email protected]> > Reported-by: Yuying Ma <[email protected]> > --- > drivers/net/ethernet/intel/igb/igb_main.c | 35 ++++++++++++++++------- > 1 file changed, 24 insertions(+), 11 deletions(-) > > diff --git a/drivers/net/ethernet/intel/igb/igb_main.c > b/drivers/net/ethernet/intel/igb/igb_main.c > index 5828831fd29c2..b2894cebe2c9e 100644 > --- a/drivers/net/ethernet/intel/igb/igb_main.c > +++ b/drivers/net/ethernet/intel/igb/igb_main.c > @@ -131,6 +131,7 @@ static void igb_set_uta(struct igb_adapter *adapter, bool
Tested-by: Rafal Romanowski <[email protected]>
