> -----Original Message-----
> From: Maciej Fijalkowski <[email protected]>
> Sent: Friday, January 9, 2026 1:15 AM
> To: Behera, Vivek (DI FA DSP ICC PRC1) <[email protected]>
> Cc: [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; [email protected]; intel-wired-
> [email protected]
> Subject: Re: [PATCH iwl-net v4] igb: Fix trigger of incorrect irq in
> igb_xsk_wakeup
> 
> On Mon, Dec 22, 2025 at 12:57:47PM +0100, Vivek Behera wrote:
> > The current implementation in the igb_xsk_wakeup expects the Rx and Tx
> > queues to share the same irq. This would lead to triggering of
> > incorrect irq in split irq configuration.
> > This patch addresses this issue which could impact environments with 2
> > active cpu cores or when the number of queues is reduced to 2 or less
> >
> > cat /proc/interrupts | grep eno2
> >  167:          0          0          0          0 IR-PCI-MSIX-0000:08:00.0
> >  0-edge      eno2
> >  168:          0          0          0          0 IR-PCI-MSIX-0000:08:00.0
> >  1-edge      eno2-rx-0
> >  169:          0          0          0          0 IR-PCI-MSIX-0000:08:00.0
> >  2-edge      eno2-rx-1
> >  170:          0          0          0          0 IR-PCI-MSIX-0000:08:00.0
> >  3-edge      eno2-tx-0
> >  171:          0          0          0          0 IR-PCI-MSIX-0000:08:00.0
> >  4-edge      eno2-tx-1
> >
> > Furthermore it uses the flags input argument to trigger either rx, tx
> > or both rx and tx irqs as specified in the ndo_xsk_wakeup api
> > documentation
> >
> > Fixes: 80f6ccf9f116 ("igb: Introduce XSK data structures and helpers")
> > Signed-off-by: Vivek Behera <[email protected]>
> > Reviewed-by: Aleksandr Loktionov <[email protected]>
> > ---
> > v1:
> > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore
> > .kernel.org%2Fintel-wired-lan%2F20251212131454.124116-1-vivek.behera%4
> >
> 0siemens.com%2F&data=05%7C02%7Cvivek.behera%40siemens.com%7C2e7cb
> 8169b
> >
> f4457c291908de4f1428a2%7C38ae3bcd95794fd4addab42e1495d55a%7C1%7C0%
> 7C63
> >
> 9035145171065657%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRyd
> WUsIlYiO
> >
> iIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0
> %7C
> >
> %7C%7C&sdata=yCLwTDeoPNeub3tvzXTA7vX5p8I%2BmqcArOk0BzogiA0%3D&
> reserved
> > =0
> > v2:
> > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore
> > .kernel.org%2Fintel-wired-lan%2F20251215115416.410619-1-vivek.behera%4
> >
> 0siemens.com%2F&data=05%7C02%7Cvivek.behera%40siemens.com%7C2e7cb
> 8169b
> >
> f4457c291908de4f1428a2%7C38ae3bcd95794fd4addab42e1495d55a%7C1%7C0%
> 7C63
> >
> 9035145171089890%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRyd
> WUsIlYiO
> >
> iIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0
> %7C
> >
> %7C%7C&sdata=wm2ylxc0sATU4XL5LFP026DwMxUWe7EWAVjCQ3yyPGM%3
> D&reserved=0
> > v3:
> > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore
> > .kernel.org%2Fintel-wired-lan%2F20251220114936.140473-1-vivek.behera%4
> >
> 0siemens.com%2F&data=05%7C02%7Cvivek.behera%40siemens.com%7C2e7cb
> 8169b
> >
> f4457c291908de4f1428a2%7C38ae3bcd95794fd4addab42e1495d55a%7C1%7C0%
> 7C63
> >
> 9035145171106278%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRyd
> WUsIlYiO
> >
> iIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0
> %7C
> >
> %7C%7C&sdata=q7VkYNY8YzNn5rsbunTF04mt0gkBe1jeF6DrSbvw5s4%3D&res
> erved=0
> >
> > changelog:
> > v1
> > - Inital description of the Bug and fixes made in the patch
> >
> > v1 -> v2
> > - Handling of RX and TX Wakeup in igc_xsk_wakeup for a split IRQ
> > configuration
> > - Review suggestions by Aleksander: Modified sequence to complete all
> >   error checks for rx and tx before updating napi states and
> > triggering irqs
> > - Corrected trigger of TX and RX interrupts over E1000_ICS (non msix
> > use case)
> > - Added define for Tx interrupt trigger bit mask for E1000_ICS
> >
> > v2 -> v3
> > - Included applicable feedback and suggestions from igc patch
> > - Fixed logic in updating eics value when  both TX and RX need wakeup
> >
> > v3 -> v4
> > - Added comments to explain trigerring of both TX and RX with active
> > queue pairs
> > - Fixed check of xsk pools in if statement
> > ---
> >  .../net/ethernet/intel/igb/e1000_defines.h    |  1 +
> >  drivers/net/ethernet/intel/igb/igb_xsk.c      | 90 +++++++++++++++++--
> >  2 files changed, 83 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/intel/igb/e1000_defines.h
> > b/drivers/net/ethernet/intel/igb/e1000_defines.h
> > index fa028928482f..9357564a2d58 100644
> > --- a/drivers/net/ethernet/intel/igb/e1000_defines.h
> > +++ b/drivers/net/ethernet/intel/igb/e1000_defines.h
> > @@ -443,6 +443,7 @@
> >  #define E1000_ICS_LSC       E1000_ICR_LSC       /* Link Status Change */
> >  #define E1000_ICS_RXDMT0    E1000_ICR_RXDMT0    /* rx desc min.
> threshold */
> >  #define E1000_ICS_DRSTA     E1000_ICR_DRSTA     /* Device Reset Aserted
> */
> > +#define E1000_ICS_TXDW      E1000_ICR_TXDW /* Transmit desc written
> back */
> >
> >  /* Extended Interrupt Cause Set */
> >  /* E1000_EITR_CNT_IGNR is only for 82576 and newer */ diff --git
> > a/drivers/net/ethernet/intel/igb/igb_xsk.c
> > b/drivers/net/ethernet/intel/igb/igb_xsk.c
> > index 30ce5fbb5b77..1d21674c0f33 100644
> > --- a/drivers/net/ethernet/intel/igb/igb_xsk.c
> > +++ b/drivers/net/ethernet/intel/igb/igb_xsk.c
> > @@ -529,6 +529,7 @@ int igb_xsk_wakeup(struct net_device *dev, u32 qid,
> u32 flags)
> >     struct igb_adapter *adapter = netdev_priv(dev);
> >     struct e1000_hw *hw = &adapter->hw;
> >     struct igb_ring *ring;
> > +   struct igb_q_vector *q_vector;
> >     u32 eics = 0;
> >
> >     if (test_bit(__IGB_DOWN, &adapter->state)) @@ -536,14 +537,82 @@
> int
> > igb_xsk_wakeup(struct net_device *dev, u32 qid, u32 flags)
> >
> >     if (!igb_xdp_is_enabled(adapter))
> >             return -EINVAL;
> > -
> > -   if (qid >= adapter->num_tx_queues)
> > +   /* Check if queue_id is valid. Tx and Rx queue numbers are always same
> */
> > +   if (qid >= adapter->num_rx_queues)
> >             return -EINVAL;
> >
> > -   ring = adapter->tx_ring[qid];
> > -
> > -   if (test_bit(IGB_RING_FLAG_TX_DISABLED, &ring->flags))
> > -           return -ENETDOWN;
> > +   if ((flags & XDP_WAKEUP_RX) && (flags & XDP_WAKEUP_TX)) {
> > +           /* If both TX and RX need to be woken up check if queue pairs 
> > are
> active */
> > +           if (adapter->flags & IGB_FLAG_QUEUE_PAIRS) {
> > +                   /* In queue-pair mode, rx_ring and tx_ring share the 
> > same
> q_vector,
> > +                    * so a single IRQ trigger will wake both RX and TX
> processing
> > +                    */
> > +                   ring = adapter->rx_ring[qid];
> > +           } else {
> > +                   /* Two irqs for Rx AND Tx need to be triggered */
> > +                   struct napi_struct *rx_napi;
> > +                   struct napi_struct *tx_napi;
> > +                   bool trigger_irq_tx = false;
> > +                   bool trigger_irq_rx = false;
> > +                   u32 eics_tx = 0;
> > +                   u32 eics_rx = 0;
> > +                   /* IRQ trigger preparation for Rx */
> > +                   ring = adapter->rx_ring[qid];
> > +                   if (!READ_ONCE(ring->xsk_pool))
> > +                           return -ENXIO;
> > +                   q_vector = ring->q_vector;
> > +                   rx_napi  = &q_vector->napi;
> > +                   /* Extend the BIT mask for eics */
> > +                   eics_rx = ring->q_vector->eims_value;
> > +
> > +                   /* IRQ trigger preparation for Tx */
> > +                   ring = adapter->tx_ring[qid];
> > +                   if (test_bit(IGB_RING_FLAG_TX_DISABLED, &ring-
> >flags))
> > +                           return -ENETDOWN;
> > +
> > +                   if (!READ_ONCE(ring->xsk_pool))
> > +                           return -ENXIO;
> > +                   q_vector = ring->q_vector;
> > +                   tx_napi  = &q_vector->napi;
> > +                   /* Extend the BIT mask for eics */
> > +                   eics_tx = ring->q_vector->eims_value;
> > +
> > +                   /* Check and update napi states for rx and tx */
> > +                   if (!napi_if_scheduled_mark_missed(rx_napi)) {
> > +                           trigger_irq_rx = true;
> > +                           eics |= eics_rx;
> > +                   }
> > +                   if (!napi_if_scheduled_mark_missed(tx_napi)) {
> > +                           trigger_irq_tx = true;
> > +                           eics |= eics_tx;
> > +                   }
> > +                   /* Now we trigger the required irqs for Rx and Tx */
> > +                   if ((trigger_irq_rx) || (trigger_irq_tx)) {
> > +                           if (adapter->flags & IGB_FLAG_HAS_MSIX) {
> > +                                   wr32(E1000_EICS, eics);
> > +                           } else {
> > +                                   if ((trigger_irq_rx) && 
> > (trigger_irq_tx))
> > +                                           wr32(E1000_ICS,
> E1000_ICS_RXDMT0 | E1000_ICS_TXDW);
> > +                                   else if (trigger_irq_rx)
> > +                                           wr32(E1000_ICS,
> E1000_ICS_RXDMT0);
> > +                                   else
> > +                                           wr32(E1000_ICS,
> E1000_ICS_TXDW);
> > +                           }
> > +                   }
> > +                   return 0;
> > +           }
> > +   } else if (flags & XDP_WAKEUP_TX) {
> > +           /* Get the ring params from Tx */
> > +           ring = adapter->tx_ring[qid];
> > +           if (test_bit(IGB_RING_FLAG_TX_DISABLED, &ring->flags))
> > +                   return -ENETDOWN;
> > +   } else if (flags & XDP_WAKEUP_RX) {
> > +           /* Get the ring params from Rx */
> > +           ring = adapter->rx_ring[qid];
> > +   } else {
> > +           /* Invalid Flags */
> > +           return -EINVAL;
> > +   }
> 
> This is too complicated IMHO. Wouldn't something like this work:
> - if wakeup rx, pick napi from rx ring's q_vector
>   * napi_if_scheduled_mark_missed() logic that exists in igc_xsk_wakeup()
> 
> repeat for tx; if IGB_FLAG_QUEUE_PAIRS then the branch of second
> napi_if_scheduled_mark_missed() call would not be executed as we had
> previously marked the missed bit in napi state;


Okay. I hope I understood the change you are suggesting. Does the code below 
reflect what you have in mind?

int igb_xsk_wakeup(struct net_device *dev, u32 qid, u32 flags)
{
        struct igb_adapter *adapter = netdev_priv(dev);
        struct e1000_hw *hw = &adapter->hw;
        struct igb_ring *ring;
        struct igb_q_vector *q_vector;
        struct napi_struct *rx_napi;
        struct napi_struct *tx_napi;
        bool trigger_irq_tx = false;
        bool trigger_irq_rx = false;
        u32 eics_tx = 0;
        u32 eics_rx = 0;
        u32 eics = 0;

        if (test_bit(__IGB_DOWN, &adapter->state))
                return -ENETDOWN;

        if (!igb_xdp_is_enabled(adapter))
                return -EINVAL;
        /* Check if queue_id is valid. Tx and Rx queue numbers are always same 
*/
        if (qid >= adapter->num_rx_queues)
                return -EINVAL;
        /* Check if flags are valid */
        if (!(flags & XDP_WAKEUP_RX) && !(flags & XDP_WAKEUP_TX))
                return -EINVAL;
        if (flags & XDP_WAKEUP_RX) {
                /* IRQ trigger preparation for Rx */
                ring = adapter->rx_ring[qid];
                if (!READ_ONCE(ring->xsk_pool))
                        return -ENXIO;
                q_vector = ring->q_vector;
                rx_napi = &q_vector->napi;
                /* Extend the BIT mask for eics */
                eics_rx = ring->q_vector->eims_value;
                trigger_irq_rx = true;
        }
        if (flags & XDP_WAKEUP_TX) {
                if (adapter->flags & IGB_FLAG_QUEUE_PAIRS) {
                        /* In queue-pair mode, rx_ring and tx_ring share the 
same q_vector,
                        * so a single IRQ trigger will wake both RX and TX 
processing
                        */

                } else {
                        /* IRQ trigger preparation for Tx */
                        ring = adapter->tx_ring[qid];
                        if (test_bit(IGB_RING_FLAG_TX_DISABLED, &ring->flags))
                                return -ENETDOWN;

                        if (!READ_ONCE(ring->xsk_pool))
                                return -ENXIO;
                        q_vector = ring->q_vector;
                        tx_napi = &q_vector->napi;
                        /* Extend the BIT mask for eics */
                        eics_tx = ring->q_vector->eims_value;
                        trigger_irq_tx = true;
                }
        }
        /* All error checks are finished. Check and update napi states for rx 
and tx */
        if (trigger_irq_rx) {
                if (!napi_if_scheduled_mark_missed(rx_napi)) {
                        eics |= eics_rx;
                }
        }
        if (trigger_irq_tx) {
                if (!napi_if_scheduled_mark_missed(tx_napi)) {
                        eics |= eics_tx;
                }
        }
        /* Now we trigger the required irqs for Rx and Tx */
        if ((trigger_irq_rx) || (trigger_irq_tx)) {
                if (adapter->flags & IGB_FLAG_HAS_MSIX) {
                        wr32(E1000_EICS, eics);
                } else {
                        if ((trigger_irq_rx) && (trigger_irq_tx))
                                wr32(E1000_ICS,
                                     E1000_ICS_RXDMT0 | E1000_ICS_TXDW);
                        else if (trigger_irq_rx)
                                wr32(E1000_ICS, E1000_ICS_RXDMT0);
                        else
                                wr32(E1000_ICS, E1000_ICS_TXDW);
                }
        }
        return 0;
}
> 
> >
> >     if (!READ_ONCE(ring->xsk_pool))
> >             return -EINVAL;
> > @@ -551,10 +620,15 @@ int igb_xsk_wakeup(struct net_device *dev, u32 qid,
> u32 flags)
> >     if (!napi_if_scheduled_mark_missed(&ring->q_vector->napi)) {
> >             /* Cause software interrupt */
> >             if (adapter->flags & IGB_FLAG_HAS_MSIX) {
> > -                   eics |= ring->q_vector->eims_value;
> > +                   eics = ring->q_vector->eims_value;
> >                     wr32(E1000_EICS, eics);
> >             } else {
> > -                   wr32(E1000_ICS, E1000_ICS_RXDMT0);
> > +                   if ((flags & XDP_WAKEUP_RX) && (flags &
> XDP_WAKEUP_TX))
> > +                           wr32(E1000_ICS, E1000_ICS_RXDMT0 |
> E1000_ICS_TXDW);
> > +                   else if (flags & XDP_WAKEUP_RX)
> > +                           wr32(E1000_ICS, E1000_ICS_RXDMT0);
> > +                   else
> > +                           wr32(E1000_ICS, E1000_ICS_TXDW);
> >             }
> >     }
> >
> > --
> > 2.34.1
> >

Reply via email to