> -----Original Message-----
> From: Simon Horman <[email protected]>
> Sent: Saturday, May 2, 2026 8:09 AM
> To: Long Li <[email protected]>
> Cc: 'Simon Horman' <[email protected]>; Konstantin Taranov
> <[email protected]>; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; Haiyang Zhang <[email protected]>; KY
> Srinivasan <[email protected]>; [email protected]; Dexuan Cui
> <[email protected]>; [email protected]; [email protected];
> [email protected]; [email protected]
> Subject: [EXTERNAL] Re: [PATCH net-next v6 2/6] net: mana: Query device
> capabilities and configure MSI-X sharing for EQs
>
> From: 'Simon Horman' <[email protected]>
>
> This is an AI-generated review of your patch. The human sending this email has
> considered the AI review valid, or at least plausible.
> Full review at:
> https://sashiko.d/
> ev%2F&data=05%7C02%7Clongli%40microsoft.com%7C1b8423bb5d15490288be
> 08dea85cb944%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C6391333
> 13365211100%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYi
> OiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0
> %7C%7C%7C&sdata=H6rPcEM62LmbWsXdpSVcVaaec3Wx6KJnTnYS0fBk6vY%3D
> &reserved=0
> ---
> net: mana: Query device capabilities and configure MSI-X sharing for EQs
>
> When querying the device, this adjusts the max number of queues to allow
> dedicated MSI-X vectors for each vPort, clamping to no less than
> MANA_DEF_NUM_QUEUES. MSI-X sharing among vPorts is disabled by default
> and enabled only when there are not enough vectors. It also renames
> mana_query_device_cfg() to mana_gd_query_device_cfg().
>
> > diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c
> > b/drivers/net/ethernet/microsoft/mana/gdma_main.c
> > --- a/drivers/net/ethernet/microsoft/mana/gdma_main.c
> > +++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c
>
> [ ... ]
>
> > + /*
> > + * Adjust gc->max_num_queues returned from the SOC to allow
> dedicated
> > + * MSIx for each vPort. Clamp to no less than
> MANA_DEF_NUM_QUEUES.
> > + */
> > + max_num_queues = (gc->num_msix_usable - 1) / num_ports;
> > + max_num_queues = roundup_pow_of_two(max(max_num_queues, 1U));
>
> Does rounding up here inflate max_num_queues and unnecessarily force MSI-X
> sharing?
>
> For example, if there are 63 usable MSIs and 2 ports, the division yields 31.
> Rounding up gives 32. The subsequent check below:
>
> > + /*
> > + * Use dedicated MSIx for EQs whenever possible, use MSIx sharing for
> > + * Ethernet EQs when (max_num_queues * num_ports >
> num_msix_usable - 1)
> > + */
> > + max_num_queues = min(gc->max_num_queues, max_num_queues);
> > + if (max_num_queues * num_ports > gc->num_msix_usable - 1)
> > + gc->msi_sharing = true;
>
> would then evaluate to true (32 * 2 > 63) and force the driver into MSI-X
> shared
> mode. This seems to contradict the intent to use dedicated MSI-X whenever
> possible.
>
> Would it be better to use rounddown_pow_of_two() instead to ensure the
> calculated queues fit within the available dedicated vectors?
The roundup is intentional. The primary goal is to maximize the number of
queues per vPort for throughput - power-of-2 queue counts align with RSS
indirection table sizes and give the best
distribution. Dedicated MSI-X is preferred when it comes for free, but not at
the cost of halving the queue count.
In your example (63 MSIs, 2 ports), rounddown would give 16 queues per port -
nearly half the 31 that the hardware could support. Roundup gives 32 and
triggers sharing mode, but sharing mode is
designed to work correctly and the queues still get serviced. The driver
trades per-EQ interrupt exclusivity for better parallelism, which is the right
tradeoff for network throughput on these
devices.
The sharing check exists as a fallback for exactly this situation - it is not
a failure mode but a deliberate operating point.
>
> [ ... ]
>
> > @@ -1968,20 +2008,30 @@ static int mana_gd_setup(struct pci_dev *pdev)
> > if (err)
> > goto destroy_hwc;
> >
> > - err = mana_gd_query_max_resources(pdev);
> > + err = mana_gd_detect_devices(pdev);
> > if (err)
> > goto destroy_hwc;
> >
> > - err = mana_gd_setup_remaining_irqs(pdev);
> > - if (err) {
> > - dev_err(gc->dev, "Failed to setup remaining IRQs: %d", err);
> > - goto destroy_hwc;
> > - }
> > -
> > - err = mana_gd_detect_devices(pdev);
> > + err = mana_gd_query_max_resources(pdev);
> > if (err)
> > goto destroy_hwc;
> >
> > + if (!gc->msi_sharing) {
> > + gc->msi_bitmap = bitmap_zalloc(gc->num_msix_usable,
> GFP_KERNEL);
> > + if (!gc->msi_bitmap) {
> > + err = -ENOMEM;
> > + goto destroy_hwc;
> > + }
> > + /* Set bit for HWC */
> > + set_bit(0, gc->msi_bitmap);
> > + } else {
> > + err = mana_gd_setup_remaining_irqs(pdev);
> > + if (err) {
> > + dev_err(gc->dev, "Failed to setup remaining IRQs: %d",
> err);
> > + goto destroy_hwc;
> > + }
> > + }
>
> If msi_sharing is disabled, we allocate the msi_bitmap but skip calling
> mana_gd_setup_remaining_irqs().
>
> Since mana_gd_setup_hwc_irqs() only allocates a single vector for the hardware
> channel when dynamic allocation is supported, does this leave the device
> without
> interrupts for its Ethernet queues?
>
> If so, it seems this could lead to queue creation failures when the driver
> attempts
> to map uninitialized vectors. I notice this is fixed in a later patch in the
> series
> ("net: mana: Allocate interrupt context for each EQ when creating vPort"), but
> does leaving it out here break bisectability?
You're right that with pci_msix_can_alloc_dyn() returning true on MANA, the
non-sharing path is reachable between patches 2 and 5. In that window,
mana_gd_register_irq() would fail with -EINVAL
because irq_contexts for indices 1+ haven't been populated yet.
However, the failure is contained: mana_create_eq() returns an error,
mana_alloc_queues() propagates it, and the interface fails to come up cleanly -
no WARN, no crash, no data corruption. The
driver remains in a consistent state and succeeds once the full series is
applied.
This is a new capability being built up across the series. The dedicated
MSI-X mode did not exist before, so there is no regression from the pre-patch
baseline - the pre-patch code always went
through mana_gd_setup_remaining_irqs() and operated in what is now called
sharing mode. Restructuring the series to make non-sharing mode functional at
each intermediate commit would require
squashing the GIC infrastructure (patches 3-4) into this patch, producing a
single large change that is significantly harder to review.
I'd prefer to keep the logical separation as-is. If you feel strongly about
strict bisectability, I could add a fallback in this patch that forces
msi_sharing = true when the GIC allocator is
not yet available, and have patch 5 remove it - but that adds throwaway code
to an intermediate commit.