> Subject: [EXTERNAL] RE: [PATCH] PCI: hv: Set default NUMA node to 0 for
> devices
> without affinity info
>
> From: Long Li <[email protected]> Sent: Thursday, March 12, 2026 3:33 PM
> >
> > When a Hyper-V PCI device does not have
> > HV_PCI_DEVICE_FLAG_NUMA_AFFINITY set or has an out-of-range
> > virtual_numa_node, hv_pci_assign_numa_node() leaves the device NUMA
> > node unset. On x86_64, the default NUMA node happens to be 0, but on
> > ARM64 it is NUMA_NO_NODE (-1), leading to inconsistent behavior across
> > architectures.
> >
> > In Azure, when no NUMA information is available from the host, devices
> > perform best when assigned to node 0. Set the device NUMA node to 0
> > unconditionally before the conditional NUMA affinity check, so that
> > devices always get a valid default and behavior is consistent on both
> > x86_64 and ARM64.
>
> I'm wondering if this is the right overall approach to the inconsistency.
> Arguably, the arm64 value of NUMA_NO_NODE is more correct when the Hyper-
> V host has not provided any NUMA information to the guest. Maybe the x86/x64
> side should be changed to default to NUMA_NO_NODE when there's no NUMA
> information provided.
Tests have shown when Azure doesn't provide NUMA information for a PCI device,
workloads runs best when the node defaults to 0. NUMA_NO_NODE results in
performance degradation on ARM64. This affects most high-performance devices
like MANA when tested to line limit.
>
> The observed x86/x64 default of NUMA node 0 does not come from x86/x64
> architecture specific PCI code. It's a Hyper-V specific behavior due to how
> hv_pci_probe() allocates the struct hv_pcibus_device, with its embedded struct
> pci_sysdata. That struct pci_sysdata has a "node" field that the x86/x64
> __pcibus_to_node() function accesses when called from pci_device_add().
> If hv_pci_probe() were to initialize that "node" field to NUMA_NO_NODE at the
> same time that it sets the "domain" field, x86/x64 guests on Hyper-V would see
> the PCI device NUMA node default to NUMA_NO_NODE like on arm64. The
> current behavior of letting the sysdata "node" field stay zero as allocated
> might
> just be an historical oversight that no one noticed.
I agree this was an oversight in the original X64 code, in that it sets to numa
node 0 by chance. But it turns out to be the ideal node configuration for Azure
when affinity information is not available through the vPCI. (i.e. non isolated
VM sizes). This results in X64 perform better than ARM64 on multiple NUMA
non-isolated VM sizes.
>
> Are there any observed problems on arm64 with the default being
> NUMA_NO_NODE? If there are such problems, they should be fixed separately
> since that case needs to work for a kernel built with CONFIG_NUMA=n.
> pcibus_to_node() will return NUMA_NO_NODE, making the default on x86/x64
> be NUMA_NO_NODE as well.
>
> I've tested setting sysdata->node to NUMA_NO_NODE in hv_pci_probe(), and
> didn't see any obviously problems in an x86/x64 Azure VM with a MANA VF and
> multiple NVMe pass-thru devices. The NUMA node reported in /sys for these PCI
> devices is indeed NUMA_NO_NODE.
> But maybe there's some other issue that I'm not aware of.
Extensive tests have shown defaulting NUMA node to 0 preserved the existing
behavior on X64, while improving performance on ARM64, especially for MANA.
This has been confirmed by the Hyper-V team, and Windows VM uses the same
values for defaults.
Thanks,
Long
>
> Michael
>
> >
> > Fixes: 999dd956d838 ("PCI: hv: Add support for protocol 1.3 and
> > support PCI_BUS_RELATIONS2")
> > Signed-off-by: Long Li <[email protected]>
> > ---
> > drivers/pci/controller/pci-hyperv.c | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> > diff --git a/drivers/pci/controller/pci-hyperv.c
> > b/drivers/pci/controller/pci-hyperv.c
> > index 2c7a406b4ba8..5c03b6e4cdab 100644
> > --- a/drivers/pci/controller/pci-hyperv.c
> > +++ b/drivers/pci/controller/pci-hyperv.c
> > @@ -2485,6 +2485,9 @@ static void hv_pci_assign_numa_node(struct
> hv_pcibus_device *hbus)
> > if (!hv_dev)
> > continue;
> >
> > + /* Default to node 0 for consistent behavior across
> > architectures
> */
> > + set_dev_node(&dev->dev, 0);
> > +
> > if (hv_dev->desc.flags &
> HV_PCI_DEVICE_FLAG_NUMA_AFFINITY &&
> > hv_dev->desc.virtual_numa_node < num_possible_nodes())
> > /*
> > --
> > 2.43.0
> >