When hv_pci_assign_numa_node() processes a device that does not have
HV_PCI_DEVICE_FLAG_NUMA_AFFINITY set or has an out-of-range
virtual_numa_node, the device NUMA node is left unset. On x86_64,
the uninitialized default happens to be 0, but on ARM64 it is
NUMA_NO_NODE (-1).
Tests show that when no NUMA information is available from the Hyper-V
host, devices perform best when assigned to node 0. With NUMA_NO_NODE
the kernel may spread work across NUMA nodes, which degrades
performance on Hyper-V, particularly for high-throughput devices like
MANA.
Always set the device NUMA node to 0 before the conditional NUMA
affinity check, so that devices get a performant default when the host
provides no NUMA information, and behavior is consistent on both
x86_64 and ARM64.
Fixes: 999dd956d838 ("PCI: hv: Add support for protocol 1.3 and support
PCI_BUS_RELATIONS2")
Signed-off-by: Long Li <[email protected]>
---
Changes in v2:
- Rewrite commit message to focus on performance as the primary
motivation: NUMA_NO_NODE causes the kernel to spread work across
NUMA nodes, degrading performance on Hyper-V
drivers/pci/controller/pci-hyperv.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/drivers/pci/controller/pci-hyperv.c
b/drivers/pci/controller/pci-hyperv.c
index 2c7a406b4ba8..38a790f642a1 100644
--- a/drivers/pci/controller/pci-hyperv.c
+++ b/drivers/pci/controller/pci-hyperv.c
@@ -2485,6 +2485,14 @@ static void hv_pci_assign_numa_node(struct
hv_pcibus_device *hbus)
if (!hv_dev)
continue;
+ /*
+ * If the Hyper-V host doesn't provide a NUMA node for the
+ * device, default to node 0. With NUMA_NO_NODE the kernel
+ * may spread work across NUMA nodes, which degrades
+ * performance on Hyper-V.
+ */
+ set_dev_node(&dev->dev, 0);
+
if (hv_dev->desc.flags & HV_PCI_DEVICE_FLAG_NUMA_AFFINITY &&
hv_dev->desc.virtual_numa_node < num_possible_nodes())
/*
--
2.43.0