Michael Kelley wrote: > From: Dan Williams <dan.j.willi...@intel.com> Sent: Wednesday, July 16, 2025 > 9:09 AM
Thanks for taking a look Michael! [..] > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > > index e9448d55113b..833ebf2d5213 100644 > > --- a/drivers/pci/pci.c > > +++ b/drivers/pci/pci.c > > @@ -6692,9 +6692,50 @@ static void pci_no_domains(void) > > #endif > > } > > > > +#ifdef CONFIG_PCI_DOMAINS > > +static DEFINE_IDA(pci_domain_nr_dynamic_ida); > > + > > +/* > > + * Find a free domain_nr either allocated by pci_domain_nr_dynamic_ida or > > + * fallback to the first free domain number above the last ACPI segment > > number. > > + * Caller may have a specific domain number in mind, in which case try to > > + * reserve it. > > + * > > + * Note that this allocation is freed by pci_release_host_bridge_dev(). > > + */ > > +int pci_bus_find_emul_domain_nr(int hint) > > +{ > > + if (hint >= 0) { > > + hint = ida_alloc_range(&pci_domain_nr_dynamic_ida, hint, hint, > > + GFP_KERNEL); > > This almost preserves the existing functionality in pci-hyperv.c. But if the > "hint" passed in is zero, current code in pci-hyperv.c treats that as a > collision and allocates some other value. The special treatment of zero is > necessary per the comment with the definition of HVPCI_DOM_INVALID. > > I don't have an opinion on whether the code here should treat a "hint" > of zero as invalid, or whether that should be handled in pci-hyperv.c. Oh, I see what you are saying. I made the "hint == 0" case start working where previously it should have failed. I feel like that's probably best handled in pci-hyperv.c with something like the following which also fixes up a regression I caused with @dom being unsigned: diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c index cfe9806bdbe4..813757db98d1 100644 --- a/drivers/pci/controller/pci-hyperv.c +++ b/drivers/pci/controller/pci-hyperv.c @@ -3642,9 +3642,9 @@ static int hv_pci_probe(struct hv_device *hdev, { struct pci_host_bridge *bridge; struct hv_pcibus_device *hbus; - u16 dom_req, dom; + int ret, dom = -EINVAL; + u16 dom_req; char *name; - int ret; bridge = devm_pci_alloc_host_bridge(&hdev->device, 0); if (!bridge) @@ -3673,7 +3673,8 @@ static int hv_pci_probe(struct hv_device *hdev, * collisions) in the same VM. */ dom_req = hdev->dev_instance.b[5] << 8 | hdev->dev_instance.b[4]; - dom = pci_bus_find_emul_domain_nr(dom_req); + if (dom_req) + dom = pci_bus_find_emul_domain_nr(dom_req); if (dom < 0) { dev_err(&hdev->device, > > + > > + if (hint >= 0) > > + return hint; > > + } > > + > > + if (acpi_disabled) > > + return ida_alloc(&pci_domain_nr_dynamic_ida, GFP_KERNEL); > > + > > + /* > > + * Emulated domains start at 0x10000 to not clash with ACPI _SEG > > + * domains. Per ACPI r6.0, sec 6.5.6, _SEG returns an integer, of > > + * which the lower 16 bits are the PCI Segment Group (domain) number. > > + * Other bits are currently reserved. > > + */ > > Back in 2018 and 2019, the Microsoft Linux team encountered problems with > PCI domain IDs that exceeded 0xFFFF. User space code, such as the Xorg X > server, > assumed PCI domain IDs were at most 16 bits, and retained only the low 16 bits > if the value was larger. My memory of the details is vague, but I believe some > or all of this behavior was tied to libpciaccess. As a result of these user > space > limitations, the pci-hyperv.c code made sure to not create any domain IDs > larger than 0xFFFF. The problem was not just theoretical -- Microsoft had > customers reporting issues due to the "32-bit domain ID problem" and the > pci-hyperv.c code was updated to avoid it. > > I don't have information on whether user space code has been fixed, or > the extent to which such a fix has propagated into distro versions. At the > least, a VM with old user space code might break if the kernel is upgraded > to one with this patch. How do people see the risks now that it is 6 years > later? I don't have enough data to make an assessment. A couple observations: - I think it would be reasonable to not fallback in the hint case with something like this: diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 833ebf2d5213..0bd2053dbe8a 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -6705,14 +6705,10 @@ static DEFINE_IDA(pci_domain_nr_dynamic_ida); */ int pci_bus_find_emul_domain_nr(int hint) { - if (hint >= 0) { - hint = ida_alloc_range(&pci_domain_nr_dynamic_ida, hint, hint, + if (hint >= 0) + return ida_alloc_range(&pci_domain_nr_dynamic_ida, hint, hint, GFP_KERNEL); - if (hint >= 0) - return hint; - } - if (acpi_disabled) return ida_alloc(&pci_domain_nr_dynamic_ida, GFP_KERNEL); - The VMD driver has been allocating 32-bit PCI domain numbers since v4.5 185a383ada2e ("x86/PCI: Add driver for Intel Volume Management Device (VMD)"). At a minimum if it is still a problem, it is a shared problem, but the significant deployment of VMD in the time likely indicates it is ok. If not, the above change at least makes the hyper-v case avoid 32-bit domain numbers.