Re: [PATCH 1/3] xen/pci: do not register devices outside of PCI segment scope
On Mon, Jan 13, 2025 at 11:18:57AM +0100, Roger Pau Monné wrote: > On Fri, Jan 10, 2025 at 04:21:29PM -0600, Bjorn Helgaas wrote: > > On Fri, Jan 10, 2025 at 03:01:48PM +0100, Roger Pau Monne wrote: > > > The PCI segment value is limited to 16 bits, however there are buses like > > > VMD > > > that fake being part of the PCI topology by adding segment with a number > > > outside the scope of the PCI firmware specification range (>= 0x1). > > > The > > > MCFG ACPI Table "PCI Segment Group Number" field is defined as having a > > > 16 bit > > > width. > > > > > > Attempting to register or manage those devices with Xen would result in > > > errors > > > at best, or overlaps with existing devices living on the truncated > > > equivalent > > > segment values. > > > > The ACPI _SEG method (ACPI r6.5, sec 6.5.6) and the corresponding > > value in the MCFG table (PCI Firmware r3.3, sec 4.1.2) are clearly > > 16-bit values. > > > > But otherwise, the segment value is pretty much an arbitrary software > > value, and the kernel works fine with the larger domain values from > > vmd_find_free_domain(), so this isn't quite enough to explain what the > > issue with Xen is. > > > > Does Xen truncate the domain to 16 bits or use it to look up something > > in ACPI? > > In the interface between Xen and Linux the segment field is 16 bit > width, so with the current interface is not possible to reference > devices that are past the 0x segment. I think this specific reason (and maybe even struct physdev_pci_device_add) would be more useful than the ACPI _SEG and MCFG things, which are not as directly connected here. Bjorn
Re: [PATCH 1/3] xen/pci: do not register devices outside of PCI segment scope
On Fri, Jan 10, 2025 at 04:21:29PM -0600, Bjorn Helgaas wrote: > On Fri, Jan 10, 2025 at 03:01:48PM +0100, Roger Pau Monne wrote: > > The PCI segment value is limited to 16 bits, however there are buses like > > VMD > > that fake being part of the PCI topology by adding segment with a number > > outside the scope of the PCI firmware specification range (>= 0x1). The > > MCFG ACPI Table "PCI Segment Group Number" field is defined as having a 16 > > bit > > width. > > > > Attempting to register or manage those devices with Xen would result in > > errors > > at best, or overlaps with existing devices living on the truncated > > equivalent > > segment values. > > The ACPI _SEG method (ACPI r6.5, sec 6.5.6) and the corresponding > value in the MCFG table (PCI Firmware r3.3, sec 4.1.2) are clearly > 16-bit values. > > But otherwise, the segment value is pretty much an arbitrary software > value, and the kernel works fine with the larger domain values from > vmd_find_free_domain(), so this isn't quite enough to explain what the > issue with Xen is. > > Does Xen truncate the domain to 16 bits or use it to look up something > in ACPI? In the interface between Xen and Linux the segment field is 16 bit width, so with the current interface is not possible to reference devices that are past the 0x segment. I also wonder whether Xen and Linux (or guest OSes in general) would agree on how to reference such devices. AFAICT VMD segment numbers are purely a software construct, but not something enforced by the specification. Could for example FreeBSD assign a different segment to VMD devices? If so we would need some kind of specification about how VMD segment values are assigned so that OSes have a coherent way of referencing VMD devices without ambiguity. Thanks, Roger.
Re: [PATCH 1/3] xen/pci: do not register devices outside of PCI segment scope
On Mon, Jan 13, 2025 at 08:17:23AM +0100, Jan Beulich wrote: > On 10.01.2025 15:01, Roger Pau Monne wrote: > > The PCI segment value is limited to 16 bits, however there are buses like > > VMD > > that fake being part of the PCI topology by adding segment with a number > > outside the scope of the PCI firmware specification range (>= 0x1). The > > MCFG ACPI Table "PCI Segment Group Number" field is defined as having a 16 > > bit > > width. > > > > Attempting to register or manage those devices with Xen would result in > > errors > > at best, or overlaps with existing devices living on the truncated > > equivalent > > segment values. > > > > Skip notifying Xen about those devices. > > Hmm, is simply omitting the notification really all it takes? How would Xen > manage MSI / MSI-X, for example, without knowing of the device? As per the > BoF on the summit in Prague(?), I continue to think we need partial driver > logic in Xen for VMD ... The basic mode of operation of devices behind a VMD bridge is to reference the interrupts of the bridge device in its MSI(-X) entries, so the VMD bridge acts as a de-multiplexer and forwards the interrupts to the device behind the VMD bridge. See vmd_alloc_irqs() (and calling context) in the VMD driver for a reference about how this is setup and operates. Note also that devices behind a VMD bridge operate using a different MSI domain, that uses a custom irq_compose_msi_msg hook. Thanks, Roger.
Re: [PATCH 1/3] xen/pci: do not register devices outside of PCI segment scope
On 10.01.2025 23:21, Bjorn Helgaas wrote: > On Fri, Jan 10, 2025 at 03:01:48PM +0100, Roger Pau Monne wrote: >> The PCI segment value is limited to 16 bits, however there are buses like VMD >> that fake being part of the PCI topology by adding segment with a number >> outside the scope of the PCI firmware specification range (>= 0x1). The >> MCFG ACPI Table "PCI Segment Group Number" field is defined as having a 16 >> bit >> width. >> >> Attempting to register or manage those devices with Xen would result in >> errors >> at best, or overlaps with existing devices living on the truncated equivalent >> segment values. > > The ACPI _SEG method (ACPI r6.5, sec 6.5.6) and the corresponding > value in the MCFG table (PCI Firmware r3.3, sec 4.1.2) are clearly > 16-bit values. > > But otherwise, the segment value is pretty much an arbitrary software > value, and the kernel works fine with the larger domain values from > vmd_find_free_domain(), so this isn't quite enough to explain what the > issue with Xen is. > > Does Xen truncate the domain to 16 bits or use it to look up something > in ACPI? One of the involved public interface structs starts like this: struct physdev_pci_device_add { /* IN */ uint16_t seg; uint8_t bus; uint8_t devfn; ... So yes, wider segment values would be truncated. Plus, even if they weren't, there would need to be coordination between Dom0 and Xen on which devices gets which segment number, since - as you say - the assignment in Linux is pretty much arbitrary. Jan
Re: [PATCH 1/3] xen/pci: do not register devices outside of PCI segment scope
On 10.01.2025 15:01, Roger Pau Monne wrote: > The PCI segment value is limited to 16 bits, however there are buses like VMD > that fake being part of the PCI topology by adding segment with a number > outside the scope of the PCI firmware specification range (>= 0x1). The > MCFG ACPI Table "PCI Segment Group Number" field is defined as having a 16 bit > width. > > Attempting to register or manage those devices with Xen would result in errors > at best, or overlaps with existing devices living on the truncated equivalent > segment values. > > Skip notifying Xen about those devices. Hmm, is simply omitting the notification really all it takes? How would Xen manage MSI / MSI-X, for example, without knowing of the device? As per the BoF on the summit in Prague(?), I continue to think we need partial driver logic in Xen for VMD ... Jan
Re: [PATCH 1/3] xen/pci: do not register devices outside of PCI segment scope
On Fri, Jan 10, 2025 at 03:01:48PM +0100, Roger Pau Monne wrote: > The PCI segment value is limited to 16 bits, however there are buses like VMD > that fake being part of the PCI topology by adding segment with a number > outside the scope of the PCI firmware specification range (>= 0x1). The > MCFG ACPI Table "PCI Segment Group Number" field is defined as having a 16 bit > width. > > Attempting to register or manage those devices with Xen would result in errors > at best, or overlaps with existing devices living on the truncated equivalent > segment values. The ACPI _SEG method (ACPI r6.5, sec 6.5.6) and the corresponding value in the MCFG table (PCI Firmware r3.3, sec 4.1.2) are clearly 16-bit values. But otherwise, the segment value is pretty much an arbitrary software value, and the kernel works fine with the larger domain values from vmd_find_free_domain(), so this isn't quite enough to explain what the issue with Xen is. Does Xen truncate the domain to 16 bits or use it to look up something in ACPI? > Skip notifying Xen about those devices. > > Signed-off-by: Roger Pau Monné > --- > drivers/xen/pci.c | 19 +++ > 1 file changed, 19 insertions(+) > > diff --git a/drivers/xen/pci.c b/drivers/xen/pci.c > index 416f231809cb..08e82fd1263e 100644 > --- a/drivers/xen/pci.c > +++ b/drivers/xen/pci.c > @@ -43,6 +43,13 @@ static int xen_add_device(struct device *dev) > pci_mcfg_reserved = true; > } > #endif > + > + if (pci_domain_nr(pci_dev->bus) >> 16) { > + dev_info(dev, > + "not registering with Xen: invalid PCI segment\n"); > + return 0; > + } > + > if (pci_seg_supported) { > DEFINE_RAW_FLEX(struct physdev_pci_device_add, add, optarr, 1); > > @@ -149,6 +156,12 @@ static int xen_remove_device(struct device *dev) > int r; > struct pci_dev *pci_dev = to_pci_dev(dev); > > + if (pci_domain_nr(pci_dev->bus) >> 16) { > + dev_info(dev, > + "not unregistering with Xen: invalid PCI segment\n"); > + return 0; > + } > + > if (pci_seg_supported) { > struct physdev_pci_device device = { > .seg = pci_domain_nr(pci_dev->bus), > @@ -182,6 +195,12 @@ int xen_reset_device(const struct pci_dev *dev) > .flags = PCI_DEVICE_RESET_FLR, > }; > > + if (pci_domain_nr(dev->bus) >> 16) { > + dev_info(&dev->dev, > + "unable to notify Xen of device reset: invalid PCI > segment\n"); > + return 0; > + } > + > return HYPERVISOR_physdev_op(PHYSDEVOP_pci_device_reset, &device); > } > EXPORT_SYMBOL_GPL(xen_reset_device); > -- > 2.46.0 >