Re: [PATCH v6 2/3] PCI: Add parameter nr_devfns to pci_add_dma_alias
> On 10 Dec 2019, at 22:37, Bjorn Helgaas wrote: > > [+cc Joerg] > > On Tue, Dec 03, 2019 at 03:43:53PM +, James Sewart wrote: >> pci_add_dma_alias can now be used to create a dma alias for a range of >> devfns. >> >> Reviewed-by: Logan Gunthorpe >> Signed-off-by: James Sewart >> --- >> drivers/pci/pci.c| 22 +- >> drivers/pci/quirks.c | 14 +++--- >> include/linux/pci.h | 2 +- >> 3 files changed, 25 insertions(+), 13 deletions(-) > > Heads up Joerg: I also updated drivers/iommu/amd_iommu.c (this is the > one reported by the kbuild test robot) and removed the printk there > that prints the same thing as the one in pci_add_dma_alias(), and I > updated a PCI quirk that was merged after this patch was posted. > >> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c >> index d3c83248f3ce..dbb01aceafda 100644 >> --- a/drivers/pci/pci.c >> +++ b/drivers/pci/pci.c >> @@ -5857,7 +5857,8 @@ int pci_set_vga_state(struct pci_dev *dev, bool decode, >> /** >> * pci_add_dma_alias - Add a DMA devfn alias for a device >> * @dev: the PCI device for which alias is added >> - * @devfn: alias slot and function >> + * @devfn_from: alias slot and function >> + * @nr_devfns: Number of subsequent devfns to alias >> * >> * This helper encodes an 8-bit devfn as a bit number in dma_alias_mask >> * which is used to program permissible bus-devfn source addresses for DMA >> @@ -5873,8 +5874,13 @@ int pci_set_vga_state(struct pci_dev *dev, bool >> decode, >> * cannot be left as a userspace activity). DMA aliases should therefore >> * be configured via quirks, such as the PCI fixup header quirk. >> */ >> -void pci_add_dma_alias(struct pci_dev *dev, u8 devfn) >> +void pci_add_dma_alias(struct pci_dev *dev, u8 devfn_from, unsigned >> nr_devfns) >> { >> +int devfn_to; >> + >> +nr_devfns = min(nr_devfns, (unsigned)MAX_NR_DEVFNS); >> +devfn_to = devfn_from + nr_devfns - 1; > > I made this look like: > > + devfn_to = min(devfn_from + nr_devfns - 1, > + (unsigned) MAX_NR_DEVFNS - 1); > > so devfn_from=0xf0, nr_devfns=0x20 doesn't cause devfn_to to wrap > around. > > I did keep Logan's reviewed-by, so let me know if I broke something. I think nr_devfns still needs updating as it is used for bitmap_set. Although thinking about it now we should limit the number to alias to be maximum (MAX_NR_DEVFNS - devfn_from), so that we don’t set past the end of the bitmap: nr_devfns = min(nr_devfns, (unsigned) MAX_NR_DEVFNS - devfn_from); I think with this change we wont need to clip devfn_to. > >> if (!dev->dma_alias_mask) >> dev->dma_alias_mask = bitmap_zalloc(MAX_NR_DEVFNS, GFP_KERNEL); >> if (!dev->dma_alias_mask) { >> @@ -5882,9 +5888,15 @@ void pci_add_dma_alias(struct pci_dev *dev, u8 devfn) >> return; >> } >> >> -set_bit(devfn, dev->dma_alias_mask); >> -pci_info(dev, "Enabling fixed DMA alias to %02x.%d\n", >> - PCI_SLOT(devfn), PCI_FUNC(devfn)); >> +bitmap_set(dev->dma_alias_mask, devfn_from, nr_devfns); >> + >> +if (nr_devfns == 1) >> +pci_info(dev, "Enabling fixed DMA alias to %02x.%d\n", >> +PCI_SLOT(devfn_from), PCI_FUNC(devfn_from)); >> +else if(nr_devfns > 1) >> +pci_info(dev, "Enabling fixed DMA alias for devfn range from >> %02x.%d to %02x.%d\n", >> +PCI_SLOT(devfn_from), PCI_FUNC(devfn_from), >> +PCI_SLOT(devfn_to), PCI_FUNC(devfn_to)); >> } ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v6 3/3] PCI: Add DMA alias quirk for PLX PEX NTB
The PLX PEX NTB forwards DMA transactions using Requester ID's that don't exist as PCI devices. The devfn for a transaction is used as an index into a lookup table storing the origin of a transaction on the other side of the bridge. This patch aliases all possible devfn's to the NTB device so that any transaction coming in is governed by the mappings for the NTB. Reviewed-by: Logan Gunthorpe Signed-off-by: James Sewart --- drivers/pci/quirks.c | 15 +++ 1 file changed, 15 insertions(+) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 0f3f5afc73fd..3a67049ca630 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -5315,6 +5315,21 @@ SWITCHTEC_QUIRK(0x8574); /* PFXI 64XG3 */ SWITCHTEC_QUIRK(0x8575); /* PFXI 80XG3 */ SWITCHTEC_QUIRK(0x8576); /* PFXI 96XG3 */ +/* + * PLX NTB uses devfn proxy IDs to move TLPs between NT endpoints. These IDs + * are used to forward responses to the originator on the other side of the + * NTB. Alias all possible IDs to the NTB to permit access when the IOMMU is + * turned on. + */ +static void quirk_plx_ntb_dma_alias(struct pci_dev *pdev) +{ + pci_info(pdev, "Setting PLX NTB proxy ID aliases\n"); + /* PLX NTB may use all 256 devfns */ + pci_add_dma_alias(pdev, 0, 256); +} +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_PLX, 0x87b0, quirk_plx_ntb_dma_alias); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_PLX, 0x87b1, quirk_plx_ntb_dma_alias); + /* * On Lenovo Thinkpad P50 SKUs with a Nvidia Quadro M1000M, the BIOS does * not always reset the secondary Nvidia GPU between reboots if the system -- 2.24.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v6 2/3] PCI: Add parameter nr_devfns to pci_add_dma_alias
pci_add_dma_alias can now be used to create a dma alias for a range of devfns. Reviewed-by: Logan Gunthorpe Signed-off-by: James Sewart --- drivers/pci/pci.c| 22 +- drivers/pci/quirks.c | 14 +++--- include/linux/pci.h | 2 +- 3 files changed, 25 insertions(+), 13 deletions(-) diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index d3c83248f3ce..dbb01aceafda 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -5857,7 +5857,8 @@ int pci_set_vga_state(struct pci_dev *dev, bool decode, /** * pci_add_dma_alias - Add a DMA devfn alias for a device * @dev: the PCI device for which alias is added - * @devfn: alias slot and function + * @devfn_from: alias slot and function + * @nr_devfns: Number of subsequent devfns to alias * * This helper encodes an 8-bit devfn as a bit number in dma_alias_mask * which is used to program permissible bus-devfn source addresses for DMA @@ -5873,8 +5874,13 @@ int pci_set_vga_state(struct pci_dev *dev, bool decode, * cannot be left as a userspace activity). DMA aliases should therefore * be configured via quirks, such as the PCI fixup header quirk. */ -void pci_add_dma_alias(struct pci_dev *dev, u8 devfn) +void pci_add_dma_alias(struct pci_dev *dev, u8 devfn_from, unsigned nr_devfns) { + int devfn_to; + + nr_devfns = min(nr_devfns, (unsigned)MAX_NR_DEVFNS); + devfn_to = devfn_from + nr_devfns - 1; + if (!dev->dma_alias_mask) dev->dma_alias_mask = bitmap_zalloc(MAX_NR_DEVFNS, GFP_KERNEL); if (!dev->dma_alias_mask) { @@ -5882,9 +5888,15 @@ void pci_add_dma_alias(struct pci_dev *dev, u8 devfn) return; } - set_bit(devfn, dev->dma_alias_mask); - pci_info(dev, "Enabling fixed DMA alias to %02x.%d\n", -PCI_SLOT(devfn), PCI_FUNC(devfn)); + bitmap_set(dev->dma_alias_mask, devfn_from, nr_devfns); + + if (nr_devfns == 1) + pci_info(dev, "Enabling fixed DMA alias to %02x.%d\n", + PCI_SLOT(devfn_from), PCI_FUNC(devfn_from)); + else if(nr_devfns > 1) + pci_info(dev, "Enabling fixed DMA alias for devfn range from %02x.%d to %02x.%d\n", + PCI_SLOT(devfn_from), PCI_FUNC(devfn_from), + PCI_SLOT(devfn_to), PCI_FUNC(devfn_to)); } bool pci_devs_are_dma_aliases(struct pci_dev *dev1, struct pci_dev *dev2) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 320255e5e8f8..0f3f5afc73fd 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -3932,7 +3932,7 @@ int pci_dev_specific_reset(struct pci_dev *dev, int probe) static void quirk_dma_func0_alias(struct pci_dev *dev) { if (PCI_FUNC(dev->devfn) != 0) - pci_add_dma_alias(dev, PCI_DEVFN(PCI_SLOT(dev->devfn), 0)); + pci_add_dma_alias(dev, PCI_DEVFN(PCI_SLOT(dev->devfn), 0), 1); } /* @@ -3946,7 +3946,7 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_RICOH, 0xe476, quirk_dma_func0_alias); static void quirk_dma_func1_alias(struct pci_dev *dev) { if (PCI_FUNC(dev->devfn) != 1) - pci_add_dma_alias(dev, PCI_DEVFN(PCI_SLOT(dev->devfn), 1)); + pci_add_dma_alias(dev, PCI_DEVFN(PCI_SLOT(dev->devfn), 1), 1); } /* @@ -4031,7 +4031,7 @@ static void quirk_fixed_dma_alias(struct pci_dev *dev) id = pci_match_id(fixed_dma_alias_tbl, dev); if (id) - pci_add_dma_alias(dev, id->driver_data); + pci_add_dma_alias(dev, id->driver_data, 1); } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ADAPTEC2, 0x0285, quirk_fixed_dma_alias); @@ -4073,9 +4073,9 @@ DECLARE_PCI_FIXUP_HEADER(0x8086, 0x244e, quirk_use_pcie_bridge_dma_alias); */ static void quirk_mic_x200_dma_alias(struct pci_dev *pdev) { - pci_add_dma_alias(pdev, PCI_DEVFN(0x10, 0x0)); - pci_add_dma_alias(pdev, PCI_DEVFN(0x11, 0x0)); - pci_add_dma_alias(pdev, PCI_DEVFN(0x12, 0x3)); + pci_add_dma_alias(pdev, PCI_DEVFN(0x10, 0x0), 1); + pci_add_dma_alias(pdev, PCI_DEVFN(0x11, 0x0), 1); + pci_add_dma_alias(pdev, PCI_DEVFN(0x12, 0x3), 1); } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2260, quirk_mic_x200_dma_alias); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2264, quirk_mic_x200_dma_alias); @@ -5273,7 +5273,7 @@ static void quirk_switchtec_ntb_dma_alias(struct pci_dev *pdev) pci_dbg(pdev, "Aliasing Partition %d Proxy ID %02x.%d\n", pp, PCI_SLOT(devfn), PCI_FUNC(devfn)); - pci_add_dma_alias(pdev, devfn); + pci_add_dma_alias(pdev, devfn, 1); } } diff --git a/include/linux/pci.h b/include/linux/pci.h index 6481da29d667..e7a9c8c92a93 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -2325,7 +2325,7 @@ static inline struct eeh_dev *pci_dev_to_eeh_dev(struct pci_dev *pdev)
[PATCH v6 1/3] PCI: Fix off by one in dma_alias_mask allocation size
The number of possible devfns is 256, add def and correct uses. Reviewed-by: Logan Gunthorpe Signed-off-by: James Sewart --- drivers/pci/pci.c| 2 +- drivers/pci/search.c | 2 +- include/linux/pci.h | 2 ++ 3 files changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index a97e2571a527..d3c83248f3ce 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -5876,7 +5876,7 @@ int pci_set_vga_state(struct pci_dev *dev, bool decode, void pci_add_dma_alias(struct pci_dev *dev, u8 devfn) { if (!dev->dma_alias_mask) - dev->dma_alias_mask = bitmap_zalloc(U8_MAX, GFP_KERNEL); + dev->dma_alias_mask = bitmap_zalloc(MAX_NR_DEVFNS, GFP_KERNEL); if (!dev->dma_alias_mask) { pci_warn(dev, "Unable to allocate DMA alias mask\n"); return; diff --git a/drivers/pci/search.c b/drivers/pci/search.c index bade14002fd8..9e4dfae47252 100644 --- a/drivers/pci/search.c +++ b/drivers/pci/search.c @@ -43,7 +43,7 @@ int pci_for_each_dma_alias(struct pci_dev *pdev, if (unlikely(pdev->dma_alias_mask)) { u8 devfn; - for_each_set_bit(devfn, pdev->dma_alias_mask, U8_MAX) { + for_each_set_bit(devfn, pdev->dma_alias_mask, MAX_NR_DEVFNS) { ret = fn(pdev, PCI_DEVID(pdev->bus->number, devfn), data); if (ret) diff --git a/include/linux/pci.h b/include/linux/pci.h index 1a6cf19eac2d..6481da29d667 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -57,6 +57,8 @@ #define PCI_DEVID(bus, devfn) u16)(bus)) << 8) | (devfn)) /* return bus from PCI devid = ((u16)bus_number) << 8) | devfn */ #define PCI_BUS_NUM(x) (((x) >> 8) & 0xff) +/* Number of possible devfns. devfns can be from 0.0 to 1f.7 inclusive */ +#define MAX_NR_DEVFNS 256 /* pci_slot represents a physical slot */ struct pci_slot { -- 2.24.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v5 2/3] PCI: Add parameter nr_devfns to pci_add_dma_alias
pci_add_dma_alias can now be used to create a dma alias for a range of devfns. Signed-off-by: James Sewart --- drivers/pci/pci.c| 23 ++- drivers/pci/quirks.c | 14 +++--- include/linux/pci.h | 2 +- 3 files changed, 26 insertions(+), 13 deletions(-) diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 0a4449a30ace..f9800a610ca1 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -5857,7 +5857,8 @@ int pci_set_vga_state(struct pci_dev *dev, bool decode, /** * pci_add_dma_alias - Add a DMA devfn alias for a device * @dev: the PCI device for which alias is added - * @devfn: alias slot and function + * @devfn_from: alias slot and function + * @nr_devfns: Number of subsequent devfns to alias * * This helper encodes an 8-bit devfn as a bit number in dma_alias_mask * which is used to program permissible bus-devfn source addresses for DMA @@ -5873,8 +5874,14 @@ int pci_set_vga_state(struct pci_dev *dev, bool decode, * cannot be left as a userspace activity). DMA aliases should therefore * be configured via quirks, such as the PCI fixup header quirk. */ -void pci_add_dma_alias(struct pci_dev *dev, u8 devfn) +void pci_add_dma_alias(struct pci_dev *dev, u8 devfn_from, unsigned nr_devfns) { + int devfn_to; + + if (nr_devfns > U8_MAX+1) + nr_devfns = U8_MAX+1; + devfn_to = devfn_from + nr_devfns - 1; + if (!dev->dma_alias_mask) dev->dma_alias_mask = bitmap_zalloc(U8_MAX+1, GFP_KERNEL); if (!dev->dma_alias_mask) { @@ -5882,9 +5889,15 @@ void pci_add_dma_alias(struct pci_dev *dev, u8 devfn) return; } - set_bit(devfn, dev->dma_alias_mask); - pci_info(dev, "Enabling fixed DMA alias to %02x.%d\n", -PCI_SLOT(devfn), PCI_FUNC(devfn)); + bitmap_set(dev->dma_alias_mask, devfn_from, nr_devfns); + + if (nr_devfns == 1) + pci_info(dev, "Enabling fixed DMA alias to %02x.%d\n", + PCI_SLOT(devfn_from), PCI_FUNC(devfn_from)); + else if (nr_devfns > 1) + pci_info(dev, "Enabling fixed DMA alias for devfn range from %02x.%d to %02x.%d\n", + PCI_SLOT(devfn_from), PCI_FUNC(devfn_from), + PCI_SLOT(devfn_to), PCI_FUNC(devfn_to)); } bool pci_devs_are_dma_aliases(struct pci_dev *dev1, struct pci_dev *dev2) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 320255e5e8f8..0f3f5afc73fd 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -3932,7 +3932,7 @@ int pci_dev_specific_reset(struct pci_dev *dev, int probe) static void quirk_dma_func0_alias(struct pci_dev *dev) { if (PCI_FUNC(dev->devfn) != 0) - pci_add_dma_alias(dev, PCI_DEVFN(PCI_SLOT(dev->devfn), 0)); + pci_add_dma_alias(dev, PCI_DEVFN(PCI_SLOT(dev->devfn), 0), 1); } /* @@ -3946,7 +3946,7 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_RICOH, 0xe476, quirk_dma_func0_alias); static void quirk_dma_func1_alias(struct pci_dev *dev) { if (PCI_FUNC(dev->devfn) != 1) - pci_add_dma_alias(dev, PCI_DEVFN(PCI_SLOT(dev->devfn), 1)); + pci_add_dma_alias(dev, PCI_DEVFN(PCI_SLOT(dev->devfn), 1), 1); } /* @@ -4031,7 +4031,7 @@ static void quirk_fixed_dma_alias(struct pci_dev *dev) id = pci_match_id(fixed_dma_alias_tbl, dev); if (id) - pci_add_dma_alias(dev, id->driver_data); + pci_add_dma_alias(dev, id->driver_data, 1); } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ADAPTEC2, 0x0285, quirk_fixed_dma_alias); @@ -4073,9 +4073,9 @@ DECLARE_PCI_FIXUP_HEADER(0x8086, 0x244e, quirk_use_pcie_bridge_dma_alias); */ static void quirk_mic_x200_dma_alias(struct pci_dev *pdev) { - pci_add_dma_alias(pdev, PCI_DEVFN(0x10, 0x0)); - pci_add_dma_alias(pdev, PCI_DEVFN(0x11, 0x0)); - pci_add_dma_alias(pdev, PCI_DEVFN(0x12, 0x3)); + pci_add_dma_alias(pdev, PCI_DEVFN(0x10, 0x0), 1); + pci_add_dma_alias(pdev, PCI_DEVFN(0x11, 0x0), 1); + pci_add_dma_alias(pdev, PCI_DEVFN(0x12, 0x3), 1); } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2260, quirk_mic_x200_dma_alias); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2264, quirk_mic_x200_dma_alias); @@ -5273,7 +5273,7 @@ static void quirk_switchtec_ntb_dma_alias(struct pci_dev *pdev) pci_dbg(pdev, "Aliasing Partition %d Proxy ID %02x.%d\n", pp, PCI_SLOT(devfn), PCI_FUNC(devfn)); - pci_add_dma_alias(pdev, devfn); + pci_add_dma_alias(pdev, devfn, 1); } } diff --git a/include/linux/pci.h b/include/linux/pci.h index 1a6cf19eac2d..84a8d4c2b24e 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -2323,7 +2323,7 @@ static inline struct eeh_dev *pci_dev_to_eeh_dev(struct pci_dev *pdev) } #endif -void
[PATCH v5 1/3] PCI: Fix off by one in dma_alias_mask allocation size
The number of possible devfns is 256 so the size of the bitmap for allocations should be U8_MAX+1. Signed-off-by: James Sewart --- drivers/pci/pci.c| 2 +- drivers/pci/search.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index a97e2571a527..0a4449a30ace 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -5876,7 +5876,7 @@ int pci_set_vga_state(struct pci_dev *dev, bool decode, void pci_add_dma_alias(struct pci_dev *dev, u8 devfn) { if (!dev->dma_alias_mask) - dev->dma_alias_mask = bitmap_zalloc(U8_MAX, GFP_KERNEL); + dev->dma_alias_mask = bitmap_zalloc(U8_MAX+1, GFP_KERNEL); if (!dev->dma_alias_mask) { pci_warn(dev, "Unable to allocate DMA alias mask\n"); return; diff --git a/drivers/pci/search.c b/drivers/pci/search.c index bade14002fd8..b3633af1743b 100644 --- a/drivers/pci/search.c +++ b/drivers/pci/search.c @@ -43,7 +43,7 @@ int pci_for_each_dma_alias(struct pci_dev *pdev, if (unlikely(pdev->dma_alias_mask)) { u8 devfn; - for_each_set_bit(devfn, pdev->dma_alias_mask, U8_MAX) { + for_each_set_bit(devfn, pdev->dma_alias_mask, U8_MAX+1) { ret = fn(pdev, PCI_DEVID(pdev->bus->number, devfn), data); if (ret) -- 2.24.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v5 3/3] PCI: Add DMA alias quirk for PLX PEX NTB
The PLX PEX NTB forwards DMA transactions using Requester ID's that don't exist as PCI devices. The devfn for a transaction is used as an index into a lookup table storing the origin of a transaction on the other side of the bridge. This patch aliases all possible devfn's to the NTB device so that any transaction coming in is governed by the mappings for the NTB. Reviewed-by: Logan Gunthorpe Signed-off-by: James Sewart --- drivers/pci/quirks.c | 15 +++ 1 file changed, 15 insertions(+) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 0f3f5afc73fd..3a67049ca630 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -5315,6 +5315,21 @@ SWITCHTEC_QUIRK(0x8574); /* PFXI 64XG3 */ SWITCHTEC_QUIRK(0x8575); /* PFXI 80XG3 */ SWITCHTEC_QUIRK(0x8576); /* PFXI 96XG3 */ +/* + * PLX NTB uses devfn proxy IDs to move TLPs between NT endpoints. These IDs + * are used to forward responses to the originator on the other side of the + * NTB. Alias all possible IDs to the NTB to permit access when the IOMMU is + * turned on. + */ +static void quirk_plx_ntb_dma_alias(struct pci_dev *pdev) +{ + pci_info(pdev, "Setting PLX NTB proxy ID aliases\n"); + /* PLX NTB may use all 256 devfns */ + pci_add_dma_alias(pdev, 0, 256); +} +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_PLX, 0x87b0, quirk_plx_ntb_dma_alias); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_PLX, 0x87b1, quirk_plx_ntb_dma_alias); + /* * On Lenovo Thinkpad P50 SKUs with a Nvidia Quadro M1000M, the BIOS does * not always reset the secondary Nvidia GPU between reboots if the system -- 2.24.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v4 1/2] PCI: Add parameter nr_devfns to pci_add_dma_alias
> On 29 Nov 2019, at 16:50, Logan Gunthorpe wrote: > > > > On 2019-11-29 5:49 a.m., James Sewart wrote: >> pci_add_dma_alias can now be used to create a dma alias for a range of >> devfns. >> >> Signed-off-by: James Sewart >> --- >> drivers/pci/pci.c| 23 ++- >> drivers/pci/quirks.c | 14 +++--- >> include/linux/pci.h | 2 +- >> 3 files changed, 26 insertions(+), 13 deletions(-) >> >> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c >> index a97e2571a527..9b0e3481fe17 100644 >> --- a/drivers/pci/pci.c >> +++ b/drivers/pci/pci.c >> @@ -5857,7 +5857,8 @@ int pci_set_vga_state(struct pci_dev *dev, bool decode, >> /** >> * pci_add_dma_alias - Add a DMA devfn alias for a device >> * @dev: the PCI device for which alias is added >> - * @devfn: alias slot and function >> + * @devfn_from: alias slot and function >> + * @nr_devfns: Number of subsequent devfns to alias >> * >> * This helper encodes an 8-bit devfn as a bit number in dma_alias_mask >> * which is used to program permissible bus-devfn source addresses for DMA >> @@ -5873,8 +5874,14 @@ int pci_set_vga_state(struct pci_dev *dev, bool >> decode, >> * cannot be left as a userspace activity). DMA aliases should therefore >> * be configured via quirks, such as the PCI fixup header quirk. >> */ >> -void pci_add_dma_alias(struct pci_dev *dev, u8 devfn) >> +void pci_add_dma_alias(struct pci_dev *dev, u8 devfn_from, unsigned >> nr_devfns) >> { >> +int devfn_to; >> + >> +if (nr_devfns > U8_MAX+1) >> +nr_devfns = U8_MAX+1; > > Why +1? That doesn't seem right to me…. U8_MAX is the max number U8 can represent(255) but is one less than the number of values it can represent(256). devfns can be 0.0 to 1f.7 inclusive(I think) so the number of possible devfns is 256. Thinking about it, maybe the zalloc should be U8_MAX+1 too? I might be wrong though, what do you reckon? > >> +devfn_to = devfn_from + nr_devfns - 1; >> + >> if (!dev->dma_alias_mask) >> dev->dma_alias_mask = bitmap_zalloc(U8_MAX, GFP_KERNEL); >> if (!dev->dma_alias_mask) { >> @@ -5882,9 +5889,15 @@ void pci_add_dma_alias(struct pci_dev *dev, u8 devfn) >> return; >> } >> >> -set_bit(devfn, dev->dma_alias_mask); >> -pci_info(dev, "Enabling fixed DMA alias to %02x.%d\n", >> - PCI_SLOT(devfn), PCI_FUNC(devfn)); >> +bitmap_set(dev->dma_alias_mask, devfn_from, nr_devfns); >> + >> +if (nr_devfns == 1) >> +pci_info(dev, "Enabling fixed DMA alias to %02x.%d\n", >> +PCI_SLOT(devfn_from), PCI_FUNC(devfn_from)); >> +else if(nr_devfns > 1) >> +pci_info(dev, "Enabling fixed DMA alias for devfn range from >> %02x.%d to %02x.%d\n", >> +PCI_SLOT(devfn_from), PCI_FUNC(devfn_from), >> +PCI_SLOT(devfn_to), PCI_FUNC(devfn_to)); >> } >> >> bool pci_devs_are_dma_aliases(struct pci_dev *dev1, struct pci_dev *dev2) >> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c >> index 320255e5e8f8..0f3f5afc73fd 100644 >> --- a/drivers/pci/quirks.c >> +++ b/drivers/pci/quirks.c >> @@ -3932,7 +3932,7 @@ int pci_dev_specific_reset(struct pci_dev *dev, int >> probe) >> static void quirk_dma_func0_alias(struct pci_dev *dev) >> { >> if (PCI_FUNC(dev->devfn) != 0) >> -pci_add_dma_alias(dev, PCI_DEVFN(PCI_SLOT(dev->devfn), 0)); >> +pci_add_dma_alias(dev, PCI_DEVFN(PCI_SLOT(dev->devfn), 0), 1); >> } >> >> /* >> @@ -3946,7 +3946,7 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_RICOH, 0xe476, >> quirk_dma_func0_alias); >> static void quirk_dma_func1_alias(struct pci_dev *dev) >> { >> if (PCI_FUNC(dev->devfn) != 1) >> -pci_add_dma_alias(dev, PCI_DEVFN(PCI_SLOT(dev->devfn), 1)); >> +pci_add_dma_alias(dev, PCI_DEVFN(PCI_SLOT(dev->devfn), 1), 1); >> } >> >> /* >> @@ -4031,7 +4031,7 @@ static void quirk_fixed_dma_alias(struct pci_dev *dev) >> >> id = pci_match_id(fixed_dma_alias_tbl, dev); >> if (id) >> -pci_add_dma_alias(dev, id->driver_data); >> +pci_add_dma_alias(dev, id->driver_data, 1); >> } >> >> DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ADAPTEC2, 0x0285, >> quirk_fixed_dma_alias); >> @@ -4073,9 +4073,9 @@ DECLARE_PCI_FIXUP_HEADER(0x8086, 0x244e, >> quirk_use_pcie_bridge_dma_alias); >> */ >> static void quirk_mic_x200_dma_alias(struct pci_dev *pdev) >> { >> -pci_add_dma_alias(pdev, PCI_DEVFN(0x10, 0x0)); >> -pci_add_dma_alias(pdev, PCI_DEVFN(0x11, 0x0)); >> -pci_add_dma_alias(pdev, PCI_DEVFN(0x12, 0x3)); >> +pci_add_dma_alias(pdev, PCI_DEVFN(0x10, 0x0), 1); >> +pci_add_dma_alias(pdev, PCI_DEVFN(0x11, 0x0), 1); >> +pci_add_dma_alias(pdev, PCI_DEVFN(0x12, 0x3), 1); >> } >> DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2260, >> quirk_mic_x200_dma_alias); >> DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2264, >> quirk_mic_x200_dma_alias); >> @@ -5273,7 +5273,7 @@
[PATCH v4 2/2] PCI: Add DMA alias quirk for PLX PEX NTB
The PLX PEX NTB forwards DMA transactions using Requester ID's that don't exist as PCI devices. The devfn for a transaction is used as an index into a lookup table storing the origin of a transaction on the other side of the bridge. This patch aliases all possible devfn's to the NTB device so that any transaction coming in is governed by the mappings for the NTB. Reviewed-by: Logan Gunthorpe Signed-off-by: James Sewart --- drivers/pci/quirks.c | 15 +++ 1 file changed, 15 insertions(+) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 0f3f5afc73fd..3a67049ca630 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -5315,6 +5315,21 @@ SWITCHTEC_QUIRK(0x8574); /* PFXI 64XG3 */ SWITCHTEC_QUIRK(0x8575); /* PFXI 80XG3 */ SWITCHTEC_QUIRK(0x8576); /* PFXI 96XG3 */ +/* + * PLX NTB uses devfn proxy IDs to move TLPs between NT endpoints. These IDs + * are used to forward responses to the originator on the other side of the + * NTB. Alias all possible IDs to the NTB to permit access when the IOMMU is + * turned on. + */ +static void quirk_plx_ntb_dma_alias(struct pci_dev *pdev) +{ + pci_info(pdev, "Setting PLX NTB proxy ID aliases\n"); + /* PLX NTB may use all 256 devfns */ + pci_add_dma_alias(pdev, 0, 256); +} +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_PLX, 0x87b0, quirk_plx_ntb_dma_alias); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_PLX, 0x87b1, quirk_plx_ntb_dma_alias); + /* * On Lenovo Thinkpad P50 SKUs with a Nvidia Quadro M1000M, the BIOS does * not always reset the secondary Nvidia GPU between reboots if the system -- 2.24.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v4 1/2] PCI: Add parameter nr_devfns to pci_add_dma_alias
pci_add_dma_alias can now be used to create a dma alias for a range of devfns. Signed-off-by: James Sewart --- drivers/pci/pci.c| 23 ++- drivers/pci/quirks.c | 14 +++--- include/linux/pci.h | 2 +- 3 files changed, 26 insertions(+), 13 deletions(-) diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index a97e2571a527..9b0e3481fe17 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -5857,7 +5857,8 @@ int pci_set_vga_state(struct pci_dev *dev, bool decode, /** * pci_add_dma_alias - Add a DMA devfn alias for a device * @dev: the PCI device for which alias is added - * @devfn: alias slot and function + * @devfn_from: alias slot and function + * @nr_devfns: Number of subsequent devfns to alias * * This helper encodes an 8-bit devfn as a bit number in dma_alias_mask * which is used to program permissible bus-devfn source addresses for DMA @@ -5873,8 +5874,14 @@ int pci_set_vga_state(struct pci_dev *dev, bool decode, * cannot be left as a userspace activity). DMA aliases should therefore * be configured via quirks, such as the PCI fixup header quirk. */ -void pci_add_dma_alias(struct pci_dev *dev, u8 devfn) +void pci_add_dma_alias(struct pci_dev *dev, u8 devfn_from, unsigned nr_devfns) { + int devfn_to; + + if (nr_devfns > U8_MAX+1) + nr_devfns = U8_MAX+1; + devfn_to = devfn_from + nr_devfns - 1; + if (!dev->dma_alias_mask) dev->dma_alias_mask = bitmap_zalloc(U8_MAX, GFP_KERNEL); if (!dev->dma_alias_mask) { @@ -5882,9 +5889,15 @@ void pci_add_dma_alias(struct pci_dev *dev, u8 devfn) return; } - set_bit(devfn, dev->dma_alias_mask); - pci_info(dev, "Enabling fixed DMA alias to %02x.%d\n", -PCI_SLOT(devfn), PCI_FUNC(devfn)); + bitmap_set(dev->dma_alias_mask, devfn_from, nr_devfns); + + if (nr_devfns == 1) + pci_info(dev, "Enabling fixed DMA alias to %02x.%d\n", + PCI_SLOT(devfn_from), PCI_FUNC(devfn_from)); + else if(nr_devfns > 1) + pci_info(dev, "Enabling fixed DMA alias for devfn range from %02x.%d to %02x.%d\n", + PCI_SLOT(devfn_from), PCI_FUNC(devfn_from), + PCI_SLOT(devfn_to), PCI_FUNC(devfn_to)); } bool pci_devs_are_dma_aliases(struct pci_dev *dev1, struct pci_dev *dev2) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 320255e5e8f8..0f3f5afc73fd 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -3932,7 +3932,7 @@ int pci_dev_specific_reset(struct pci_dev *dev, int probe) static void quirk_dma_func0_alias(struct pci_dev *dev) { if (PCI_FUNC(dev->devfn) != 0) - pci_add_dma_alias(dev, PCI_DEVFN(PCI_SLOT(dev->devfn), 0)); + pci_add_dma_alias(dev, PCI_DEVFN(PCI_SLOT(dev->devfn), 0), 1); } /* @@ -3946,7 +3946,7 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_RICOH, 0xe476, quirk_dma_func0_alias); static void quirk_dma_func1_alias(struct pci_dev *dev) { if (PCI_FUNC(dev->devfn) != 1) - pci_add_dma_alias(dev, PCI_DEVFN(PCI_SLOT(dev->devfn), 1)); + pci_add_dma_alias(dev, PCI_DEVFN(PCI_SLOT(dev->devfn), 1), 1); } /* @@ -4031,7 +4031,7 @@ static void quirk_fixed_dma_alias(struct pci_dev *dev) id = pci_match_id(fixed_dma_alias_tbl, dev); if (id) - pci_add_dma_alias(dev, id->driver_data); + pci_add_dma_alias(dev, id->driver_data, 1); } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ADAPTEC2, 0x0285, quirk_fixed_dma_alias); @@ -4073,9 +4073,9 @@ DECLARE_PCI_FIXUP_HEADER(0x8086, 0x244e, quirk_use_pcie_bridge_dma_alias); */ static void quirk_mic_x200_dma_alias(struct pci_dev *pdev) { - pci_add_dma_alias(pdev, PCI_DEVFN(0x10, 0x0)); - pci_add_dma_alias(pdev, PCI_DEVFN(0x11, 0x0)); - pci_add_dma_alias(pdev, PCI_DEVFN(0x12, 0x3)); + pci_add_dma_alias(pdev, PCI_DEVFN(0x10, 0x0), 1); + pci_add_dma_alias(pdev, PCI_DEVFN(0x11, 0x0), 1); + pci_add_dma_alias(pdev, PCI_DEVFN(0x12, 0x3), 1); } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2260, quirk_mic_x200_dma_alias); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2264, quirk_mic_x200_dma_alias); @@ -5273,7 +5273,7 @@ static void quirk_switchtec_ntb_dma_alias(struct pci_dev *pdev) pci_dbg(pdev, "Aliasing Partition %d Proxy ID %02x.%d\n", pp, PCI_SLOT(devfn), PCI_FUNC(devfn)); - pci_add_dma_alias(pdev, devfn); + pci_add_dma_alias(pdev, devfn, 1); } } diff --git a/include/linux/pci.h b/include/linux/pci.h index 1a6cf19eac2d..84a8d4c2b24e 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -2323,7 +2323,7 @@ static inline struct eeh_dev *pci_dev_to_eeh_dev(struct pci_dev *pdev) } #endif -void
[PATCH v3 2/2] PCI: Add DMA alias quirk for PLX PEX NTB
The PLX PEX NTB forwards DMA transactions using Requester ID's that don't exist as PCI devices. The devfn for a transaction is used as an index into a lookup table storing the origin of a transaction on the other side of the bridge. This patch aliases all possible devfn's to the NTB device so that any transaction coming in is governed by the mappings for the NTB. Reviewed-by: Logan Gunthorpe Signed-off-by: James Sewart --- drivers/pci/quirks.c | 15 +++ 1 file changed, 15 insertions(+) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 0f3f5afc73fd..3a67049ca630 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -5315,6 +5315,21 @@ SWITCHTEC_QUIRK(0x8574); /* PFXI 64XG3 */ SWITCHTEC_QUIRK(0x8575); /* PFXI 80XG3 */ SWITCHTEC_QUIRK(0x8576); /* PFXI 96XG3 */ +/* + * PLX NTB uses devfn proxy IDs to move TLPs between NT endpoints. These IDs + * are used to forward responses to the originator on the other side of the + * NTB. Alias all possible IDs to the NTB to permit access when the IOMMU is + * turned on. + */ +static void quirk_plx_ntb_dma_alias(struct pci_dev *pdev) +{ + pci_info(pdev, "Setting PLX NTB proxy ID aliases\n"); + /* PLX NTB may use all 256 devfns */ + pci_add_dma_alias(pdev, 0, 256); +} +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_PLX, 0x87b0, quirk_plx_ntb_dma_alias); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_PLX, 0x87b1, quirk_plx_ntb_dma_alias); + /* * On Lenovo Thinkpad P50 SKUs with a Nvidia Quadro M1000M, the BIOS does * not always reset the secondary Nvidia GPU between reboots if the system -- 2.24.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 1/2] PCI: Add parameter nr_devfns to pci_add_dma_alias
pci_add_dma_alias can now be used to create a dma alias for a range of devfns. Signed-off-by: James Sewart --- drivers/pci/pci.c| 21 - drivers/pci/quirks.c | 14 +++--- include/linux/pci.h | 2 +- 3 files changed, 24 insertions(+), 13 deletions(-) diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index a97e2571a527..71dbabf5ee3d 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -5857,7 +5857,8 @@ int pci_set_vga_state(struct pci_dev *dev, bool decode, /** * pci_add_dma_alias - Add a DMA devfn alias for a device * @dev: the PCI device for which alias is added - * @devfn: alias slot and function + * @devfn_from: alias slot and function + * @nr_devfns: Number of subsequent devfns to alias * * This helper encodes an 8-bit devfn as a bit number in dma_alias_mask * which is used to program permissible bus-devfn source addresses for DMA @@ -5873,8 +5874,12 @@ int pci_set_vga_state(struct pci_dev *dev, bool decode, * cannot be left as a userspace activity). DMA aliases should therefore * be configured via quirks, such as the PCI fixup header quirk. */ -void pci_add_dma_alias(struct pci_dev *dev, u8 devfn) +void pci_add_dma_alias(struct pci_dev *dev, u8 devfn_from, int nr_devfns) { + int devfn_to = devfn_from + nr_devfns - 1; + + BUG_ON(nr_devfns < 1); + if (!dev->dma_alias_mask) dev->dma_alias_mask = bitmap_zalloc(U8_MAX, GFP_KERNEL); if (!dev->dma_alias_mask) { @@ -5882,9 +5887,15 @@ void pci_add_dma_alias(struct pci_dev *dev, u8 devfn) return; } - set_bit(devfn, dev->dma_alias_mask); - pci_info(dev, "Enabling fixed DMA alias to %02x.%d\n", -PCI_SLOT(devfn), PCI_FUNC(devfn)); + bitmap_set(dev->dma_alias_mask, devfn_from, nr_devfns); + + if (nr_devfns == 1) + pci_info(dev, "Enabling fixed DMA alias to %02x.%d\n", + PCI_SLOT(devfn_from), PCI_FUNC(devfn_from)); + else + pci_info(dev, "Enabling fixed DMA alias for devfn range from %02x.%d to %02x.%d\n", + PCI_SLOT(devfn_from), PCI_FUNC(devfn_from), + PCI_SLOT(devfn_to), PCI_FUNC(devfn_to)); } bool pci_devs_are_dma_aliases(struct pci_dev *dev1, struct pci_dev *dev2) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 320255e5e8f8..0f3f5afc73fd 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -3932,7 +3932,7 @@ int pci_dev_specific_reset(struct pci_dev *dev, int probe) static void quirk_dma_func0_alias(struct pci_dev *dev) { if (PCI_FUNC(dev->devfn) != 0) - pci_add_dma_alias(dev, PCI_DEVFN(PCI_SLOT(dev->devfn), 0)); + pci_add_dma_alias(dev, PCI_DEVFN(PCI_SLOT(dev->devfn), 0), 1); } /* @@ -3946,7 +3946,7 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_RICOH, 0xe476, quirk_dma_func0_alias); static void quirk_dma_func1_alias(struct pci_dev *dev) { if (PCI_FUNC(dev->devfn) != 1) - pci_add_dma_alias(dev, PCI_DEVFN(PCI_SLOT(dev->devfn), 1)); + pci_add_dma_alias(dev, PCI_DEVFN(PCI_SLOT(dev->devfn), 1), 1); } /* @@ -4031,7 +4031,7 @@ static void quirk_fixed_dma_alias(struct pci_dev *dev) id = pci_match_id(fixed_dma_alias_tbl, dev); if (id) - pci_add_dma_alias(dev, id->driver_data); + pci_add_dma_alias(dev, id->driver_data, 1); } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ADAPTEC2, 0x0285, quirk_fixed_dma_alias); @@ -4073,9 +4073,9 @@ DECLARE_PCI_FIXUP_HEADER(0x8086, 0x244e, quirk_use_pcie_bridge_dma_alias); */ static void quirk_mic_x200_dma_alias(struct pci_dev *pdev) { - pci_add_dma_alias(pdev, PCI_DEVFN(0x10, 0x0)); - pci_add_dma_alias(pdev, PCI_DEVFN(0x11, 0x0)); - pci_add_dma_alias(pdev, PCI_DEVFN(0x12, 0x3)); + pci_add_dma_alias(pdev, PCI_DEVFN(0x10, 0x0), 1); + pci_add_dma_alias(pdev, PCI_DEVFN(0x11, 0x0), 1); + pci_add_dma_alias(pdev, PCI_DEVFN(0x12, 0x3), 1); } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2260, quirk_mic_x200_dma_alias); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2264, quirk_mic_x200_dma_alias); @@ -5273,7 +5273,7 @@ static void quirk_switchtec_ntb_dma_alias(struct pci_dev *pdev) pci_dbg(pdev, "Aliasing Partition %d Proxy ID %02x.%d\n", pp, PCI_SLOT(devfn), PCI_FUNC(devfn)); - pci_add_dma_alias(pdev, devfn); + pci_add_dma_alias(pdev, devfn, 1); } } diff --git a/include/linux/pci.h b/include/linux/pci.h index 1a6cf19eac2d..f51bdda49e9a 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -2323,7 +2323,7 @@ static inline struct eeh_dev *pci_dev_to_eeh_dev(struct pci_dev *pdev) } #endif -void pci_add_dma_alias(struct pci_dev *dev, u8 devfn); +void pci_add_dma_alias(struct pci_dev
[PATCH v3 2/2] PCI: Add DMA alias quirk for PLX PEX NTB
The PLX PEX NTB forwards DMA transactions using Requester ID's that don't exist as PCI devices. The devfn for a transaction is used as an index into a lookup table storing the origin of a transaction on the other side of the bridge. This patch aliases all possible devfn's to the NTB device so that any transaction coming in is governed by the mappings for the NTB. Reviewed-by: Logan Gunthorpe Signed-off-by: James Sewart --- drivers/pci/quirks.c | 15 +++ 1 file changed, 15 insertions(+) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 320255e5e8f8..1ed230f739a4 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -5315,6 +5315,21 @@ SWITCHTEC_QUIRK(0x8574); /* PFXI 64XG3 */ SWITCHTEC_QUIRK(0x8575); /* PFXI 80XG3 */ SWITCHTEC_QUIRK(0x8576); /* PFXI 96XG3 */ +/* + * PLX NTB uses devfn proxy IDs to move TLPs between NT endpoints. These IDs + * are used to forward responses to the originator on the other side of the + * NTB. Alias all possible IDs to the NTB to permit access when the IOMMU is + * turned on. + */ +static void quirk_plx_ntb_dma_alias(struct pci_dev *pdev) +{ + pci_info(pdev, "Setting PLX NTB proxy ID aliases\n"); + /* PLX NTB may use all 256 devfns */ + pci_add_dma_alias_range(pdev, 0, 256); +} +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_PLX, 0x87b0, quirk_plx_ntb_dma_alias); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_PLX, 0x87b1, quirk_plx_ntb_dma_alias); + /* * On Lenovo Thinkpad P50 SKUs with a Nvidia Quadro M1000M, the BIOS does * not always reset the secondary Nvidia GPU between reboots if the system -- 2.24.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 1/2] PCI: Add helper pci_add_dma_alias_range
pci_add_dma_alias_range can be used to create a dma alias for range of devfns. Reviewed-by: Logan Gunthorpe Signed-off-by: James Sewart --- drivers/pci/pci.c | 30 +++--- include/linux/pci.h | 1 + 2 files changed, 24 insertions(+), 7 deletions(-) diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index a97e2571a527..68339309c0f4 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -5854,6 +5854,18 @@ int pci_set_vga_state(struct pci_dev *dev, bool decode, return 0; } +int _pci_add_dma_alias_range(struct pci_dev *dev, u8 devfn_from, int len) +{ + if (!dev->dma_alias_mask) + dev->dma_alias_mask = bitmap_zalloc(U8_MAX, GFP_KERNEL); + if (!dev->dma_alias_mask) { + pci_warn(dev, "Unable to allocate DMA alias mask\n"); + return -ENOMEM; + } + bitmap_set(dev->dma_alias_mask, devfn_from, len); + return 0; +} + /** * pci_add_dma_alias - Add a DMA devfn alias for a device * @dev: the PCI device for which alias is added @@ -5875,18 +5887,22 @@ int pci_set_vga_state(struct pci_dev *dev, bool decode, */ void pci_add_dma_alias(struct pci_dev *dev, u8 devfn) { - if (!dev->dma_alias_mask) - dev->dma_alias_mask = bitmap_zalloc(U8_MAX, GFP_KERNEL); - if (!dev->dma_alias_mask) { - pci_warn(dev, "Unable to allocate DMA alias mask\n"); + if (_pci_add_dma_alias_range(dev, devfn, 1) != 0) return; - } - - set_bit(devfn, dev->dma_alias_mask); pci_info(dev, "Enabling fixed DMA alias to %02x.%d\n", PCI_SLOT(devfn), PCI_FUNC(devfn)); } +void pci_add_dma_alias_range(struct pci_dev *dev, u8 devfn_from, int len) +{ + int devfn_to = devfn_from + len - 1; + + if (_pci_add_dma_alias_range(dev, devfn_from, len) != 0) + return; + pci_info(dev, "Enabling fixed DMA alias for devfn range from %02x.%d to %02x.%d\n", +PCI_SLOT(devfn_from), PCI_FUNC(devfn_from), PCI_SLOT(devfn_to), PCI_FUNC(devfn_to)); +} + bool pci_devs_are_dma_aliases(struct pci_dev *dev1, struct pci_dev *dev2) { return (dev1->dma_alias_mask && diff --git a/include/linux/pci.h b/include/linux/pci.h index 1a6cf19eac2d..6765f3d0102b 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -2324,6 +2324,7 @@ static inline struct eeh_dev *pci_dev_to_eeh_dev(struct pci_dev *pdev) #endif void pci_add_dma_alias(struct pci_dev *dev, u8 devfn); +void pci_add_dma_alias_range(struct pci_dev *dev, u8 devfn_from, int len); bool pci_devs_are_dma_aliases(struct pci_dev *dev1, struct pci_dev *dev2); int pci_for_each_dma_alias(struct pci_dev *pdev, int (*fn)(struct pci_dev *pdev, -- 2.24.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2] PCI: Add DMA alias quirk for PLX PEX NTB
The PLX PEX NTB forwards DMA transactions using Requester ID's that don't exist as PCI devices. The devfn for a transaction is used as an index into a lookup table storing the origin of a transaction on the other side of the bridge. Add helper pci_add_dma_alias_range that can alias a range of devfns in dma_alias_mask. This patch aliases all possible devfn's to the NTB device so that any transaction coming in is governed by the mappings for the NTB. Signed-off-by: James Sewart --- drivers/pci/pci.c| 29 ++--- drivers/pci/quirks.c | 15 +++ include/linux/pci.h | 1 + 3 files changed, 38 insertions(+), 7 deletions(-) diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index a97e2571a527..f502af1b5d10 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -5854,6 +5854,18 @@ int pci_set_vga_state(struct pci_dev *dev, bool decode, return 0; } +int _pci_add_dma_alias_range(struct pci_dev *dev, u8 devfn_from, int len) +{ + if (!dev->dma_alias_mask) + dev->dma_alias_mask = bitmap_zalloc(U8_MAX, GFP_KERNEL); + if (!dev->dma_alias_mask) { + pci_warn(dev, "Unable to allocate DMA alias mask\n"); + return -ENOMEM; + } + bitmap_set(dev->dma_alias_mask, devfn_from, len); + return 0; +} + /** * pci_add_dma_alias - Add a DMA devfn alias for a device * @dev: the PCI device for which alias is added @@ -5875,18 +5887,21 @@ int pci_set_vga_state(struct pci_dev *dev, bool decode, */ void pci_add_dma_alias(struct pci_dev *dev, u8 devfn) { - if (!dev->dma_alias_mask) - dev->dma_alias_mask = bitmap_zalloc(U8_MAX, GFP_KERNEL); - if (!dev->dma_alias_mask) { - pci_warn(dev, "Unable to allocate DMA alias mask\n"); + if (_pci_add_dma_alias_range(dev, devfn, 1) != 0) return; - } - - set_bit(devfn, dev->dma_alias_mask); pci_info(dev, "Enabling fixed DMA alias to %02x.%d\n", PCI_SLOT(devfn), PCI_FUNC(devfn)); } +void pci_add_dma_alias_range(struct pci_dev *dev, u8 devfn_from, int len) +{ + int devfn_to = devfn_from + len - 1; + if (_pci_add_dma_alias_range(dev, devfn_from, len) != 0) + return; + pci_info(dev, "Enabling fixed DMA alias for devfn range from %02x.%d to %02x.%d\n", +PCI_SLOT(devfn_from), PCI_FUNC(devfn_from), PCI_SLOT(devfn_to), PCI_FUNC(devfn_to)); +} + bool pci_devs_are_dma_aliases(struct pci_dev *dev1, struct pci_dev *dev2) { return (dev1->dma_alias_mask && diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 320255e5e8f8..1ed230f739a4 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -5315,6 +5315,21 @@ SWITCHTEC_QUIRK(0x8574); /* PFXI 64XG3 */ SWITCHTEC_QUIRK(0x8575); /* PFXI 80XG3 */ SWITCHTEC_QUIRK(0x8576); /* PFXI 96XG3 */ +/* + * PLX NTB uses devfn proxy IDs to move TLPs between NT endpoints. These IDs + * are used to forward responses to the originator on the other side of the + * NTB. Alias all possible IDs to the NTB to permit access when the IOMMU is + * turned on. + */ +static void quirk_plx_ntb_dma_alias(struct pci_dev *pdev) +{ + pci_info(pdev, "Setting PLX NTB proxy ID aliases\n"); + /* PLX NTB may use all 256 devfns */ + pci_add_dma_alias_range(pdev, 0, 256); +} +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_PLX, 0x87b0, quirk_plx_ntb_dma_alias); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_PLX, 0x87b1, quirk_plx_ntb_dma_alias); + /* * On Lenovo Thinkpad P50 SKUs with a Nvidia Quadro M1000M, the BIOS does * not always reset the secondary Nvidia GPU between reboots if the system diff --git a/include/linux/pci.h b/include/linux/pci.h index 1a6cf19eac2d..6765f3d0102b 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -2324,6 +2324,7 @@ static inline struct eeh_dev *pci_dev_to_eeh_dev(struct pci_dev *pdev) #endif void pci_add_dma_alias(struct pci_dev *dev, u8 devfn); +void pci_add_dma_alias_range(struct pci_dev *dev, u8 devfn_from, int len); bool pci_devs_are_dma_aliases(struct pci_dev *dev1, struct pci_dev *dev2); int pci_for_each_dma_alias(struct pci_dev *pdev, int (*fn)(struct pci_dev *pdev, -- 2.24.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] Ensure pci transactions coming from PLX NTB are handled when IOMMU is turned on
Any comments on this? Cheers, James. > On 24 Oct 2019, at 13:52, James Sewart wrote: > > The PLX PEX NTB forwards DMA transactions using Requester ID's that don't > exist as > PCI devices. The devfn for a transaction is used as an index into a lookup > table > storing the origin of a transaction on the other side of the bridge. > > This patch aliases all possible devfn's to the NTB device so that any > transaction > coming in is governed by the mappings for the NTB. > > Signed-Off-By: James Sewart > --- > drivers/pci/quirks.c | 22 ++ > 1 file changed, 22 insertions(+) > > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c > index 320255e5e8f8..647f546e427f 100644 > --- a/drivers/pci/quirks.c > +++ b/drivers/pci/quirks.c > @@ -5315,6 +5315,28 @@ SWITCHTEC_QUIRK(0x8574); /* PFXI 64XG3 */ > SWITCHTEC_QUIRK(0x8575); /* PFXI 80XG3 */ > SWITCHTEC_QUIRK(0x8576); /* PFXI 96XG3 */ > > +/* > + * PLX NTB uses devfn proxy IDs to move TLPs between NT endpoints. These IDs > + * are used to forward responses to the originator on the other side of the > + * NTB. Alias all possible IDs to the NTB to permit access when the IOMMU is > + * turned on. > + */ > +static void quirk_PLX_NTB_dma_alias(struct pci_dev *pdev) > +{ > + if (!pdev->dma_alias_mask) > + pdev->dma_alias_mask = kcalloc(BITS_TO_LONGS(U8_MAX), > + sizeof(long), GFP_KERNEL); > + if (!pdev->dma_alias_mask) { > + dev_warn(>dev, "Unable to allocate DMA alias mask\n"); > + return; > + } > + > + // PLX NTB may use all 256 devfns > + memset(pdev->dma_alias_mask, U8_MAX, (U8_MAX+1)/BITS_PER_BYTE); > +} > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_PLX, 0x87b0, quirk_PLX_NTB_dma_alias); > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_PLX, 0x87b1, quirk_PLX_NTB_dma_alias); > + > /* > * On Lenovo Thinkpad P50 SKUs with a Nvidia Quadro M1000M, the BIOS does > * not always reset the secondary Nvidia GPU between reboots if the system > -- > 2.19.1 > > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] Ensure pci transactions coming from PLX NTB are handled when IOMMU is turned on
The PLX PEX NTB forwards DMA transactions using Requester ID's that don't exist as PCI devices. The devfn for a transaction is used as an index into a lookup table storing the origin of a transaction on the other side of the bridge. This patch aliases all possible devfn's to the NTB device so that any transaction coming in is governed by the mappings for the NTB. Signed-Off-By: James Sewart --- drivers/pci/quirks.c | 22 ++ 1 file changed, 22 insertions(+) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 320255e5e8f8..647f546e427f 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -5315,6 +5315,28 @@ SWITCHTEC_QUIRK(0x8574); /* PFXI 64XG3 */ SWITCHTEC_QUIRK(0x8575); /* PFXI 80XG3 */ SWITCHTEC_QUIRK(0x8576); /* PFXI 96XG3 */ +/* + * PLX NTB uses devfn proxy IDs to move TLPs between NT endpoints. These IDs + * are used to forward responses to the originator on the other side of the + * NTB. Alias all possible IDs to the NTB to permit access when the IOMMU is + * turned on. + */ +static void quirk_PLX_NTB_dma_alias(struct pci_dev *pdev) +{ + if (!pdev->dma_alias_mask) + pdev->dma_alias_mask = kcalloc(BITS_TO_LONGS(U8_MAX), + sizeof(long), GFP_KERNEL); + if (!pdev->dma_alias_mask) { + dev_warn(>dev, "Unable to allocate DMA alias mask\n"); + return; + } + + // PLX NTB may use all 256 devfns + memset(pdev->dma_alias_mask, U8_MAX, (U8_MAX+1)/BITS_PER_BYTE); +} +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_PLX, 0x87b0, quirk_PLX_NTB_dma_alias); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_PLX, 0x87b1, quirk_PLX_NTB_dma_alias); + /* * On Lenovo Thinkpad P50 SKUs with a Nvidia Quadro M1000M, the BIOS does * not always reset the secondary Nvidia GPU between reboots if the system -- 2.19.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions
Hey Lu, > On 10 Apr 2019, at 06:22, Lu Baolu wrote: > > Hi James, > > On 4/6/19 2:02 AM, James Sewart wrote: >> Hey Lu, >> My bad, did some debugging on my end. The issue was swapping out >> find_domain for iommu_get_domain_for_dev. It seems in some situations the >> domain is not attached to the group but the device is expected to have the >> domain still stored in its archdata. >> I’ve attached the final patch with find_domain unremoved which seems to >> work in my testing. > > Just looked into your v3 patch set and some thoughts from my end posted > here just for your information. > > Let me post the problems we want to address. > > 1. When allocating a new group for a device, how should we determine the > type of the default domain? > > 2. If we need to put a device into an existing group which uses a > different type of domain from what the device desires to use, we might > break the functionality of the device. > > My new thought is letting the iommu generic code to determine the > default domain type (hence my proposed vendor specific default domain > type patches could be dropped). > > If the default domain type is dynamical mapping, and later in > iommu_no_mapping(), we determines that we must use an identity domain, > we then call iommu_request_dm_for_dev(dev). > > If the default domain type is identity mapping, and later in > iommu_no_mapping(), we determined that we must use a dynamical domain, > we then call iommu_request_dma_domain_for_dev(dev). > > We already have iommu_request_dm_for_dev() in iommu.c. We only need to > implement iommu_request_dma_domain_for_dev(). > > With this done, your patch titled "Create an IOMMU group for devices > that require an identity map" could also be dropped. > > Any thoughts? Theres a couple issues I can think of. iommu_request_dm_for_dev changes the domain for all devices within the devices group, if we may have devices with different domain requirements in the same group then only the last initialised device will get the domain it wants. If we want to add iommu_request_dma_domain_for_dev then we would still need another IOMMU group for identity domain devices. Both with v3 and the proposed method here, iommu_should_identity_map is determining whether the device requires an identity map. In v3 this is called once up front by the generic code to determine for the IOMMU group which domain type to use. In the proposed method I think this would happen after initial domain allocation and would trigger the domain to be replaced with a different domain. I prefer the solution in v3. What do you think? > >> Cheers, >> James. > > Best regards, > Lu Baolu Cheers, James. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions
Hey Lu, My bad, did some debugging on my end. The issue was swapping out find_domain for iommu_get_domain_for_dev. It seems in some situations the domain is not attached to the group but the device is expected to have the domain still stored in its archdata. I’ve attached the final patch with find_domain unremoved which seems to work in my testing. Cheers, James. 0009-iommu-vt-d-Remove-lazy-allocation-of-domains.patch Description: Binary data > On 4 Apr 2019, at 07:49, Lu Baolu wrote: > > Hi James, > > I did a sanity test from my end. The test machine fails to boot. I > haven't seen any valuable kernel log. It results in a purple screen. > > Best regards, > Lu Baolu > > On 3/29/19 11:26 PM, James Sewart wrote: >> Hey Lu, >> I’ve attached a preliminary v3, if you could take a look and run some tests >> that would be great. >> Since v2 i’ve added your default domain type patches, the new device_group >> handler, and incorporated Jacob’s feedback. >>> On 28 Mar 2019, at 18:37, James Sewart wrote: >>> >>> Hey Lu, >>> On 26 Mar 2019, at 01:24, Lu Baolu wrote: Hi James, On 3/25/19 8:57 PM, James Sewart wrote: >>> Theres an issue that if we choose to alloc a new resv_region with type >>> IOMMU_RESV_DIRECT, we will need to refactor intel_iommu_put_resv_regions >>> to free this entry type which means refactoring the rmrr regions in >>> get_resv_regions. Should this work be in this patchset? >> Do you mean the rmrr regions are not allocated in get_resv_regions, but >> are freed in put_resv_regions? I think we should fix this in this patch >> set since this might impact the device passthrough if we don't do it. > They’re not allocated and not freed currently, only type IOMMU_RESV_MSI is > freed in put_resv_regions. If we allocate a new resv_region with type > IOMMU_RESV_DIRECT for the isa region, then it won’t be freed. If we modify > put_resv_regions to free type IOMMU_RESV_DIRECT, then we will try to free > the static RMRR regions. > Either the ISA region is static and not freed as with my implementation, > or the RMRR regions are converted to be allocated on each call to > get_resv_regions and freed in put_resv_regions. By the way, there's another way in my mind. Let's add a new region type for LPC devices, e.x. IOMMU_RESV_LPC, and then handle it in the same way as those MSI regions. Just FYI. >>> >>> This solution would require adding some extra code to >>> iommu_group_create_direct_mappings as currently only type >>> IOMMU_RESV_DIRECT is identity mapped, other types are only reserved. >>> >>> Best regards, Lu Baolu >>> >>> Cheers, >>> James. >> Cheers, >> James. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions
Hey Lu, I’ve attached a preliminary v3, if you could take a look and run some tests that would be great. Since v2 i’ve added your default domain type patches, the new device_group handler, and incorporated Jacob’s feedback. > On 28 Mar 2019, at 18:37, James Sewart wrote: > > Hey Lu, > >> On 26 Mar 2019, at 01:24, Lu Baolu wrote: >> >> Hi James, >> >> On 3/25/19 8:57 PM, James Sewart wrote: > Theres an issue that if we choose to alloc a new resv_region with type > IOMMU_RESV_DIRECT, we will need to refactor intel_iommu_put_resv_regions > to free this entry type which means refactoring the rmrr regions in > get_resv_regions. Should this work be in this patchset? Do you mean the rmrr regions are not allocated in get_resv_regions, but are freed in put_resv_regions? I think we should fix this in this patch set since this might impact the device passthrough if we don't do it. >>> They’re not allocated and not freed currently, only type IOMMU_RESV_MSI is >>> freed in put_resv_regions. If we allocate a new resv_region with type >>> IOMMU_RESV_DIRECT for the isa region, then it won’t be freed. If we modify >>> put_resv_regions to free type IOMMU_RESV_DIRECT, then we will try to free >>> the static RMRR regions. >>> Either the ISA region is static and not freed as with my implementation, >>> or the RMRR regions are converted to be allocated on each call to >>> get_resv_regions and freed in put_resv_regions. >> >> By the way, there's another way in my mind. Let's add a new region type >> for LPC devices, e.x. IOMMU_RESV_LPC, and then handle it in the same way >> as those MSI regions. Just FYI. > > This solution would require adding some extra code to > iommu_group_create_direct_mappings as currently only type > IOMMU_RESV_DIRECT is identity mapped, other types are only reserved. > > >> >> Best regards, >> Lu Baolu > > Cheers, > James. Cheers, James. 0001-iommu-Move-iommu_group_create_direct_mappings-to-aft.patch Description: Binary data 0002-iommu-vt-d-Implement-apply_resv_region-for-reserving.patch Description: Binary data 0003-iommu-vt-d-Expose-ISA-direct-mapping-region-via-iomm.patch Description: Binary data 0004-iommu-vt-d-Create-an-IOMMU-group-for-devices-that-re.patch Description: Binary data 0005-iommu-Add-ops-entry-for-vendor-specific-default-doma.patch Description: Binary data 0006-iommu-vt-d-Add-is_identity_map-ops-entry.patch Description: Binary data 0007-iommu-vt-d-Enable-DMA-remapping-after-rmrr-mapped.patch Description: Binary data 0008-iommu-vt-d-Allow-IOMMU_DOMAIN_DMA-to-be-allocated-by.patch Description: Binary data 0009-iommu-vt-d-Remove-lazy-allocation-of-domains.patch Description: Binary data ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions
Hey Lu, > On 26 Mar 2019, at 01:24, Lu Baolu wrote: > > Hi James, > > On 3/25/19 8:57 PM, James Sewart wrote: Theres an issue that if we choose to alloc a new resv_region with type IOMMU_RESV_DIRECT, we will need to refactor intel_iommu_put_resv_regions to free this entry type which means refactoring the rmrr regions in get_resv_regions. Should this work be in this patchset? >>> Do you mean the rmrr regions are not allocated in get_resv_regions, but >>> are freed in put_resv_regions? I think we should fix this in this patch >>> set since this might impact the device passthrough if we don't do it. >> They’re not allocated and not freed currently, only type IOMMU_RESV_MSI is >> freed in put_resv_regions. If we allocate a new resv_region with type >> IOMMU_RESV_DIRECT for the isa region, then it won’t be freed. If we modify >> put_resv_regions to free type IOMMU_RESV_DIRECT, then we will try to free >> the static RMRR regions. >> Either the ISA region is static and not freed as with my implementation, >> or the RMRR regions are converted to be allocated on each call to >> get_resv_regions and freed in put_resv_regions. > > By the way, there's another way in my mind. Let's add a new region type > for LPC devices, e.x. IOMMU_RESV_LPC, and then handle it in the same way > as those MSI regions. Just FYI. This solution would require adding some extra code to iommu_group_create_direct_mappings as currently only type IOMMU_RESV_DIRECT is identity mapped, other types are only reserved. > > Best regards, > Lu Baolu Cheers, James. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions
Hey Lu, > On 25 Mar 2019, at 02:03, Lu Baolu wrote: > > Hi James, > > On 3/22/19 5:57 PM, James Sewart wrote: >> Hey Lu, >>> On 15 Mar 2019, at 02:19, Lu Baolu wrote: >>> >>> Hi James, >>> >>> On 3/14/19 7:58 PM, James Sewart wrote: To support mapping ISA region via iommu_group_create_direct_mappings, make sure its exposed by iommu_get_resv_regions. This allows deduplication of reserved region mappings Signed-off-by: James Sewart --- drivers/iommu/intel-iommu.c | 42 + 1 file changed, 33 insertions(+), 9 deletions(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 8e0a4e2ff77f..2e00e8708f06 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -337,6 +337,8 @@ static LIST_HEAD(dmar_rmrr_units); #define for_each_rmrr_units(rmrr) \ list_for_each_entry(rmrr, _rmrr_units, list) +static struct iommu_resv_region *isa_resv_region; + /* bitmap for indexing intel_iommus */ static int g_num_of_iommus; @@ -2780,26 +2782,34 @@ static inline int iommu_prepare_rmrr_dev(struct dmar_rmrr_unit *rmrr, rmrr->end_address); } +static inline struct iommu_resv_region *iommu_get_isa_resv_region(void) +{ + if (!isa_resv_region) + isa_resv_region = iommu_alloc_resv_region(0, + 16*1024*1024, + 0, IOMMU_RESV_DIRECT); + + return isa_resv_region; +} + #ifdef CONFIG_INTEL_IOMMU_FLOPPY_WA -static inline void iommu_prepare_isa(void) +static inline void iommu_prepare_isa(struct pci_dev *pdev) { - struct pci_dev *pdev; int ret; + struct iommu_resv_region *reg = iommu_get_isa_resv_region(); - pdev = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, NULL); - if (!pdev) + if (!reg) return; pr_info("Prepare 0-16MiB unity mapping for LPC\n"); - ret = iommu_prepare_identity_map(>dev, 0, 16*1024*1024 - 1); + ret = iommu_prepare_identity_map(>dev, reg->start, + reg->start + reg->length - 1); if (ret) pr_err("Failed to create 0-16MiB identity map - floppy might not work\n"); - - pci_dev_put(pdev); } #else -static inline void iommu_prepare_isa(void) +static inline void iommu_prepare_isa(struct pci_dev *pdev) { return; } @@ -3289,6 +3299,7 @@ static int __init init_dmars(void) struct dmar_rmrr_unit *rmrr; bool copied_tables = false; struct device *dev; + struct pci_dev *pdev; struct intel_iommu *iommu; int i, ret; @@ -3469,7 +3480,11 @@ static int __init init_dmars(void) } } - iommu_prepare_isa(); + pdev = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, NULL); + if (pdev) { + iommu_prepare_isa(pdev); + pci_dev_put(pdev); + } domains_done: @@ -5266,6 +5281,7 @@ static void intel_iommu_get_resv_regions(struct device *device, struct iommu_resv_region *reg; struct dmar_rmrr_unit *rmrr; struct device *i_dev; + struct pci_dev *pdev; int i; rcu_read_lock(); @@ -5280,6 +5296,14 @@ static void intel_iommu_get_resv_regions(struct device *device, } rcu_read_unlock(); + if (dev_is_pci(device)) { + pdev = to_pci_dev(device); + if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA) { + reg = iommu_get_isa_resv_region(); + list_add_tail(>list, head); + } + } + >>> >>> Just wondering why not just >>> >>> +#ifdef CONFIG_INTEL_IOMMU_FLOPPY_WA >>> + if (dev_is_pci(device)) { >>> + pdev = to_pci_dev(device); >>> + if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA) { >>> + reg = iommu_alloc_resv_region(0, >>> + 16*1024*1024, >>> + 0, IOMMU_RESV_DIRECT); >>> + if (reg) >>> + list_add_tail(>list, head); >>> + } >>> + } >>> +#endif >>> >>> and, remove all other related code? >> At this point in the patchset if we remove iommu_prepare_isa then the ISA >> region won’t be mapped to the device. Only once the dma domain is allocable >> will the reserved regions be mapped by iommu_group_create_direct_mappings. > > Yes. So if we put the allocation code here, it won't impact anything and > will take effect as soon as the dma domain is allocatable. > >> Theres an issue that if we choose to alloc a new resv_region with type >> IOMMU_RESV_DIRECT, we will need to refactor intel_iommu_put_resv_regions >> to free this entry type which means refactoring the
Re: [PATCH v2 7/7] iommu/vt-d: Remove lazy allocation of domains
Hey Jacob, > On 14 Mar 2019, at 23:35, Jacob Pan wrote: > > On Thu, 14 Mar 2019 11:59:36 + > James Sewart wrote: > >> >> -domain = get_valid_domain_for_dev(dev); >> +domain = find_domain(dev); >> if (!domain) >> return DMA_MAPPING_ERROR; >> >> @@ -3914,7 +3624,7 @@ static int intel_map_sg(struct device *dev, >> struct scatterlist *sglist, int nele if (iommu_no_mapping(dev)) >> return intel_nontranslate_map_sg(dev, sglist, >> nelems, dir); >> -domain = get_valid_domain_for_dev(dev); >> +domain = find_domain(dev); > This patchset looks like a very good clean up, I am wondering why we > can't use the generic iommu_get_domain_for_dev() here, since VT-d has a > default DMA domain after your patch. This should be possible, only downside is we get an iommu_domain from iommu_get_domain_for_dev and will have to check its not null before getting the dmar_domain from it. We will be able to remove find_domain though. Cheers, James. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 0/7] iommu/vt-d: Fix-up device-domain relationship by refactoring to use iommu group default domain.
Hey Lu, > On 20 Mar 2019, at 01:26, Lu Baolu wrote: > > Hi James, > > On 3/19/19 9:35 PM, James Sewart wrote: >> Hey Lu, >>> On 15 Mar 2019, at 03:13, Lu Baolu wrote: >>> >>> Hi James, >>> >>> On 3/14/19 7:56 PM, James Sewart wrote: Patches 1 and 2 are the same as v1. v1-v2: Refactored ISA direct mappings to be returned by iommu_get_resv_regions. Integrated patch by Lu to defer turning on DMAR until iommu.c has mapped reserved regions. Integrated patches by Lu to remove more unused code in cleanup. Lu: I didn't integrate your patch to set the default domain type as it isn't directly related to the aim of this patchset. Instead patch 4 >>> >>> Without those patches, user experience will be affected and some devices >>> will not work on Intel platforms anymore. >>> >>> For a long time, Intel IOMMU driver has its own logic to determine >>> whether a device requires an identity domain. For example, when user >>> specifies "iommu=pt" in kernel parameter, all device will be attached >>> with the identity domain. Further more, some quirky devices require >>> an identity domain to be used before enabling DMA remapping, otherwise, >>> it will not work. This was done by adding quirk bits in Intel IOMMU >>> driver. >>> >>> So from my point of view, one way is porting all those quirks and kernel >>> parameters into IOMMU generic layer, or opening a door for vendor IOMMU >>> driver to determine the default domain type by their own. I prefer the >>> latter option since it will not impact any behaviors on other >>> architectures. >> I see your point. I’m not confident that using the proposed door to set a >> groups default domain has the desired behaviour. As discussed before the >> default domain type will be set based on the desired type for only the >> first device attached to a group. I think to change the default domain >> type you would need a slightly different door that wasn’t conditioned on >> device. > > I think this as another problem. Just a summarize for the ease of > discussion. We saw two problems: > > 1. When allocating a new group for a device, how should we determine the > type of the default domain? This is what my proposal patches trying to > address. This will work as expected only if all devices within a group get the same result from is_identity_map. Otherwise wee see issue 2. > > 2. If we need to put a device into an existing group which uses a > different type of domain from what the device desires to use, we might > break the functionality of the device. For this problem I'd second your > proposal below if I get your point correctly. > >> For situations where individual devices require an identity domain because >> of quirks then maybe calling is_identity_map per device in >> iommu_group_get_for_dev is a better solution than the one I proposed. > > Do you mean if we see a quirky device requires a different domain type > other than the default domain type of the group, we will assign a new > group to it? That looks good to me as far as I can see. I suppose this > should be done in vt-d's ops callback. I have thought about this a bit and I think the cleanest approach is to put devices that require an identity domain into their own group, this can be done in the device_group callback, avoiding any situation where we have to deal with creating a new group based on domain type in iommu_group_get_for_dev. This way we shouldn’t ever be in a situation with multiple different domain types per group. This way your patches will work as expected. See below for a possible implementation. > > Best regards, > Lu Baolu Cheers, James. Devices that require an identity map because of quirks or other reasons should be put in their own IOMMU group so as to not end up with multiple different domains per group. Signed-off-by: James Sewart diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 3cb8b36abf50..0f5a121d31a0 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -5421,6 +5421,22 @@ struct intel_iommu *intel_svm_device_to_iommu(struct device *dev) } #endif /* CONFIG_INTEL_IOMMU_SVM */ +static struct iommu_group *intel_identity_group; + +struct iommu_group *intel_iommu_pci_device_group(struct device *dev) +{ + if (iommu_no_mapping(dev)) { + if (!intel_identity_group) { + intel_identity_group = iommu_group_alloc(); + if (IS_ERR(intel_identity_group)) + return NULL; + } + return intel_identity_group; + } + + return pci_device_group(dev); +} + const struct iommu_ops intel_iommu_ops = { .capable= intel_iommu_capable, .domain_alloc = intel_iommu_domain_alloc, @@ -5435,7 +5451,7 @@ const struct iommu_ops intel_iommu_ops = { .get_resv_regions = intel_iommu_get_resv_regions, .put_resv_regions
Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions
Hey Lu, > On 15 Mar 2019, at 02:19, Lu Baolu wrote: > > Hi James, > > On 3/14/19 7:58 PM, James Sewart wrote: >> To support mapping ISA region via iommu_group_create_direct_mappings, >> make sure its exposed by iommu_get_resv_regions. This allows >> deduplication of reserved region mappings >> Signed-off-by: James Sewart >> --- >> drivers/iommu/intel-iommu.c | 42 + >> 1 file changed, 33 insertions(+), 9 deletions(-) >> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c >> index 8e0a4e2ff77f..2e00e8708f06 100644 >> --- a/drivers/iommu/intel-iommu.c >> +++ b/drivers/iommu/intel-iommu.c >> @@ -337,6 +337,8 @@ static LIST_HEAD(dmar_rmrr_units); >> #define for_each_rmrr_units(rmrr) \ >> list_for_each_entry(rmrr, _rmrr_units, list) >> +static struct iommu_resv_region *isa_resv_region; >> + >> /* bitmap for indexing intel_iommus */ >> static int g_num_of_iommus; >> @@ -2780,26 +2782,34 @@ static inline int iommu_prepare_rmrr_dev(struct >> dmar_rmrr_unit *rmrr, >>rmrr->end_address); >> } >> +static inline struct iommu_resv_region *iommu_get_isa_resv_region(void) >> +{ >> +if (!isa_resv_region) >> +isa_resv_region = iommu_alloc_resv_region(0, >> +16*1024*1024, >> +0, IOMMU_RESV_DIRECT); >> + >> +return isa_resv_region; >> +} >> + >> #ifdef CONFIG_INTEL_IOMMU_FLOPPY_WA >> -static inline void iommu_prepare_isa(void) >> +static inline void iommu_prepare_isa(struct pci_dev *pdev) >> { >> -struct pci_dev *pdev; >> int ret; >> +struct iommu_resv_region *reg = iommu_get_isa_resv_region(); >> - pdev = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, NULL); >> -if (!pdev) >> +if (!reg) >> return; >> pr_info("Prepare 0-16MiB unity mapping for LPC\n"); >> -ret = iommu_prepare_identity_map(>dev, 0, 16*1024*1024 - 1); >> +ret = iommu_prepare_identity_map(>dev, reg->start, >> +reg->start + reg->length - 1); >> if (ret) >> pr_err("Failed to create 0-16MiB identity map - floppy might >> not work\n"); >> - >> -pci_dev_put(pdev); >> } >> #else >> -static inline void iommu_prepare_isa(void) >> +static inline void iommu_prepare_isa(struct pci_dev *pdev) >> { >> return; >> } >> @@ -3289,6 +3299,7 @@ static int __init init_dmars(void) >> struct dmar_rmrr_unit *rmrr; >> bool copied_tables = false; >> struct device *dev; >> +struct pci_dev *pdev; >> struct intel_iommu *iommu; >> int i, ret; >> @@ -3469,7 +3480,11 @@ static int __init init_dmars(void) >> } >> } >> - iommu_prepare_isa(); >> +pdev = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, NULL); >> +if (pdev) { >> +iommu_prepare_isa(pdev); >> +pci_dev_put(pdev); >> +} >>domains_done: >> @@ -5266,6 +5281,7 @@ static void intel_iommu_get_resv_regions(struct >> device *device, >> struct iommu_resv_region *reg; >> struct dmar_rmrr_unit *rmrr; >> struct device *i_dev; >> +struct pci_dev *pdev; >> int i; >> rcu_read_lock(); >> @@ -5280,6 +5296,14 @@ static void intel_iommu_get_resv_regions(struct >> device *device, >> } >> rcu_read_unlock(); >> + if (dev_is_pci(device)) { >> +pdev = to_pci_dev(device); >> +if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA) { >> +reg = iommu_get_isa_resv_region(); >> +list_add_tail(>list, head); >> +} >> +} >> + > > Just wondering why not just > > +#ifdef CONFIG_INTEL_IOMMU_FLOPPY_WA > + if (dev_is_pci(device)) { > + pdev = to_pci_dev(device); > + if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA) { > + reg = iommu_alloc_resv_region(0, > + 16*1024*1024, > + 0, IOMMU_RESV_DIRECT); > + if (reg) > + list_add_tail(>list, head); > + } > + } > +#endif > > and, remove all other related code? At this point in the patchset if we remove iommu_prepare_isa then the ISA region won’t be mapped to the device. Only once the dma domain is allocable will the reserved regions be mapped by iommu_group_create_direct_mappings. Theres an issue that if we choose to alloc a new resv_region with type IOMMU_RESV_DIRECT, we will need to refactor intel_iommu_put_resv_regions to free this entry type which means refactoring the rmrr regions in get_resv_regions. Should this work be in this patchset? > > Best regards, > Lu Baolu Cheers, James. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 0/7] iommu/vt-d: Fix-up device-domain relationship by refactoring to use iommu group default domain.
Hey Lu, > On 15 Mar 2019, at 03:13, Lu Baolu wrote: > > Hi James, > > On 3/14/19 7:56 PM, James Sewart wrote: >> Patches 1 and 2 are the same as v1. >> v1-v2: >> Refactored ISA direct mappings to be returned by iommu_get_resv_regions. >> Integrated patch by Lu to defer turning on DMAR until iommu.c has mapped >> reserved regions. >> Integrated patches by Lu to remove more unused code in cleanup. >> Lu: I didn't integrate your patch to set the default domain type as it >> isn't directly related to the aim of this patchset. Instead patch 4 > > Without those patches, user experience will be affected and some devices > will not work on Intel platforms anymore. > > For a long time, Intel IOMMU driver has its own logic to determine > whether a device requires an identity domain. For example, when user > specifies "iommu=pt" in kernel parameter, all device will be attached > with the identity domain. Further more, some quirky devices require > an identity domain to be used before enabling DMA remapping, otherwise, > it will not work. This was done by adding quirk bits in Intel IOMMU > driver. > > So from my point of view, one way is porting all those quirks and kernel > parameters into IOMMU generic layer, or opening a door for vendor IOMMU > driver to determine the default domain type by their own. I prefer the > latter option since it will not impact any behaviors on other > architectures. I see your point. I’m not confident that using the proposed door to set a groups default domain has the desired behaviour. As discussed before the default domain type will be set based on the desired type for only the first device attached to a group. I think to change the default domain type you would need a slightly different door that wasn’t conditioned on device. For situations where individual devices require an identity domain because of quirks then maybe calling is_identity_map per device in iommu_group_get_for_dev is a better solution than the one I proposed. > >> addresses the issue of a device requiring an identity domain by ignoring >> the domain param in attach_device and printing a warning. > > This will not work as I commented in that thread. > >> I booted some of our devices with this patchset and haven't seen any >> issues. It doesn't look like we have any devices with RMRR's though so >> those codepaths aren't tested. >> James Sewart (7): >> iommu: Move iommu_group_create_direct_mappings to after device_attach >> iommu/vt-d: Implement apply_resv_region for reserving IOVA ranges >> iommu/vt-d: Expose ISA direct mapping region via >> iommu_get_resv_regions >> iommu/vt-d: Ignore domain parameter in attach_device if device >> requires identity map >> iommu/vt-d: Allow IOMMU_DOMAIN_DMA to be allocated by iommu_ops >> iommu/vt-d: Remove lazy allocation of domains >> Lu Baolu (1): >> iommu/vt-d: Enable DMA remapping after rmrr mapped >> drivers/iommu/intel-iommu.c | 444 +++- >> drivers/iommu/iommu.c | 4 +- >> 2 files changed, 131 insertions(+), 317 deletions(-) > > Best regards, > Lu Baolu Cheers, James. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 6/7] iommu/vt-d: Allow IOMMU_DOMAIN_DMA to be allocated by iommu_ops
Allowing IOMMU_DOMAIN_DMA type IOMMU domain to be allocated allows the default_domain of an iommu_group to be set. This delegates device-domain relationships to the generic IOMMU code. Signed-off-by: James Sewart --- drivers/iommu/intel-iommu.c | 99 + 1 file changed, 78 insertions(+), 21 deletions(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 35821df70f78..2c9d793af394 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -309,6 +309,18 @@ static int hw_pass_through = 1; /* si_domain contains mulitple devices */ #define DOMAIN_FLAG_STATIC_IDENTITY(1 << 1) +/* + * Domain managed externally, don't cleanup if it isn't attached + * to any devices. + */ +#define DOMAIN_FLAG_MANAGED_EXTERNALLY (1 << 2) + +/* + * Set after domain initialisation. Used when allocating dma domains to + * defer domain initialisation until it is attached to a device + */ +#define DOMAIN_FLAG_INITIALISED(1 << 3) + #define for_each_domain_iommu(idx, domain) \ for (idx = 0; idx < g_num_of_iommus; idx++) \ if (domain->iommu_refcnt[idx]) @@ -560,6 +572,16 @@ static inline int domain_type_is_vm_or_si(struct dmar_domain *domain) DOMAIN_FLAG_STATIC_IDENTITY); } +static inline int domain_managed_externally(struct dmar_domain *domain) +{ + return domain->flags & DOMAIN_FLAG_MANAGED_EXTERNALLY; +} + +static inline int domain_is_initialised(struct dmar_domain *domain) +{ + return domain->flags & DOMAIN_FLAG_INITIALISED; +} + static inline int domain_pfn_supported(struct dmar_domain *domain, unsigned long pfn) { @@ -1664,7 +1686,7 @@ static void disable_dmar_iommu(struct intel_iommu *iommu) __dmar_remove_one_dev_info(info); - if (!domain_type_is_vm_or_si(domain)) { + if (!domain_managed_externally(domain)) { /* * The domain_exit() function can't be called under * device_domain_lock, as it takes this lock itself. @@ -1897,6 +1919,7 @@ static int domain_init(struct dmar_domain *domain, struct intel_iommu *iommu, domain->pgd = (struct dma_pte *)alloc_pgtable_page(domain->nid); if (!domain->pgd) return -ENOMEM; + domain->flags |= DOMAIN_FLAG_INITIALISED; __iommu_flush_cache(iommu, domain->pgd, PAGE_SIZE); return 0; } @@ -1909,6 +1932,9 @@ static void domain_exit(struct dmar_domain *domain) if (!domain) return; + if (!domain_is_initialised(domain)) + goto free_mem; + /* Remove associated devices and clear attached or cached domains */ rcu_read_lock(); domain_remove_dev_info(domain); @@ -1921,6 +1947,7 @@ static void domain_exit(struct dmar_domain *domain) dma_free_pagelist(freelist); +free_mem: free_domain_mem(domain); } @@ -4585,7 +4612,7 @@ static int device_notifier(struct notifier_block *nb, return 0; dmar_remove_one_dev_info(domain, dev); - if (!domain_type_is_vm_or_si(domain) && list_empty(>devices)) + if (!domain_managed_externally(domain) && list_empty(>devices)) domain_exit(domain); return 0; @@ -5039,6 +5066,7 @@ static int md_domain_init(struct dmar_domain *domain, int guest_width) domain->pgd = (struct dma_pte *)alloc_pgtable_page(domain->nid); if (!domain->pgd) return -ENOMEM; + domain->flags |= DOMAIN_FLAG_INITIALISED; domain_flush_cache(domain, domain->pgd, PAGE_SIZE); return 0; } @@ -5047,28 +5075,43 @@ static struct iommu_domain *intel_iommu_domain_alloc(unsigned type) { struct dmar_domain *dmar_domain; struct iommu_domain *domain; + int flags = DOMAIN_FLAG_MANAGED_EXTERNALLY; - if (type != IOMMU_DOMAIN_UNMANAGED) - return NULL; + switch (type) { + case IOMMU_DOMAIN_UNMANAGED: + flags |= DOMAIN_FLAG_VIRTUAL_MACHINE | DOMAIN_FLAG_INITIALISED; + dmar_domain = alloc_domain(flags); + if (!dmar_domain) + return NULL; - dmar_domain = alloc_domain(DOMAIN_FLAG_VIRTUAL_MACHINE); - if (!dmar_domain) { - pr_err("Can't allocate dmar_domain\n"); - return NULL; - } - if (md_domain_init(dmar_domain, DEFAULT_DOMAIN_ADDRESS_WIDTH)) { - pr_err("Domain initialization failed\n"); - domain_exit(dmar_domain); + if (md_domain_init(dmar_domain, DEFAULT_DOMAIN_ADDRESS_WIDTH)) { + pr_err("Domain initialization failed\n"); + domain_exit(dmar_domain); + return NULL; + } + domain_update_iommu_cap(dmar_domain); +
[PATCH v2 7/7] iommu/vt-d: Remove lazy allocation of domains
The generic IOMMU code will allocate and attach a default domain to each device that comes online. This patch removes the lazy domain allocation and early reserved region mapping that is now redundant. Signed-off-by: James Sewart --- drivers/iommu/intel-iommu.c | 300 +--- 1 file changed, 5 insertions(+), 295 deletions(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 2c9d793af394..f8c0c3e16935 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -2605,118 +2605,6 @@ static struct dmar_domain *dmar_insert_one_dev_info(struct intel_iommu *iommu, return domain; } -static int get_last_alias(struct pci_dev *pdev, u16 alias, void *opaque) -{ - *(u16 *)opaque = alias; - return 0; -} - -static struct dmar_domain *find_or_alloc_domain(struct device *dev, int gaw) -{ - struct device_domain_info *info = NULL; - struct dmar_domain *domain = NULL; - struct intel_iommu *iommu; - u16 dma_alias; - unsigned long flags; - u8 bus, devfn; - - iommu = device_to_iommu(dev, , ); - if (!iommu) - return NULL; - - if (dev_is_pci(dev)) { - struct pci_dev *pdev = to_pci_dev(dev); - - pci_for_each_dma_alias(pdev, get_last_alias, _alias); - - spin_lock_irqsave(_domain_lock, flags); - info = dmar_search_domain_by_dev_info(pci_domain_nr(pdev->bus), - PCI_BUS_NUM(dma_alias), - dma_alias & 0xff); - if (info) { - iommu = info->iommu; - domain = info->domain; - } - spin_unlock_irqrestore(_domain_lock, flags); - - /* DMA alias already has a domain, use it */ - if (info) - goto out; - } - - /* Allocate and initialize new domain for the device */ - domain = alloc_domain(0); - if (!domain) - return NULL; - if (domain_init(domain, iommu, gaw)) { - domain_exit(domain); - return NULL; - } - -out: - - return domain; -} - -static struct dmar_domain *set_domain_for_dev(struct device *dev, - struct dmar_domain *domain) -{ - struct intel_iommu *iommu; - struct dmar_domain *tmp; - u16 req_id, dma_alias; - u8 bus, devfn; - - iommu = device_to_iommu(dev, , ); - if (!iommu) - return NULL; - - req_id = ((u16)bus << 8) | devfn; - - if (dev_is_pci(dev)) { - struct pci_dev *pdev = to_pci_dev(dev); - - pci_for_each_dma_alias(pdev, get_last_alias, _alias); - - /* register PCI DMA alias device */ - if (req_id != dma_alias) { - tmp = dmar_insert_one_dev_info(iommu, PCI_BUS_NUM(dma_alias), - dma_alias & 0xff, NULL, domain); - - if (!tmp || tmp != domain) - return tmp; - } - } - - tmp = dmar_insert_one_dev_info(iommu, bus, devfn, dev, domain); - if (!tmp || tmp != domain) - return tmp; - - return domain; -} - -static struct dmar_domain *get_domain_for_dev(struct device *dev, int gaw) -{ - struct dmar_domain *domain, *tmp; - - domain = find_domain(dev); - if (domain) - goto out; - - domain = find_or_alloc_domain(dev, gaw); - if (!domain) - goto out; - - tmp = set_domain_for_dev(dev, domain); - if (!tmp || domain != tmp) { - domain_exit(domain); - domain = tmp; - } - -out: - - return domain; -} - static int iommu_domain_identity_map(struct dmar_domain *domain, unsigned long long start, unsigned long long end) @@ -2742,73 +2630,6 @@ static int iommu_domain_identity_map(struct dmar_domain *domain, DMA_PTE_READ|DMA_PTE_WRITE); } -static int domain_prepare_identity_map(struct device *dev, - struct dmar_domain *domain, - unsigned long long start, - unsigned long long end) -{ - /* For _hardware_ passthrough, don't bother. But for software - passthrough, we do it anyway -- it may indicate a memory - range which is reserved in E820, so which didn't get set - up to start with in si_domain */ - if (domain == si_domain && hw_pass_through) { - pr_warn("Ignoring identity map for HW passthrough device %s [0x%Lx - 0x%Lx]\n", - dev_name(dev), start, end); - return 0; - } - -
[PATCH v2 5/7] iommu/vt-d: Enable DMA remapping after rmrr mapped
The rmrr devices require identity map of the rmrr regions before enabling DMA remapping. Otherwise, there will be a window during which DMA from/to the rmrr regions will be blocked. In order to alleviate this, we move enabling DMA remapping after all rmrr regions get mapped. Signed-off-by: Lu Baolu --- drivers/iommu/intel-iommu.c | 16 ++-- 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 104d36f225a7..35821df70f78 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -3518,11 +3518,6 @@ static int __init init_dmars(void) ret = dmar_set_interrupt(iommu); if (ret) goto free_iommu; - - if (!translation_pre_enabled(iommu)) - iommu_enable_translation(iommu); - - iommu_disable_protect_mem_regions(iommu); } return 0; @@ -4912,7 +4907,6 @@ int __init intel_iommu_init(void) goto out_free_reserved_range; } up_write(_global_lock); - pr_info("Intel(R) Virtualization Technology for Directed I/O\n"); #if defined(CONFIG_X86) && defined(CONFIG_SWIOTLB) swiotlb = 0; @@ -4935,6 +4929,16 @@ int __init intel_iommu_init(void) register_memory_notifier(_iommu_memory_nb); cpuhp_setup_state(CPUHP_IOMMU_INTEL_DEAD, "iommu/intel:dead", NULL, intel_iommu_cpu_dead); + + /* Finally, we enable the DMA remapping hardware. */ + for_each_iommu(iommu, drhd) { + if (!translation_pre_enabled(iommu)) + iommu_enable_translation(iommu); + + iommu_disable_protect_mem_regions(iommu); + } + pr_info("Intel(R) Virtualization Technology for Directed I/O\n"); + intel_iommu_enabled = 1; intel_iommu_debugfs_init(); -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 4/7] iommu/vt-d: Ignore domain parameter in attach_device if device requires identity map
If a device requires an identity map then it is not safe to attach a domain that can remap addresses. Warn the user if this occurs. Signed-off-by: James Sewart --- drivers/iommu/intel-iommu.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 2e00e8708f06..104d36f225a7 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -5101,6 +5101,11 @@ static int intel_iommu_attach_device(struct iommu_domain *domain, } } + if (iommu_no_mapping(dev)) { + dmar_domain = si_domain; + dev_warn(dev, "VT-d: Device is required to use identity IOMMU mapping, ignoring domain attached\n"); + } + iommu = device_to_iommu(dev, , ); if (!iommu) return -ENODEV; -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions
To support mapping ISA region via iommu_group_create_direct_mappings, make sure its exposed by iommu_get_resv_regions. This allows deduplication of reserved region mappings Signed-off-by: James Sewart --- drivers/iommu/intel-iommu.c | 42 + 1 file changed, 33 insertions(+), 9 deletions(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 8e0a4e2ff77f..2e00e8708f06 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -337,6 +337,8 @@ static LIST_HEAD(dmar_rmrr_units); #define for_each_rmrr_units(rmrr) \ list_for_each_entry(rmrr, _rmrr_units, list) +static struct iommu_resv_region *isa_resv_region; + /* bitmap for indexing intel_iommus */ static int g_num_of_iommus; @@ -2780,26 +2782,34 @@ static inline int iommu_prepare_rmrr_dev(struct dmar_rmrr_unit *rmrr, rmrr->end_address); } +static inline struct iommu_resv_region *iommu_get_isa_resv_region(void) +{ + if (!isa_resv_region) + isa_resv_region = iommu_alloc_resv_region(0, + 16*1024*1024, + 0, IOMMU_RESV_DIRECT); + + return isa_resv_region; +} + #ifdef CONFIG_INTEL_IOMMU_FLOPPY_WA -static inline void iommu_prepare_isa(void) +static inline void iommu_prepare_isa(struct pci_dev *pdev) { - struct pci_dev *pdev; int ret; + struct iommu_resv_region *reg = iommu_get_isa_resv_region(); - pdev = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, NULL); - if (!pdev) + if (!reg) return; pr_info("Prepare 0-16MiB unity mapping for LPC\n"); - ret = iommu_prepare_identity_map(>dev, 0, 16*1024*1024 - 1); + ret = iommu_prepare_identity_map(>dev, reg->start, + reg->start + reg->length - 1); if (ret) pr_err("Failed to create 0-16MiB identity map - floppy might not work\n"); - - pci_dev_put(pdev); } #else -static inline void iommu_prepare_isa(void) +static inline void iommu_prepare_isa(struct pci_dev *pdev) { return; } @@ -3289,6 +3299,7 @@ static int __init init_dmars(void) struct dmar_rmrr_unit *rmrr; bool copied_tables = false; struct device *dev; + struct pci_dev *pdev; struct intel_iommu *iommu; int i, ret; @@ -3469,7 +3480,11 @@ static int __init init_dmars(void) } } - iommu_prepare_isa(); + pdev = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, NULL); + if (pdev) { + iommu_prepare_isa(pdev); + pci_dev_put(pdev); + } domains_done: @@ -5266,6 +5281,7 @@ static void intel_iommu_get_resv_regions(struct device *device, struct iommu_resv_region *reg; struct dmar_rmrr_unit *rmrr; struct device *i_dev; + struct pci_dev *pdev; int i; rcu_read_lock(); @@ -5280,6 +5296,14 @@ static void intel_iommu_get_resv_regions(struct device *device, } rcu_read_unlock(); + if (dev_is_pci(device)) { + pdev = to_pci_dev(device); + if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA) { + reg = iommu_get_isa_resv_region(); + list_add_tail(>list, head); + } + } + reg = iommu_alloc_resv_region(IOAPIC_RANGE_START, IOAPIC_RANGE_END - IOAPIC_RANGE_START + 1, 0, IOMMU_RESV_MSI); -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 2/7] iommu/vt-d: Implement apply_resv_region for reserving IOVA ranges
Used by iommu.c before creating identity mappings for reserved ranges to ensure dma-ops won't ever remap these ranges Signed-off-by: James Sewart --- drivers/iommu/intel-iommu.c | 14 ++ 1 file changed, 14 insertions(+) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 78188bf7e90d..8e0a4e2ff77f 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -5299,6 +5299,19 @@ static void intel_iommu_put_resv_regions(struct device *dev, } } +static void intel_iommu_apply_resv_region(struct device *dev, + struct iommu_domain *domain, + struct iommu_resv_region *region) +{ + struct dmar_domain *dmar_domain = to_dmar_domain(domain); + unsigned long start, end; + + start = IOVA_PFN(region->start); + end = IOVA_PFN(region->start + region->length - 1); + + WARN_ON_ONCE(reserve_iova(_domain->iovad, start, end) == NULL); +} + #ifdef CONFIG_INTEL_IOMMU_SVM int intel_iommu_enable_pasid(struct intel_iommu *iommu, struct intel_svm_dev *sdev) { @@ -5392,6 +5405,7 @@ const struct iommu_ops intel_iommu_ops = { .remove_device = intel_iommu_remove_device, .get_resv_regions = intel_iommu_get_resv_regions, .put_resv_regions = intel_iommu_put_resv_regions, + .apply_resv_region = intel_iommu_apply_resv_region, .device_group = pci_device_group, .pgsize_bitmap = INTEL_IOMMU_PGSIZES, }; -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 1/7] iommu: Move iommu_group_create_direct_mappings to after device_attach
If an IOMMU driver requires to know which IOMMU a domain is associated to initialise a domain then it will do so in device_attach, before which it is not safe to call iommu_ops. Signed-off-by: James Sewart --- drivers/iommu/iommu.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 3ed4db334341..1c6ffbb2d20e 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -652,8 +652,6 @@ int iommu_group_add_device(struct iommu_group *group, struct device *dev) dev->iommu_group = group; - iommu_group_create_direct_mappings(group, dev); - mutex_lock(>mutex); list_add_tail(>list, >devices); if (group->domain) @@ -662,6 +660,8 @@ int iommu_group_add_device(struct iommu_group *group, struct device *dev) if (ret) goto err_put_group; + iommu_group_create_direct_mappings(group, dev); + /* Notify any listeners about change to group. */ blocking_notifier_call_chain(>notifier, IOMMU_GROUP_NOTIFY_ADD_DEVICE, dev); -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 0/7] iommu/vt-d: Fix-up device-domain relationship by refactoring to use iommu group default domain.
Patches 1 and 2 are the same as v1. v1-v2: Refactored ISA direct mappings to be returned by iommu_get_resv_regions. Integrated patch by Lu to defer turning on DMAR until iommu.c has mapped reserved regions. Integrated patches by Lu to remove more unused code in cleanup. Lu: I didn't integrate your patch to set the default domain type as it isn't directly related to the aim of this patchset. Instead patch 4 addresses the issue of a device requiring an identity domain by ignoring the domain param in attach_device and printing a warning. I booted some of our devices with this patchset and haven't seen any issues. It doesn't look like we have any devices with RMRR's though so those codepaths aren't tested. James Sewart (7): iommu: Move iommu_group_create_direct_mappings to after device_attach iommu/vt-d: Implement apply_resv_region for reserving IOVA ranges iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions iommu/vt-d: Ignore domain parameter in attach_device if device requires identity map iommu/vt-d: Allow IOMMU_DOMAIN_DMA to be allocated by iommu_ops iommu/vt-d: Remove lazy allocation of domains Lu Baolu (1): iommu/vt-d: Enable DMA remapping after rmrr mapped drivers/iommu/intel-iommu.c | 444 +++- drivers/iommu/iommu.c | 4 +- 2 files changed, 131 insertions(+), 317 deletions(-) -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 4/4] iommu/vt-d: Remove lazy allocation of domains
Hey Lu, > On 8 Mar 2019, at 03:09, Lu Baolu wrote: > > Hi James, > > On 3/7/19 6:21 PM, James Sewart wrote: >> Hey Lu, >>> On 7 Mar 2019, at 06:31, Lu Baolu wrote: >>> >>> Hi James, >>> >>> On 3/7/19 2:08 AM, James Sewart wrote: - /* - * For each rmrr - * for each dev attached to rmrr - * do - * locate drhd for dev, alloc domain for dev - * allocate free domain - * allocate page table entries for rmrr - * if context not allocated for bus - * allocate and init context - * set present in root table for this bus - * init context with domain, translation etc - *endfor - * endfor - */ - pr_info("Setting RMRR:\n"); - for_each_rmrr_units(rmrr) { - /* some BIOS lists non-exist devices in DMAR table. */ - for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt, -i, dev) { - ret = iommu_prepare_rmrr_dev(rmrr, dev); - if (ret) - pr_err("Mapping reserved region failed\n"); - } - } - - iommu_prepare_isa(); >>> Why do you want to remove this segment of code? >> This will only work if the lazy allocation of domains exists, these >> mappings will disappear once a default domain is attached to a device and >> then remade by iommu_group_create_direct_mappings. This code is redundant >> and removing it allows us to remove all the lazy allocation logic. > No exactly. > > We need to setup the rmrr mapping before enabling dma remapping, > otherwise, there will be a window after dma remapping enabling and > rmrr getting mapped, during which people might see dma faults. Do you think this patch instead should be a refactoring to simplify this initial domain setup before the default domain takes over? It seems like duplicated effort to have both lazy allocated domains and externally managed domains. We could allocate a domain here for any devices with RMRR and call get_resv_regions to avoid duplicating RMRR loop. >>> >>> Agree. We should replace the lazy allocated domains with default domain >>> in a clean way. Actually, your patches look great to me. But I do think >>> we need further cleanups. The rmrr code is one example, and the identity >>> domain (si_domain) is another. >>> >>> Do you mind if I work on top of your patches for further cleanups and >>> sign off a v2 together with you? >> Sure, sounds good. I’ll fixup patch 3 and have a go at integrating >> iommu_prepare_isa into get_resv_regions. This should make the initial >> domain logic here quite concise. > > Here attached three extra patches which I think should be added before > PATCH 3/4, and some further cleanup changes which you can merge with > PATCH 4/4. > > > > 0006-iommu-Add-ops-entry-for-vendor-specific-default-doma.patch > 0007-iommu-vt-d-Add-is_identity_map-ops-entry.patch > > These two patches aim to add a generic method for vendor specific iommu > drivers to specify the type of the default domain for a device. Intel > iommu driver will register an ops for this since it already has its own > identity map adjudicator for a long time. This seems like a good idea, but as domain alloc is only called for the default domain on first device attached to a group, we may miss checking whether a device added later should have an identity domain. Should there be paths to downgrade a groups domain if one of the devices requires one? > > > > 0008-iommu-vt-d-Enable-DMA-remapping-after-rmrr-mapped.patch > > This aims to address the requirement of rmrr identity map before > enabling DMA remapping. With default domain mechanism, the default > domain will be allocated and attached to the device in bus_set_iommu() > during boot. Move enabling DMA remapping engines after bus_set_iommu() > will fix the rmrr issue. Thats pretty neat, avoids any temporary domain allocation, nice! > > > > 0009-return-si_domain-directly.patch > > I suggest that we should keep current si_domain implementation since > changing si_domain is not the purpose of this patch set. Please merge > this with PATCH 3/4 if you like it. Seems reasonable. > > > > 0010-iommu-vt-d-remove-floopy-workaround.patch > 0011-iommu-vt-d-remove-prepare-rmrr-helpers.patch > 0012-iommu-vt-d-remove-prepare-identity-map-during-boot.patch > > Above patches are further cleanups as the result of switching to default > domain. Please consider to merge them with PATCH 4/4. Nice, good catch. > > > > I
Re: [PATCH 4/4] iommu/vt-d: Remove lazy allocation of domains
Hey Lu, > On 7 Mar 2019, at 06:31, Lu Baolu wrote: > > Hi James, > > On 3/7/19 2:08 AM, James Sewart wrote: >> -/* >> - * For each rmrr >> - * for each dev attached to rmrr >> - * do >> - * locate drhd for dev, alloc domain for dev >> - * allocate free domain >> - * allocate page table entries for rmrr >> - * if context not allocated for bus >> - * allocate and init context >> - * set present in root table for this bus >> - * init context with domain, translation etc >> - *endfor >> - * endfor >> - */ >> -pr_info("Setting RMRR:\n"); >> -for_each_rmrr_units(rmrr) { >> -/* some BIOS lists non-exist devices in DMAR table. */ >> -for_each_active_dev_scope(rmrr->devices, >> rmrr->devices_cnt, >> - i, dev) { >> -ret = iommu_prepare_rmrr_dev(rmrr, dev); >> -if (ret) >> -pr_err("Mapping reserved region >> failed\n"); >> -} >> -} >> - >> -iommu_prepare_isa(); > Why do you want to remove this segment of code? This will only work if the lazy allocation of domains exists, these mappings will disappear once a default domain is attached to a device and then remade by iommu_group_create_direct_mappings. This code is redundant and removing it allows us to remove all the lazy allocation logic. >>> No exactly. >>> >>> We need to setup the rmrr mapping before enabling dma remapping, >>> otherwise, there will be a window after dma remapping enabling and >>> rmrr getting mapped, during which people might see dma faults. >> Do you think this patch instead should be a refactoring to simplify this >> initial domain setup before the default domain takes over? It seems like >> duplicated effort to have both lazy allocated domains and externally managed >> domains. We could allocate a domain here for any devices with RMRR and call >> get_resv_regions to avoid duplicating RMRR loop. > > Agree. We should replace the lazy allocated domains with default domain > in a clean way. Actually, your patches look great to me. But I do think > we need further cleanups. The rmrr code is one example, and the identity > domain (si_domain) is another. > > Do you mind if I work on top of your patches for further cleanups and > sign off a v2 together with you? Sure, sounds good. I’ll fixup patch 3 and have a go at integrating iommu_prepare_isa into get_resv_regions. This should make the initial domain logic here quite concise. > > Best regards, > Lu Baolu Cheers, James. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 4/4] iommu/vt-d: Remove lazy allocation of domains
Hey, > On 6 Mar 2019, at 07:00, Lu Baolu wrote: > > Hi, > > On 3/5/19 7:46 PM, James Sewart wrote: >> Hey Lu, >>> On 5 Mar 2019, at 06:59, Lu Baolu wrote: >>> >>> Hi, >>> >>> It's hard for me to understand why do we remove the rmrr related >>> code in this patch. >> The RMRR code removed here requires the lazy allocation of domains to >> exist, as it is run before iommu.c would assign IOMMU groups and attach a >> domain. Before patch 3, removing this code would mean the RMRR regions are >> never mapped for a domain: iommu.c will allocate a default domain for the >> group that a device is about to be put in, it will attach the domain to >> the device, then for each region returned by get_resv_regions it will >> create an identity map, this is where the RMRRs are setup for the default >> domain. > >>> >>> And, now we have two places to hold a domain for a device: group and >>> dev->info. We can consider to optimize the use of per device iommu >>> arch data. This should be later work anyway. >>> >>> More comments inline. >>> >>> On 3/4/19 11:47 PM, James Sewart wrote: The generic IOMMU code will allocate and attach a dma ops domain to each device that comes online, replacing any lazy allocated domain. Removes RMRR application at iommu init time as we won't have a domain attached to any device. iommu.c will do this after attaching a device using create_direct_mappings. Signed-off-by: James Sewart --- drivers/iommu/intel-iommu.c | 202 ++-- 1 file changed, 8 insertions(+), 194 deletions(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 71cd6bbfec05..282257e2628d 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -2595,118 +2595,6 @@ static struct dmar_domain *dmar_insert_one_dev_info(struct intel_iommu *iommu, return domain; } -static int get_last_alias(struct pci_dev *pdev, u16 alias, void *opaque) -{ - *(u16 *)opaque = alias; - return 0; -} - -static struct dmar_domain *find_or_alloc_domain(struct device *dev, int gaw) -{ - struct device_domain_info *info = NULL; - struct dmar_domain *domain = NULL; - struct intel_iommu *iommu; - u16 dma_alias; - unsigned long flags; - u8 bus, devfn; - - iommu = device_to_iommu(dev, , ); - if (!iommu) - return NULL; - - if (dev_is_pci(dev)) { - struct pci_dev *pdev = to_pci_dev(dev); - - pci_for_each_dma_alias(pdev, get_last_alias, _alias); - - spin_lock_irqsave(_domain_lock, flags); - info = dmar_search_domain_by_dev_info(pci_domain_nr(pdev->bus), -PCI_BUS_NUM(dma_alias), -dma_alias & 0xff); - if (info) { - iommu = info->iommu; - domain = info->domain; - } - spin_unlock_irqrestore(_domain_lock, flags); - - /* DMA alias already has a domain, use it */ - if (info) - goto out; - } - - /* Allocate and initialize new domain for the device */ - domain = alloc_domain(0); - if (!domain) - return NULL; - if (domain_init(domain, iommu, gaw)) { - domain_exit(domain); - return NULL; - } - -out: - - return domain; -} - -static struct dmar_domain *set_domain_for_dev(struct device *dev, -struct dmar_domain *domain) -{ - struct intel_iommu *iommu; - struct dmar_domain *tmp; - u16 req_id, dma_alias; - u8 bus, devfn; - - iommu = device_to_iommu(dev, , ); - if (!iommu) - return NULL; - - req_id = ((u16)bus << 8) | devfn; - - if (dev_is_pci(dev)) { - struct pci_dev *pdev = to_pci_dev(dev); - - pci_for_each_dma_alias(pdev, get_last_alias, _alias); - - /* register PCI DMA alias device */ - if (req_id != dma_alias) { - tmp = dmar_insert_one_dev_info(iommu, PCI_BUS_NUM(dma_alias), - dma_alias & 0xff, NULL, domain); - - if (!tmp || tmp != domain) - return tmp; - } - } - - tmp = dmar_insert_one_dev_info(iommu, bus, devfn, dev, domain); - if (!tmp || tmp != domain) - return tmp; - - return domain; -} - -static struct dmar_domain *get_domain_for_dev(struct device *dev, int gaw) -{ - struct dmar_domain *domain, *tmp; - - domain = find_domain(dev); - if
Re: [PATCH 4/4] iommu/vt-d: Remove lazy allocation of domains
Hey Lu, > On 5 Mar 2019, at 06:59, Lu Baolu wrote: > > Hi, > > It's hard for me to understand why do we remove the rmrr related > code in this patch. The RMRR code removed here requires the lazy allocation of domains to exist, as it is run before iommu.c would assign IOMMU groups and attach a domain. Before patch 3, removing this code would mean the RMRR regions are never mapped for a domain: iommu.c will allocate a default domain for the group that a device is about to be put in, it will attach the domain to the device, then for each region returned by get_resv_regions it will create an identity map, this is where the RMRRs are setup for the default domain. > > And, now we have two places to hold a domain for a device: group and > dev->info. We can consider to optimize the use of per device iommu > arch data. This should be later work anyway. > > More comments inline. > > On 3/4/19 11:47 PM, James Sewart wrote: >> The generic IOMMU code will allocate and attach a dma ops domain to each >> device that comes online, replacing any lazy allocated domain. Removes >> RMRR application at iommu init time as we won't have a domain attached >> to any device. iommu.c will do this after attaching a device using >> create_direct_mappings. >> Signed-off-by: James Sewart >> --- >> drivers/iommu/intel-iommu.c | 202 ++-- >> 1 file changed, 8 insertions(+), 194 deletions(-) >> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c >> index 71cd6bbfec05..282257e2628d 100644 >> --- a/drivers/iommu/intel-iommu.c >> +++ b/drivers/iommu/intel-iommu.c >> @@ -2595,118 +2595,6 @@ static struct dmar_domain >> *dmar_insert_one_dev_info(struct intel_iommu *iommu, >> return domain; >> } >> -static int get_last_alias(struct pci_dev *pdev, u16 alias, void *opaque) >> -{ >> -*(u16 *)opaque = alias; >> -return 0; >> -} >> - >> -static struct dmar_domain *find_or_alloc_domain(struct device *dev, int gaw) >> -{ >> -struct device_domain_info *info = NULL; >> -struct dmar_domain *domain = NULL; >> -struct intel_iommu *iommu; >> -u16 dma_alias; >> -unsigned long flags; >> -u8 bus, devfn; >> - >> -iommu = device_to_iommu(dev, , ); >> -if (!iommu) >> -return NULL; >> - >> -if (dev_is_pci(dev)) { >> -struct pci_dev *pdev = to_pci_dev(dev); >> - >> -pci_for_each_dma_alias(pdev, get_last_alias, _alias); >> - >> -spin_lock_irqsave(_domain_lock, flags); >> -info = dmar_search_domain_by_dev_info(pci_domain_nr(pdev->bus), >> - PCI_BUS_NUM(dma_alias), >> - dma_alias & 0xff); >> -if (info) { >> -iommu = info->iommu; >> -domain = info->domain; >> -} >> -spin_unlock_irqrestore(_domain_lock, flags); >> - >> -/* DMA alias already has a domain, use it */ >> -if (info) >> -goto out; >> -} >> - >> -/* Allocate and initialize new domain for the device */ >> -domain = alloc_domain(0); >> -if (!domain) >> -return NULL; >> -if (domain_init(domain, iommu, gaw)) { >> -domain_exit(domain); >> -return NULL; >> -} >> - >> -out: >> - >> -return domain; >> -} >> - >> -static struct dmar_domain *set_domain_for_dev(struct device *dev, >> - struct dmar_domain *domain) >> -{ >> -struct intel_iommu *iommu; >> -struct dmar_domain *tmp; >> -u16 req_id, dma_alias; >> -u8 bus, devfn; >> - >> -iommu = device_to_iommu(dev, , ); >> -if (!iommu) >> -return NULL; >> - >> -req_id = ((u16)bus << 8) | devfn; >> - >> -if (dev_is_pci(dev)) { >> -struct pci_dev *pdev = to_pci_dev(dev); >> - >> -pci_for_each_dma_alias(pdev, get_last_alias, _alias); >> - >> -/* register PCI DMA alias device */ >> -if (req_id != dma_alias) { >> -tmp = dmar_insert_one_dev_info(iommu, >> PCI_BUS_NUM(dma_alias), >> -dma_alias & 0xff, NULL, domain); >> - >> -if (!tmp || tmp != domain) >> -return tmp; >> -} >> -} >> - >> -tmp = dmar_insert_one_dev_info(iommu, bus, devfn, dev, domain); >> -if (!tmp || tmp != domain) >> -return tmp; >> - >> -return domain; >> -} >> - >> -static struct dmar_domain *get_domain_for_dev(struct device *dev, int gaw) >> -{ >> -struct dmar_domain *domain, *tmp; >> - >> -domain = find_domain(dev); >> -if (domain) >> -goto out; >> - >> -domain = find_or_alloc_domain(dev, gaw); >> -if (!domain) >> -goto out; >> - >> -tmp = set_domain_for_dev(dev, domain); >> -if (!tmp || domain != tmp) { >> -
Re: [PATCH 3/4] iommu/vt-d: Allow IOMMU_DOMAIN_DMA and IOMMU_DOMAIN_IDENTITY to be allocated
Hey Lu, > On 5 Mar 2019, at 06:46, Lu Baolu wrote: > > Hi, > > On 3/4/19 11:46 PM, James Sewart wrote: >> Allowing IOMMU_DOMAIN_DMA type IOMMU domain to be allocated allows the >> default_domain of an iommu_group to be set. This delegates device-domain >> relationships to the generic IOMMU code. >> Signed-off-by: James Sewart >> --- >> drivers/iommu/intel-iommu.c | 113 +++- >> 1 file changed, 84 insertions(+), 29 deletions(-) >> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c >> index 8e0a4e2ff77f..71cd6bbfec05 100644 >> --- a/drivers/iommu/intel-iommu.c >> +++ b/drivers/iommu/intel-iommu.c >> @@ -309,6 +309,14 @@ static int hw_pass_through = 1; >> /* si_domain contains mulitple devices */ >> #define DOMAIN_FLAG_STATIC_IDENTITY (1 << 1) >> +/* Domain managed externally, don't cleanup if it isn't attached >> + * to any devices. */ >> +#define DOMAIN_FLAG_MANAGED_EXTERNALLY (1 << 2) >> + >> +/* Set after domain initialisation. Used when allocating dma domains to >> + * defer domain initialisation until it is attached to a device */ >> +#define DOMAIN_FLAG_INITIALISED (1 << 4) > > Why do you skip bit 3? This was an oversight, I will update to use bit 3. > >> + >> #define for_each_domain_iommu(idx, domain) \ >> for (idx = 0; idx < g_num_of_iommus; idx++) \ >> if (domain->iommu_refcnt[idx]) >> @@ -558,6 +566,16 @@ static inline int domain_type_is_vm_or_si(struct >> dmar_domain *domain) >> DOMAIN_FLAG_STATIC_IDENTITY); >> } >> +static inline int domain_managed_externally(struct dmar_domain *domain) >> +{ >> +return domain->flags & DOMAIN_FLAG_MANAGED_EXTERNALLY; >> +} >> + >> +static inline int domain_is_initialised(struct dmar_domain *domain) >> +{ >> +return domain->flags & DOMAIN_FLAG_INITIALISED; >> +} >> + >> static inline int domain_pfn_supported(struct dmar_domain *domain, >> unsigned long pfn) >> { >> @@ -1662,7 +1680,7 @@ static void disable_dmar_iommu(struct intel_iommu >> *iommu) >> __dmar_remove_one_dev_info(info); >> - if (!domain_type_is_vm_or_si(domain)) { >> +if (!domain_managed_externally(domain)) { >> /* >> * The domain_exit() function can't be called under >> * device_domain_lock, as it takes this lock itself. >> @@ -1895,6 +1913,7 @@ static int domain_init(struct dmar_domain *domain, >> struct intel_iommu *iommu, >> domain->pgd = (struct dma_pte *)alloc_pgtable_page(domain->nid); >> if (!domain->pgd) >> return -ENOMEM; >> +domain->flags |= DOMAIN_FLAG_INITIALISED; >> __iommu_flush_cache(iommu, domain->pgd, PAGE_SIZE); >> return 0; >> } >> @@ -2807,14 +2826,10 @@ static inline void iommu_prepare_isa(void) >>static int md_domain_init(struct dmar_domain *domain, int guest_width); >> -static int __init si_domain_init(int hw) >> +static int __init si_domain_init(struct dmar_domain *si_domain, int hw) >> { >> int nid, ret = 0; >> - si_domain = alloc_domain(DOMAIN_FLAG_STATIC_IDENTITY); >> -if (!si_domain) >> -return -EFAULT; >> - >> if (md_domain_init(si_domain, DEFAULT_DOMAIN_ADDRESS_WIDTH)) { >> domain_exit(si_domain); >> return -EFAULT; >> @@ -3417,9 +3432,16 @@ static int __init init_dmars(void) >> check_tylersburg_isoch(); >> if (iommu_identity_mapping) { >> -ret = si_domain_init(hw_pass_through); >> -if (ret) >> +si_domain = alloc_domain(DOMAIN_FLAG_STATIC_IDENTITY); > > Where will this si_domain be used? We still need to keep the global > si_domain? I am unsure of the best thing to do here. The si_domain can be initialised as a hardware passthrough which means it doesn’t have any mappings applied. I think any user allocating a domain here will always want a software passthrough domain. I’m not sure if/how to consolidate these two. > >> +if (!si_domain) { >> +ret = -EFAULT; >> goto free_iommu; >> +} >> +ret = si_domain_init(si_domain, hw_pass_through); >> +if (ret) { >> +domain_exit(si_domain); >> +goto free_iommu; >> +} >> } >>@@ -4575,7 +4597,7 @@ static int device_notifier(struct notifier_block >> *nb, >> return 0; >> dmar_remove_one_dev_info(domain, dev); >> -if (!domain_type_is_vm_or_si(domain) && list_empty(>devices)) >> +if (!domain_managed_externally(domain) && list_empty(>devices)) >> domain_exit(domain); >> return 0; >> @@ -5020,6 +5042,7 @@ static int md_domain_init(struct dmar_domain *domain, >> int guest_width) >> domain->pgd = (struct dma_pte *)alloc_pgtable_page(domain->nid); >> if (!domain->pgd) >> return
Re: [PATCH 0/4] iommu/vt-d: Fix-up device-domain relationship by refactoring to use iommu group default domain.
Hey Lu, The motivation for this is buggy domain <-> IOMMU group relationship when using find_or_alloc_domain. From what I have read it should be the case that an IOMMU domain is shared by all devices in the same group, thus the same mappings. This is because the IOMMU group is determined by things like ACS settings on pci switches which determines whether devices can talk to each other. In find_or_alloc_domain we only determine domain sharing based on devices returned by pci_for_each_dma_alias, whereas there are many more checks in pci_device_group(e.g. ACS settings of intermediary pci switches), which is used for determining which IOMMU group a device is in. This discontinuity means it can be the case that each device within an IOMMU group will have different domains. We see issues as it is supposed to be safe to assume that devices within a group should be considered to be in the same address space, but if each device has its own domain then any mapping created won’t be valid for the other devices, and even could be mapped differently for each device. This also could cause issues with a device whose RMRR's are not applied to the domains of other devices within its group, these regions could be remapped for the other devices resulting in unexpected behaviour during peer-to-peer transfers. Cheers, James > On 5 Mar 2019, at 06:05, Lu Baolu wrote: > > Hi James, > > Very glad to see this. Thank you! > > On 3/4/19 11:41 PM, James Sewart wrote: >> Hey, >> This patchset moves IOMMU_DOMAIN_DMA iommu domain management to iommu.c. >> This avoids the use of find_or_alloc_domain whose domain assignment is >> inconsistent with the iommu grouping as determined by pci_device_group. > > Is this a bug fix or an improvement? What's the real issue will it cause > if we go without this patch set? > > Best regards, > Lu Baolu > >> Patch 3 permits domain type IOMMU_DOMAIN_DMA to be allocated via the >> iommu_ops api, allowing the default_domain of an iommu group to be set in >> iommu.c. This domain will be attached to every device that is brought up >> with an iommu group, and the devices reserved regions will be mapped using >> regions returned by get_resv_regions. >> In intel_iommu_domain_alloc we don’t know the IOMMU this domain will be >> associated with so we defer full initialisation until >> intel_iommu_attach_device. Currently iommu.c:iommu_group_add_device will >> try to map a devices reserved regions before attaching the domain which >> would cause issue if the domain is not fully initialised. This is >> addressed in patch 1 by moving the mapping to after attaching. >> Patch 2 implements function apply_resv_region, used in >> iommu_group_create_direct_mappings to mark the reserved regions as non >> mappable for the dma_map_ops api. >> Patch 4 removes the domain lazy allocation logic. Before this patch the >> lazy allocation logic would not be used as any domain allocated using >> these paths would be replaced when attaching the group default domain. >> Default domain allocation has been tested with and without this patch on >> 4.19. >> Cheers, >> James. >> James Sewart (4): >> iommu: Move iommu_group_create_direct_mappings to after device_attach >> iommu/vt-d: Implement apply_resv_region for reserving IOVA ranges >> iommu/vt-d: Allow IOMMU_DOMAIN_DMA and IOMMU_DOMAIN_IDENTITY to be >> allocated >> iommu/vt-d: Remove lazy allocation of domains >> drivers/iommu/intel-iommu.c | 329 >> drivers/iommu/iommu.c | 4 +- >> 2 files changed, 108 insertions(+), 225 deletions(-) ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 3/4] iommu/vt-d: Allow IOMMU_DOMAIN_DMA and IOMMU_DOMAIN_IDENTITY to be allocated
Allowing IOMMU_DOMAIN_DMA type IOMMU domain to be allocated allows the default_domain of an iommu_group to be set. This delegates device-domain relationships to the generic IOMMU code. Signed-off-by: James Sewart --- drivers/iommu/intel-iommu.c | 113 +++- 1 file changed, 84 insertions(+), 29 deletions(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 8e0a4e2ff77f..71cd6bbfec05 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -309,6 +309,14 @@ static int hw_pass_through = 1; /* si_domain contains mulitple devices */ #define DOMAIN_FLAG_STATIC_IDENTITY(1 << 1) +/* Domain managed externally, don't cleanup if it isn't attached + * to any devices. */ +#define DOMAIN_FLAG_MANAGED_EXTERNALLY (1 << 2) + +/* Set after domain initialisation. Used when allocating dma domains to + * defer domain initialisation until it is attached to a device */ +#define DOMAIN_FLAG_INITIALISED(1 << 4) + #define for_each_domain_iommu(idx, domain) \ for (idx = 0; idx < g_num_of_iommus; idx++) \ if (domain->iommu_refcnt[idx]) @@ -558,6 +566,16 @@ static inline int domain_type_is_vm_or_si(struct dmar_domain *domain) DOMAIN_FLAG_STATIC_IDENTITY); } +static inline int domain_managed_externally(struct dmar_domain *domain) +{ + return domain->flags & DOMAIN_FLAG_MANAGED_EXTERNALLY; +} + +static inline int domain_is_initialised(struct dmar_domain *domain) +{ + return domain->flags & DOMAIN_FLAG_INITIALISED; +} + static inline int domain_pfn_supported(struct dmar_domain *domain, unsigned long pfn) { @@ -1662,7 +1680,7 @@ static void disable_dmar_iommu(struct intel_iommu *iommu) __dmar_remove_one_dev_info(info); - if (!domain_type_is_vm_or_si(domain)) { + if (!domain_managed_externally(domain)) { /* * The domain_exit() function can't be called under * device_domain_lock, as it takes this lock itself. @@ -1895,6 +1913,7 @@ static int domain_init(struct dmar_domain *domain, struct intel_iommu *iommu, domain->pgd = (struct dma_pte *)alloc_pgtable_page(domain->nid); if (!domain->pgd) return -ENOMEM; + domain->flags |= DOMAIN_FLAG_INITIALISED; __iommu_flush_cache(iommu, domain->pgd, PAGE_SIZE); return 0; } @@ -2807,14 +2826,10 @@ static inline void iommu_prepare_isa(void) static int md_domain_init(struct dmar_domain *domain, int guest_width); -static int __init si_domain_init(int hw) +static int __init si_domain_init(struct dmar_domain *si_domain, int hw) { int nid, ret = 0; - si_domain = alloc_domain(DOMAIN_FLAG_STATIC_IDENTITY); - if (!si_domain) - return -EFAULT; - if (md_domain_init(si_domain, DEFAULT_DOMAIN_ADDRESS_WIDTH)) { domain_exit(si_domain); return -EFAULT; @@ -3417,9 +3432,16 @@ static int __init init_dmars(void) check_tylersburg_isoch(); if (iommu_identity_mapping) { - ret = si_domain_init(hw_pass_through); - if (ret) + si_domain = alloc_domain(DOMAIN_FLAG_STATIC_IDENTITY); + if (!si_domain) { + ret = -EFAULT; goto free_iommu; + } + ret = si_domain_init(si_domain, hw_pass_through); + if (ret) { + domain_exit(si_domain); + goto free_iommu; + } } @@ -4575,7 +4597,7 @@ static int device_notifier(struct notifier_block *nb, return 0; dmar_remove_one_dev_info(domain, dev); - if (!domain_type_is_vm_or_si(domain) && list_empty(>devices)) + if (!domain_managed_externally(domain) && list_empty(>devices)) domain_exit(domain); return 0; @@ -5020,6 +5042,7 @@ static int md_domain_init(struct dmar_domain *domain, int guest_width) domain->pgd = (struct dma_pte *)alloc_pgtable_page(domain->nid); if (!domain->pgd) return -ENOMEM; + domain->flags |= DOMAIN_FLAG_INITIALISED; domain_flush_cache(domain, domain->pgd, PAGE_SIZE); return 0; } @@ -5028,28 +5051,54 @@ static struct iommu_domain *intel_iommu_domain_alloc(unsigned type) { struct dmar_domain *dmar_domain; struct iommu_domain *domain; + int flags = DOMAIN_FLAG_MANAGED_EXTERNALLY; - if (type != IOMMU_DOMAIN_UNMANAGED) - return NULL; - - dmar_domain = alloc_domain(DOMAIN_FLAG_VIRTUAL_MACHINE); - if (!dmar_domain) { - pr_err("Can't allocate dmar_domain\n"); - return NULL; - } - if (md_domain_init(dmar_domain, DEFAULT_DOMAIN_ADDRESS_WIDTH)) { -
[PATCH 4/4] iommu/vt-d: Remove lazy allocation of domains
The generic IOMMU code will allocate and attach a dma ops domain to each device that comes online, replacing any lazy allocated domain. Removes RMRR application at iommu init time as we won't have a domain attached to any device. iommu.c will do this after attaching a device using create_direct_mappings. Signed-off-by: James Sewart --- drivers/iommu/intel-iommu.c | 202 ++-- 1 file changed, 8 insertions(+), 194 deletions(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 71cd6bbfec05..282257e2628d 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -2595,118 +2595,6 @@ static struct dmar_domain *dmar_insert_one_dev_info(struct intel_iommu *iommu, return domain; } -static int get_last_alias(struct pci_dev *pdev, u16 alias, void *opaque) -{ - *(u16 *)opaque = alias; - return 0; -} - -static struct dmar_domain *find_or_alloc_domain(struct device *dev, int gaw) -{ - struct device_domain_info *info = NULL; - struct dmar_domain *domain = NULL; - struct intel_iommu *iommu; - u16 dma_alias; - unsigned long flags; - u8 bus, devfn; - - iommu = device_to_iommu(dev, , ); - if (!iommu) - return NULL; - - if (dev_is_pci(dev)) { - struct pci_dev *pdev = to_pci_dev(dev); - - pci_for_each_dma_alias(pdev, get_last_alias, _alias); - - spin_lock_irqsave(_domain_lock, flags); - info = dmar_search_domain_by_dev_info(pci_domain_nr(pdev->bus), - PCI_BUS_NUM(dma_alias), - dma_alias & 0xff); - if (info) { - iommu = info->iommu; - domain = info->domain; - } - spin_unlock_irqrestore(_domain_lock, flags); - - /* DMA alias already has a domain, use it */ - if (info) - goto out; - } - - /* Allocate and initialize new domain for the device */ - domain = alloc_domain(0); - if (!domain) - return NULL; - if (domain_init(domain, iommu, gaw)) { - domain_exit(domain); - return NULL; - } - -out: - - return domain; -} - -static struct dmar_domain *set_domain_for_dev(struct device *dev, - struct dmar_domain *domain) -{ - struct intel_iommu *iommu; - struct dmar_domain *tmp; - u16 req_id, dma_alias; - u8 bus, devfn; - - iommu = device_to_iommu(dev, , ); - if (!iommu) - return NULL; - - req_id = ((u16)bus << 8) | devfn; - - if (dev_is_pci(dev)) { - struct pci_dev *pdev = to_pci_dev(dev); - - pci_for_each_dma_alias(pdev, get_last_alias, _alias); - - /* register PCI DMA alias device */ - if (req_id != dma_alias) { - tmp = dmar_insert_one_dev_info(iommu, PCI_BUS_NUM(dma_alias), - dma_alias & 0xff, NULL, domain); - - if (!tmp || tmp != domain) - return tmp; - } - } - - tmp = dmar_insert_one_dev_info(iommu, bus, devfn, dev, domain); - if (!tmp || tmp != domain) - return tmp; - - return domain; -} - -static struct dmar_domain *get_domain_for_dev(struct device *dev, int gaw) -{ - struct dmar_domain *domain, *tmp; - - domain = find_domain(dev); - if (domain) - goto out; - - domain = find_or_alloc_domain(dev, gaw); - if (!domain) - goto out; - - tmp = set_domain_for_dev(dev, domain); - if (!tmp || domain != tmp) { - domain_exit(domain); - domain = tmp; - } - -out: - - return domain; -} - static int iommu_domain_identity_map(struct dmar_domain *domain, unsigned long long start, unsigned long long end) @@ -2779,7 +2667,7 @@ static int iommu_prepare_identity_map(struct device *dev, struct dmar_domain *domain; int ret; - domain = get_domain_for_dev(dev, DEFAULT_DOMAIN_ADDRESS_WIDTH); + domain = find_domain(dev); if (!domain) return -ENOMEM; @@ -3301,11 +3189,9 @@ static int copy_translation_tables(struct intel_iommu *iommu) static int __init init_dmars(void) { struct dmar_drhd_unit *drhd; - struct dmar_rmrr_unit *rmrr; bool copied_tables = false; - struct device *dev; struct intel_iommu *iommu; - int i, ret; + int ret; /* * for each drhd @@ -3466,32 +3352,6 @@ static int __init init_dmars(void) goto free_iommu; } } - /* -
[PATCH 1/4] iommu: Move iommu_group_create_direct_mappings to after device_attach
If an IOMMU driver requires to know which IOMMU a domain is associated to initialise a domain then it will do so in device_attach, before which it is not safe to call iommu_ops. Signed-off-by: James Sewart --- drivers/iommu/iommu.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 3ed4db334341..1c6ffbb2d20e 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -652,8 +652,6 @@ int iommu_group_add_device(struct iommu_group *group, struct device *dev) dev->iommu_group = group; - iommu_group_create_direct_mappings(group, dev); - mutex_lock(>mutex); list_add_tail(>list, >devices); if (group->domain) @@ -662,6 +660,8 @@ int iommu_group_add_device(struct iommu_group *group, struct device *dev) if (ret) goto err_put_group; + iommu_group_create_direct_mappings(group, dev); + /* Notify any listeners about change to group. */ blocking_notifier_call_chain(>notifier, IOMMU_GROUP_NOTIFY_ADD_DEVICE, dev); -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 2/4] iommu/vt-d: Implement apply_resv_region for reserving IOVA ranges
Used by iommu.c before creating identity mappings for reserved ranges to ensure dma-map-ops won't ever remap these addresses. Signed-off-by: James Sewart --- drivers/iommu/intel-iommu.c | 14 ++ 1 file changed, 14 insertions(+) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 78188bf7e90d..8e0a4e2ff77f 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -5299,6 +5299,19 @@ static void intel_iommu_put_resv_regions(struct device *dev, } } +static void intel_iommu_apply_resv_region(struct device *dev, + struct iommu_domain *domain, + struct iommu_resv_region *region) +{ + struct dmar_domain *dmar_domain = to_dmar_domain(domain); + unsigned long start, end; + + start = IOVA_PFN(region->start); + end = IOVA_PFN(region->start + region->length - 1); + + WARN_ON_ONCE(reserve_iova(_domain->iovad, start, end) == NULL); +} + #ifdef CONFIG_INTEL_IOMMU_SVM int intel_iommu_enable_pasid(struct intel_iommu *iommu, struct intel_svm_dev *sdev) { @@ -5392,6 +5405,7 @@ const struct iommu_ops intel_iommu_ops = { .remove_device = intel_iommu_remove_device, .get_resv_regions = intel_iommu_get_resv_regions, .put_resv_regions = intel_iommu_put_resv_regions, + .apply_resv_region = intel_iommu_apply_resv_region, .device_group = pci_device_group, .pgsize_bitmap = INTEL_IOMMU_PGSIZES, }; -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 0/4] iommu/vt-d: Fix-up device-domain relationship by refactoring to use iommu group default domain.
Hey, This patchset moves IOMMU_DOMAIN_DMA iommu domain management to iommu.c. This avoids the use of find_or_alloc_domain whose domain assignment is inconsistent with the iommu grouping as determined by pci_device_group. Patch 3 permits domain type IOMMU_DOMAIN_DMA to be allocated via the iommu_ops api, allowing the default_domain of an iommu group to be set in iommu.c. This domain will be attached to every device that is brought up with an iommu group, and the devices reserved regions will be mapped using regions returned by get_resv_regions. In intel_iommu_domain_alloc we don’t know the IOMMU this domain will be associated with so we defer full initialisation until intel_iommu_attach_device. Currently iommu.c:iommu_group_add_device will try to map a devices reserved regions before attaching the domain which would cause issue if the domain is not fully initialised. This is addressed in patch 1 by moving the mapping to after attaching. Patch 2 implements function apply_resv_region, used in iommu_group_create_direct_mappings to mark the reserved regions as non mappable for the dma_map_ops api. Patch 4 removes the domain lazy allocation logic. Before this patch the lazy allocation logic would not be used as any domain allocated using these paths would be replaced when attaching the group default domain. Default domain allocation has been tested with and without this patch on 4.19. Cheers, James. James Sewart (4): iommu: Move iommu_group_create_direct_mappings to after device_attach iommu/vt-d: Implement apply_resv_region for reserving IOVA ranges iommu/vt-d: Allow IOMMU_DOMAIN_DMA and IOMMU_DOMAIN_IDENTITY to be allocated iommu/vt-d: Remove lazy allocation of domains drivers/iommu/intel-iommu.c | 329 drivers/iommu/iommu.c | 4 +- 2 files changed, 108 insertions(+), 225 deletions(-) -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC v2] iommu/vt-d: Allow iommu_domain_alloc to allocate IOMMU_DOMAIN_DMA
Hey Jacob, > On 7 Jan 2019, at 20:04, Jacob Pan wrote: > > On Wed, 5 Dec 2018 17:19:35 + > James Sewart wrote: > >> Hey, >> >> There exists an issue in the logic used to determine domain >> association with devices. Currently the driver uses >> find_or_alloc_domain to either reuse an existing domain or allocate a >> new one if one isn’t found. Domains should be shared between all >> members of an IOMMU group as this is the minimum granularity that we >> can guarantee address space isolation. >> >> The intel IOMMU driver exposes pci_device_group in intel_iommu_ops as >> the function to call to determine the group of a device, this is >> implemented in the generic IOMMU code and checks: dma aliases, >> upstream pcie switch ACS, pci aliases, and pci function aliases. The >> find_or_alloc_domain code currently only uses dma aliases to >> determine if a domain is shared. This causes a disconnect between >> IOMMU groups and domains. We have observed devices under a pcie >> switch each having their own domain but assigned the same group. >> >> One solution would be to fix the logic in find_or_alloc_domain to add >> checks for the other conditions that a device may share a domain. >> However, this duplicates code which the generic IOMMU code >> implements. Instead this issue can be fixed by allowing the >> allocation of default_domain on the IOMMU group. This is not >> currently supported as the intel driver does not allow allocation of >> domain type IOMMU_DOMAIN_DMA. >> >> Allowing allocation of DMA domains has the effect that the >> default_domain is non NULL and is attached to a device when >> initialising. This delegates the handling of domains to the generic >> IOMMU code. Once this is implemented it is possible to remove the >> lazy allocation of domains entirely. >> > This can also consolidate the domain storage, i.e. move domain from > device_domain_info to iommu_group. The plan was to have a patchset that first added the functionality below, making use of the group domain storage but keeping the lazy allocation. Then subsequent patches that remove the lazy allocation. To also remove the device_domain_info it looks like some of the information there might need moving into the domain? >> This patch implements DMA and identity domains to be allocated for >> external management. As it isn’t known which device will be attached >> to a domain, the dma domain is not initialised at alloc time. Instead >> it is allocated when attached. As we may lose RMRR mappings when >> attaching a device to a new domain, we also ensure these are mapped >> at attach time. >> >> This will likely conflict with the work done for auxiliary domains by >> Baolu but the code to accommodate won’t change much. >> >> I had also started on a patch to remove find_or_alloc_domain and >> various functions that call it but had issues with edge cases such as >> iommu_prepare_isa that is doing domain operations at IOMMU init time. >> >> Cheers, >> James. >> >> >> --- >> drivers/iommu/intel-iommu.c | 159 >> +--- 1 file changed, 110 >> insertions(+), 49 deletions(-) >> >> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c >> index 41a4b8808802..6437cb2e9b22 100644 >> --- a/drivers/iommu/intel-iommu.c >> +++ b/drivers/iommu/intel-iommu.c >> @@ -351,6 +351,14 @@ static int hw_pass_through = 1; >> /* si_domain contains mulitple devices */ >> #define DOMAIN_FLAG_STATIC_IDENTITY (1 << 1) >> >> +/* Domain managed externally, don't cleanup if it isn't attached >> + * to any devices. */ >> +#define DOMAIN_FLAG_NO_CLEANUP (1 << 2) >> + > the name NO_CLEANUP is a little counter intuitive to me, should it be > called UNINITIALISED? I don’t think uninitialised is accurate, the domain may be initialised. It is used to distinguish between domains allocated by the external API, which we shouldn’t automatically cleanup, and domains allocated internally, which should. I agree a better name could be found. >> +/* Set after domain initialisation. Used when allocating dma domains >> to >> + * defer domain initialisation until it is attached to a device */ >> +#define DOMAIN_FLAG_INITIALISED (1 << 4) >> + >> #define for_each_domain_iommu(idx, domain) \ >> for (idx = 0; idx < g_num_of_iommus; idx++) \ >> if (domain->iommu_refcnt[idx]) >> @@ -624,6 +632,16 @@ static inline int domain_type_is_vm_or_si(struct >> dmar_domain *domain) DOMAIN_FLAG_STATIC_IDENTITY); >> } >> >> +static inline int domain_managed_externally(struct dmar_domain >> *domain) +{ >> +return domain->flags & DOMAIN_FLAG_NO_CLEANUP; >> +} >> + >> +static inline int domain_is_initialised(struct dmar_domain *domain) >> +{ >> +return domain->flags & DOMAIN_FLAG_INITIALISED; >> +} >> + >> static inline int domain_pfn_supported(struct dmar_domain *domain, >> unsigned long pfn) >> { >> @@ -1717,7 +1735,7 @@ static void
Re: [RFC v2] iommu/vt-d: Allow iommu_domain_alloc to allocate IOMMU_DOMAIN_DMA
Bump > On 5 Dec 2018, at 17:19, James Sewart wrote: > > Hey, > > There exists an issue in the logic used to determine domain association > with devices. Currently the driver uses find_or_alloc_domain to either > reuse an existing domain or allocate a new one if one isn’t found. Domains > should be shared between all members of an IOMMU group as this is the > minimum granularity that we can guarantee address space isolation. > > The intel IOMMU driver exposes pci_device_group in intel_iommu_ops as the > function to call to determine the group of a device, this is implemented > in the generic IOMMU code and checks: dma aliases, upstream pcie switch > ACS, pci aliases, and pci function aliases. The find_or_alloc_domain code > currently only uses dma aliases to determine if a domain is shared. This > causes a disconnect between IOMMU groups and domains. We have observed > devices under a pcie switch each having their own domain but assigned the > same group. > > One solution would be to fix the logic in find_or_alloc_domain to add > checks for the other conditions that a device may share a domain. However, > this duplicates code which the generic IOMMU code implements. Instead this > issue can be fixed by allowing the allocation of default_domain on the > IOMMU group. This is not currently supported as the intel driver does not > allow allocation of domain type IOMMU_DOMAIN_DMA. > > Allowing allocation of DMA domains has the effect that the default_domain > is non NULL and is attached to a device when initialising. This delegates > the handling of domains to the generic IOMMU code. Once this is > implemented it is possible to remove the lazy allocation of domains > entirely. > > This patch implements DMA and identity domains to be allocated for > external management. As it isn’t known which device will be attached to a > domain, the dma domain is not initialised at alloc time. Instead it is > allocated when attached. As we may lose RMRR mappings when attaching a > device to a new domain, we also ensure these are mapped at attach time. > > This will likely conflict with the work done for auxiliary domains by > Baolu but the code to accommodate won’t change much. > > I had also started on a patch to remove find_or_alloc_domain and various > functions that call it but had issues with edge cases such as > iommu_prepare_isa that is doing domain operations at IOMMU init time. > > Cheers, > James. > > > --- > drivers/iommu/intel-iommu.c | 159 +--- > 1 file changed, 110 insertions(+), 49 deletions(-) > > diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c > index 41a4b8808802..6437cb2e9b22 100644 > --- a/drivers/iommu/intel-iommu.c > +++ b/drivers/iommu/intel-iommu.c > @@ -351,6 +351,14 @@ static int hw_pass_through = 1; > /* si_domain contains mulitple devices */ > #define DOMAIN_FLAG_STATIC_IDENTITY (1 << 1) > > +/* Domain managed externally, don't cleanup if it isn't attached > + * to any devices. */ > +#define DOMAIN_FLAG_NO_CLEANUP (1 << 2) > + > +/* Set after domain initialisation. Used when allocating dma domains to > + * defer domain initialisation until it is attached to a device */ > +#define DOMAIN_FLAG_INITIALISED (1 << 4) > + > #define for_each_domain_iommu(idx, domain)\ > for (idx = 0; idx < g_num_of_iommus; idx++) \ > if (domain->iommu_refcnt[idx]) > @@ -624,6 +632,16 @@ static inline int domain_type_is_vm_or_si(struct > dmar_domain *domain) > DOMAIN_FLAG_STATIC_IDENTITY); > } > > +static inline int domain_managed_externally(struct dmar_domain *domain) > +{ > + return domain->flags & DOMAIN_FLAG_NO_CLEANUP; > +} > + > +static inline int domain_is_initialised(struct dmar_domain *domain) > +{ > + return domain->flags & DOMAIN_FLAG_INITIALISED; > +} > + > static inline int domain_pfn_supported(struct dmar_domain *domain, > unsigned long pfn) > { > @@ -1717,7 +1735,7 @@ static void disable_dmar_iommu(struct intel_iommu > *iommu) > > __dmar_remove_one_dev_info(info); > > - if (!domain_type_is_vm_or_si(domain)) { > + if (!domain_managed_externally(domain)) { > /* >* The domain_exit() function can't be called under >* device_domain_lock, as it takes this lock itself. > @@ -1951,6 +1969,7 @@ static int domain_init(struct dmar_domain *domain, > struct intel_iommu *iommu, > domain->pgd = (struct dma_pte *)alloc_pgtable_page(domain->nid); > if (!domain->pgd) > return -ENOMEM; > + domain->flags |= DOMAIN_FLAG_INITIALISED; > __iommu_flush_cache(iommu, domain->pgd, PAGE_SIZE); > return 0; > } > @@ -3234,11 +3253,9 @@ static int copy_translation_tables(struct intel_iommu > *iommu) > static int __init init_dmars(void) > { >
[RFC v2] iommu/vt-d: Allow iommu_domain_alloc to allocate IOMMU_DOMAIN_DMA
Hey, There exists an issue in the logic used to determine domain association with devices. Currently the driver uses find_or_alloc_domain to either reuse an existing domain or allocate a new one if one isn’t found. Domains should be shared between all members of an IOMMU group as this is the minimum granularity that we can guarantee address space isolation. The intel IOMMU driver exposes pci_device_group in intel_iommu_ops as the function to call to determine the group of a device, this is implemented in the generic IOMMU code and checks: dma aliases, upstream pcie switch ACS, pci aliases, and pci function aliases. The find_or_alloc_domain code currently only uses dma aliases to determine if a domain is shared. This causes a disconnect between IOMMU groups and domains. We have observed devices under a pcie switch each having their own domain but assigned the same group. One solution would be to fix the logic in find_or_alloc_domain to add checks for the other conditions that a device may share a domain. However, this duplicates code which the generic IOMMU code implements. Instead this issue can be fixed by allowing the allocation of default_domain on the IOMMU group. This is not currently supported as the intel driver does not allow allocation of domain type IOMMU_DOMAIN_DMA. Allowing allocation of DMA domains has the effect that the default_domain is non NULL and is attached to a device when initialising. This delegates the handling of domains to the generic IOMMU code. Once this is implemented it is possible to remove the lazy allocation of domains entirely. This patch implements DMA and identity domains to be allocated for external management. As it isn’t known which device will be attached to a domain, the dma domain is not initialised at alloc time. Instead it is allocated when attached. As we may lose RMRR mappings when attaching a device to a new domain, we also ensure these are mapped at attach time. This will likely conflict with the work done for auxiliary domains by Baolu but the code to accommodate won’t change much. I had also started on a patch to remove find_or_alloc_domain and various functions that call it but had issues with edge cases such as iommu_prepare_isa that is doing domain operations at IOMMU init time. Cheers, James. --- drivers/iommu/intel-iommu.c | 159 +--- 1 file changed, 110 insertions(+), 49 deletions(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 41a4b8808802..6437cb2e9b22 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -351,6 +351,14 @@ static int hw_pass_through = 1; /* si_domain contains mulitple devices */ #define DOMAIN_FLAG_STATIC_IDENTITY(1 << 1) +/* Domain managed externally, don't cleanup if it isn't attached + * to any devices. */ +#define DOMAIN_FLAG_NO_CLEANUP (1 << 2) + +/* Set after domain initialisation. Used when allocating dma domains to + * defer domain initialisation until it is attached to a device */ +#define DOMAIN_FLAG_INITIALISED(1 << 4) + #define for_each_domain_iommu(idx, domain) \ for (idx = 0; idx < g_num_of_iommus; idx++) \ if (domain->iommu_refcnt[idx]) @@ -624,6 +632,16 @@ static inline int domain_type_is_vm_or_si(struct dmar_domain *domain) DOMAIN_FLAG_STATIC_IDENTITY); } +static inline int domain_managed_externally(struct dmar_domain *domain) +{ + return domain->flags & DOMAIN_FLAG_NO_CLEANUP; +} + +static inline int domain_is_initialised(struct dmar_domain *domain) +{ + return domain->flags & DOMAIN_FLAG_INITIALISED; +} + static inline int domain_pfn_supported(struct dmar_domain *domain, unsigned long pfn) { @@ -1717,7 +1735,7 @@ static void disable_dmar_iommu(struct intel_iommu *iommu) __dmar_remove_one_dev_info(info); - if (!domain_type_is_vm_or_si(domain)) { + if (!domain_managed_externally(domain)) { /* * The domain_exit() function can't be called under * device_domain_lock, as it takes this lock itself. @@ -1951,6 +1969,7 @@ static int domain_init(struct dmar_domain *domain, struct intel_iommu *iommu, domain->pgd = (struct dma_pte *)alloc_pgtable_page(domain->nid); if (!domain->pgd) return -ENOMEM; + domain->flags |= DOMAIN_FLAG_INITIALISED; __iommu_flush_cache(iommu, domain->pgd, PAGE_SIZE); return 0; } @@ -3234,11 +3253,9 @@ static int copy_translation_tables(struct intel_iommu *iommu) static int __init init_dmars(void) { struct dmar_drhd_unit *drhd; - struct dmar_rmrr_unit *rmrr; bool copied_tables = false; - struct device *dev; struct intel_iommu *iommu; - int i, ret; + int ret; /* * for each
Re: [RFC] iommu/vt-d: Group and domain relationship
ment apply_resv_region() in VT-d driver and > move RMRR setup here. > - DMA map API, will use per group default domain instead of per device > private domain > > The change is rather pervasive so i am trying to set a balance for > short term functionality and complete clean up. Any ideas welcome. > Sounds good, thanks. I am eager to test the patch. >>> >>>>> So we can't just open the door but not cleanup the things right? >>>> A user of domain_alloc and attach_device is responsible for >>>> detaching a domain if it is no longer needed and calling >>>> domain_free. >>> >>> Currently DMA API calls get_valid_domain_for_dev() to retrieve a DMA >>> domain. If the domain has already been allocated, return directly. >>> Otherwise, allocate and initialize a new one for the device. Let's >>> call domains allocated by get_valid_domain_for_dev() as "A". >>> >>> If we open the door and allow another component to manage the DMA >>> domains through domain iommu_domain_alloc/free(). Let's call domains >>> allocated through iommu_domain_alloc() as "B". >>> >>> So how can we sync between A and B? >> >> I’m not sure we need to sync them. Domain A would be replaced >> entirely for a certain device with domain B by attach_device. As we >> saw above domain A would be freed if there are no more devices >> attached to it. Domain B would be managed by the user of the iommu >> api. If a device is detached from domain B, the find_or_alloc_domain >> logic will take over once more and allocate a fresh domain. >> > Agreed, it is more like switching domains when you switching APIs. But > the reserved region better be preserved. >> Cheers, >> James. >> >>> >>> Need to go through the code to find out more. >>> >>> Best regards, >>> Lu Baolu >>>> Cheers, >>>> James. >>>>> >>>>> I haven't spent time on details. So I cc'ed Jacob for corrections. >>>>> >>>>> Best regards, >>>>> Lu Baolu >>>>> >>>>>> Cheers, >>>>>> James. >>>>>> On Fri, Nov 2, 2018 at 2:43 AM Lu Baolu >>>>>> wrote: >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> On 10/30/18 10:18 PM, James Sewart via iommu wrote: >>>>>>>> Hey, >>>>>>>> >>>>>>>> I’ve been investigating the relationship between iommu groups >>>>>>>> and domains on our systems and have a few question. Why does >>>>>>>> the intel iommu code not allow allocating IOMMU_DOMAIN_DMA? >>>>>>>> Returning NULL when given this domain type has the side effect >>>>>>>> that the default_domain for an iommu group is not set, which, >>>>>>>> when using for e.g. dma_map_ops.map_page, means a domain is >>>>>>>> allocated per device. >>>>>>> >>>>>>> Intel vt-d driver doesn't implement the default domain and >>>>>>> allocates domain only on demanded. There are lots of things to >>>>>>> do before we allow iommu API to allocate domains other than >>>>>>> IOMMU_DOMAIN_UNMANAGED. >>>>>>> >>>>>>> Best regards, >>>>>>> Lu Baolu >>>>>>> >>>>>>>> >>>>>>>> This seems to be the opposite behaviour to the AMD iommu code >>>>>>>> which supports allocating an IOMMU_DOMAIN_DMA and will only >>>>>>>> look to the iommu group if a domain is not attached to the >>>>>>>> device rather than allocating a new one. On AMD every device >>>>>>>> in an iommu group will share the same domain. >>>>>>>> >>>>>>>> Appended is what I think a patch to implement domain_alloc for >>>>>>>> IOMMU_DOMAIN_DMA and also IOMMU_DOMAIN_IDENTITY would look >>>>>>>> like. Testing shows each device in a group will share a domain >>>>>>>> by default, it also allows allocating a new dma domain that >>>>>>>> can be successfully attached to a group with >>>>>>>> iommu_attach_group. >>>>>
Re: [RFC] iommu/vt-d: Group and domain relationship
Hey Yi, > On 9 Nov 2018, at 06:54, Liu, Yi L wrote: > > Hi James, > > Regards to the relationship of iommu group and domain, the blog written by > Alex > may help you. The blog explained very well on how iommu group is determined > and > why. > > http://vfio.blogspot.com/2014/08/iommu-groups-inside-and-out.html Thanks for the link, this explains how I expected the group domain relationship to work. > >> From: iommu-boun...@lists.linux-foundation.org [mailto:iommu- >> boun...@lists.linux-foundation.org] On Behalf Of James Sewart via iommu >> Sent: Thursday, November 8, 2018 7:30 PM >> Subject: Re: [RFC] iommu/vt-d: Group and domain relationship >> >> Hey, >> >>> On 8 Nov 2018, at 01:42, Lu Baolu wrote: >>> >>> Hi, >>> >>> On 11/8/18 1:55 AM, James Sewart wrote: >>>> Hey, >>>>> On 7 Nov 2018, at 02:10, Lu Baolu wrote: >>>>> >>>>> Hi, >>>>> >>>>> On 11/6/18 6:40 PM, James Sewart wrote: >>>>>> Hey Lu, >>>>>> Would you be able to go into more detail about the issues with > > [...] > >>>>> >>>>> Why do we want to open this door? Probably we want the generic iommu >>>>> layer to handle these things (it's called default domain). >>>> I’d like to allocate a domain and attach it to multiple devices in a >>>> group/multiple groups so that they share address translation, but still >>>> allow drivers for devices in those groups to use the dma_map_ops api. >>> >>> Just out of curiosity, why do you want to share a single domain across >>> multiple groups? By default, the groups and DMA domains are normally >>> 1-1 mapped, right? >> >> Currently we see each device in a group with their own domain. >> find_or_alloc_domain looks at dma aliases to determine who shares domains >> whereas pci_device_group in the generic iommu code determines groups using >> a few other checks. We have observed internally that devices under a pcie >> switch will be put in the same group but they do not share a domain within >> that group. > > Really? iommu group is DMA isolation unit. You said they are not sharing a > domain. > Do you mean they have different IOVA address space by mentioning they are not > sharing a domain? Is there any special things(e.g. special IOVA allocation to > avoid > unexpected P2P) done on your system? Normally, devices within an iommu group > should share an IOVA address space. We see a unique set of mappings for each device within a group. Adding debug output to the kernel shows find_or_alloc_domain allocating a new domain for each device in a group. We don’t have any special configuration with regard to IOVA allocation. I think the issue is that find_or_alloc_domain only uses pci_for_each_dma_alias to determine domain sharing. This doesn’t check the configuration of upstream pcie-pcie switches and doesn’t look to see if ACS is enabled or disabled. pci_device_group in drivers/iommu/iommu.c, which is used to determine which group a device is assigned, performs an extra walk of the upstream buses to see if there exists a bus without ACS which would allow peer-to-peer dma and groups the device accordingly. One solution would be to allow the generic iommu code to assign domains, which it does based on the device_group function of the iommu_ops. The reason it doesn’t currently is that the default domain type is IOMMU_DOMAIN_DMA which the intel driver currently disallows allocating via the domain_alloc api. Cheers, James. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC] iommu/vt-d: Group and domain relationship
Hey, > On 8 Nov 2018, at 01:42, Lu Baolu wrote: > > Hi, > > On 11/8/18 1:55 AM, James Sewart wrote: >> Hey, >>> On 7 Nov 2018, at 02:10, Lu Baolu wrote: >>> >>> Hi, >>> >>> On 11/6/18 6:40 PM, James Sewart wrote: >>>> Hey Lu, >>>> Would you be able to go into more detail about the issues with >>>> allowing IOMMU_DOMAIN_DMA to be allocated via domain_alloc? >>> >>> This door is closed because intel iommu driver does everything for >>> IOMMU_DOMAIN_DMA: allocating a domain and setup the context entries >>> for the domain. >> As far as I can tell, attach_device in the intel driver will handle >> cleaning up any old domain context mapping and ensure the new domain is >> mapped with calls to dmar_remove_one_dev_info and domain_add_dev_info. > > That's only for domains of IOMMU_DOMAIN_UNMANAGED, right? attach_device has logic for cleaning up domains that are allocated by the intel driver. old_domain = find_domain(dev); if (old_domain) { rcu_read_lock(); dmar_remove_one_dev_info(old_domain, dev); rcu_read_unlock(); if (!domain_type_is_vm_or_si(old_domain) && list_empty(_domain->devices)) domain_exit(old_domain); } This is checking the type of the old domain only, freeing it if is not attached to any devices. Looking at this now, maybe the solution would be to distinguish between internally allocated dma domains and dma domains allocated via the external api, so we avoid freeing a domain a driver has reference to. >>> >>> Why do we want to open this door? Probably we want the generic iommu >>> layer to handle these things (it's called default domain). >> I’d like to allocate a domain and attach it to multiple devices in a >> group/multiple groups so that they share address translation, but still >> allow drivers for devices in those groups to use the dma_map_ops api. > > Just out of curiosity, why do you want to share a single domain across > multiple groups? By default, the groups and DMA domains are normally > 1-1 mapped, right? Currently we see each device in a group with their own domain. find_or_alloc_domain looks at dma aliases to determine who shares domains whereas pci_device_group in the generic iommu code determines groups using a few other checks. We have observed internally that devices under a pcie switch will be put in the same group but they do not share a domain within that group. Getting every device within a group to share a domain would get us 90% of the way to what we want. But we have some configurations where there exist devices put in other groups that we want to share translations with. > >>> So we can't just open the door but not cleanup the things right? >> A user of domain_alloc and attach_device is responsible for detaching a >> domain if it is no longer needed and calling domain_free. > > Currently DMA API calls get_valid_domain_for_dev() to retrieve a DMA > domain. If the domain has already been allocated, return directly. > Otherwise, allocate and initialize a new one for the device. Let's call > domains allocated by get_valid_domain_for_dev() as "A". > > If we open the door and allow another component to manage the DMA > domains through domain iommu_domain_alloc/free(). Let's call domains > allocated through iommu_domain_alloc() as "B". > > So how can we sync between A and B? I’m not sure we need to sync them. Domain A would be replaced entirely for a certain device with domain B by attach_device. As we saw above domain A would be freed if there are no more devices attached to it. Domain B would be managed by the user of the iommu api. If a device is detached from domain B, the find_or_alloc_domain logic will take over once more and allocate a fresh domain. Cheers, James. > > Need to go through the code to find out more. > > Best regards, > Lu Baolu >> Cheers, >> James. >>> >>> I haven't spent time on details. So I cc'ed Jacob for corrections. >>> >>> Best regards, >>> Lu Baolu >>> >>>> Cheers, >>>> James. >>>> On Fri, Nov 2, 2018 at 2:43 AM Lu Baolu wrote: >>>>> >>>>> Hi, >>>>> >>>>> On 10/30/18 10:18 PM, James Sewart via iommu wrote: >>>>>> Hey, >>>>>> >>>>>> I’ve been investigating the relationship between iommu groups and domains >>>>>> on our systems and have a few question. Why
Re: [RFC] iommu/vt-d: Group and domain relationship
Hey, > On 7 Nov 2018, at 02:10, Lu Baolu wrote: > > Hi, > > On 11/6/18 6:40 PM, James Sewart wrote: >> Hey Lu, >> Would you be able to go into more detail about the issues with >> allowing IOMMU_DOMAIN_DMA to be allocated via domain_alloc? > > This door is closed because intel iommu driver does everything for > IOMMU_DOMAIN_DMA: allocating a domain and setup the context entries > for the domain. As far as I can tell, attach_device in the intel driver will handle cleaning up any old domain context mapping and ensure the new domain is mapped with calls to dmar_remove_one_dev_info and domain_add_dev_info. > > Why do we want to open this door? Probably we want the generic iommu > layer to handle these things (it's called default domain). I’d like to allocate a domain and attach it to multiple devices in a group/multiple groups so that they share address translation, but still allow drivers for devices in those groups to use the dma_map_ops api. > So we can't just open the door but not cleanup the things right? A user of domain_alloc and attach_device is responsible for detaching a domain if it is no longer needed and calling domain_free. Cheers, James. > > I haven't spent time on details. So I cc'ed Jacob for corrections. > > Best regards, > Lu Baolu > >> Cheers, >> James. >> On Fri, Nov 2, 2018 at 2:43 AM Lu Baolu wrote: >>> >>> Hi, >>> >>> On 10/30/18 10:18 PM, James Sewart via iommu wrote: >>>> Hey, >>>> >>>> I’ve been investigating the relationship between iommu groups and domains >>>> on our systems and have a few question. Why does the intel iommu code not >>>> allow allocating IOMMU_DOMAIN_DMA? Returning NULL when given this domain >>>> type has the side effect that the default_domain for an iommu group is not >>>> set, which, when using for e.g. dma_map_ops.map_page, means a domain is >>>> allocated per device. >>> >>> Intel vt-d driver doesn't implement the default domain and allocates >>> domain only on demanded. There are lots of things to do before we allow >>> iommu API to allocate domains other than IOMMU_DOMAIN_UNMANAGED. >>> >>> Best regards, >>> Lu Baolu >>> >>>> >>>> This seems to be the opposite behaviour to the AMD iommu code which >>>> supports allocating an IOMMU_DOMAIN_DMA and will only look to the iommu >>>> group if a domain is not attached to the device rather than allocating a >>>> new one. On AMD every device in an iommu group will share the same domain. >>>> >>>> Appended is what I think a patch to implement domain_alloc for >>>> IOMMU_DOMAIN_DMA and also IOMMU_DOMAIN_IDENTITY would look like. Testing >>>> shows each device in a group will share a domain by default, it also >>>> allows allocating a new dma domain that can be successfully attached to a >>>> group with iommu_attach_group. >>>> >>>> Looking for comment on why the behaviour is how it is currently and if >>>> there are any issues with the solution I’ve been testing. >>>> >>>> Cheers, >>>> James. >>>> >>>> >>>> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c >>>> index bff2abd6..3a58389f 100644 >>>> --- a/drivers/iommu/intel-iommu.c >>>> +++ b/drivers/iommu/intel-iommu.c >>>> @@ -5170,10 +5170,15 @@ static struct iommu_domain >>>> *intel_iommu_domain_alloc(unsigned type) >>>> struct dmar_domain *dmar_domain; >>>> struct iommu_domain *domain; >>>> >>>> - if (type != IOMMU_DOMAIN_UNMANAGED) >>>> + if (type == IOMMU_DOMAIN_UNMANAGED) >>>> + dmar_domain = alloc_domain(DOMAIN_FLAG_VIRTUAL_MACHINE); >>>> + else if(type == IOMMU_DOMAIN_DMA) >>>> + dmar_domain = alloc_domain(0); >>>> + else if(type == IOMMU_DOMAIN_IDENTITY) >>>> + dmar_domain = alloc_domain(DOMAIN_FLAG_STATIC_IDENTITY); >>>> + else >>>> return NULL; >>>> >>>> - dmar_domain = alloc_domain(DOMAIN_FLAG_VIRTUAL_MACHINE); >>>> if (!dmar_domain) { >>>> pr_err("Can't allocate dmar_domain\n"); >>>> return NULL; >>>> @@ -5186,9 +5191,12 @@ static struct iommu_domain >>>> *intel_iommu_domain_alloc(unsigned type) >>>> domain_update_iommu_cap(dmar_do
Re: [RFC] iommu/vt-d: Group and domain relationship
Hey Lu, Would you be able to go into more detail about the issues with allowing IOMMU_DOMAIN_DMA to be allocated via domain_alloc? Cheers, James. On Fri, Nov 2, 2018 at 2:43 AM Lu Baolu wrote: > > Hi, > > On 10/30/18 10:18 PM, James Sewart via iommu wrote: > > Hey, > > > > I’ve been investigating the relationship between iommu groups and domains > > on our systems and have a few question. Why does the intel iommu code not > > allow allocating IOMMU_DOMAIN_DMA? Returning NULL when given this domain > > type has the side effect that the default_domain for an iommu group is not > > set, which, when using for e.g. dma_map_ops.map_page, means a domain is > > allocated per device. > > Intel vt-d driver doesn't implement the default domain and allocates > domain only on demanded. There are lots of things to do before we allow > iommu API to allocate domains other than IOMMU_DOMAIN_UNMANAGED. > > Best regards, > Lu Baolu > > > > > This seems to be the opposite behaviour to the AMD iommu code which > > supports allocating an IOMMU_DOMAIN_DMA and will only look to the iommu > > group if a domain is not attached to the device rather than allocating a > > new one. On AMD every device in an iommu group will share the same domain. > > > > Appended is what I think a patch to implement domain_alloc for > > IOMMU_DOMAIN_DMA and also IOMMU_DOMAIN_IDENTITY would look like. Testing > > shows each device in a group will share a domain by default, it also > > allows allocating a new dma domain that can be successfully attached to a > > group with iommu_attach_group. > > > > Looking for comment on why the behaviour is how it is currently and if > > there are any issues with the solution I’ve been testing. > > > > Cheers, > > James. > > > > > > diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c > > index bff2abd6..3a58389f 100644 > > --- a/drivers/iommu/intel-iommu.c > > +++ b/drivers/iommu/intel-iommu.c > > @@ -5170,10 +5170,15 @@ static struct iommu_domain > > *intel_iommu_domain_alloc(unsigned type) > > struct dmar_domain *dmar_domain; > > struct iommu_domain *domain; > > > > - if (type != IOMMU_DOMAIN_UNMANAGED) > > + if (type == IOMMU_DOMAIN_UNMANAGED) > > + dmar_domain = alloc_domain(DOMAIN_FLAG_VIRTUAL_MACHINE); > > + else if(type == IOMMU_DOMAIN_DMA) > > + dmar_domain = alloc_domain(0); > > + else if(type == IOMMU_DOMAIN_IDENTITY) > > + dmar_domain = alloc_domain(DOMAIN_FLAG_STATIC_IDENTITY); > > + else > > return NULL; > > > > - dmar_domain = alloc_domain(DOMAIN_FLAG_VIRTUAL_MACHINE); > > if (!dmar_domain) { > > pr_err("Can't allocate dmar_domain\n"); > > return NULL; > > @@ -5186,9 +5191,12 @@ static struct iommu_domain > > *intel_iommu_domain_alloc(unsigned type) > > domain_update_iommu_cap(dmar_domain); > > > > domain = _domain->domain; > > - domain->geometry.aperture_start = 0; > > - domain->geometry.aperture_end = __DOMAIN_MAX_ADDR(dmar_domain->gaw); > > - domain->geometry.force_aperture = true; > > + > > + if (type == IOMMU_DOMAIN_UNMANAGED) { > > + domain->geometry.aperture_start = 0; > > + domain->geometry.aperture_end = > > __DOMAIN_MAX_ADDR(dmar_domain->gaw); > > + domain->geometry.force_aperture = true; > > + } > > > > return domain; > > } > > ___ > > iommu mailing list > > iommu@lists.linux-foundation.org > > https://lists.linuxfoundation.org/mailman/listinfo/iommu > > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[RFC] iommu/vt-d: Group and domain relationship
Hey, I’ve been investigating the relationship between iommu groups and domains on our systems and have a few question. Why does the intel iommu code not allow allocating IOMMU_DOMAIN_DMA? Returning NULL when given this domain type has the side effect that the default_domain for an iommu group is not set, which, when using for e.g. dma_map_ops.map_page, means a domain is allocated per device. This seems to be the opposite behaviour to the AMD iommu code which supports allocating an IOMMU_DOMAIN_DMA and will only look to the iommu group if a domain is not attached to the device rather than allocating a new one. On AMD every device in an iommu group will share the same domain. Appended is what I think a patch to implement domain_alloc for IOMMU_DOMAIN_DMA and also IOMMU_DOMAIN_IDENTITY would look like. Testing shows each device in a group will share a domain by default, it also allows allocating a new dma domain that can be successfully attached to a group with iommu_attach_group. Looking for comment on why the behaviour is how it is currently and if there are any issues with the solution I’ve been testing. Cheers, James. diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index bff2abd6..3a58389f 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -5170,10 +5170,15 @@ static struct iommu_domain *intel_iommu_domain_alloc(unsigned type) struct dmar_domain *dmar_domain; struct iommu_domain *domain; - if (type != IOMMU_DOMAIN_UNMANAGED) + if (type == IOMMU_DOMAIN_UNMANAGED) + dmar_domain = alloc_domain(DOMAIN_FLAG_VIRTUAL_MACHINE); + else if(type == IOMMU_DOMAIN_DMA) + dmar_domain = alloc_domain(0); + else if(type == IOMMU_DOMAIN_IDENTITY) + dmar_domain = alloc_domain(DOMAIN_FLAG_STATIC_IDENTITY); + else return NULL; - dmar_domain = alloc_domain(DOMAIN_FLAG_VIRTUAL_MACHINE); if (!dmar_domain) { pr_err("Can't allocate dmar_domain\n"); return NULL; @@ -5186,9 +5191,12 @@ static struct iommu_domain *intel_iommu_domain_alloc(unsigned type) domain_update_iommu_cap(dmar_domain); domain = _domain->domain; - domain->geometry.aperture_start = 0; - domain->geometry.aperture_end = __DOMAIN_MAX_ADDR(dmar_domain->gaw); - domain->geometry.force_aperture = true; + + if (type == IOMMU_DOMAIN_UNMANAGED) { + domain->geometry.aperture_start = 0; + domain->geometry.aperture_end = __DOMAIN_MAX_ADDR(dmar_domain->gaw); + domain->geometry.force_aperture = true; + } return domain; } ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu