Re: [PATCH v6 2/3] PCI: Add parameter nr_devfns to pci_add_dma_alias

2019-12-11 Thread James Sewart via iommu


> On 10 Dec 2019, at 22:37, Bjorn Helgaas  wrote:
> 
> [+cc Joerg]
> 
> On Tue, Dec 03, 2019 at 03:43:53PM +, James Sewart wrote:
>> pci_add_dma_alias can now be used to create a dma alias for a range of
>> devfns.
>> 
>> Reviewed-by: Logan Gunthorpe 
>> Signed-off-by: James Sewart 
>> ---
>> drivers/pci/pci.c| 22 +-
>> drivers/pci/quirks.c | 14 +++---
>> include/linux/pci.h  |  2 +-
>> 3 files changed, 25 insertions(+), 13 deletions(-)
> 
> Heads up Joerg: I also updated drivers/iommu/amd_iommu.c (this is the
> one reported by the kbuild test robot) and removed the printk there
> that prints the same thing as the one in pci_add_dma_alias(), and I
> updated a PCI quirk that was merged after this patch was posted.
> 
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>> index d3c83248f3ce..dbb01aceafda 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -5857,7 +5857,8 @@ int pci_set_vga_state(struct pci_dev *dev, bool decode,
>> /**
>>  * pci_add_dma_alias - Add a DMA devfn alias for a device
>>  * @dev: the PCI device for which alias is added
>> - * @devfn: alias slot and function
>> + * @devfn_from: alias slot and function
>> + * @nr_devfns: Number of subsequent devfns to alias
>>  *
>>  * This helper encodes an 8-bit devfn as a bit number in dma_alias_mask
>>  * which is used to program permissible bus-devfn source addresses for DMA
>> @@ -5873,8 +5874,13 @@ int pci_set_vga_state(struct pci_dev *dev, bool 
>> decode,
>>  * cannot be left as a userspace activity).  DMA aliases should therefore
>>  * be configured via quirks, such as the PCI fixup header quirk.
>>  */
>> -void pci_add_dma_alias(struct pci_dev *dev, u8 devfn)
>> +void pci_add_dma_alias(struct pci_dev *dev, u8 devfn_from, unsigned 
>> nr_devfns)
>> {
>> +int devfn_to;
>> +
>> +nr_devfns = min(nr_devfns, (unsigned)MAX_NR_DEVFNS);
>> +devfn_to = devfn_from + nr_devfns - 1;
> 
> I made this look like:
> 
> +   devfn_to = min(devfn_from + nr_devfns - 1,
> +  (unsigned) MAX_NR_DEVFNS - 1);
> 
> so devfn_from=0xf0, nr_devfns=0x20 doesn't cause devfn_to to wrap
> around.
> 
> I did keep Logan's reviewed-by, so let me know if I broke something.

I think nr_devfns still needs updating as it is used for bitmap_set. 
Although thinking about it now we should limit the number to alias to be 
maximum (MAX_NR_DEVFNS - devfn_from), so that we don’t set past the end of 
the bitmap:

 nr_devfns = min(nr_devfns, (unsigned) MAX_NR_DEVFNS - devfn_from);

I think with this change we wont need to clip devfn_to.

> 
>>  if (!dev->dma_alias_mask)
>>  dev->dma_alias_mask = bitmap_zalloc(MAX_NR_DEVFNS, GFP_KERNEL);
>>  if (!dev->dma_alias_mask) {
>> @@ -5882,9 +5888,15 @@ void pci_add_dma_alias(struct pci_dev *dev, u8 devfn)
>>  return;
>>  }
>> 
>> -set_bit(devfn, dev->dma_alias_mask);
>> -pci_info(dev, "Enabling fixed DMA alias to %02x.%d\n",
>> - PCI_SLOT(devfn), PCI_FUNC(devfn));
>> +bitmap_set(dev->dma_alias_mask, devfn_from, nr_devfns);
>> +
>> +if (nr_devfns == 1)
>> +pci_info(dev, "Enabling fixed DMA alias to %02x.%d\n",
>> +PCI_SLOT(devfn_from), PCI_FUNC(devfn_from));
>> +else if(nr_devfns > 1)
>> +pci_info(dev, "Enabling fixed DMA alias for devfn range from 
>> %02x.%d to %02x.%d\n",
>> +PCI_SLOT(devfn_from), PCI_FUNC(devfn_from),
>> +PCI_SLOT(devfn_to), PCI_FUNC(devfn_to));
>> }

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v6 3/3] PCI: Add DMA alias quirk for PLX PEX NTB

2019-12-03 Thread James Sewart via iommu
The PLX PEX NTB forwards DMA transactions using Requester ID's that
don't exist as PCI devices. The devfn for a transaction is used as an
index into a lookup table storing the origin of a transaction on the
other side of the bridge.

This patch aliases all possible devfn's to the NTB device so that any
transaction coming in is governed by the mappings for the NTB.

Reviewed-by: Logan Gunthorpe 
Signed-off-by: James Sewart 
---
 drivers/pci/quirks.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 0f3f5afc73fd..3a67049ca630 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -5315,6 +5315,21 @@ SWITCHTEC_QUIRK(0x8574);  /* PFXI 64XG3 */
 SWITCHTEC_QUIRK(0x8575);  /* PFXI 80XG3 */
 SWITCHTEC_QUIRK(0x8576);  /* PFXI 96XG3 */
 
+/*
+ * PLX NTB uses devfn proxy IDs to move TLPs between NT endpoints. These IDs
+ * are used to forward responses to the originator on the other side of the
+ * NTB. Alias all possible IDs to the NTB to permit access when the IOMMU is
+ * turned on.
+ */
+static void quirk_plx_ntb_dma_alias(struct pci_dev *pdev)
+{
+   pci_info(pdev, "Setting PLX NTB proxy ID aliases\n");
+   /* PLX NTB may use all 256 devfns */
+   pci_add_dma_alias(pdev, 0, 256);
+}
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_PLX, 0x87b0, quirk_plx_ntb_dma_alias);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_PLX, 0x87b1, quirk_plx_ntb_dma_alias);
+
 /*
  * On Lenovo Thinkpad P50 SKUs with a Nvidia Quadro M1000M, the BIOS does
  * not always reset the secondary Nvidia GPU between reboots if the system
-- 
2.24.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v6 2/3] PCI: Add parameter nr_devfns to pci_add_dma_alias

2019-12-03 Thread James Sewart via iommu
pci_add_dma_alias can now be used to create a dma alias for a range of
devfns.

Reviewed-by: Logan Gunthorpe 
Signed-off-by: James Sewart 
---
 drivers/pci/pci.c| 22 +-
 drivers/pci/quirks.c | 14 +++---
 include/linux/pci.h  |  2 +-
 3 files changed, 25 insertions(+), 13 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index d3c83248f3ce..dbb01aceafda 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5857,7 +5857,8 @@ int pci_set_vga_state(struct pci_dev *dev, bool decode,
 /**
  * pci_add_dma_alias - Add a DMA devfn alias for a device
  * @dev: the PCI device for which alias is added
- * @devfn: alias slot and function
+ * @devfn_from: alias slot and function
+ * @nr_devfns: Number of subsequent devfns to alias
  *
  * This helper encodes an 8-bit devfn as a bit number in dma_alias_mask
  * which is used to program permissible bus-devfn source addresses for DMA
@@ -5873,8 +5874,13 @@ int pci_set_vga_state(struct pci_dev *dev, bool decode,
  * cannot be left as a userspace activity).  DMA aliases should therefore
  * be configured via quirks, such as the PCI fixup header quirk.
  */
-void pci_add_dma_alias(struct pci_dev *dev, u8 devfn)
+void pci_add_dma_alias(struct pci_dev *dev, u8 devfn_from, unsigned nr_devfns)
 {
+   int devfn_to;
+
+   nr_devfns = min(nr_devfns, (unsigned)MAX_NR_DEVFNS);
+   devfn_to = devfn_from + nr_devfns - 1;
+
if (!dev->dma_alias_mask)
dev->dma_alias_mask = bitmap_zalloc(MAX_NR_DEVFNS, GFP_KERNEL);
if (!dev->dma_alias_mask) {
@@ -5882,9 +5888,15 @@ void pci_add_dma_alias(struct pci_dev *dev, u8 devfn)
return;
}
 
-   set_bit(devfn, dev->dma_alias_mask);
-   pci_info(dev, "Enabling fixed DMA alias to %02x.%d\n",
-PCI_SLOT(devfn), PCI_FUNC(devfn));
+   bitmap_set(dev->dma_alias_mask, devfn_from, nr_devfns);
+
+   if (nr_devfns == 1)
+   pci_info(dev, "Enabling fixed DMA alias to %02x.%d\n",
+   PCI_SLOT(devfn_from), PCI_FUNC(devfn_from));
+   else if(nr_devfns > 1)
+   pci_info(dev, "Enabling fixed DMA alias for devfn range from 
%02x.%d to %02x.%d\n",
+   PCI_SLOT(devfn_from), PCI_FUNC(devfn_from),
+   PCI_SLOT(devfn_to), PCI_FUNC(devfn_to));
 }
 
 bool pci_devs_are_dma_aliases(struct pci_dev *dev1, struct pci_dev *dev2)
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 320255e5e8f8..0f3f5afc73fd 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -3932,7 +3932,7 @@ int pci_dev_specific_reset(struct pci_dev *dev, int probe)
 static void quirk_dma_func0_alias(struct pci_dev *dev)
 {
if (PCI_FUNC(dev->devfn) != 0)
-   pci_add_dma_alias(dev, PCI_DEVFN(PCI_SLOT(dev->devfn), 0));
+   pci_add_dma_alias(dev, PCI_DEVFN(PCI_SLOT(dev->devfn), 0), 1);
 }
 
 /*
@@ -3946,7 +3946,7 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_RICOH, 0xe476, 
quirk_dma_func0_alias);
 static void quirk_dma_func1_alias(struct pci_dev *dev)
 {
if (PCI_FUNC(dev->devfn) != 1)
-   pci_add_dma_alias(dev, PCI_DEVFN(PCI_SLOT(dev->devfn), 1));
+   pci_add_dma_alias(dev, PCI_DEVFN(PCI_SLOT(dev->devfn), 1), 1);
 }
 
 /*
@@ -4031,7 +4031,7 @@ static void quirk_fixed_dma_alias(struct pci_dev *dev)
 
id = pci_match_id(fixed_dma_alias_tbl, dev);
if (id)
-   pci_add_dma_alias(dev, id->driver_data);
+   pci_add_dma_alias(dev, id->driver_data, 1);
 }
 
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ADAPTEC2, 0x0285, 
quirk_fixed_dma_alias);
@@ -4073,9 +4073,9 @@ DECLARE_PCI_FIXUP_HEADER(0x8086, 0x244e, 
quirk_use_pcie_bridge_dma_alias);
  */
 static void quirk_mic_x200_dma_alias(struct pci_dev *pdev)
 {
-   pci_add_dma_alias(pdev, PCI_DEVFN(0x10, 0x0));
-   pci_add_dma_alias(pdev, PCI_DEVFN(0x11, 0x0));
-   pci_add_dma_alias(pdev, PCI_DEVFN(0x12, 0x3));
+   pci_add_dma_alias(pdev, PCI_DEVFN(0x10, 0x0), 1);
+   pci_add_dma_alias(pdev, PCI_DEVFN(0x11, 0x0), 1);
+   pci_add_dma_alias(pdev, PCI_DEVFN(0x12, 0x3), 1);
 }
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2260, 
quirk_mic_x200_dma_alias);
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2264, 
quirk_mic_x200_dma_alias);
@@ -5273,7 +5273,7 @@ static void quirk_switchtec_ntb_dma_alias(struct pci_dev 
*pdev)
pci_dbg(pdev,
"Aliasing Partition %d Proxy ID %02x.%d\n",
pp, PCI_SLOT(devfn), PCI_FUNC(devfn));
-   pci_add_dma_alias(pdev, devfn);
+   pci_add_dma_alias(pdev, devfn, 1);
}
}
 
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 6481da29d667..e7a9c8c92a93 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -2325,7 +2325,7 @@ static inline struct eeh_dev *pci_dev_to_eeh_dev(struct 
pci_dev *pdev)
 

[PATCH v6 1/3] PCI: Fix off by one in dma_alias_mask allocation size

2019-12-03 Thread James Sewart via iommu
The number of possible devfns is 256, add def and correct uses.

Reviewed-by: Logan Gunthorpe 
Signed-off-by: James Sewart 
---
 drivers/pci/pci.c| 2 +-
 drivers/pci/search.c | 2 +-
 include/linux/pci.h  | 2 ++
 3 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index a97e2571a527..d3c83248f3ce 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5876,7 +5876,7 @@ int pci_set_vga_state(struct pci_dev *dev, bool decode,
 void pci_add_dma_alias(struct pci_dev *dev, u8 devfn)
 {
if (!dev->dma_alias_mask)
-   dev->dma_alias_mask = bitmap_zalloc(U8_MAX, GFP_KERNEL);
+   dev->dma_alias_mask = bitmap_zalloc(MAX_NR_DEVFNS, GFP_KERNEL);
if (!dev->dma_alias_mask) {
pci_warn(dev, "Unable to allocate DMA alias mask\n");
return;
diff --git a/drivers/pci/search.c b/drivers/pci/search.c
index bade14002fd8..9e4dfae47252 100644
--- a/drivers/pci/search.c
+++ b/drivers/pci/search.c
@@ -43,7 +43,7 @@ int pci_for_each_dma_alias(struct pci_dev *pdev,
if (unlikely(pdev->dma_alias_mask)) {
u8 devfn;
 
-   for_each_set_bit(devfn, pdev->dma_alias_mask, U8_MAX) {
+   for_each_set_bit(devfn, pdev->dma_alias_mask, MAX_NR_DEVFNS) {
ret = fn(pdev, PCI_DEVID(pdev->bus->number, devfn),
 data);
if (ret)
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 1a6cf19eac2d..6481da29d667 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -57,6 +57,8 @@
 #define PCI_DEVID(bus, devfn)  u16)(bus)) << 8) | (devfn))
 /* return bus from PCI devid = ((u16)bus_number) << 8) | devfn */
 #define PCI_BUS_NUM(x) (((x) >> 8) & 0xff)
+/* Number of possible devfns. devfns can be from 0.0 to 1f.7 inclusive */
+#define MAX_NR_DEVFNS 256
 
 /* pci_slot represents a physical slot */
 struct pci_slot {
-- 
2.24.0


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v5 2/3] PCI: Add parameter nr_devfns to pci_add_dma_alias

2019-11-29 Thread James Sewart via iommu
pci_add_dma_alias can now be used to create a dma alias for a range of
devfns.

Signed-off-by: James Sewart 
---
 drivers/pci/pci.c| 23 ++-
 drivers/pci/quirks.c | 14 +++---
 include/linux/pci.h  |  2 +-
 3 files changed, 26 insertions(+), 13 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 0a4449a30ace..f9800a610ca1 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5857,7 +5857,8 @@ int pci_set_vga_state(struct pci_dev *dev, bool decode,
 /**
  * pci_add_dma_alias - Add a DMA devfn alias for a device
  * @dev: the PCI device for which alias is added
- * @devfn: alias slot and function
+ * @devfn_from: alias slot and function
+ * @nr_devfns: Number of subsequent devfns to alias
  *
  * This helper encodes an 8-bit devfn as a bit number in dma_alias_mask
  * which is used to program permissible bus-devfn source addresses for DMA
@@ -5873,8 +5874,14 @@ int pci_set_vga_state(struct pci_dev *dev, bool decode,
  * cannot be left as a userspace activity).  DMA aliases should therefore
  * be configured via quirks, such as the PCI fixup header quirk.
  */
-void pci_add_dma_alias(struct pci_dev *dev, u8 devfn)
+void pci_add_dma_alias(struct pci_dev *dev, u8 devfn_from, unsigned nr_devfns)
 {
+   int devfn_to;
+
+   if (nr_devfns > U8_MAX+1)
+   nr_devfns = U8_MAX+1;
+   devfn_to = devfn_from + nr_devfns - 1;
+
if (!dev->dma_alias_mask)
dev->dma_alias_mask = bitmap_zalloc(U8_MAX+1, GFP_KERNEL);
if (!dev->dma_alias_mask) {
@@ -5882,9 +5889,15 @@ void pci_add_dma_alias(struct pci_dev *dev, u8 devfn)
return;
}
 
-   set_bit(devfn, dev->dma_alias_mask);
-   pci_info(dev, "Enabling fixed DMA alias to %02x.%d\n",
-PCI_SLOT(devfn), PCI_FUNC(devfn));
+   bitmap_set(dev->dma_alias_mask, devfn_from, nr_devfns);
+
+   if (nr_devfns == 1)
+   pci_info(dev, "Enabling fixed DMA alias to %02x.%d\n",
+   PCI_SLOT(devfn_from), PCI_FUNC(devfn_from));
+   else if (nr_devfns > 1)
+   pci_info(dev, "Enabling fixed DMA alias for devfn range from 
%02x.%d to %02x.%d\n",
+   PCI_SLOT(devfn_from), PCI_FUNC(devfn_from),
+   PCI_SLOT(devfn_to), PCI_FUNC(devfn_to));
 }
 
 bool pci_devs_are_dma_aliases(struct pci_dev *dev1, struct pci_dev *dev2)
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 320255e5e8f8..0f3f5afc73fd 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -3932,7 +3932,7 @@ int pci_dev_specific_reset(struct pci_dev *dev, int probe)
 static void quirk_dma_func0_alias(struct pci_dev *dev)
 {
if (PCI_FUNC(dev->devfn) != 0)
-   pci_add_dma_alias(dev, PCI_DEVFN(PCI_SLOT(dev->devfn), 0));
+   pci_add_dma_alias(dev, PCI_DEVFN(PCI_SLOT(dev->devfn), 0), 1);
 }
 
 /*
@@ -3946,7 +3946,7 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_RICOH, 0xe476, 
quirk_dma_func0_alias);
 static void quirk_dma_func1_alias(struct pci_dev *dev)
 {
if (PCI_FUNC(dev->devfn) != 1)
-   pci_add_dma_alias(dev, PCI_DEVFN(PCI_SLOT(dev->devfn), 1));
+   pci_add_dma_alias(dev, PCI_DEVFN(PCI_SLOT(dev->devfn), 1), 1);
 }
 
 /*
@@ -4031,7 +4031,7 @@ static void quirk_fixed_dma_alias(struct pci_dev *dev)
 
id = pci_match_id(fixed_dma_alias_tbl, dev);
if (id)
-   pci_add_dma_alias(dev, id->driver_data);
+   pci_add_dma_alias(dev, id->driver_data, 1);
 }
 
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ADAPTEC2, 0x0285, 
quirk_fixed_dma_alias);
@@ -4073,9 +4073,9 @@ DECLARE_PCI_FIXUP_HEADER(0x8086, 0x244e, 
quirk_use_pcie_bridge_dma_alias);
  */
 static void quirk_mic_x200_dma_alias(struct pci_dev *pdev)
 {
-   pci_add_dma_alias(pdev, PCI_DEVFN(0x10, 0x0));
-   pci_add_dma_alias(pdev, PCI_DEVFN(0x11, 0x0));
-   pci_add_dma_alias(pdev, PCI_DEVFN(0x12, 0x3));
+   pci_add_dma_alias(pdev, PCI_DEVFN(0x10, 0x0), 1);
+   pci_add_dma_alias(pdev, PCI_DEVFN(0x11, 0x0), 1);
+   pci_add_dma_alias(pdev, PCI_DEVFN(0x12, 0x3), 1);
 }
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2260, 
quirk_mic_x200_dma_alias);
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2264, 
quirk_mic_x200_dma_alias);
@@ -5273,7 +5273,7 @@ static void quirk_switchtec_ntb_dma_alias(struct pci_dev 
*pdev)
pci_dbg(pdev,
"Aliasing Partition %d Proxy ID %02x.%d\n",
pp, PCI_SLOT(devfn), PCI_FUNC(devfn));
-   pci_add_dma_alias(pdev, devfn);
+   pci_add_dma_alias(pdev, devfn, 1);
}
}
 
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 1a6cf19eac2d..84a8d4c2b24e 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -2323,7 +2323,7 @@ static inline struct eeh_dev *pci_dev_to_eeh_dev(struct 
pci_dev *pdev)
 }
 #endif
 
-void 

[PATCH v5 1/3] PCI: Fix off by one in dma_alias_mask allocation size

2019-11-29 Thread James Sewart via iommu
The number of possible devfns is 256 so the size of the bitmap for
allocations should be U8_MAX+1.

Signed-off-by: James Sewart 
---
 drivers/pci/pci.c| 2 +-
 drivers/pci/search.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index a97e2571a527..0a4449a30ace 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5876,7 +5876,7 @@ int pci_set_vga_state(struct pci_dev *dev, bool decode,
 void pci_add_dma_alias(struct pci_dev *dev, u8 devfn)
 {
if (!dev->dma_alias_mask)
-   dev->dma_alias_mask = bitmap_zalloc(U8_MAX, GFP_KERNEL);
+   dev->dma_alias_mask = bitmap_zalloc(U8_MAX+1, GFP_KERNEL);
if (!dev->dma_alias_mask) {
pci_warn(dev, "Unable to allocate DMA alias mask\n");
return;
diff --git a/drivers/pci/search.c b/drivers/pci/search.c
index bade14002fd8..b3633af1743b 100644
--- a/drivers/pci/search.c
+++ b/drivers/pci/search.c
@@ -43,7 +43,7 @@ int pci_for_each_dma_alias(struct pci_dev *pdev,
if (unlikely(pdev->dma_alias_mask)) {
u8 devfn;
 
-   for_each_set_bit(devfn, pdev->dma_alias_mask, U8_MAX) {
+   for_each_set_bit(devfn, pdev->dma_alias_mask, U8_MAX+1) {
ret = fn(pdev, PCI_DEVID(pdev->bus->number, devfn),
 data);
if (ret)
-- 
2.24.0


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v5 3/3] PCI: Add DMA alias quirk for PLX PEX NTB

2019-11-29 Thread James Sewart via iommu
The PLX PEX NTB forwards DMA transactions using Requester ID's that
don't exist as PCI devices. The devfn for a transaction is used as an
index into a lookup table storing the origin of a transaction on the
other side of the bridge.

This patch aliases all possible devfn's to the NTB device so that any
transaction coming in is governed by the mappings for the NTB.

Reviewed-by: Logan Gunthorpe 
Signed-off-by: James Sewart 
---
 drivers/pci/quirks.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 0f3f5afc73fd..3a67049ca630 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -5315,6 +5315,21 @@ SWITCHTEC_QUIRK(0x8574);  /* PFXI 64XG3 */
 SWITCHTEC_QUIRK(0x8575);  /* PFXI 80XG3 */
 SWITCHTEC_QUIRK(0x8576);  /* PFXI 96XG3 */
 
+/*
+ * PLX NTB uses devfn proxy IDs to move TLPs between NT endpoints. These IDs
+ * are used to forward responses to the originator on the other side of the
+ * NTB. Alias all possible IDs to the NTB to permit access when the IOMMU is
+ * turned on.
+ */
+static void quirk_plx_ntb_dma_alias(struct pci_dev *pdev)
+{
+   pci_info(pdev, "Setting PLX NTB proxy ID aliases\n");
+   /* PLX NTB may use all 256 devfns */
+   pci_add_dma_alias(pdev, 0, 256);
+}
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_PLX, 0x87b0, quirk_plx_ntb_dma_alias);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_PLX, 0x87b1, quirk_plx_ntb_dma_alias);
+
 /*
  * On Lenovo Thinkpad P50 SKUs with a Nvidia Quadro M1000M, the BIOS does
  * not always reset the secondary Nvidia GPU between reboots if the system
-- 
2.24.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 1/2] PCI: Add parameter nr_devfns to pci_add_dma_alias

2019-11-29 Thread James Sewart via iommu


> On 29 Nov 2019, at 16:50, Logan Gunthorpe  wrote:
> 
> 
> 
> On 2019-11-29 5:49 a.m., James Sewart wrote:
>> pci_add_dma_alias can now be used to create a dma alias for a range of
>> devfns.
>> 
>> Signed-off-by: James Sewart 
>> ---
>> drivers/pci/pci.c| 23 ++-
>> drivers/pci/quirks.c | 14 +++---
>> include/linux/pci.h  |  2 +-
>> 3 files changed, 26 insertions(+), 13 deletions(-)
>> 
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>> index a97e2571a527..9b0e3481fe17 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -5857,7 +5857,8 @@ int pci_set_vga_state(struct pci_dev *dev, bool decode,
>> /**
>>  * pci_add_dma_alias - Add a DMA devfn alias for a device
>>  * @dev: the PCI device for which alias is added
>> - * @devfn: alias slot and function
>> + * @devfn_from: alias slot and function
>> + * @nr_devfns: Number of subsequent devfns to alias
>>  *
>>  * This helper encodes an 8-bit devfn as a bit number in dma_alias_mask
>>  * which is used to program permissible bus-devfn source addresses for DMA
>> @@ -5873,8 +5874,14 @@ int pci_set_vga_state(struct pci_dev *dev, bool 
>> decode,
>>  * cannot be left as a userspace activity).  DMA aliases should therefore
>>  * be configured via quirks, such as the PCI fixup header quirk.
>>  */
>> -void pci_add_dma_alias(struct pci_dev *dev, u8 devfn)
>> +void pci_add_dma_alias(struct pci_dev *dev, u8 devfn_from, unsigned 
>> nr_devfns)
>> {
>> +int devfn_to;
>> +
>> +if (nr_devfns > U8_MAX+1)
>> +nr_devfns = U8_MAX+1;
> 
> Why +1? That doesn't seem right to me….

U8_MAX is the max number U8 can represent(255) but is one less than the number 
of values it can represent(256). devfns can be 0.0 to 1f.7 inclusive(I think) 
so the number of possible devfns is 256. Thinking about it, maybe the 
zalloc should be U8_MAX+1 too?

I might be wrong though, what do you reckon?

> 
>> +devfn_to = devfn_from + nr_devfns - 1;
>> +
>>  if (!dev->dma_alias_mask)
>>  dev->dma_alias_mask = bitmap_zalloc(U8_MAX, GFP_KERNEL);
>>  if (!dev->dma_alias_mask) {
>> @@ -5882,9 +5889,15 @@ void pci_add_dma_alias(struct pci_dev *dev, u8 devfn)
>>  return;
>>  }
>> 
>> -set_bit(devfn, dev->dma_alias_mask);
>> -pci_info(dev, "Enabling fixed DMA alias to %02x.%d\n",
>> - PCI_SLOT(devfn), PCI_FUNC(devfn));
>> +bitmap_set(dev->dma_alias_mask, devfn_from, nr_devfns);
>> +
>> +if (nr_devfns == 1)
>> +pci_info(dev, "Enabling fixed DMA alias to %02x.%d\n",
>> +PCI_SLOT(devfn_from), PCI_FUNC(devfn_from));
>> +else if(nr_devfns > 1)
>> +pci_info(dev, "Enabling fixed DMA alias for devfn range from 
>> %02x.%d to %02x.%d\n",
>> +PCI_SLOT(devfn_from), PCI_FUNC(devfn_from),
>> +PCI_SLOT(devfn_to), PCI_FUNC(devfn_to));
>> }
>> 
>> bool pci_devs_are_dma_aliases(struct pci_dev *dev1, struct pci_dev *dev2)
>> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
>> index 320255e5e8f8..0f3f5afc73fd 100644
>> --- a/drivers/pci/quirks.c
>> +++ b/drivers/pci/quirks.c
>> @@ -3932,7 +3932,7 @@ int pci_dev_specific_reset(struct pci_dev *dev, int 
>> probe)
>> static void quirk_dma_func0_alias(struct pci_dev *dev)
>> {
>>  if (PCI_FUNC(dev->devfn) != 0)
>> -pci_add_dma_alias(dev, PCI_DEVFN(PCI_SLOT(dev->devfn), 0));
>> +pci_add_dma_alias(dev, PCI_DEVFN(PCI_SLOT(dev->devfn), 0), 1);
>> }
>> 
>> /*
>> @@ -3946,7 +3946,7 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_RICOH, 0xe476, 
>> quirk_dma_func0_alias);
>> static void quirk_dma_func1_alias(struct pci_dev *dev)
>> {
>>  if (PCI_FUNC(dev->devfn) != 1)
>> -pci_add_dma_alias(dev, PCI_DEVFN(PCI_SLOT(dev->devfn), 1));
>> +pci_add_dma_alias(dev, PCI_DEVFN(PCI_SLOT(dev->devfn), 1), 1);
>> }
>> 
>> /*
>> @@ -4031,7 +4031,7 @@ static void quirk_fixed_dma_alias(struct pci_dev *dev)
>> 
>>  id = pci_match_id(fixed_dma_alias_tbl, dev);
>>  if (id)
>> -pci_add_dma_alias(dev, id->driver_data);
>> +pci_add_dma_alias(dev, id->driver_data, 1);
>> }
>> 
>> DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ADAPTEC2, 0x0285, 
>> quirk_fixed_dma_alias);
>> @@ -4073,9 +4073,9 @@ DECLARE_PCI_FIXUP_HEADER(0x8086, 0x244e, 
>> quirk_use_pcie_bridge_dma_alias);
>>  */
>> static void quirk_mic_x200_dma_alias(struct pci_dev *pdev)
>> {
>> -pci_add_dma_alias(pdev, PCI_DEVFN(0x10, 0x0));
>> -pci_add_dma_alias(pdev, PCI_DEVFN(0x11, 0x0));
>> -pci_add_dma_alias(pdev, PCI_DEVFN(0x12, 0x3));
>> +pci_add_dma_alias(pdev, PCI_DEVFN(0x10, 0x0), 1);
>> +pci_add_dma_alias(pdev, PCI_DEVFN(0x11, 0x0), 1);
>> +pci_add_dma_alias(pdev, PCI_DEVFN(0x12, 0x3), 1);
>> }
>> DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2260, 
>> quirk_mic_x200_dma_alias);
>> DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2264, 
>> quirk_mic_x200_dma_alias);
>> @@ -5273,7 +5273,7 @@ 

[PATCH v4 2/2] PCI: Add DMA alias quirk for PLX PEX NTB

2019-11-29 Thread James Sewart via iommu
The PLX PEX NTB forwards DMA transactions using Requester ID's that
don't exist as PCI devices. The devfn for a transaction is used as an
index into a lookup table storing the origin of a transaction on the
other side of the bridge.

This patch aliases all possible devfn's to the NTB device so that any
transaction coming in is governed by the mappings for the NTB.

Reviewed-by: Logan Gunthorpe 
Signed-off-by: James Sewart 
---
 drivers/pci/quirks.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 0f3f5afc73fd..3a67049ca630 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -5315,6 +5315,21 @@ SWITCHTEC_QUIRK(0x8574);  /* PFXI 64XG3 */
 SWITCHTEC_QUIRK(0x8575);  /* PFXI 80XG3 */
 SWITCHTEC_QUIRK(0x8576);  /* PFXI 96XG3 */
 
+/*
+ * PLX NTB uses devfn proxy IDs to move TLPs between NT endpoints. These IDs
+ * are used to forward responses to the originator on the other side of the
+ * NTB. Alias all possible IDs to the NTB to permit access when the IOMMU is
+ * turned on.
+ */
+static void quirk_plx_ntb_dma_alias(struct pci_dev *pdev)
+{
+   pci_info(pdev, "Setting PLX NTB proxy ID aliases\n");
+   /* PLX NTB may use all 256 devfns */
+   pci_add_dma_alias(pdev, 0, 256);
+}
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_PLX, 0x87b0, quirk_plx_ntb_dma_alias);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_PLX, 0x87b1, quirk_plx_ntb_dma_alias);
+
 /*
  * On Lenovo Thinkpad P50 SKUs with a Nvidia Quadro M1000M, the BIOS does
  * not always reset the secondary Nvidia GPU between reboots if the system
-- 
2.24.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v4 1/2] PCI: Add parameter nr_devfns to pci_add_dma_alias

2019-11-29 Thread James Sewart via iommu
pci_add_dma_alias can now be used to create a dma alias for a range of
devfns.

Signed-off-by: James Sewart 
---
 drivers/pci/pci.c| 23 ++-
 drivers/pci/quirks.c | 14 +++---
 include/linux/pci.h  |  2 +-
 3 files changed, 26 insertions(+), 13 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index a97e2571a527..9b0e3481fe17 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5857,7 +5857,8 @@ int pci_set_vga_state(struct pci_dev *dev, bool decode,
 /**
  * pci_add_dma_alias - Add a DMA devfn alias for a device
  * @dev: the PCI device for which alias is added
- * @devfn: alias slot and function
+ * @devfn_from: alias slot and function
+ * @nr_devfns: Number of subsequent devfns to alias
  *
  * This helper encodes an 8-bit devfn as a bit number in dma_alias_mask
  * which is used to program permissible bus-devfn source addresses for DMA
@@ -5873,8 +5874,14 @@ int pci_set_vga_state(struct pci_dev *dev, bool decode,
  * cannot be left as a userspace activity).  DMA aliases should therefore
  * be configured via quirks, such as the PCI fixup header quirk.
  */
-void pci_add_dma_alias(struct pci_dev *dev, u8 devfn)
+void pci_add_dma_alias(struct pci_dev *dev, u8 devfn_from, unsigned nr_devfns)
 {
+   int devfn_to;
+
+   if (nr_devfns > U8_MAX+1)
+   nr_devfns = U8_MAX+1;
+   devfn_to = devfn_from + nr_devfns - 1;
+
if (!dev->dma_alias_mask)
dev->dma_alias_mask = bitmap_zalloc(U8_MAX, GFP_KERNEL);
if (!dev->dma_alias_mask) {
@@ -5882,9 +5889,15 @@ void pci_add_dma_alias(struct pci_dev *dev, u8 devfn)
return;
}
 
-   set_bit(devfn, dev->dma_alias_mask);
-   pci_info(dev, "Enabling fixed DMA alias to %02x.%d\n",
-PCI_SLOT(devfn), PCI_FUNC(devfn));
+   bitmap_set(dev->dma_alias_mask, devfn_from, nr_devfns);
+
+   if (nr_devfns == 1)
+   pci_info(dev, "Enabling fixed DMA alias to %02x.%d\n",
+   PCI_SLOT(devfn_from), PCI_FUNC(devfn_from));
+   else if(nr_devfns > 1)
+   pci_info(dev, "Enabling fixed DMA alias for devfn range from 
%02x.%d to %02x.%d\n",
+   PCI_SLOT(devfn_from), PCI_FUNC(devfn_from),
+   PCI_SLOT(devfn_to), PCI_FUNC(devfn_to));
 }
 
 bool pci_devs_are_dma_aliases(struct pci_dev *dev1, struct pci_dev *dev2)
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 320255e5e8f8..0f3f5afc73fd 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -3932,7 +3932,7 @@ int pci_dev_specific_reset(struct pci_dev *dev, int probe)
 static void quirk_dma_func0_alias(struct pci_dev *dev)
 {
if (PCI_FUNC(dev->devfn) != 0)
-   pci_add_dma_alias(dev, PCI_DEVFN(PCI_SLOT(dev->devfn), 0));
+   pci_add_dma_alias(dev, PCI_DEVFN(PCI_SLOT(dev->devfn), 0), 1);
 }
 
 /*
@@ -3946,7 +3946,7 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_RICOH, 0xe476, 
quirk_dma_func0_alias);
 static void quirk_dma_func1_alias(struct pci_dev *dev)
 {
if (PCI_FUNC(dev->devfn) != 1)
-   pci_add_dma_alias(dev, PCI_DEVFN(PCI_SLOT(dev->devfn), 1));
+   pci_add_dma_alias(dev, PCI_DEVFN(PCI_SLOT(dev->devfn), 1), 1);
 }
 
 /*
@@ -4031,7 +4031,7 @@ static void quirk_fixed_dma_alias(struct pci_dev *dev)
 
id = pci_match_id(fixed_dma_alias_tbl, dev);
if (id)
-   pci_add_dma_alias(dev, id->driver_data);
+   pci_add_dma_alias(dev, id->driver_data, 1);
 }
 
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ADAPTEC2, 0x0285, 
quirk_fixed_dma_alias);
@@ -4073,9 +4073,9 @@ DECLARE_PCI_FIXUP_HEADER(0x8086, 0x244e, 
quirk_use_pcie_bridge_dma_alias);
  */
 static void quirk_mic_x200_dma_alias(struct pci_dev *pdev)
 {
-   pci_add_dma_alias(pdev, PCI_DEVFN(0x10, 0x0));
-   pci_add_dma_alias(pdev, PCI_DEVFN(0x11, 0x0));
-   pci_add_dma_alias(pdev, PCI_DEVFN(0x12, 0x3));
+   pci_add_dma_alias(pdev, PCI_DEVFN(0x10, 0x0), 1);
+   pci_add_dma_alias(pdev, PCI_DEVFN(0x11, 0x0), 1);
+   pci_add_dma_alias(pdev, PCI_DEVFN(0x12, 0x3), 1);
 }
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2260, 
quirk_mic_x200_dma_alias);
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2264, 
quirk_mic_x200_dma_alias);
@@ -5273,7 +5273,7 @@ static void quirk_switchtec_ntb_dma_alias(struct pci_dev 
*pdev)
pci_dbg(pdev,
"Aliasing Partition %d Proxy ID %02x.%d\n",
pp, PCI_SLOT(devfn), PCI_FUNC(devfn));
-   pci_add_dma_alias(pdev, devfn);
+   pci_add_dma_alias(pdev, devfn, 1);
}
}
 
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 1a6cf19eac2d..84a8d4c2b24e 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -2323,7 +2323,7 @@ static inline struct eeh_dev *pci_dev_to_eeh_dev(struct 
pci_dev *pdev)
 }
 #endif
 
-void 

[PATCH v3 2/2] PCI: Add DMA alias quirk for PLX PEX NTB

2019-11-27 Thread James Sewart via iommu
The PLX PEX NTB forwards DMA transactions using Requester ID's that
don't exist as PCI devices. The devfn for a transaction is used as an
index into a lookup table storing the origin of a transaction on the
other side of the bridge.

This patch aliases all possible devfn's to the NTB device so that any
transaction coming in is governed by the mappings for the NTB.

Reviewed-by: Logan Gunthorpe 
Signed-off-by: James Sewart 
---
 drivers/pci/quirks.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 0f3f5afc73fd..3a67049ca630 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -5315,6 +5315,21 @@ SWITCHTEC_QUIRK(0x8574);  /* PFXI 64XG3 */
 SWITCHTEC_QUIRK(0x8575);  /* PFXI 80XG3 */
 SWITCHTEC_QUIRK(0x8576);  /* PFXI 96XG3 */
 
+/*
+ * PLX NTB uses devfn proxy IDs to move TLPs between NT endpoints. These IDs
+ * are used to forward responses to the originator on the other side of the
+ * NTB. Alias all possible IDs to the NTB to permit access when the IOMMU is
+ * turned on.
+ */
+static void quirk_plx_ntb_dma_alias(struct pci_dev *pdev)
+{
+   pci_info(pdev, "Setting PLX NTB proxy ID aliases\n");
+   /* PLX NTB may use all 256 devfns */
+   pci_add_dma_alias(pdev, 0, 256);
+}
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_PLX, 0x87b0, quirk_plx_ntb_dma_alias);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_PLX, 0x87b1, quirk_plx_ntb_dma_alias);
+
 /*
  * On Lenovo Thinkpad P50 SKUs with a Nvidia Quadro M1000M, the BIOS does
  * not always reset the secondary Nvidia GPU between reboots if the system
-- 
2.24.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 1/2] PCI: Add parameter nr_devfns to pci_add_dma_alias

2019-11-27 Thread James Sewart via iommu
pci_add_dma_alias can now be used to create a dma alias for a range of
devfns.

Signed-off-by: James Sewart 
---
 drivers/pci/pci.c| 21 -
 drivers/pci/quirks.c | 14 +++---
 include/linux/pci.h  |  2 +-
 3 files changed, 24 insertions(+), 13 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index a97e2571a527..71dbabf5ee3d 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5857,7 +5857,8 @@ int pci_set_vga_state(struct pci_dev *dev, bool decode,
 /**
  * pci_add_dma_alias - Add a DMA devfn alias for a device
  * @dev: the PCI device for which alias is added
- * @devfn: alias slot and function
+ * @devfn_from: alias slot and function
+ * @nr_devfns: Number of subsequent devfns to alias
  *
  * This helper encodes an 8-bit devfn as a bit number in dma_alias_mask
  * which is used to program permissible bus-devfn source addresses for DMA
@@ -5873,8 +5874,12 @@ int pci_set_vga_state(struct pci_dev *dev, bool decode,
  * cannot be left as a userspace activity).  DMA aliases should therefore
  * be configured via quirks, such as the PCI fixup header quirk.
  */
-void pci_add_dma_alias(struct pci_dev *dev, u8 devfn)
+void pci_add_dma_alias(struct pci_dev *dev, u8 devfn_from, int nr_devfns)
 {
+   int devfn_to = devfn_from + nr_devfns - 1;
+
+   BUG_ON(nr_devfns < 1);
+
if (!dev->dma_alias_mask)
dev->dma_alias_mask = bitmap_zalloc(U8_MAX, GFP_KERNEL);
if (!dev->dma_alias_mask) {
@@ -5882,9 +5887,15 @@ void pci_add_dma_alias(struct pci_dev *dev, u8 devfn)
return;
}
 
-   set_bit(devfn, dev->dma_alias_mask);
-   pci_info(dev, "Enabling fixed DMA alias to %02x.%d\n",
-PCI_SLOT(devfn), PCI_FUNC(devfn));
+   bitmap_set(dev->dma_alias_mask, devfn_from, nr_devfns);
+
+   if (nr_devfns == 1)
+   pci_info(dev, "Enabling fixed DMA alias to %02x.%d\n",
+   PCI_SLOT(devfn_from), PCI_FUNC(devfn_from));
+   else
+   pci_info(dev, "Enabling fixed DMA alias for devfn range from 
%02x.%d to %02x.%d\n",
+   PCI_SLOT(devfn_from), PCI_FUNC(devfn_from),
+   PCI_SLOT(devfn_to), PCI_FUNC(devfn_to));
 }
 
 bool pci_devs_are_dma_aliases(struct pci_dev *dev1, struct pci_dev *dev2)
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 320255e5e8f8..0f3f5afc73fd 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -3932,7 +3932,7 @@ int pci_dev_specific_reset(struct pci_dev *dev, int probe)
 static void quirk_dma_func0_alias(struct pci_dev *dev)
 {
if (PCI_FUNC(dev->devfn) != 0)
-   pci_add_dma_alias(dev, PCI_DEVFN(PCI_SLOT(dev->devfn), 0));
+   pci_add_dma_alias(dev, PCI_DEVFN(PCI_SLOT(dev->devfn), 0), 1);
 }
 
 /*
@@ -3946,7 +3946,7 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_RICOH, 0xe476, 
quirk_dma_func0_alias);
 static void quirk_dma_func1_alias(struct pci_dev *dev)
 {
if (PCI_FUNC(dev->devfn) != 1)
-   pci_add_dma_alias(dev, PCI_DEVFN(PCI_SLOT(dev->devfn), 1));
+   pci_add_dma_alias(dev, PCI_DEVFN(PCI_SLOT(dev->devfn), 1), 1);
 }
 
 /*
@@ -4031,7 +4031,7 @@ static void quirk_fixed_dma_alias(struct pci_dev *dev)
 
id = pci_match_id(fixed_dma_alias_tbl, dev);
if (id)
-   pci_add_dma_alias(dev, id->driver_data);
+   pci_add_dma_alias(dev, id->driver_data, 1);
 }
 
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ADAPTEC2, 0x0285, 
quirk_fixed_dma_alias);
@@ -4073,9 +4073,9 @@ DECLARE_PCI_FIXUP_HEADER(0x8086, 0x244e, 
quirk_use_pcie_bridge_dma_alias);
  */
 static void quirk_mic_x200_dma_alias(struct pci_dev *pdev)
 {
-   pci_add_dma_alias(pdev, PCI_DEVFN(0x10, 0x0));
-   pci_add_dma_alias(pdev, PCI_DEVFN(0x11, 0x0));
-   pci_add_dma_alias(pdev, PCI_DEVFN(0x12, 0x3));
+   pci_add_dma_alias(pdev, PCI_DEVFN(0x10, 0x0), 1);
+   pci_add_dma_alias(pdev, PCI_DEVFN(0x11, 0x0), 1);
+   pci_add_dma_alias(pdev, PCI_DEVFN(0x12, 0x3), 1);
 }
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2260, 
quirk_mic_x200_dma_alias);
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2264, 
quirk_mic_x200_dma_alias);
@@ -5273,7 +5273,7 @@ static void quirk_switchtec_ntb_dma_alias(struct pci_dev 
*pdev)
pci_dbg(pdev,
"Aliasing Partition %d Proxy ID %02x.%d\n",
pp, PCI_SLOT(devfn), PCI_FUNC(devfn));
-   pci_add_dma_alias(pdev, devfn);
+   pci_add_dma_alias(pdev, devfn, 1);
}
}
 
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 1a6cf19eac2d..f51bdda49e9a 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -2323,7 +2323,7 @@ static inline struct eeh_dev *pci_dev_to_eeh_dev(struct 
pci_dev *pdev)
 }
 #endif
 
-void pci_add_dma_alias(struct pci_dev *dev, u8 devfn);
+void pci_add_dma_alias(struct pci_dev 

[PATCH v3 2/2] PCI: Add DMA alias quirk for PLX PEX NTB

2019-11-26 Thread James Sewart via iommu
The PLX PEX NTB forwards DMA transactions using Requester ID's that
don't exist as PCI devices. The devfn for a transaction is used as an
index into a lookup table storing the origin of a transaction on the
other side of the bridge.

This patch aliases all possible devfn's to the NTB device so that any
transaction coming in is governed by the mappings for the NTB.

Reviewed-by: Logan Gunthorpe 
Signed-off-by: James Sewart 
---
 drivers/pci/quirks.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 320255e5e8f8..1ed230f739a4 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -5315,6 +5315,21 @@ SWITCHTEC_QUIRK(0x8574);  /* PFXI 64XG3 */
 SWITCHTEC_QUIRK(0x8575);  /* PFXI 80XG3 */
 SWITCHTEC_QUIRK(0x8576);  /* PFXI 96XG3 */
 
+/*
+ * PLX NTB uses devfn proxy IDs to move TLPs between NT endpoints. These IDs
+ * are used to forward responses to the originator on the other side of the
+ * NTB. Alias all possible IDs to the NTB to permit access when the IOMMU is
+ * turned on.
+ */
+static void quirk_plx_ntb_dma_alias(struct pci_dev *pdev)
+{
+   pci_info(pdev, "Setting PLX NTB proxy ID aliases\n");
+   /* PLX NTB may use all 256 devfns */
+   pci_add_dma_alias_range(pdev, 0, 256);
+}
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_PLX, 0x87b0, quirk_plx_ntb_dma_alias);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_PLX, 0x87b1, quirk_plx_ntb_dma_alias);
+
 /*
  * On Lenovo Thinkpad P50 SKUs with a Nvidia Quadro M1000M, the BIOS does
  * not always reset the secondary Nvidia GPU between reboots if the system
-- 
2.24.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 1/2] PCI: Add helper pci_add_dma_alias_range

2019-11-26 Thread James Sewart via iommu
pci_add_dma_alias_range can be used to create a dma alias for range of
devfns.

Reviewed-by: Logan Gunthorpe 
Signed-off-by: James Sewart 
---
 drivers/pci/pci.c   | 30 +++---
 include/linux/pci.h |  1 +
 2 files changed, 24 insertions(+), 7 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index a97e2571a527..68339309c0f4 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5854,6 +5854,18 @@ int pci_set_vga_state(struct pci_dev *dev, bool decode,
return 0;
 }
 
+int _pci_add_dma_alias_range(struct pci_dev *dev, u8 devfn_from, int len)
+{
+   if (!dev->dma_alias_mask)
+   dev->dma_alias_mask = bitmap_zalloc(U8_MAX, GFP_KERNEL);
+   if (!dev->dma_alias_mask) {
+   pci_warn(dev, "Unable to allocate DMA alias mask\n");
+   return -ENOMEM;
+   }
+   bitmap_set(dev->dma_alias_mask, devfn_from, len);
+   return 0;
+}
+
 /**
  * pci_add_dma_alias - Add a DMA devfn alias for a device
  * @dev: the PCI device for which alias is added
@@ -5875,18 +5887,22 @@ int pci_set_vga_state(struct pci_dev *dev, bool decode,
  */
 void pci_add_dma_alias(struct pci_dev *dev, u8 devfn)
 {
-   if (!dev->dma_alias_mask)
-   dev->dma_alias_mask = bitmap_zalloc(U8_MAX, GFP_KERNEL);
-   if (!dev->dma_alias_mask) {
-   pci_warn(dev, "Unable to allocate DMA alias mask\n");
+   if (_pci_add_dma_alias_range(dev, devfn, 1) != 0)
return;
-   }
-
-   set_bit(devfn, dev->dma_alias_mask);
pci_info(dev, "Enabling fixed DMA alias to %02x.%d\n",
 PCI_SLOT(devfn), PCI_FUNC(devfn));
 }
 
+void pci_add_dma_alias_range(struct pci_dev *dev, u8 devfn_from, int len)
+{
+   int devfn_to = devfn_from + len - 1;
+
+   if (_pci_add_dma_alias_range(dev, devfn_from, len) != 0)
+   return;
+   pci_info(dev, "Enabling fixed DMA alias for devfn range from %02x.%d to 
%02x.%d\n",
+PCI_SLOT(devfn_from), PCI_FUNC(devfn_from), 
PCI_SLOT(devfn_to), PCI_FUNC(devfn_to));
+}
+
 bool pci_devs_are_dma_aliases(struct pci_dev *dev1, struct pci_dev *dev2)
 {
return (dev1->dma_alias_mask &&
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 1a6cf19eac2d..6765f3d0102b 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -2324,6 +2324,7 @@ static inline struct eeh_dev *pci_dev_to_eeh_dev(struct 
pci_dev *pdev)
 #endif
 
 void pci_add_dma_alias(struct pci_dev *dev, u8 devfn);
+void pci_add_dma_alias_range(struct pci_dev *dev, u8 devfn_from, int len);
 bool pci_devs_are_dma_aliases(struct pci_dev *dev1, struct pci_dev *dev2);
 int pci_for_each_dma_alias(struct pci_dev *pdev,
   int (*fn)(struct pci_dev *pdev,
-- 
2.24.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2] PCI: Add DMA alias quirk for PLX PEX NTB

2019-11-26 Thread James Sewart via iommu
The PLX PEX NTB forwards DMA transactions using Requester ID's that
don't exist as PCI devices. The devfn for a transaction is used as an
index into a lookup table storing the origin of a transaction on the
other side of the bridge.

Add helper pci_add_dma_alias_range that can alias a range of devfns in 
dma_alias_mask.

This patch aliases all possible devfn's to the NTB device so that any
transaction coming in is governed by the mappings for the NTB.

Signed-off-by: James Sewart 
---
 drivers/pci/pci.c| 29 ++---
 drivers/pci/quirks.c | 15 +++
 include/linux/pci.h  |  1 +
 3 files changed, 38 insertions(+), 7 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index a97e2571a527..f502af1b5d10 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5854,6 +5854,18 @@ int pci_set_vga_state(struct pci_dev *dev, bool decode,
return 0;
 }
 
+int _pci_add_dma_alias_range(struct pci_dev *dev, u8 devfn_from, int len)
+{
+   if (!dev->dma_alias_mask)
+   dev->dma_alias_mask = bitmap_zalloc(U8_MAX, GFP_KERNEL);
+   if (!dev->dma_alias_mask) {
+   pci_warn(dev, "Unable to allocate DMA alias mask\n");
+   return -ENOMEM;
+   }
+   bitmap_set(dev->dma_alias_mask, devfn_from, len);
+   return 0;
+}
+
 /**
  * pci_add_dma_alias - Add a DMA devfn alias for a device
  * @dev: the PCI device for which alias is added
@@ -5875,18 +5887,21 @@ int pci_set_vga_state(struct pci_dev *dev, bool decode,
  */
 void pci_add_dma_alias(struct pci_dev *dev, u8 devfn)
 {
-   if (!dev->dma_alias_mask)
-   dev->dma_alias_mask = bitmap_zalloc(U8_MAX, GFP_KERNEL);
-   if (!dev->dma_alias_mask) {
-   pci_warn(dev, "Unable to allocate DMA alias mask\n");
+   if (_pci_add_dma_alias_range(dev, devfn, 1) != 0)
return;
-   }
-
-   set_bit(devfn, dev->dma_alias_mask);
pci_info(dev, "Enabling fixed DMA alias to %02x.%d\n",
 PCI_SLOT(devfn), PCI_FUNC(devfn));
 }
 
+void pci_add_dma_alias_range(struct pci_dev *dev, u8 devfn_from, int len)
+{
+   int devfn_to = devfn_from + len - 1;
+   if (_pci_add_dma_alias_range(dev, devfn_from, len) != 0)
+   return;
+   pci_info(dev, "Enabling fixed DMA alias for devfn range from %02x.%d to 
%02x.%d\n",
+PCI_SLOT(devfn_from), PCI_FUNC(devfn_from), 
PCI_SLOT(devfn_to), PCI_FUNC(devfn_to));
+}
+
 bool pci_devs_are_dma_aliases(struct pci_dev *dev1, struct pci_dev *dev2)
 {
return (dev1->dma_alias_mask &&
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 320255e5e8f8..1ed230f739a4 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -5315,6 +5315,21 @@ SWITCHTEC_QUIRK(0x8574);  /* PFXI 64XG3 */
 SWITCHTEC_QUIRK(0x8575);  /* PFXI 80XG3 */
 SWITCHTEC_QUIRK(0x8576);  /* PFXI 96XG3 */
 
+/*
+ * PLX NTB uses devfn proxy IDs to move TLPs between NT endpoints. These IDs
+ * are used to forward responses to the originator on the other side of the
+ * NTB. Alias all possible IDs to the NTB to permit access when the IOMMU is
+ * turned on.
+ */
+static void quirk_plx_ntb_dma_alias(struct pci_dev *pdev)
+{
+   pci_info(pdev, "Setting PLX NTB proxy ID aliases\n");
+   /* PLX NTB may use all 256 devfns */
+   pci_add_dma_alias_range(pdev, 0, 256);
+}
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_PLX, 0x87b0, quirk_plx_ntb_dma_alias);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_PLX, 0x87b1, quirk_plx_ntb_dma_alias);
+
 /*
  * On Lenovo Thinkpad P50 SKUs with a Nvidia Quadro M1000M, the BIOS does
  * not always reset the secondary Nvidia GPU between reboots if the system
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 1a6cf19eac2d..6765f3d0102b 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -2324,6 +2324,7 @@ static inline struct eeh_dev *pci_dev_to_eeh_dev(struct 
pci_dev *pdev)
 #endif
 
 void pci_add_dma_alias(struct pci_dev *dev, u8 devfn);
+void pci_add_dma_alias_range(struct pci_dev *dev, u8 devfn_from, int len);
 bool pci_devs_are_dma_aliases(struct pci_dev *dev1, struct pci_dev *dev2);
 int pci_for_each_dma_alias(struct pci_dev *pdev,
   int (*fn)(struct pci_dev *pdev,
-- 
2.24.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] Ensure pci transactions coming from PLX NTB are handled when IOMMU is turned on

2019-11-05 Thread James Sewart via iommu
Any comments on this?

Cheers,
James.

> On 24 Oct 2019, at 13:52, James Sewart  wrote:
> 
> The PLX PEX NTB forwards DMA transactions using Requester ID's that don't 
> exist as
> PCI devices. The devfn for a transaction is used as an index into a lookup 
> table
> storing the origin of a transaction on the other side of the bridge.
> 
> This patch aliases all possible devfn's to the NTB device so that any 
> transaction
> coming in is governed by the mappings for the NTB.
> 
> Signed-Off-By: James Sewart 
> ---
> drivers/pci/quirks.c | 22 ++
> 1 file changed, 22 insertions(+)
> 
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index 320255e5e8f8..647f546e427f 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -5315,6 +5315,28 @@ SWITCHTEC_QUIRK(0x8574);  /* PFXI 64XG3 */
> SWITCHTEC_QUIRK(0x8575);  /* PFXI 80XG3 */
> SWITCHTEC_QUIRK(0x8576);  /* PFXI 96XG3 */
> 
> +/*
> + * PLX NTB uses devfn proxy IDs to move TLPs between NT endpoints. These IDs
> + * are used to forward responses to the originator on the other side of the
> + * NTB. Alias all possible IDs to the NTB to permit access when the IOMMU is
> + * turned on.
> + */
> +static void quirk_PLX_NTB_dma_alias(struct pci_dev *pdev)
> +{
> + if (!pdev->dma_alias_mask)
> + pdev->dma_alias_mask = kcalloc(BITS_TO_LONGS(U8_MAX),
> +   sizeof(long), GFP_KERNEL);
> + if (!pdev->dma_alias_mask) {
> + dev_warn(>dev, "Unable to allocate DMA alias mask\n");
> + return;
> + }
> +
> + // PLX NTB may use all 256 devfns
> + memset(pdev->dma_alias_mask, U8_MAX, (U8_MAX+1)/BITS_PER_BYTE);
> +}
> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_PLX, 0x87b0, quirk_PLX_NTB_dma_alias);
> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_PLX, 0x87b1, quirk_PLX_NTB_dma_alias);
> +
> /*
>  * On Lenovo Thinkpad P50 SKUs with a Nvidia Quadro M1000M, the BIOS does
>  * not always reset the secondary Nvidia GPU between reboots if the system
> -- 
> 2.19.1
> 
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] Ensure pci transactions coming from PLX NTB are handled when IOMMU is turned on

2019-10-24 Thread James Sewart via iommu
The PLX PEX NTB forwards DMA transactions using Requester ID's that don't exist 
as
PCI devices. The devfn for a transaction is used as an index into a lookup table
storing the origin of a transaction on the other side of the bridge.

This patch aliases all possible devfn's to the NTB device so that any 
transaction
coming in is governed by the mappings for the NTB.

Signed-Off-By: James Sewart 
---
 drivers/pci/quirks.c | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 320255e5e8f8..647f546e427f 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -5315,6 +5315,28 @@ SWITCHTEC_QUIRK(0x8574);  /* PFXI 64XG3 */
 SWITCHTEC_QUIRK(0x8575);  /* PFXI 80XG3 */
 SWITCHTEC_QUIRK(0x8576);  /* PFXI 96XG3 */
 
+/*
+ * PLX NTB uses devfn proxy IDs to move TLPs between NT endpoints. These IDs
+ * are used to forward responses to the originator on the other side of the
+ * NTB. Alias all possible IDs to the NTB to permit access when the IOMMU is
+ * turned on.
+ */
+static void quirk_PLX_NTB_dma_alias(struct pci_dev *pdev)
+{
+   if (!pdev->dma_alias_mask)
+   pdev->dma_alias_mask = kcalloc(BITS_TO_LONGS(U8_MAX),
+ sizeof(long), GFP_KERNEL);
+   if (!pdev->dma_alias_mask) {
+   dev_warn(>dev, "Unable to allocate DMA alias mask\n");
+   return;
+   }
+
+   // PLX NTB may use all 256 devfns
+   memset(pdev->dma_alias_mask, U8_MAX, (U8_MAX+1)/BITS_PER_BYTE);
+}
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_PLX, 0x87b0, quirk_PLX_NTB_dma_alias);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_PLX, 0x87b1, quirk_PLX_NTB_dma_alias);
+
 /*
  * On Lenovo Thinkpad P50 SKUs with a Nvidia Quadro M1000M, the BIOS does
  * not always reset the secondary Nvidia GPU between reboots if the system
-- 
2.19.1


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions

2019-04-15 Thread James Sewart via iommu
Hey Lu,

> On 10 Apr 2019, at 06:22, Lu Baolu  wrote:
> 
> Hi James,
> 
> On 4/6/19 2:02 AM, James Sewart wrote:
>> Hey Lu,
>> My bad, did some debugging on my end. The issue was swapping out
>> find_domain for iommu_get_domain_for_dev. It seems in some situations the
>> domain is not attached to the group but the device is expected to have the
>> domain still stored in its archdata.
>> I’ve attached the final patch with find_domain unremoved which seems to
>> work in my testing.
> 
> Just looked into your v3 patch set and some thoughts from my end posted
> here just for your information.
> 
> Let me post the problems we want to address.
> 
> 1. When allocating a new group for a device, how should we determine the
> type of the default domain?
> 
> 2. If we need to put a device into an existing group which uses a
> different type of domain from what the device desires to use, we might
> break the functionality of the device.
> 
> My new thought is letting the iommu generic code to determine the
> default domain type (hence my proposed vendor specific default domain
> type patches could be dropped).
> 
> If the default domain type is dynamical mapping, and later in 
> iommu_no_mapping(), we determines that we must use an identity domain,
> we then call iommu_request_dm_for_dev(dev).
> 
> If the default domain type is identity mapping, and later in
> iommu_no_mapping(), we determined that we must use a dynamical domain,
> we then call iommu_request_dma_domain_for_dev(dev).
> 
> We already have iommu_request_dm_for_dev() in iommu.c. We only need to
> implement iommu_request_dma_domain_for_dev().
> 
> With this done, your patch titled "Create an IOMMU group for devices
> that require an identity map" could also be dropped.
> 
> Any thoughts?

Theres a couple issues I can think of. iommu_request_dm_for_dev changes 
the domain for all devices within the devices group, if we may have 
devices with different domain requirements in the same group then only the 
last initialised device will get the domain it wants. If we want to add 
iommu_request_dma_domain_for_dev then we would still need another IOMMU 
group for identity domain devices.

Both with v3 and the proposed method here, iommu_should_identity_map is 
determining whether the device requires an identity map. In v3 this is 
called once up front by the generic code to determine for the IOMMU group 
which domain type to use. In the proposed method I think this would happen 
after initial domain allocation and would trigger the domain to be 
replaced with a different domain.

I prefer the solution in v3. What do you think?

> 
>> Cheers,
>> James.
> 
> Best regards,
> Lu Baolu

Cheers,
James.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions

2019-04-05 Thread James Sewart via iommu
Hey Lu,

My bad, did some debugging on my end. The issue was swapping out 
find_domain for iommu_get_domain_for_dev. It seems in some situations the 
domain is not attached to the group but the device is expected to have the 
domain still stored in its archdata.

I’ve attached the final patch with find_domain unremoved which seems to 
work in my testing.

Cheers,
James.



0009-iommu-vt-d-Remove-lazy-allocation-of-domains.patch
Description: Binary data



> On 4 Apr 2019, at 07:49, Lu Baolu  wrote:
> 
> Hi James,
> 
> I did a sanity test from my end. The test machine fails to boot. I
> haven't seen any valuable kernel log. It results in a purple screen.
> 
> Best regards,
> Lu Baolu
> 
> On 3/29/19 11:26 PM, James Sewart wrote:
>> Hey Lu,
>> I’ve attached a preliminary v3, if you could take a look and run some tests
>> that would be great.
>> Since v2 i’ve added your default domain type patches, the new device_group
>> handler, and incorporated Jacob’s feedback.
>>> On 28 Mar 2019, at 18:37, James Sewart  wrote:
>>> 
>>> Hey Lu,
>>> 
 On 26 Mar 2019, at 01:24, Lu Baolu  wrote:
 
 Hi James,
 
 On 3/25/19 8:57 PM, James Sewart wrote:
>>> Theres an issue that if we choose to alloc a new resv_region with type
>>> IOMMU_RESV_DIRECT, we will need to refactor intel_iommu_put_resv_regions
>>> to free this entry type which means refactoring the rmrr regions in
>>> get_resv_regions. Should this work be in this patchset?
>> Do you mean the rmrr regions are not allocated in get_resv_regions, but
>> are freed in put_resv_regions? I think we should fix this in this patch
>> set since this might impact the device passthrough if we don't do it.
> They’re not allocated and not freed currently, only type IOMMU_RESV_MSI is
> freed in put_resv_regions. If we allocate a new resv_region with type
> IOMMU_RESV_DIRECT for the isa region, then it won’t be freed. If we modify
> put_resv_regions to free type IOMMU_RESV_DIRECT, then we will try to free
> the static RMRR regions.
> Either the ISA region is static and not freed as with my implementation,
> or the RMRR regions are converted to be allocated on each call to
> get_resv_regions and freed in put_resv_regions.
 
 By the way, there's another way in my mind. Let's add a new region type
 for LPC devices, e.x. IOMMU_RESV_LPC, and then handle it in the same way
 as those MSI regions. Just FYI.
>>> 
>>> This solution would require adding some extra code to
>>> iommu_group_create_direct_mappings as currently only type
>>> IOMMU_RESV_DIRECT is identity mapped, other types are only reserved.
>>> 
>>> 
 
 Best regards,
 Lu Baolu
>>> 
>>> Cheers,
>>> James.
>> Cheers,
>> James.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions

2019-03-29 Thread James Sewart via iommu
Hey Lu,

I’ve attached a preliminary v3, if you could take a look and run some tests 
that would be great.

Since v2 i’ve added your default domain type patches, the new device_group 
handler, and incorporated Jacob’s feedback.

> On 28 Mar 2019, at 18:37, James Sewart  wrote:
> 
> Hey Lu,
> 
>> On 26 Mar 2019, at 01:24, Lu Baolu  wrote:
>> 
>> Hi James,
>> 
>> On 3/25/19 8:57 PM, James Sewart wrote:
> Theres an issue that if we choose to alloc a new resv_region with type
> IOMMU_RESV_DIRECT, we will need to refactor intel_iommu_put_resv_regions
> to free this entry type which means refactoring the rmrr regions in
> get_resv_regions. Should this work be in this patchset?
 Do you mean the rmrr regions are not allocated in get_resv_regions, but
 are freed in put_resv_regions? I think we should fix this in this patch
 set since this might impact the device passthrough if we don't do it.
>>> They’re not allocated and not freed currently, only type IOMMU_RESV_MSI is
>>> freed in put_resv_regions. If we allocate a new resv_region with type
>>> IOMMU_RESV_DIRECT for the isa region, then it won’t be freed. If we modify
>>> put_resv_regions to free type IOMMU_RESV_DIRECT, then we will try to free
>>> the static RMRR regions.
>>> Either the ISA region is static and not freed as with my implementation,
>>> or the RMRR regions are converted to be allocated on each call to
>>> get_resv_regions and freed in put_resv_regions.
>> 
>> By the way, there's another way in my mind. Let's add a new region type
>> for LPC devices, e.x. IOMMU_RESV_LPC, and then handle it in the same way
>> as those MSI regions. Just FYI.
> 
> This solution would require adding some extra code to 
> iommu_group_create_direct_mappings as currently only type 
> IOMMU_RESV_DIRECT is identity mapped, other types are only reserved.
> 
> 
>> 
>> Best regards,
>> Lu Baolu
> 
> Cheers,
> James.

Cheers,
James.



0001-iommu-Move-iommu_group_create_direct_mappings-to-aft.patch
Description: Binary data


0002-iommu-vt-d-Implement-apply_resv_region-for-reserving.patch
Description: Binary data


0003-iommu-vt-d-Expose-ISA-direct-mapping-region-via-iomm.patch
Description: Binary data


0004-iommu-vt-d-Create-an-IOMMU-group-for-devices-that-re.patch
Description: Binary data


0005-iommu-Add-ops-entry-for-vendor-specific-default-doma.patch
Description: Binary data


0006-iommu-vt-d-Add-is_identity_map-ops-entry.patch
Description: Binary data


0007-iommu-vt-d-Enable-DMA-remapping-after-rmrr-mapped.patch
Description: Binary data


0008-iommu-vt-d-Allow-IOMMU_DOMAIN_DMA-to-be-allocated-by.patch
Description: Binary data


0009-iommu-vt-d-Remove-lazy-allocation-of-domains.patch
Description: Binary data
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions

2019-03-28 Thread James Sewart via iommu
Hey Lu,

> On 26 Mar 2019, at 01:24, Lu Baolu  wrote:
> 
> Hi James,
> 
> On 3/25/19 8:57 PM, James Sewart wrote:
 Theres an issue that if we choose to alloc a new resv_region with type
 IOMMU_RESV_DIRECT, we will need to refactor intel_iommu_put_resv_regions
 to free this entry type which means refactoring the rmrr regions in
 get_resv_regions. Should this work be in this patchset?
>>> Do you mean the rmrr regions are not allocated in get_resv_regions, but
>>> are freed in put_resv_regions? I think we should fix this in this patch
>>> set since this might impact the device passthrough if we don't do it.
>> They’re not allocated and not freed currently, only type IOMMU_RESV_MSI is
>> freed in put_resv_regions. If we allocate a new resv_region with type
>> IOMMU_RESV_DIRECT for the isa region, then it won’t be freed. If we modify
>> put_resv_regions to free type IOMMU_RESV_DIRECT, then we will try to free
>> the static RMRR regions.
>> Either the ISA region is static and not freed as with my implementation,
>> or the RMRR regions are converted to be allocated on each call to
>> get_resv_regions and freed in put_resv_regions.
> 
> By the way, there's another way in my mind. Let's add a new region type
> for LPC devices, e.x. IOMMU_RESV_LPC, and then handle it in the same way
> as those MSI regions. Just FYI.

This solution would require adding some extra code to 
iommu_group_create_direct_mappings as currently only type 
IOMMU_RESV_DIRECT is identity mapped, other types are only reserved.


> 
> Best regards,
> Lu Baolu

Cheers,
James.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions

2019-03-25 Thread James Sewart via iommu
Hey Lu,

> On 25 Mar 2019, at 02:03, Lu Baolu  wrote:
> 
> Hi James,
> 
> On 3/22/19 5:57 PM, James Sewart wrote:
>> Hey Lu,
>>> On 15 Mar 2019, at 02:19, Lu Baolu  wrote:
>>> 
>>> Hi James,
>>> 
>>> On 3/14/19 7:58 PM, James Sewart wrote:
 To support mapping ISA region via iommu_group_create_direct_mappings,
 make sure its exposed by iommu_get_resv_regions. This allows
 deduplication of reserved region mappings
 Signed-off-by: James Sewart 
 ---
  drivers/iommu/intel-iommu.c | 42 +
  1 file changed, 33 insertions(+), 9 deletions(-)
 diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
 index 8e0a4e2ff77f..2e00e8708f06 100644
 --- a/drivers/iommu/intel-iommu.c
 +++ b/drivers/iommu/intel-iommu.c
 @@ -337,6 +337,8 @@ static LIST_HEAD(dmar_rmrr_units);
  #define for_each_rmrr_units(rmrr) \
list_for_each_entry(rmrr, _rmrr_units, list)
  +static struct iommu_resv_region *isa_resv_region;
 +
  /* bitmap for indexing intel_iommus */
  static int g_num_of_iommus;
  @@ -2780,26 +2782,34 @@ static inline int iommu_prepare_rmrr_dev(struct 
 dmar_rmrr_unit *rmrr,
  rmrr->end_address);
  }
  +static inline struct iommu_resv_region *iommu_get_isa_resv_region(void)
 +{
 +  if (!isa_resv_region)
 +  isa_resv_region = iommu_alloc_resv_region(0,
 +  16*1024*1024,
 +  0, IOMMU_RESV_DIRECT);
 +
 +  return isa_resv_region;
 +}
 +
  #ifdef CONFIG_INTEL_IOMMU_FLOPPY_WA
 -static inline void iommu_prepare_isa(void)
 +static inline void iommu_prepare_isa(struct pci_dev *pdev)
  {
 -  struct pci_dev *pdev;
int ret;
 +  struct iommu_resv_region *reg = iommu_get_isa_resv_region();
  - pdev = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, NULL);
 -  if (!pdev)
 +  if (!reg)
return;
pr_info("Prepare 0-16MiB unity mapping for LPC\n");
 -  ret = iommu_prepare_identity_map(>dev, 0, 16*1024*1024 - 1);
 +  ret = iommu_prepare_identity_map(>dev, reg->start,
 +  reg->start + reg->length - 1);
if (ret)
pr_err("Failed to create 0-16MiB identity map - floppy might 
 not work\n");
 -
 -  pci_dev_put(pdev);
  }
  #else
 -static inline void iommu_prepare_isa(void)
 +static inline void iommu_prepare_isa(struct pci_dev *pdev)
  {
return;
  }
 @@ -3289,6 +3299,7 @@ static int __init init_dmars(void)
struct dmar_rmrr_unit *rmrr;
bool copied_tables = false;
struct device *dev;
 +  struct pci_dev *pdev;
struct intel_iommu *iommu;
int i, ret;
  @@ -3469,7 +3480,11 @@ static int __init init_dmars(void)
}
}
  - iommu_prepare_isa();
 +  pdev = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, NULL);
 +  if (pdev) {
 +  iommu_prepare_isa(pdev);
 +  pci_dev_put(pdev);
 +  }
domains_done:
  @@ -5266,6 +5281,7 @@ static void intel_iommu_get_resv_regions(struct 
 device *device,
struct iommu_resv_region *reg;
struct dmar_rmrr_unit *rmrr;
struct device *i_dev;
 +  struct pci_dev *pdev;
int i;
rcu_read_lock();
 @@ -5280,6 +5296,14 @@ static void intel_iommu_get_resv_regions(struct 
 device *device,
}
rcu_read_unlock();
  + if (dev_is_pci(device)) {
 +  pdev = to_pci_dev(device);
 +  if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA) {
 +  reg = iommu_get_isa_resv_region();
 +  list_add_tail(>list, head);
 +  }
 +  }
 +
>>> 
>>> Just wondering why not just
>>> 
>>> +#ifdef CONFIG_INTEL_IOMMU_FLOPPY_WA
>>> +   if (dev_is_pci(device)) {
>>> +   pdev = to_pci_dev(device);
>>> +   if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA) {
>>> +   reg = iommu_alloc_resv_region(0,
>>> +   16*1024*1024,
>>> +   0, IOMMU_RESV_DIRECT);
>>> +   if (reg)
>>> +   list_add_tail(>list, head);
>>> +   }
>>> +   }
>>> +#endif
>>> 
>>> and, remove all other related code?
>> At this point in the patchset if we remove iommu_prepare_isa then the ISA
>> region won’t be mapped to the device. Only once the dma domain is allocable
>> will the reserved regions be mapped by iommu_group_create_direct_mappings.
> 
> Yes. So if we put the allocation code here, it won't impact anything and
> will take effect as soon as the dma domain is allocatable.
> 
>> Theres an issue that if we choose to alloc a new resv_region with type
>> IOMMU_RESV_DIRECT, we will need to refactor intel_iommu_put_resv_regions
>> to free this entry type which means refactoring the 

Re: [PATCH v2 7/7] iommu/vt-d: Remove lazy allocation of domains

2019-03-22 Thread James Sewart via iommu
Hey Jacob,

> On 14 Mar 2019, at 23:35, Jacob Pan  wrote:
> 
> On Thu, 14 Mar 2019 11:59:36 +
> James Sewart  wrote:
> 
>> 
>> -domain = get_valid_domain_for_dev(dev);
>> +domain = find_domain(dev);
>>  if (!domain)
>>  return DMA_MAPPING_ERROR;
>> 
>> @@ -3914,7 +3624,7 @@ static int intel_map_sg(struct device *dev,
>> struct scatterlist *sglist, int nele if (iommu_no_mapping(dev))
>>  return intel_nontranslate_map_sg(dev, sglist,
>> nelems, dir); 
>> -domain = get_valid_domain_for_dev(dev);
>> +domain = find_domain(dev);
> This patchset looks like a very good clean up, I am wondering why we
> can't use the generic iommu_get_domain_for_dev() here, since VT-d has a
> default DMA domain after your patch.

This should be possible, only downside is we get an iommu_domain from 
iommu_get_domain_for_dev and will have to check its not null before 
getting the dmar_domain from it. We will be able to remove find_domain 
though.

Cheers,
James.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 0/7] iommu/vt-d: Fix-up device-domain relationship by refactoring to use iommu group default domain.

2019-03-22 Thread James Sewart via iommu
Hey Lu,

> On 20 Mar 2019, at 01:26, Lu Baolu  wrote:
> 
> Hi James,
> 
> On 3/19/19 9:35 PM, James Sewart wrote:
>> Hey Lu,
>>> On 15 Mar 2019, at 03:13, Lu Baolu  wrote:
>>> 
>>> Hi James,
>>> 
>>> On 3/14/19 7:56 PM, James Sewart wrote:
 Patches 1 and 2 are the same as v1.
 v1-v2:
   Refactored ISA direct mappings to be returned by iommu_get_resv_regions.
   Integrated patch by Lu to defer turning on DMAR until iommu.c has mapped
 reserved regions.
   Integrated patches by Lu to remove more unused code in cleanup.
 Lu: I didn't integrate your patch to set the default domain type as it
 isn't directly related to the aim of this patchset. Instead patch 4
>>> 
>>> Without those patches, user experience will be affected and some devices
>>> will not work on Intel platforms anymore.
>>> 
>>> For a long time, Intel IOMMU driver has its own logic to determine
>>> whether a device requires an identity domain. For example, when user
>>> specifies "iommu=pt" in kernel parameter, all device will be attached
>>> with the identity domain. Further more, some quirky devices require
>>> an identity domain to be used before enabling DMA remapping, otherwise,
>>> it will not work. This was done by adding quirk bits in Intel IOMMU
>>> driver.
>>> 
>>> So from my point of view, one way is porting all those quirks and kernel
>>> parameters into IOMMU generic layer, or opening a door for vendor IOMMU
>>> driver to determine the default domain type by their own. I prefer the
>>> latter option since it will not impact any behaviors on other
>>> architectures.
>> I see your point. I’m not confident that using the proposed door to set a
>> groups default domain has the desired behaviour. As discussed before the
>> default domain type will be set based on the desired type for only the
>> first device attached to a group. I think to change the default domain
>> type you would need a slightly different door that wasn’t conditioned on
>> device.
> 
> I think this as another problem. Just a summarize for the ease of
> discussion. We saw two problems:
> 
> 1. When allocating a new group for a device, how should we determine the
> type of the default domain? This is what my proposal patches trying to
> address.

This will work as expected only if all devices within a group get the same 
result from is_identity_map. Otherwise wee see issue 2.

> 
> 2. If we need to put a device into an existing group which uses a
> different type of domain from what the device desires to use, we might
> break the functionality of the device. For this problem I'd second your
> proposal below if I get your point correctly.
> 
>> For situations where individual devices require an identity domain because
>> of quirks then maybe calling is_identity_map per device in
>> iommu_group_get_for_dev is a better solution than the one I proposed.
> 
> Do you mean if we see a quirky device requires a different domain type
> other than the default domain type of the group, we will assign a new
> group to it? That looks good to me as far as I can see. I suppose this
> should be done in vt-d's ops callback.

I have thought about this a bit and I think the cleanest approach is to 
put devices that require an identity domain into their own group, this can 
be done in the device_group callback, avoiding any situation where we have 
to deal with creating a new group based on domain type in 
iommu_group_get_for_dev. This way we shouldn’t ever be in a situation with 
multiple different domain types per group. This way your patches will work 
as expected. See below for a possible implementation.

> 
> Best regards,
> Lu Baolu

Cheers,
James.

Devices that require an identity map because of quirks or other reasons
should be put in their own IOMMU group so as to not end up with multiple
different domains per group.

Signed-off-by: James Sewart 

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 3cb8b36abf50..0f5a121d31a0 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5421,6 +5421,22 @@ struct intel_iommu *intel_svm_device_to_iommu(struct 
device *dev)
 }
 #endif /* CONFIG_INTEL_IOMMU_SVM */

+static struct iommu_group *intel_identity_group;
+
+struct iommu_group *intel_iommu_pci_device_group(struct device *dev)
+{
+   if (iommu_no_mapping(dev)) {
+   if (!intel_identity_group) {
+   intel_identity_group = iommu_group_alloc();
+   if (IS_ERR(intel_identity_group))
+   return NULL;
+   }
+   return intel_identity_group;
+   }
+
+   return pci_device_group(dev);
+}
+
 const struct iommu_ops intel_iommu_ops = {
.capable= intel_iommu_capable,
.domain_alloc   = intel_iommu_domain_alloc,
@@ -5435,7 +5451,7 @@ const struct iommu_ops intel_iommu_ops = {
.get_resv_regions   = intel_iommu_get_resv_regions,
.put_resv_regions 

Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions

2019-03-22 Thread James Sewart via iommu
Hey Lu,

> On 15 Mar 2019, at 02:19, Lu Baolu  wrote:
> 
> Hi James,
> 
> On 3/14/19 7:58 PM, James Sewart wrote:
>> To support mapping ISA region via iommu_group_create_direct_mappings,
>> make sure its exposed by iommu_get_resv_regions. This allows
>> deduplication of reserved region mappings
>> Signed-off-by: James Sewart 
>> ---
>>  drivers/iommu/intel-iommu.c | 42 +
>>  1 file changed, 33 insertions(+), 9 deletions(-)
>> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
>> index 8e0a4e2ff77f..2e00e8708f06 100644
>> --- a/drivers/iommu/intel-iommu.c
>> +++ b/drivers/iommu/intel-iommu.c
>> @@ -337,6 +337,8 @@ static LIST_HEAD(dmar_rmrr_units);
>>  #define for_each_rmrr_units(rmrr) \
>>  list_for_each_entry(rmrr, _rmrr_units, list)
>>  +static struct iommu_resv_region *isa_resv_region;
>> +
>>  /* bitmap for indexing intel_iommus */
>>  static int g_num_of_iommus;
>>  @@ -2780,26 +2782,34 @@ static inline int iommu_prepare_rmrr_dev(struct 
>> dmar_rmrr_unit *rmrr,
>>rmrr->end_address);
>>  }
>>  +static inline struct iommu_resv_region *iommu_get_isa_resv_region(void)
>> +{
>> +if (!isa_resv_region)
>> +isa_resv_region = iommu_alloc_resv_region(0,
>> +16*1024*1024,
>> +0, IOMMU_RESV_DIRECT);
>> +
>> +return isa_resv_region;
>> +}
>> +
>>  #ifdef CONFIG_INTEL_IOMMU_FLOPPY_WA
>> -static inline void iommu_prepare_isa(void)
>> +static inline void iommu_prepare_isa(struct pci_dev *pdev)
>>  {
>> -struct pci_dev *pdev;
>>  int ret;
>> +struct iommu_resv_region *reg = iommu_get_isa_resv_region();
>>  -   pdev = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, NULL);
>> -if (!pdev)
>> +if (!reg)
>>  return;
>>  pr_info("Prepare 0-16MiB unity mapping for LPC\n");
>> -ret = iommu_prepare_identity_map(>dev, 0, 16*1024*1024 - 1);
>> +ret = iommu_prepare_identity_map(>dev, reg->start,
>> +reg->start + reg->length - 1);
>>  if (ret)
>>  pr_err("Failed to create 0-16MiB identity map - floppy might 
>> not work\n");
>> -
>> -pci_dev_put(pdev);
>>  }
>>  #else
>> -static inline void iommu_prepare_isa(void)
>> +static inline void iommu_prepare_isa(struct pci_dev *pdev)
>>  {
>>  return;
>>  }
>> @@ -3289,6 +3299,7 @@ static int __init init_dmars(void)
>>  struct dmar_rmrr_unit *rmrr;
>>  bool copied_tables = false;
>>  struct device *dev;
>> +struct pci_dev *pdev;
>>  struct intel_iommu *iommu;
>>  int i, ret;
>>  @@ -3469,7 +3480,11 @@ static int __init init_dmars(void)
>>  }
>>  }
>>  -   iommu_prepare_isa();
>> +pdev = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, NULL);
>> +if (pdev) {
>> +iommu_prepare_isa(pdev);
>> +pci_dev_put(pdev);
>> +}
>>domains_done:
>>  @@ -5266,6 +5281,7 @@ static void intel_iommu_get_resv_regions(struct 
>> device *device,
>>  struct iommu_resv_region *reg;
>>  struct dmar_rmrr_unit *rmrr;
>>  struct device *i_dev;
>> +struct pci_dev *pdev;
>>  int i;
>>  rcu_read_lock();
>> @@ -5280,6 +5296,14 @@ static void intel_iommu_get_resv_regions(struct 
>> device *device,
>>  }
>>  rcu_read_unlock();
>>  +   if (dev_is_pci(device)) {
>> +pdev = to_pci_dev(device);
>> +if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA) {
>> +reg = iommu_get_isa_resv_region();
>> +list_add_tail(>list, head);
>> +}
>> +}
>> +
> 
> Just wondering why not just
> 
> +#ifdef CONFIG_INTEL_IOMMU_FLOPPY_WA
> + if (dev_is_pci(device)) {
> + pdev = to_pci_dev(device);
> + if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA) {
> + reg = iommu_alloc_resv_region(0,
> + 16*1024*1024,
> + 0, IOMMU_RESV_DIRECT);
> + if (reg)
> + list_add_tail(>list, head);
> + }
> + }
> +#endif
> 
> and, remove all other related code?

At this point in the patchset if we remove iommu_prepare_isa then the ISA 
region won’t be mapped to the device. Only once the dma domain is allocable 
will the reserved regions be mapped by iommu_group_create_direct_mappings.

Theres an issue that if we choose to alloc a new resv_region with type 
IOMMU_RESV_DIRECT, we will need to refactor intel_iommu_put_resv_regions 
to free this entry type which means refactoring the rmrr regions in 
get_resv_regions. Should this work be in this patchset?

> 
> Best regards,
> Lu Baolu

Cheers,
James.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 0/7] iommu/vt-d: Fix-up device-domain relationship by refactoring to use iommu group default domain.

2019-03-19 Thread James Sewart via iommu
Hey Lu,

> On 15 Mar 2019, at 03:13, Lu Baolu  wrote:
> 
> Hi James,
> 
> On 3/14/19 7:56 PM, James Sewart wrote:
>> Patches 1 and 2 are the same as v1.
>> v1-v2:
>>   Refactored ISA direct mappings to be returned by iommu_get_resv_regions.
>>   Integrated patch by Lu to defer turning on DMAR until iommu.c has mapped
>> reserved regions.
>>   Integrated patches by Lu to remove more unused code in cleanup.
>> Lu: I didn't integrate your patch to set the default domain type as it
>> isn't directly related to the aim of this patchset. Instead patch 4
> 
> Without those patches, user experience will be affected and some devices
> will not work on Intel platforms anymore.
> 
> For a long time, Intel IOMMU driver has its own logic to determine
> whether a device requires an identity domain. For example, when user
> specifies "iommu=pt" in kernel parameter, all device will be attached
> with the identity domain. Further more, some quirky devices require
> an identity domain to be used before enabling DMA remapping, otherwise,
> it will not work. This was done by adding quirk bits in Intel IOMMU
> driver.
> 
> So from my point of view, one way is porting all those quirks and kernel
> parameters into IOMMU generic layer, or opening a door for vendor IOMMU
> driver to determine the default domain type by their own. I prefer the
> latter option since it will not impact any behaviors on other
> architectures.

I see your point. I’m not confident that using the proposed door to set a 
groups default domain has the desired behaviour. As discussed before the 
default domain type will be set based on the desired type for only the 
first device attached to a group. I think to change the default domain 
type you would need a slightly different door that wasn’t conditioned on 
device.

For situations where individual devices require an identity domain because 
of quirks then maybe calling is_identity_map per device in 
iommu_group_get_for_dev is a better solution than the one I proposed.

> 
>> addresses the issue of a device requiring an identity domain by ignoring
>> the domain param in attach_device and printing a warning.
> 
> This will not work as I commented in that thread.
> 
>> I booted some of our devices with this patchset and haven't seen any
>> issues. It doesn't look like we have any devices with RMRR's though so
>> those codepaths aren't tested.
>> James Sewart (7):
>>   iommu: Move iommu_group_create_direct_mappings to after device_attach
>>   iommu/vt-d: Implement apply_resv_region for reserving IOVA ranges
>>   iommu/vt-d: Expose ISA direct mapping region via
>> iommu_get_resv_regions
>>   iommu/vt-d: Ignore domain parameter in attach_device if device
>> requires identity map
>>   iommu/vt-d: Allow IOMMU_DOMAIN_DMA to be allocated by iommu_ops
>>   iommu/vt-d: Remove lazy allocation of domains
>> Lu Baolu (1):
>>   iommu/vt-d: Enable DMA remapping after rmrr mapped
>>  drivers/iommu/intel-iommu.c | 444 +++-
>>  drivers/iommu/iommu.c   |   4 +-
>>  2 files changed, 131 insertions(+), 317 deletions(-)
> 
> Best regards,
> Lu Baolu

Cheers,
James.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v2 6/7] iommu/vt-d: Allow IOMMU_DOMAIN_DMA to be allocated by iommu_ops

2019-03-14 Thread James Sewart via iommu
Allowing IOMMU_DOMAIN_DMA type IOMMU domain to be allocated allows the
default_domain of an iommu_group to be set. This delegates device-domain
relationships to the generic IOMMU code.

Signed-off-by: James Sewart 
---
 drivers/iommu/intel-iommu.c | 99 +
 1 file changed, 78 insertions(+), 21 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 35821df70f78..2c9d793af394 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -309,6 +309,18 @@ static int hw_pass_through = 1;
 /* si_domain contains mulitple devices */
 #define DOMAIN_FLAG_STATIC_IDENTITY(1 << 1)
 
+/*
+ * Domain managed externally, don't cleanup if it isn't attached
+ * to any devices.
+ */
+#define DOMAIN_FLAG_MANAGED_EXTERNALLY (1 << 2)
+
+/*
+ * Set after domain initialisation. Used when allocating dma domains to
+ * defer domain initialisation until it is attached to a device
+ */
+#define DOMAIN_FLAG_INITIALISED(1 << 3)
+
 #define for_each_domain_iommu(idx, domain) \
for (idx = 0; idx < g_num_of_iommus; idx++) \
if (domain->iommu_refcnt[idx])
@@ -560,6 +572,16 @@ static inline int domain_type_is_vm_or_si(struct 
dmar_domain *domain)
DOMAIN_FLAG_STATIC_IDENTITY);
 }
 
+static inline int domain_managed_externally(struct dmar_domain *domain)
+{
+   return domain->flags & DOMAIN_FLAG_MANAGED_EXTERNALLY;
+}
+
+static inline int domain_is_initialised(struct dmar_domain *domain)
+{
+   return domain->flags & DOMAIN_FLAG_INITIALISED;
+}
+
 static inline int domain_pfn_supported(struct dmar_domain *domain,
   unsigned long pfn)
 {
@@ -1664,7 +1686,7 @@ static void disable_dmar_iommu(struct intel_iommu *iommu)
 
__dmar_remove_one_dev_info(info);
 
-   if (!domain_type_is_vm_or_si(domain)) {
+   if (!domain_managed_externally(domain)) {
/*
 * The domain_exit() function  can't be called under
 * device_domain_lock, as it takes this lock itself.
@@ -1897,6 +1919,7 @@ static int domain_init(struct dmar_domain *domain, struct 
intel_iommu *iommu,
domain->pgd = (struct dma_pte *)alloc_pgtable_page(domain->nid);
if (!domain->pgd)
return -ENOMEM;
+   domain->flags |= DOMAIN_FLAG_INITIALISED;
__iommu_flush_cache(iommu, domain->pgd, PAGE_SIZE);
return 0;
 }
@@ -1909,6 +1932,9 @@ static void domain_exit(struct dmar_domain *domain)
if (!domain)
return;
 
+   if (!domain_is_initialised(domain))
+   goto free_mem;
+
/* Remove associated devices and clear attached or cached domains */
rcu_read_lock();
domain_remove_dev_info(domain);
@@ -1921,6 +1947,7 @@ static void domain_exit(struct dmar_domain *domain)
 
dma_free_pagelist(freelist);
 
+free_mem:
free_domain_mem(domain);
 }
 
@@ -4585,7 +4612,7 @@ static int device_notifier(struct notifier_block *nb,
return 0;
 
dmar_remove_one_dev_info(domain, dev);
-   if (!domain_type_is_vm_or_si(domain) && list_empty(>devices))
+   if (!domain_managed_externally(domain) && list_empty(>devices))
domain_exit(domain);
 
return 0;
@@ -5039,6 +5066,7 @@ static int md_domain_init(struct dmar_domain *domain, int 
guest_width)
domain->pgd = (struct dma_pte *)alloc_pgtable_page(domain->nid);
if (!domain->pgd)
return -ENOMEM;
+   domain->flags |= DOMAIN_FLAG_INITIALISED;
domain_flush_cache(domain, domain->pgd, PAGE_SIZE);
return 0;
 }
@@ -5047,28 +5075,43 @@ static struct iommu_domain 
*intel_iommu_domain_alloc(unsigned type)
 {
struct dmar_domain *dmar_domain;
struct iommu_domain *domain;
+   int flags = DOMAIN_FLAG_MANAGED_EXTERNALLY;
 
-   if (type != IOMMU_DOMAIN_UNMANAGED)
-   return NULL;
+   switch (type) {
+   case IOMMU_DOMAIN_UNMANAGED:
+   flags |= DOMAIN_FLAG_VIRTUAL_MACHINE | DOMAIN_FLAG_INITIALISED;
+   dmar_domain = alloc_domain(flags);
+   if (!dmar_domain)
+   return NULL;
 
-   dmar_domain = alloc_domain(DOMAIN_FLAG_VIRTUAL_MACHINE);
-   if (!dmar_domain) {
-   pr_err("Can't allocate dmar_domain\n");
-   return NULL;
-   }
-   if (md_domain_init(dmar_domain, DEFAULT_DOMAIN_ADDRESS_WIDTH)) {
-   pr_err("Domain initialization failed\n");
-   domain_exit(dmar_domain);
+   if (md_domain_init(dmar_domain, DEFAULT_DOMAIN_ADDRESS_WIDTH)) {
+   pr_err("Domain initialization failed\n");
+   domain_exit(dmar_domain);
+   return NULL;
+   }
+   domain_update_iommu_cap(dmar_domain);
+  

[PATCH v2 7/7] iommu/vt-d: Remove lazy allocation of domains

2019-03-14 Thread James Sewart via iommu
The generic IOMMU code will allocate and attach a default domain to each
device that comes online. This patch removes the lazy domain allocation
and early reserved region mapping that is now redundant.

Signed-off-by: James Sewart 
---
 drivers/iommu/intel-iommu.c | 300 +---
 1 file changed, 5 insertions(+), 295 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 2c9d793af394..f8c0c3e16935 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -2605,118 +2605,6 @@ static struct dmar_domain 
*dmar_insert_one_dev_info(struct intel_iommu *iommu,
return domain;
 }
 
-static int get_last_alias(struct pci_dev *pdev, u16 alias, void *opaque)
-{
-   *(u16 *)opaque = alias;
-   return 0;
-}
-
-static struct dmar_domain *find_or_alloc_domain(struct device *dev, int gaw)
-{
-   struct device_domain_info *info = NULL;
-   struct dmar_domain *domain = NULL;
-   struct intel_iommu *iommu;
-   u16 dma_alias;
-   unsigned long flags;
-   u8 bus, devfn;
-
-   iommu = device_to_iommu(dev, , );
-   if (!iommu)
-   return NULL;
-
-   if (dev_is_pci(dev)) {
-   struct pci_dev *pdev = to_pci_dev(dev);
-
-   pci_for_each_dma_alias(pdev, get_last_alias, _alias);
-
-   spin_lock_irqsave(_domain_lock, flags);
-   info = dmar_search_domain_by_dev_info(pci_domain_nr(pdev->bus),
- PCI_BUS_NUM(dma_alias),
- dma_alias & 0xff);
-   if (info) {
-   iommu = info->iommu;
-   domain = info->domain;
-   }
-   spin_unlock_irqrestore(_domain_lock, flags);
-
-   /* DMA alias already has a domain, use it */
-   if (info)
-   goto out;
-   }
-
-   /* Allocate and initialize new domain for the device */
-   domain = alloc_domain(0);
-   if (!domain)
-   return NULL;
-   if (domain_init(domain, iommu, gaw)) {
-   domain_exit(domain);
-   return NULL;
-   }
-
-out:
-
-   return domain;
-}
-
-static struct dmar_domain *set_domain_for_dev(struct device *dev,
- struct dmar_domain *domain)
-{
-   struct intel_iommu *iommu;
-   struct dmar_domain *tmp;
-   u16 req_id, dma_alias;
-   u8 bus, devfn;
-
-   iommu = device_to_iommu(dev, , );
-   if (!iommu)
-   return NULL;
-
-   req_id = ((u16)bus << 8) | devfn;
-
-   if (dev_is_pci(dev)) {
-   struct pci_dev *pdev = to_pci_dev(dev);
-
-   pci_for_each_dma_alias(pdev, get_last_alias, _alias);
-
-   /* register PCI DMA alias device */
-   if (req_id != dma_alias) {
-   tmp = dmar_insert_one_dev_info(iommu, 
PCI_BUS_NUM(dma_alias),
-   dma_alias & 0xff, NULL, domain);
-
-   if (!tmp || tmp != domain)
-   return tmp;
-   }
-   }
-
-   tmp = dmar_insert_one_dev_info(iommu, bus, devfn, dev, domain);
-   if (!tmp || tmp != domain)
-   return tmp;
-
-   return domain;
-}
-
-static struct dmar_domain *get_domain_for_dev(struct device *dev, int gaw)
-{
-   struct dmar_domain *domain, *tmp;
-
-   domain = find_domain(dev);
-   if (domain)
-   goto out;
-
-   domain = find_or_alloc_domain(dev, gaw);
-   if (!domain)
-   goto out;
-
-   tmp = set_domain_for_dev(dev, domain);
-   if (!tmp || domain != tmp) {
-   domain_exit(domain);
-   domain = tmp;
-   }
-
-out:
-
-   return domain;
-}
-
 static int iommu_domain_identity_map(struct dmar_domain *domain,
 unsigned long long start,
 unsigned long long end)
@@ -2742,73 +2630,6 @@ static int iommu_domain_identity_map(struct dmar_domain 
*domain,
DMA_PTE_READ|DMA_PTE_WRITE);
 }
 
-static int domain_prepare_identity_map(struct device *dev,
-  struct dmar_domain *domain,
-  unsigned long long start,
-  unsigned long long end)
-{
-   /* For _hardware_ passthrough, don't bother. But for software
-  passthrough, we do it anyway -- it may indicate a memory
-  range which is reserved in E820, so which didn't get set
-  up to start with in si_domain */
-   if (domain == si_domain && hw_pass_through) {
-   pr_warn("Ignoring identity map for HW passthrough device %s 
[0x%Lx - 0x%Lx]\n",
-   dev_name(dev), start, end);
-   return 0;
-   }
-
-   

[PATCH v2 5/7] iommu/vt-d: Enable DMA remapping after rmrr mapped

2019-03-14 Thread James Sewart via iommu
The rmrr devices require identity map of the rmrr regions before
enabling DMA remapping. Otherwise, there will be a window during
which DMA from/to the rmrr regions will be blocked. In order to
alleviate this, we move enabling DMA remapping after all rmrr
regions get mapped.

Signed-off-by: Lu Baolu 
---
 drivers/iommu/intel-iommu.c | 16 ++--
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 104d36f225a7..35821df70f78 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -3518,11 +3518,6 @@ static int __init init_dmars(void)
ret = dmar_set_interrupt(iommu);
if (ret)
goto free_iommu;
-
-   if (!translation_pre_enabled(iommu))
-   iommu_enable_translation(iommu);
-
-   iommu_disable_protect_mem_regions(iommu);
}
 
return 0;
@@ -4912,7 +4907,6 @@ int __init intel_iommu_init(void)
goto out_free_reserved_range;
}
up_write(_global_lock);
-   pr_info("Intel(R) Virtualization Technology for Directed I/O\n");
 
 #if defined(CONFIG_X86) && defined(CONFIG_SWIOTLB)
swiotlb = 0;
@@ -4935,6 +4929,16 @@ int __init intel_iommu_init(void)
register_memory_notifier(_iommu_memory_nb);
cpuhp_setup_state(CPUHP_IOMMU_INTEL_DEAD, "iommu/intel:dead", NULL,
  intel_iommu_cpu_dead);
+
+   /* Finally, we enable the DMA remapping hardware. */
+   for_each_iommu(iommu, drhd) {
+   if (!translation_pre_enabled(iommu))
+   iommu_enable_translation(iommu);
+
+   iommu_disable_protect_mem_regions(iommu);
+   }
+   pr_info("Intel(R) Virtualization Technology for Directed I/O\n");
+
intel_iommu_enabled = 1;
intel_iommu_debugfs_init();
 
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 4/7] iommu/vt-d: Ignore domain parameter in attach_device if device requires identity map

2019-03-14 Thread James Sewart via iommu
If a device requires an identity map then it is not safe to attach a
domain that can remap addresses. Warn the user if this occurs.

Signed-off-by: James Sewart 
---
 drivers/iommu/intel-iommu.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 2e00e8708f06..104d36f225a7 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5101,6 +5101,11 @@ static int intel_iommu_attach_device(struct iommu_domain 
*domain,
}
}
 
+   if (iommu_no_mapping(dev)) {
+   dmar_domain = si_domain;
+   dev_warn(dev, "VT-d: Device is required to use identity IOMMU 
mapping, ignoring domain attached\n");
+   }
+
iommu = device_to_iommu(dev, , );
if (!iommu)
return -ENODEV;
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions

2019-03-14 Thread James Sewart via iommu
To support mapping ISA region via iommu_group_create_direct_mappings,
make sure its exposed by iommu_get_resv_regions. This allows
deduplication of reserved region mappings

Signed-off-by: James Sewart 
---
 drivers/iommu/intel-iommu.c | 42 +
 1 file changed, 33 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 8e0a4e2ff77f..2e00e8708f06 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -337,6 +337,8 @@ static LIST_HEAD(dmar_rmrr_units);
 #define for_each_rmrr_units(rmrr) \
list_for_each_entry(rmrr, _rmrr_units, list)
 
+static struct iommu_resv_region *isa_resv_region;
+
 /* bitmap for indexing intel_iommus */
 static int g_num_of_iommus;
 
@@ -2780,26 +2782,34 @@ static inline int iommu_prepare_rmrr_dev(struct 
dmar_rmrr_unit *rmrr,
  rmrr->end_address);
 }
 
+static inline struct iommu_resv_region *iommu_get_isa_resv_region(void)
+{
+   if (!isa_resv_region)
+   isa_resv_region = iommu_alloc_resv_region(0,
+   16*1024*1024,
+   0, IOMMU_RESV_DIRECT);
+
+   return isa_resv_region;
+}
+
 #ifdef CONFIG_INTEL_IOMMU_FLOPPY_WA
-static inline void iommu_prepare_isa(void)
+static inline void iommu_prepare_isa(struct pci_dev *pdev)
 {
-   struct pci_dev *pdev;
int ret;
+   struct iommu_resv_region *reg = iommu_get_isa_resv_region();
 
-   pdev = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, NULL);
-   if (!pdev)
+   if (!reg)
return;
 
pr_info("Prepare 0-16MiB unity mapping for LPC\n");
-   ret = iommu_prepare_identity_map(>dev, 0, 16*1024*1024 - 1);
+   ret = iommu_prepare_identity_map(>dev, reg->start,
+   reg->start + reg->length - 1);
 
if (ret)
pr_err("Failed to create 0-16MiB identity map - floppy might 
not work\n");
-
-   pci_dev_put(pdev);
 }
 #else
-static inline void iommu_prepare_isa(void)
+static inline void iommu_prepare_isa(struct pci_dev *pdev)
 {
return;
 }
@@ -3289,6 +3299,7 @@ static int __init init_dmars(void)
struct dmar_rmrr_unit *rmrr;
bool copied_tables = false;
struct device *dev;
+   struct pci_dev *pdev;
struct intel_iommu *iommu;
int i, ret;
 
@@ -3469,7 +3480,11 @@ static int __init init_dmars(void)
}
}
 
-   iommu_prepare_isa();
+   pdev = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, NULL);
+   if (pdev) {
+   iommu_prepare_isa(pdev);
+   pci_dev_put(pdev);
+   }
 
 domains_done:
 
@@ -5266,6 +5281,7 @@ static void intel_iommu_get_resv_regions(struct device 
*device,
struct iommu_resv_region *reg;
struct dmar_rmrr_unit *rmrr;
struct device *i_dev;
+   struct pci_dev *pdev;
int i;
 
rcu_read_lock();
@@ -5280,6 +5296,14 @@ static void intel_iommu_get_resv_regions(struct device 
*device,
}
rcu_read_unlock();
 
+   if (dev_is_pci(device)) {
+   pdev = to_pci_dev(device);
+   if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA) {
+   reg = iommu_get_isa_resv_region();
+   list_add_tail(>list, head);
+   }
+   }
+
reg = iommu_alloc_resv_region(IOAPIC_RANGE_START,
  IOAPIC_RANGE_END - IOAPIC_RANGE_START + 1,
  0, IOMMU_RESV_MSI);
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 2/7] iommu/vt-d: Implement apply_resv_region for reserving IOVA ranges

2019-03-14 Thread James Sewart via iommu
Used by iommu.c before creating identity mappings for reserved ranges to
ensure dma-ops won't ever remap these ranges

Signed-off-by: James Sewart 
---
 drivers/iommu/intel-iommu.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 78188bf7e90d..8e0a4e2ff77f 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5299,6 +5299,19 @@ static void intel_iommu_put_resv_regions(struct device 
*dev,
}
 }
 
+static void intel_iommu_apply_resv_region(struct device *dev,
+   struct iommu_domain *domain,
+   struct iommu_resv_region *region)
+{
+   struct dmar_domain *dmar_domain = to_dmar_domain(domain);
+   unsigned long start, end;
+
+   start = IOVA_PFN(region->start);
+   end   = IOVA_PFN(region->start + region->length - 1);
+
+   WARN_ON_ONCE(reserve_iova(_domain->iovad, start, end) == NULL);
+}
+
 #ifdef CONFIG_INTEL_IOMMU_SVM
 int intel_iommu_enable_pasid(struct intel_iommu *iommu, struct intel_svm_dev 
*sdev)
 {
@@ -5392,6 +5405,7 @@ const struct iommu_ops intel_iommu_ops = {
.remove_device  = intel_iommu_remove_device,
.get_resv_regions   = intel_iommu_get_resv_regions,
.put_resv_regions   = intel_iommu_put_resv_regions,
+   .apply_resv_region  = intel_iommu_apply_resv_region,
.device_group   = pci_device_group,
.pgsize_bitmap  = INTEL_IOMMU_PGSIZES,
 };
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 1/7] iommu: Move iommu_group_create_direct_mappings to after device_attach

2019-03-14 Thread James Sewart via iommu
If an IOMMU driver requires to know which IOMMU a domain is associated
to initialise a domain then it will do so in device_attach, before which
it is not safe to call iommu_ops.

Signed-off-by: James Sewart 
---
 drivers/iommu/iommu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 3ed4db334341..1c6ffbb2d20e 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -652,8 +652,6 @@ int iommu_group_add_device(struct iommu_group *group, 
struct device *dev)
 
dev->iommu_group = group;
 
-   iommu_group_create_direct_mappings(group, dev);
-
mutex_lock(>mutex);
list_add_tail(>list, >devices);
if (group->domain)
@@ -662,6 +660,8 @@ int iommu_group_add_device(struct iommu_group *group, 
struct device *dev)
if (ret)
goto err_put_group;
 
+   iommu_group_create_direct_mappings(group, dev);
+
/* Notify any listeners about change to group. */
blocking_notifier_call_chain(>notifier,
 IOMMU_GROUP_NOTIFY_ADD_DEVICE, dev);
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 0/7] iommu/vt-d: Fix-up device-domain relationship by refactoring to use iommu group default domain.

2019-03-14 Thread James Sewart via iommu
Patches 1 and 2 are the same as v1.

v1-v2:
  Refactored ISA direct mappings to be returned by iommu_get_resv_regions.
  Integrated patch by Lu to defer turning on DMAR until iommu.c has mapped 
reserved regions.
  Integrated patches by Lu to remove more unused code in cleanup.

Lu: I didn't integrate your patch to set the default domain type as it 
isn't directly related to the aim of this patchset. Instead patch 4 
addresses the issue of a device requiring an identity domain by ignoring 
the domain param in attach_device and printing a warning.

I booted some of our devices with this patchset and haven't seen any 
issues. It doesn't look like we have any devices with RMRR's though so 
those codepaths aren't tested.

James Sewart (7):
  iommu: Move iommu_group_create_direct_mappings to after device_attach
  iommu/vt-d: Implement apply_resv_region for reserving IOVA ranges
  iommu/vt-d: Expose ISA direct mapping region via
iommu_get_resv_regions
  iommu/vt-d: Ignore domain parameter in attach_device if device
requires identity map
  iommu/vt-d: Allow IOMMU_DOMAIN_DMA to be allocated by iommu_ops
  iommu/vt-d: Remove lazy allocation of domains

Lu Baolu (1):
  iommu/vt-d: Enable DMA remapping after rmrr mapped

 drivers/iommu/intel-iommu.c | 444 +++-
 drivers/iommu/iommu.c   |   4 +-
 2 files changed, 131 insertions(+), 317 deletions(-)

-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 4/4] iommu/vt-d: Remove lazy allocation of domains

2019-03-08 Thread James Sewart via iommu
Hey Lu,

> On 8 Mar 2019, at 03:09, Lu Baolu  wrote:
> 
> Hi James,
> 
> On 3/7/19 6:21 PM, James Sewart wrote:
>> Hey Lu,
>>> On 7 Mar 2019, at 06:31, Lu Baolu  wrote:
>>> 
>>> Hi James,
>>> 
>>> On 3/7/19 2:08 AM, James Sewart wrote:
 -  /*
 -   * For each rmrr
 -   *   for each dev attached to rmrr
 -   *   do
 -   * locate drhd for dev, alloc domain for dev
 -   * allocate free domain
 -   * allocate page table entries for rmrr
 -   * if context not allocated for bus
 -   *   allocate and init context
 -   *   set present in root table for this bus
 -   * init context with domain, translation etc
 -   *endfor
 -   * endfor
 -   */
 -  pr_info("Setting RMRR:\n");
 -  for_each_rmrr_units(rmrr) {
 -  /* some BIOS lists non-exist devices in DMAR table. */
 -  for_each_active_dev_scope(rmrr->devices, 
 rmrr->devices_cnt,
 -i, dev) {
 -  ret = iommu_prepare_rmrr_dev(rmrr, dev);
 -  if (ret)
 -  pr_err("Mapping reserved region 
 failed\n");
 -  }
 -  }
 -
 -  iommu_prepare_isa();
>>> Why do you want to remove this segment of code?
>> This will only work if the lazy allocation of domains exists, these
>> mappings will disappear once a default domain is attached to a device and
>> then remade by iommu_group_create_direct_mappings. This code is redundant
>> and removing it allows us to remove all the lazy allocation logic.
> No exactly.
> 
> We need to setup the rmrr mapping before enabling dma remapping,
> otherwise, there will be a window after dma remapping enabling and
> rmrr getting mapped, during which people might see dma faults.
 Do you think this patch instead should be a refactoring to simplify this 
 initial domain setup before the default domain takes over? It seems like 
 duplicated effort to have both lazy allocated domains and externally 
 managed domains. We could allocate a domain here for any devices with RMRR 
 and call get_resv_regions to avoid duplicating RMRR loop.
>>> 
>>> Agree. We should replace the lazy allocated domains with default domain
>>> in a clean way. Actually, your patches look great to me. But I do think
>>> we need further cleanups. The rmrr code is one example, and the identity
>>> domain (si_domain) is another.
>>> 
>>> Do you mind if I work on top of your patches for further cleanups and
>>> sign off a v2 together with you?
>> Sure, sounds good. I’ll fixup patch 3 and have a go at integrating
>> iommu_prepare_isa into get_resv_regions. This should make the initial
>> domain logic here quite concise.
> 
> Here attached three extra patches which I think should be added before
> PATCH 3/4, and some further cleanup changes which you can merge with
> PATCH 4/4.
> 
> 
> 
> 0006-iommu-Add-ops-entry-for-vendor-specific-default-doma.patch
> 0007-iommu-vt-d-Add-is_identity_map-ops-entry.patch
> 
> These two patches aim to add a generic method for vendor specific iommu
> drivers to specify the type of the default domain for a device. Intel
> iommu driver will register an ops for this since it already has its own
> identity map adjudicator for a long time.

This seems like a good idea, but as domain alloc is only called for the 
default domain on first device attached to a group, we may miss checking 
whether a device added later should have an identity domain. Should there 
be paths to downgrade a groups domain if one of the devices requires one?

> 
> 
> 
> 0008-iommu-vt-d-Enable-DMA-remapping-after-rmrr-mapped.patch
> 
> This aims to address the requirement of rmrr identity map before
> enabling DMA remapping. With default domain mechanism, the default
> domain will be allocated and attached to the device in bus_set_iommu()
> during boot. Move enabling DMA remapping engines after bus_set_iommu()
> will fix the rmrr issue.

Thats pretty neat, avoids any temporary domain allocation, nice!

> 
> 
> 
> 0009-return-si_domain-directly.patch
> 
> I suggest that we should keep current si_domain implementation since
> changing si_domain is not the purpose of this patch set. Please merge
> this with PATCH 3/4 if you like it.

Seems reasonable.

> 
> 
> 
> 0010-iommu-vt-d-remove-floopy-workaround.patch
> 0011-iommu-vt-d-remove-prepare-rmrr-helpers.patch
> 0012-iommu-vt-d-remove-prepare-identity-map-during-boot.patch
> 
> Above patches are further cleanups as the result of switching to default
> domain. Please consider to merge them with PATCH 4/4.

Nice, good catch.

> 
> 
> 
> I 

Re: [PATCH 4/4] iommu/vt-d: Remove lazy allocation of domains

2019-03-07 Thread James Sewart via iommu
Hey Lu,

> On 7 Mar 2019, at 06:31, Lu Baolu  wrote:
> 
> Hi James,
> 
> On 3/7/19 2:08 AM, James Sewart wrote:
>> -/*
>> - * For each rmrr
>> - *   for each dev attached to rmrr
>> - *   do
>> - * locate drhd for dev, alloc domain for dev
>> - * allocate free domain
>> - * allocate page table entries for rmrr
>> - * if context not allocated for bus
>> - *   allocate and init context
>> - *   set present in root table for this bus
>> - * init context with domain, translation etc
>> - *endfor
>> - * endfor
>> - */
>> -pr_info("Setting RMRR:\n");
>> -for_each_rmrr_units(rmrr) {
>> -/* some BIOS lists non-exist devices in DMAR table. */
>> -for_each_active_dev_scope(rmrr->devices, 
>> rmrr->devices_cnt,
>> -  i, dev) {
>> -ret = iommu_prepare_rmrr_dev(rmrr, dev);
>> -if (ret)
>> -pr_err("Mapping reserved region 
>> failed\n");
>> -}
>> -}
>> -
>> -iommu_prepare_isa();
> Why do you want to remove this segment of code?
 This will only work if the lazy allocation of domains exists, these
 mappings will disappear once a default domain is attached to a device and
 then remade by iommu_group_create_direct_mappings. This code is redundant
 and removing it allows us to remove all the lazy allocation logic.
>>> No exactly.
>>> 
>>> We need to setup the rmrr mapping before enabling dma remapping,
>>> otherwise, there will be a window after dma remapping enabling and
>>> rmrr getting mapped, during which people might see dma faults.
>> Do you think this patch instead should be a refactoring to simplify this 
>> initial domain setup before the default domain takes over? It seems like 
>> duplicated effort to have both lazy allocated domains and externally managed 
>> domains. We could allocate a domain here for any devices with RMRR and call 
>> get_resv_regions to avoid duplicating RMRR loop.
> 
> Agree. We should replace the lazy allocated domains with default domain
> in a clean way. Actually, your patches look great to me. But I do think
> we need further cleanups. The rmrr code is one example, and the identity
> domain (si_domain) is another.
> 
> Do you mind if I work on top of your patches for further cleanups and
> sign off a v2 together with you?

Sure, sounds good. I’ll fixup patch 3 and have a go at integrating 
iommu_prepare_isa into get_resv_regions. This should make the initial 
domain logic here quite concise.

> 
> Best regards,
> Lu Baolu

Cheers,
James.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 4/4] iommu/vt-d: Remove lazy allocation of domains

2019-03-06 Thread James Sewart via iommu
Hey,

> On 6 Mar 2019, at 07:00, Lu Baolu  wrote:
> 
> Hi,
> 
> On 3/5/19 7:46 PM, James Sewart wrote:
>> Hey Lu,
>>> On 5 Mar 2019, at 06:59, Lu Baolu  wrote:
>>> 
>>> Hi,
>>> 
>>> It's hard for me to understand why do we remove the rmrr related
>>> code in this patch.
>> The RMRR code removed here requires the lazy allocation of domains to
>> exist, as it is run before iommu.c would assign IOMMU groups and attach a
>> domain. Before patch 3, removing this code would mean the RMRR regions are
>> never mapped for a domain: iommu.c will allocate a default domain for the
>> group that a device is about to be put in, it will attach the domain to
>> the device, then for each region returned by get_resv_regions it will
>> create an identity map, this is where the RMRRs are setup for the default
>> domain. >
>>> 
>>> And, now we have two places to hold a domain for a device: group and
>>> dev->info. We can consider to optimize the use of per device iommu
>>> arch data. This should be later work anyway.
>>> 
>>> More comments inline.
>>> 
>>> On 3/4/19 11:47 PM, James Sewart wrote:
 The generic IOMMU code will allocate and attach a dma ops domain to each
 device that comes online, replacing any lazy allocated domain. Removes
 RMRR application at iommu init time as we won't have a domain attached
 to any device. iommu.c will do this after attaching a device using
 create_direct_mappings.
 Signed-off-by: James Sewart 
 ---
  drivers/iommu/intel-iommu.c | 202 ++--
  1 file changed, 8 insertions(+), 194 deletions(-)
 diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
 index 71cd6bbfec05..282257e2628d 100644
 --- a/drivers/iommu/intel-iommu.c
 +++ b/drivers/iommu/intel-iommu.c
 @@ -2595,118 +2595,6 @@ static struct dmar_domain 
 *dmar_insert_one_dev_info(struct intel_iommu *iommu,
return domain;
  }
  -static int get_last_alias(struct pci_dev *pdev, u16 alias, void *opaque)
 -{
 -  *(u16 *)opaque = alias;
 -  return 0;
 -}
 -
 -static struct dmar_domain *find_or_alloc_domain(struct device *dev, int 
 gaw)
 -{
 -  struct device_domain_info *info = NULL;
 -  struct dmar_domain *domain = NULL;
 -  struct intel_iommu *iommu;
 -  u16 dma_alias;
 -  unsigned long flags;
 -  u8 bus, devfn;
 -
 -  iommu = device_to_iommu(dev, , );
 -  if (!iommu)
 -  return NULL;
 -
 -  if (dev_is_pci(dev)) {
 -  struct pci_dev *pdev = to_pci_dev(dev);
 -
 -  pci_for_each_dma_alias(pdev, get_last_alias, _alias);
 -
 -  spin_lock_irqsave(_domain_lock, flags);
 -  info = dmar_search_domain_by_dev_info(pci_domain_nr(pdev->bus),
 -PCI_BUS_NUM(dma_alias),
 -dma_alias & 0xff);
 -  if (info) {
 -  iommu = info->iommu;
 -  domain = info->domain;
 -  }
 -  spin_unlock_irqrestore(_domain_lock, flags);
 -
 -  /* DMA alias already has a domain, use it */
 -  if (info)
 -  goto out;
 -  }
 -
 -  /* Allocate and initialize new domain for the device */
 -  domain = alloc_domain(0);
 -  if (!domain)
 -  return NULL;
 -  if (domain_init(domain, iommu, gaw)) {
 -  domain_exit(domain);
 -  return NULL;
 -  }
 -
 -out:
 -
 -  return domain;
 -}
 -
 -static struct dmar_domain *set_domain_for_dev(struct device *dev,
 -struct dmar_domain *domain)
 -{
 -  struct intel_iommu *iommu;
 -  struct dmar_domain *tmp;
 -  u16 req_id, dma_alias;
 -  u8 bus, devfn;
 -
 -  iommu = device_to_iommu(dev, , );
 -  if (!iommu)
 -  return NULL;
 -
 -  req_id = ((u16)bus << 8) | devfn;
 -
 -  if (dev_is_pci(dev)) {
 -  struct pci_dev *pdev = to_pci_dev(dev);
 -
 -  pci_for_each_dma_alias(pdev, get_last_alias, _alias);
 -
 -  /* register PCI DMA alias device */
 -  if (req_id != dma_alias) {
 -  tmp = dmar_insert_one_dev_info(iommu, 
 PCI_BUS_NUM(dma_alias),
 -  dma_alias & 0xff, NULL, domain);
 -
 -  if (!tmp || tmp != domain)
 -  return tmp;
 -  }
 -  }
 -
 -  tmp = dmar_insert_one_dev_info(iommu, bus, devfn, dev, domain);
 -  if (!tmp || tmp != domain)
 -  return tmp;
 -
 -  return domain;
 -}
 -
 -static struct dmar_domain *get_domain_for_dev(struct device *dev, int gaw)
 -{
 -  struct dmar_domain *domain, *tmp;
 -
 -  domain = find_domain(dev);
 -  if 

Re: [PATCH 4/4] iommu/vt-d: Remove lazy allocation of domains

2019-03-05 Thread James Sewart via iommu
Hey Lu,

> On 5 Mar 2019, at 06:59, Lu Baolu  wrote:
> 
> Hi,
> 
> It's hard for me to understand why do we remove the rmrr related
> code in this patch.

The RMRR code removed here requires the lazy allocation of domains to 
exist, as it is run before iommu.c would assign IOMMU groups and attach a 
domain. Before patch 3, removing this code would mean the RMRR regions are 
never mapped for a domain: iommu.c will allocate a default domain for the 
group that a device is about to be put in, it will attach the domain to 
the device, then for each region returned by get_resv_regions it will 
create an identity map, this is where the RMRRs are setup for the default 
domain.

> 
> And, now we have two places to hold a domain for a device: group and
> dev->info. We can consider to optimize the use of per device iommu
> arch data. This should be later work anyway.
> 
> More comments inline.
> 
> On 3/4/19 11:47 PM, James Sewart wrote:
>> The generic IOMMU code will allocate and attach a dma ops domain to each
>> device that comes online, replacing any lazy allocated domain. Removes
>> RMRR application at iommu init time as we won't have a domain attached
>> to any device. iommu.c will do this after attaching a device using
>> create_direct_mappings.
>> Signed-off-by: James Sewart 
>> ---
>>  drivers/iommu/intel-iommu.c | 202 ++--
>>  1 file changed, 8 insertions(+), 194 deletions(-)
>> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
>> index 71cd6bbfec05..282257e2628d 100644
>> --- a/drivers/iommu/intel-iommu.c
>> +++ b/drivers/iommu/intel-iommu.c
>> @@ -2595,118 +2595,6 @@ static struct dmar_domain 
>> *dmar_insert_one_dev_info(struct intel_iommu *iommu,
>>  return domain;
>>  }
>>  -static int get_last_alias(struct pci_dev *pdev, u16 alias, void *opaque)
>> -{
>> -*(u16 *)opaque = alias;
>> -return 0;
>> -}
>> -
>> -static struct dmar_domain *find_or_alloc_domain(struct device *dev, int gaw)
>> -{
>> -struct device_domain_info *info = NULL;
>> -struct dmar_domain *domain = NULL;
>> -struct intel_iommu *iommu;
>> -u16 dma_alias;
>> -unsigned long flags;
>> -u8 bus, devfn;
>> -
>> -iommu = device_to_iommu(dev, , );
>> -if (!iommu)
>> -return NULL;
>> -
>> -if (dev_is_pci(dev)) {
>> -struct pci_dev *pdev = to_pci_dev(dev);
>> -
>> -pci_for_each_dma_alias(pdev, get_last_alias, _alias);
>> -
>> -spin_lock_irqsave(_domain_lock, flags);
>> -info = dmar_search_domain_by_dev_info(pci_domain_nr(pdev->bus),
>> -  PCI_BUS_NUM(dma_alias),
>> -  dma_alias & 0xff);
>> -if (info) {
>> -iommu = info->iommu;
>> -domain = info->domain;
>> -}
>> -spin_unlock_irqrestore(_domain_lock, flags);
>> -
>> -/* DMA alias already has a domain, use it */
>> -if (info)
>> -goto out;
>> -}
>> -
>> -/* Allocate and initialize new domain for the device */
>> -domain = alloc_domain(0);
>> -if (!domain)
>> -return NULL;
>> -if (domain_init(domain, iommu, gaw)) {
>> -domain_exit(domain);
>> -return NULL;
>> -}
>> -
>> -out:
>> -
>> -return domain;
>> -}
>> -
>> -static struct dmar_domain *set_domain_for_dev(struct device *dev,
>> -  struct dmar_domain *domain)
>> -{
>> -struct intel_iommu *iommu;
>> -struct dmar_domain *tmp;
>> -u16 req_id, dma_alias;
>> -u8 bus, devfn;
>> -
>> -iommu = device_to_iommu(dev, , );
>> -if (!iommu)
>> -return NULL;
>> -
>> -req_id = ((u16)bus << 8) | devfn;
>> -
>> -if (dev_is_pci(dev)) {
>> -struct pci_dev *pdev = to_pci_dev(dev);
>> -
>> -pci_for_each_dma_alias(pdev, get_last_alias, _alias);
>> -
>> -/* register PCI DMA alias device */
>> -if (req_id != dma_alias) {
>> -tmp = dmar_insert_one_dev_info(iommu, 
>> PCI_BUS_NUM(dma_alias),
>> -dma_alias & 0xff, NULL, domain);
>> -
>> -if (!tmp || tmp != domain)
>> -return tmp;
>> -}
>> -}
>> -
>> -tmp = dmar_insert_one_dev_info(iommu, bus, devfn, dev, domain);
>> -if (!tmp || tmp != domain)
>> -return tmp;
>> -
>> -return domain;
>> -}
>> -
>> -static struct dmar_domain *get_domain_for_dev(struct device *dev, int gaw)
>> -{
>> -struct dmar_domain *domain, *tmp;
>> -
>> -domain = find_domain(dev);
>> -if (domain)
>> -goto out;
>> -
>> -domain = find_or_alloc_domain(dev, gaw);
>> -if (!domain)
>> -goto out;
>> -
>> -tmp = set_domain_for_dev(dev, domain);
>> -if (!tmp || domain != tmp) {
>> -

Re: [PATCH 3/4] iommu/vt-d: Allow IOMMU_DOMAIN_DMA and IOMMU_DOMAIN_IDENTITY to be allocated

2019-03-05 Thread James Sewart via iommu
Hey Lu,

> On 5 Mar 2019, at 06:46, Lu Baolu  wrote:
> 
> Hi,
> 
> On 3/4/19 11:46 PM, James Sewart wrote:
>> Allowing IOMMU_DOMAIN_DMA type IOMMU domain to be allocated allows the
>> default_domain of an iommu_group to be set. This delegates device-domain
>> relationships to the generic IOMMU code.
>> Signed-off-by: James Sewart 
>> ---
>>  drivers/iommu/intel-iommu.c | 113 +++-
>>  1 file changed, 84 insertions(+), 29 deletions(-)
>> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
>> index 8e0a4e2ff77f..71cd6bbfec05 100644
>> --- a/drivers/iommu/intel-iommu.c
>> +++ b/drivers/iommu/intel-iommu.c
>> @@ -309,6 +309,14 @@ static int hw_pass_through = 1;
>>  /* si_domain contains mulitple devices */
>>  #define DOMAIN_FLAG_STATIC_IDENTITY (1 << 1)
>>  +/* Domain managed externally, don't cleanup if it isn't attached
>> + * to any devices. */
>> +#define DOMAIN_FLAG_MANAGED_EXTERNALLY  (1 << 2)
>> +
>> +/* Set after domain initialisation. Used when allocating dma domains to
>> + * defer domain initialisation until it is attached to a device */
>> +#define DOMAIN_FLAG_INITIALISED (1 << 4)
> 
> Why do you skip bit 3?

This was an oversight, I will update to use bit 3.

> 
>> +
>>  #define for_each_domain_iommu(idx, domain)  \
>>  for (idx = 0; idx < g_num_of_iommus; idx++) \
>>  if (domain->iommu_refcnt[idx])
>> @@ -558,6 +566,16 @@ static inline int domain_type_is_vm_or_si(struct 
>> dmar_domain *domain)
>>  DOMAIN_FLAG_STATIC_IDENTITY);
>>  }
>>  +static inline int domain_managed_externally(struct dmar_domain *domain)
>> +{
>> +return domain->flags & DOMAIN_FLAG_MANAGED_EXTERNALLY;
>> +}
>> +
>> +static inline int domain_is_initialised(struct dmar_domain *domain)
>> +{
>> +return domain->flags & DOMAIN_FLAG_INITIALISED;
>> +}
>> +
>>  static inline int domain_pfn_supported(struct dmar_domain *domain,
>> unsigned long pfn)
>>  {
>> @@ -1662,7 +1680,7 @@ static void disable_dmar_iommu(struct intel_iommu 
>> *iommu)
>>  __dmar_remove_one_dev_info(info);
>>  -   if (!domain_type_is_vm_or_si(domain)) {
>> +if (!domain_managed_externally(domain)) {
>>  /*
>>   * The domain_exit() function  can't be called under
>>   * device_domain_lock, as it takes this lock itself.
>> @@ -1895,6 +1913,7 @@ static int domain_init(struct dmar_domain *domain, 
>> struct intel_iommu *iommu,
>>  domain->pgd = (struct dma_pte *)alloc_pgtable_page(domain->nid);
>>  if (!domain->pgd)
>>  return -ENOMEM;
>> +domain->flags |= DOMAIN_FLAG_INITIALISED;
>>  __iommu_flush_cache(iommu, domain->pgd, PAGE_SIZE);
>>  return 0;
>>  }
>> @@ -2807,14 +2826,10 @@ static inline void iommu_prepare_isa(void)
>>static int md_domain_init(struct dmar_domain *domain, int guest_width);
>>  -static int __init si_domain_init(int hw)
>> +static int __init si_domain_init(struct dmar_domain *si_domain, int hw)
>>  {
>>  int nid, ret = 0;
>>  -   si_domain = alloc_domain(DOMAIN_FLAG_STATIC_IDENTITY);
>> -if (!si_domain)
>> -return -EFAULT;
>> -
>>  if (md_domain_init(si_domain, DEFAULT_DOMAIN_ADDRESS_WIDTH)) {
>>  domain_exit(si_domain);
>>  return -EFAULT;
>> @@ -3417,9 +3432,16 @@ static int __init init_dmars(void)
>>  check_tylersburg_isoch();
>>  if (iommu_identity_mapping) {
>> -ret = si_domain_init(hw_pass_through);
>> -if (ret)
>> +si_domain = alloc_domain(DOMAIN_FLAG_STATIC_IDENTITY);
> 
> Where will this si_domain be used? We still need to keep the global
> si_domain?

I am unsure of the best thing to do here. The si_domain can be initialised 
as a hardware passthrough which means it doesn’t have any mappings applied. 
I think any user allocating a domain here will always want a software 
passthrough domain. I’m not sure if/how to consolidate these two.

> 
>> +if (!si_domain) {
>> +ret = -EFAULT;
>>  goto free_iommu;
>> +}
>> +ret = si_domain_init(si_domain, hw_pass_through);
>> +if (ret) {
>> +domain_exit(si_domain);
>> +goto free_iommu;
>> +}
>>  }
>>@@ -4575,7 +4597,7 @@ static int device_notifier(struct notifier_block 
>> *nb,
>>  return 0;
>>  dmar_remove_one_dev_info(domain, dev);
>> -if (!domain_type_is_vm_or_si(domain) && list_empty(>devices))
>> +if (!domain_managed_externally(domain) && list_empty(>devices))
>>  domain_exit(domain);
>>  return 0;
>> @@ -5020,6 +5042,7 @@ static int md_domain_init(struct dmar_domain *domain, 
>> int guest_width)
>>  domain->pgd = (struct dma_pte *)alloc_pgtable_page(domain->nid);
>>  if (!domain->pgd)
>>  return 

Re: [PATCH 0/4] iommu/vt-d: Fix-up device-domain relationship by refactoring to use iommu group default domain.

2019-03-05 Thread James Sewart via iommu
Hey Lu,

The motivation for this is buggy domain <-> IOMMU group relationship when 
using find_or_alloc_domain. From what I have read it should be the case 
that an IOMMU domain is shared by all devices in the same group, thus the 
same mappings. This is because the IOMMU group is determined by things 
like ACS settings on pci switches which determines whether devices can 
talk to each other.

In find_or_alloc_domain we only determine domain sharing based on devices 
returned by pci_for_each_dma_alias, whereas there are many more checks in 
pci_device_group(e.g. ACS settings of intermediary pci switches), which is 
used for determining which IOMMU group a device is in. This discontinuity 
means it can be the case that each device within an IOMMU group will have 
different domains.

We see issues as it is supposed to be safe to assume that devices within a 
group should be considered to be in the same address space, but if each 
device has its own domain then any mapping created won’t be valid for the 
other devices, and even could be mapped differently for each device. This 
also could cause issues with a device whose RMRR's are not applied to the 
domains of other devices within its group, these regions could be remapped 
for the other devices resulting in unexpected behaviour during 
peer-to-peer transfers.

Cheers,
James


> On 5 Mar 2019, at 06:05, Lu Baolu  wrote:
> 
> Hi James,
> 
> Very glad to see this. Thank you!
> 
> On 3/4/19 11:41 PM, James Sewart wrote:
>> Hey,
>> This patchset moves IOMMU_DOMAIN_DMA iommu domain management to iommu.c.
>> This avoids the use of find_or_alloc_domain whose domain assignment is
>> inconsistent with the iommu grouping as determined by pci_device_group.
> 
> Is this a bug fix or an improvement? What's the real issue will it cause
> if we go without this patch set?
> 
> Best regards,
> Lu Baolu
> 
>> Patch 3 permits domain type IOMMU_DOMAIN_DMA to be allocated via the
>> iommu_ops api, allowing the default_domain of an iommu group to be set in
>> iommu.c. This domain will be attached to every device that is brought up
>> with an iommu group, and the devices reserved regions will be mapped using
>> regions returned by get_resv_regions.
>> In intel_iommu_domain_alloc we don’t know the IOMMU this domain will be
>> associated with so we defer full initialisation until
>> intel_iommu_attach_device. Currently iommu.c:iommu_group_add_device will
>> try to map a devices reserved regions before attaching the domain which
>> would cause issue if the domain is not fully initialised. This is
>> addressed in patch 1 by moving the mapping to after attaching.
>> Patch 2 implements function apply_resv_region, used in
>> iommu_group_create_direct_mappings to mark the reserved regions as non
>> mappable for the dma_map_ops api.
>> Patch 4 removes the domain lazy allocation logic. Before this patch the
>> lazy allocation logic would not be used as any domain allocated using
>> these paths would be replaced when attaching the group default domain.
>> Default domain allocation has been tested with and without this patch on
>> 4.19.
>> Cheers,
>> James.
>> James Sewart (4):
>>   iommu: Move iommu_group_create_direct_mappings to after device_attach
>>   iommu/vt-d: Implement apply_resv_region for reserving IOVA ranges
>>   iommu/vt-d: Allow IOMMU_DOMAIN_DMA and IOMMU_DOMAIN_IDENTITY to be 
>> allocated
>>   iommu/vt-d: Remove lazy allocation of domains
>>  drivers/iommu/intel-iommu.c | 329 
>>  drivers/iommu/iommu.c   |   4 +-
>>  2 files changed, 108 insertions(+), 225 deletions(-)

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 3/4] iommu/vt-d: Allow IOMMU_DOMAIN_DMA and IOMMU_DOMAIN_IDENTITY to be allocated

2019-03-04 Thread James Sewart via iommu
Allowing IOMMU_DOMAIN_DMA type IOMMU domain to be allocated allows the
default_domain of an iommu_group to be set. This delegates device-domain
relationships to the generic IOMMU code.

Signed-off-by: James Sewart 
---
 drivers/iommu/intel-iommu.c | 113 +++-
 1 file changed, 84 insertions(+), 29 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 8e0a4e2ff77f..71cd6bbfec05 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -309,6 +309,14 @@ static int hw_pass_through = 1;
 /* si_domain contains mulitple devices */
 #define DOMAIN_FLAG_STATIC_IDENTITY(1 << 1)
 
+/* Domain managed externally, don't cleanup if it isn't attached
+ * to any devices. */
+#define DOMAIN_FLAG_MANAGED_EXTERNALLY (1 << 2)
+
+/* Set after domain initialisation. Used when allocating dma domains to
+ * defer domain initialisation until it is attached to a device */
+#define DOMAIN_FLAG_INITIALISED(1 << 4)
+
 #define for_each_domain_iommu(idx, domain) \
for (idx = 0; idx < g_num_of_iommus; idx++) \
if (domain->iommu_refcnt[idx])
@@ -558,6 +566,16 @@ static inline int domain_type_is_vm_or_si(struct 
dmar_domain *domain)
DOMAIN_FLAG_STATIC_IDENTITY);
 }
 
+static inline int domain_managed_externally(struct dmar_domain *domain)
+{
+   return domain->flags & DOMAIN_FLAG_MANAGED_EXTERNALLY;
+}
+
+static inline int domain_is_initialised(struct dmar_domain *domain)
+{
+   return domain->flags & DOMAIN_FLAG_INITIALISED;
+}
+
 static inline int domain_pfn_supported(struct dmar_domain *domain,
   unsigned long pfn)
 {
@@ -1662,7 +1680,7 @@ static void disable_dmar_iommu(struct intel_iommu *iommu)
 
__dmar_remove_one_dev_info(info);
 
-   if (!domain_type_is_vm_or_si(domain)) {
+   if (!domain_managed_externally(domain)) {
/*
 * The domain_exit() function  can't be called under
 * device_domain_lock, as it takes this lock itself.
@@ -1895,6 +1913,7 @@ static int domain_init(struct dmar_domain *domain, struct 
intel_iommu *iommu,
domain->pgd = (struct dma_pte *)alloc_pgtable_page(domain->nid);
if (!domain->pgd)
return -ENOMEM;
+   domain->flags |= DOMAIN_FLAG_INITIALISED;
__iommu_flush_cache(iommu, domain->pgd, PAGE_SIZE);
return 0;
 }
@@ -2807,14 +2826,10 @@ static inline void iommu_prepare_isa(void)
 
 static int md_domain_init(struct dmar_domain *domain, int guest_width);
 
-static int __init si_domain_init(int hw)
+static int __init si_domain_init(struct dmar_domain *si_domain, int hw)
 {
int nid, ret = 0;
 
-   si_domain = alloc_domain(DOMAIN_FLAG_STATIC_IDENTITY);
-   if (!si_domain)
-   return -EFAULT;
-
if (md_domain_init(si_domain, DEFAULT_DOMAIN_ADDRESS_WIDTH)) {
domain_exit(si_domain);
return -EFAULT;
@@ -3417,9 +3432,16 @@ static int __init init_dmars(void)
check_tylersburg_isoch();
 
if (iommu_identity_mapping) {
-   ret = si_domain_init(hw_pass_through);
-   if (ret)
+   si_domain = alloc_domain(DOMAIN_FLAG_STATIC_IDENTITY);
+   if (!si_domain) {
+   ret = -EFAULT;
goto free_iommu;
+   }
+   ret = si_domain_init(si_domain, hw_pass_through);
+   if (ret) {
+   domain_exit(si_domain);
+   goto free_iommu;
+   }
}
 
 
@@ -4575,7 +4597,7 @@ static int device_notifier(struct notifier_block *nb,
return 0;
 
dmar_remove_one_dev_info(domain, dev);
-   if (!domain_type_is_vm_or_si(domain) && list_empty(>devices))
+   if (!domain_managed_externally(domain) && list_empty(>devices))
domain_exit(domain);
 
return 0;
@@ -5020,6 +5042,7 @@ static int md_domain_init(struct dmar_domain *domain, int 
guest_width)
domain->pgd = (struct dma_pte *)alloc_pgtable_page(domain->nid);
if (!domain->pgd)
return -ENOMEM;
+   domain->flags |= DOMAIN_FLAG_INITIALISED;
domain_flush_cache(domain, domain->pgd, PAGE_SIZE);
return 0;
 }
@@ -5028,28 +5051,54 @@ static struct iommu_domain 
*intel_iommu_domain_alloc(unsigned type)
 {
struct dmar_domain *dmar_domain;
struct iommu_domain *domain;
+   int flags = DOMAIN_FLAG_MANAGED_EXTERNALLY;
 
-   if (type != IOMMU_DOMAIN_UNMANAGED)
-   return NULL;
-
-   dmar_domain = alloc_domain(DOMAIN_FLAG_VIRTUAL_MACHINE);
-   if (!dmar_domain) {
-   pr_err("Can't allocate dmar_domain\n");
-   return NULL;
-   }
-   if (md_domain_init(dmar_domain, DEFAULT_DOMAIN_ADDRESS_WIDTH)) {
-

[PATCH 4/4] iommu/vt-d: Remove lazy allocation of domains

2019-03-04 Thread James Sewart via iommu
The generic IOMMU code will allocate and attach a dma ops domain to each
device that comes online, replacing any lazy allocated domain. Removes
RMRR application at iommu init time as we won't have a domain attached
to any device. iommu.c will do this after attaching a device using
create_direct_mappings.

Signed-off-by: James Sewart 
---
 drivers/iommu/intel-iommu.c | 202 ++--
 1 file changed, 8 insertions(+), 194 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 71cd6bbfec05..282257e2628d 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -2595,118 +2595,6 @@ static struct dmar_domain 
*dmar_insert_one_dev_info(struct intel_iommu *iommu,
return domain;
 }
 
-static int get_last_alias(struct pci_dev *pdev, u16 alias, void *opaque)
-{
-   *(u16 *)opaque = alias;
-   return 0;
-}
-
-static struct dmar_domain *find_or_alloc_domain(struct device *dev, int gaw)
-{
-   struct device_domain_info *info = NULL;
-   struct dmar_domain *domain = NULL;
-   struct intel_iommu *iommu;
-   u16 dma_alias;
-   unsigned long flags;
-   u8 bus, devfn;
-
-   iommu = device_to_iommu(dev, , );
-   if (!iommu)
-   return NULL;
-
-   if (dev_is_pci(dev)) {
-   struct pci_dev *pdev = to_pci_dev(dev);
-
-   pci_for_each_dma_alias(pdev, get_last_alias, _alias);
-
-   spin_lock_irqsave(_domain_lock, flags);
-   info = dmar_search_domain_by_dev_info(pci_domain_nr(pdev->bus),
- PCI_BUS_NUM(dma_alias),
- dma_alias & 0xff);
-   if (info) {
-   iommu = info->iommu;
-   domain = info->domain;
-   }
-   spin_unlock_irqrestore(_domain_lock, flags);
-
-   /* DMA alias already has a domain, use it */
-   if (info)
-   goto out;
-   }
-
-   /* Allocate and initialize new domain for the device */
-   domain = alloc_domain(0);
-   if (!domain)
-   return NULL;
-   if (domain_init(domain, iommu, gaw)) {
-   domain_exit(domain);
-   return NULL;
-   }
-
-out:
-
-   return domain;
-}
-
-static struct dmar_domain *set_domain_for_dev(struct device *dev,
- struct dmar_domain *domain)
-{
-   struct intel_iommu *iommu;
-   struct dmar_domain *tmp;
-   u16 req_id, dma_alias;
-   u8 bus, devfn;
-
-   iommu = device_to_iommu(dev, , );
-   if (!iommu)
-   return NULL;
-
-   req_id = ((u16)bus << 8) | devfn;
-
-   if (dev_is_pci(dev)) {
-   struct pci_dev *pdev = to_pci_dev(dev);
-
-   pci_for_each_dma_alias(pdev, get_last_alias, _alias);
-
-   /* register PCI DMA alias device */
-   if (req_id != dma_alias) {
-   tmp = dmar_insert_one_dev_info(iommu, 
PCI_BUS_NUM(dma_alias),
-   dma_alias & 0xff, NULL, domain);
-
-   if (!tmp || tmp != domain)
-   return tmp;
-   }
-   }
-
-   tmp = dmar_insert_one_dev_info(iommu, bus, devfn, dev, domain);
-   if (!tmp || tmp != domain)
-   return tmp;
-
-   return domain;
-}
-
-static struct dmar_domain *get_domain_for_dev(struct device *dev, int gaw)
-{
-   struct dmar_domain *domain, *tmp;
-
-   domain = find_domain(dev);
-   if (domain)
-   goto out;
-
-   domain = find_or_alloc_domain(dev, gaw);
-   if (!domain)
-   goto out;
-
-   tmp = set_domain_for_dev(dev, domain);
-   if (!tmp || domain != tmp) {
-   domain_exit(domain);
-   domain = tmp;
-   }
-
-out:
-
-   return domain;
-}
-
 static int iommu_domain_identity_map(struct dmar_domain *domain,
 unsigned long long start,
 unsigned long long end)
@@ -2779,7 +2667,7 @@ static int iommu_prepare_identity_map(struct device *dev,
struct dmar_domain *domain;
int ret;
 
-   domain = get_domain_for_dev(dev, DEFAULT_DOMAIN_ADDRESS_WIDTH);
+   domain = find_domain(dev);
if (!domain)
return -ENOMEM;
 
@@ -3301,11 +3189,9 @@ static int copy_translation_tables(struct intel_iommu 
*iommu)
 static int __init init_dmars(void)
 {
struct dmar_drhd_unit *drhd;
-   struct dmar_rmrr_unit *rmrr;
bool copied_tables = false;
-   struct device *dev;
struct intel_iommu *iommu;
-   int i, ret;
+   int ret;
 
/*
 * for each drhd
@@ -3466,32 +3352,6 @@ static int __init init_dmars(void)
goto free_iommu;
}
}
-   /*
-

[PATCH 1/4] iommu: Move iommu_group_create_direct_mappings to after device_attach

2019-03-04 Thread James Sewart via iommu
If an IOMMU driver requires to know which IOMMU a domain is associated
to initialise a domain then it will do so in device_attach, before which
it is not safe to call iommu_ops.

Signed-off-by: James Sewart 
---
 drivers/iommu/iommu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 3ed4db334341..1c6ffbb2d20e 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -652,8 +652,6 @@ int iommu_group_add_device(struct iommu_group *group, 
struct device *dev)
 
dev->iommu_group = group;
 
-   iommu_group_create_direct_mappings(group, dev);
-
mutex_lock(>mutex);
list_add_tail(>list, >devices);
if (group->domain)
@@ -662,6 +660,8 @@ int iommu_group_add_device(struct iommu_group *group, 
struct device *dev)
if (ret)
goto err_put_group;
 
+   iommu_group_create_direct_mappings(group, dev);
+
/* Notify any listeners about change to group. */
blocking_notifier_call_chain(>notifier,
 IOMMU_GROUP_NOTIFY_ADD_DEVICE, dev);
-- 
2.17.1



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/4] iommu/vt-d: Implement apply_resv_region for reserving IOVA ranges

2019-03-04 Thread James Sewart via iommu
Used by iommu.c before creating identity mappings for reserved ranges to
ensure dma-map-ops won't ever remap these addresses.

Signed-off-by: James Sewart 
---
 drivers/iommu/intel-iommu.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 78188bf7e90d..8e0a4e2ff77f 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5299,6 +5299,19 @@ static void intel_iommu_put_resv_regions(struct device 
*dev,
}
 }
 
+static void intel_iommu_apply_resv_region(struct device *dev,
+   struct iommu_domain *domain,
+   struct iommu_resv_region *region)
+{
+   struct dmar_domain *dmar_domain = to_dmar_domain(domain);
+   unsigned long start, end;
+
+   start = IOVA_PFN(region->start);
+   end   = IOVA_PFN(region->start + region->length - 1);
+
+   WARN_ON_ONCE(reserve_iova(_domain->iovad, start, end) == NULL);
+}
+
 #ifdef CONFIG_INTEL_IOMMU_SVM
 int intel_iommu_enable_pasid(struct intel_iommu *iommu, struct intel_svm_dev 
*sdev)
 {
@@ -5392,6 +5405,7 @@ const struct iommu_ops intel_iommu_ops = {
.remove_device  = intel_iommu_remove_device,
.get_resv_regions   = intel_iommu_get_resv_regions,
.put_resv_regions   = intel_iommu_put_resv_regions,
+   .apply_resv_region  = intel_iommu_apply_resv_region,
.device_group   = pci_device_group,
.pgsize_bitmap  = INTEL_IOMMU_PGSIZES,
 };
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 0/4] iommu/vt-d: Fix-up device-domain relationship by refactoring to use iommu group default domain.

2019-03-04 Thread James Sewart via iommu
Hey,

This patchset moves IOMMU_DOMAIN_DMA iommu domain management to iommu.c. 
This avoids the use of find_or_alloc_domain whose domain assignment is 
inconsistent with the iommu grouping as determined by pci_device_group.

Patch 3 permits domain type IOMMU_DOMAIN_DMA to be allocated via the 
iommu_ops api, allowing the default_domain of an iommu group to be set in 
iommu.c. This domain will be attached to every device that is brought up 
with an iommu group, and the devices reserved regions will be mapped using 
regions returned by get_resv_regions.

In intel_iommu_domain_alloc we don’t know the IOMMU this domain will be 
associated with so we defer full initialisation until 
intel_iommu_attach_device. Currently iommu.c:iommu_group_add_device will 
try to map a devices reserved regions before attaching the domain which 
would cause issue if the domain is not fully initialised. This is 
addressed in patch 1 by moving the mapping to after attaching.

Patch 2 implements function apply_resv_region, used in 
iommu_group_create_direct_mappings to mark the reserved regions as non 
mappable for the dma_map_ops api.

Patch 4 removes the domain lazy allocation logic. Before this patch the 
lazy allocation logic would not be used as any domain allocated using 
these paths would be replaced when attaching the group default domain. 
Default domain allocation has been tested with and without this patch on 
4.19.

Cheers,
James.

James Sewart (4):
  iommu: Move iommu_group_create_direct_mappings to after device_attach
  iommu/vt-d: Implement apply_resv_region for reserving IOVA ranges
  iommu/vt-d: Allow IOMMU_DOMAIN_DMA and IOMMU_DOMAIN_IDENTITY to be allocated
  iommu/vt-d: Remove lazy allocation of domains

 drivers/iommu/intel-iommu.c | 329 
 drivers/iommu/iommu.c   |   4 +-
 2 files changed, 108 insertions(+), 225 deletions(-)

-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC v2] iommu/vt-d: Allow iommu_domain_alloc to allocate IOMMU_DOMAIN_DMA

2019-01-15 Thread James Sewart via iommu
Hey Jacob,

> On 7 Jan 2019, at 20:04, Jacob Pan  wrote:
> 
> On Wed, 5 Dec 2018 17:19:35 +
> James Sewart  wrote:
> 
>> Hey,
>> 
>> There exists an issue in the logic used to determine domain
>> association with devices. Currently the driver uses
>> find_or_alloc_domain to either reuse an existing domain or allocate a
>> new one if one isn’t found. Domains should be shared between all
>> members of an IOMMU group as this is the minimum granularity that we
>> can guarantee address space isolation.
>> 
>> The intel IOMMU driver exposes pci_device_group in intel_iommu_ops as
>> the function to call to determine the group of a device, this is
>> implemented in the generic IOMMU code and checks: dma aliases,
>> upstream pcie switch ACS, pci aliases, and pci function aliases. The
>> find_or_alloc_domain code currently only uses dma aliases to
>> determine if a domain is shared. This causes a disconnect between
>> IOMMU groups and domains. We have observed devices under a pcie
>> switch each having their own domain but assigned the same group.
>> 
>> One solution would be to fix the logic in find_or_alloc_domain to add 
>> checks for the other conditions that a device may share a domain.
>> However, this duplicates code which the generic IOMMU code
>> implements. Instead this issue can be fixed by allowing the
>> allocation of default_domain on the IOMMU group. This is not
>> currently supported as the intel driver does not allow allocation of
>> domain type IOMMU_DOMAIN_DMA.
>> 
>> Allowing allocation of DMA domains has the effect that the
>> default_domain is non NULL and is attached to a device when
>> initialising. This delegates the handling of domains to the generic
>> IOMMU code. Once this is implemented it is possible to remove the
>> lazy allocation of domains entirely.
>> 
> This can also consolidate the domain storage, i.e. move domain from
> device_domain_info to iommu_group.

The plan was to have a patchset that first added the functionality below, 
making use of the group domain storage but keeping the lazy allocation. 
Then subsequent patches that remove the lazy allocation. To also remove 
the device_domain_info it looks like some of the information there might 
need moving into the domain?

>> This patch implements DMA and identity domains to be allocated for 
>> external management. As it isn’t known which device will be attached
>> to a domain, the dma domain is not initialised at alloc time. Instead
>> it is allocated when attached. As we may lose RMRR mappings when
>> attaching a device to a new domain, we also ensure these are mapped
>> at attach time.
>> 
>> This will likely conflict with the work done for auxiliary domains by 
>> Baolu but the code to accommodate won’t change much.
>> 
>> I had also started on a patch to remove find_or_alloc_domain and
>> various functions that call it but had issues with edge cases such as 
>> iommu_prepare_isa that is doing domain operations at IOMMU init time.
>> 
>> Cheers,
>> James.
>> 
>> 
>> ---
>> drivers/iommu/intel-iommu.c | 159
>> +--- 1 file changed, 110
>> insertions(+), 49 deletions(-)
>> 
>> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
>> index 41a4b8808802..6437cb2e9b22 100644
>> --- a/drivers/iommu/intel-iommu.c
>> +++ b/drivers/iommu/intel-iommu.c
>> @@ -351,6 +351,14 @@ static int hw_pass_through = 1;
>> /* si_domain contains mulitple devices */
>> #define DOMAIN_FLAG_STATIC_IDENTITY  (1 << 1)
>> 
>> +/* Domain managed externally, don't cleanup if it isn't attached
>> + * to any devices. */
>> +#define DOMAIN_FLAG_NO_CLEANUP  (1 << 2)
>> +
> the name NO_CLEANUP is a little counter intuitive to me, should it be
> called UNINITIALISED?

I don’t think uninitialised is accurate, the domain may be initialised. It 
is used to distinguish between domains allocated by the external API, 
which we shouldn’t automatically cleanup, and domains allocated internally, 
which should. I agree a better name could be found.

>> +/* Set after domain initialisation. Used when allocating dma domains
>> to
>> + * defer domain initialisation until it is attached to a device */
>> +#define DOMAIN_FLAG_INITIALISED (1 << 4)
>> +
>> #define for_each_domain_iommu(idx, domain)   \
>>  for (idx = 0; idx < g_num_of_iommus; idx++) \
>>  if (domain->iommu_refcnt[idx])
>> @@ -624,6 +632,16 @@ static inline int domain_type_is_vm_or_si(struct
>> dmar_domain *domain) DOMAIN_FLAG_STATIC_IDENTITY);
>> }
>> 
>> +static inline int domain_managed_externally(struct dmar_domain
>> *domain) +{
>> +return domain->flags & DOMAIN_FLAG_NO_CLEANUP;
>> +}
>> +
>> +static inline int domain_is_initialised(struct dmar_domain *domain)
>> +{
>> +return domain->flags & DOMAIN_FLAG_INITIALISED;
>> +}
>> +
>> static inline int domain_pfn_supported(struct dmar_domain *domain,
>> unsigned long pfn)
>> {
>> @@ -1717,7 +1735,7 @@ static void 

Re: [RFC v2] iommu/vt-d: Allow iommu_domain_alloc to allocate IOMMU_DOMAIN_DMA

2019-01-02 Thread James Sewart via iommu
Bump

> On 5 Dec 2018, at 17:19, James Sewart  wrote:
> 
> Hey,
> 
> There exists an issue in the logic used to determine domain association 
> with devices. Currently the driver uses find_or_alloc_domain to either 
> reuse an existing domain or allocate a new one if one isn’t found. Domains 
> should be shared between all members of an IOMMU group as this is the 
> minimum granularity that we can guarantee address space isolation.
> 
> The intel IOMMU driver exposes pci_device_group in intel_iommu_ops as the 
> function to call to determine the group of a device, this is implemented 
> in the generic IOMMU code and checks: dma aliases, upstream pcie switch 
> ACS, pci aliases, and pci function aliases. The find_or_alloc_domain code 
> currently only uses dma aliases to determine if a domain is shared. This 
> causes a disconnect between IOMMU groups and domains. We have observed 
> devices under a pcie switch each having their own domain but assigned the 
> same group.
> 
> One solution would be to fix the logic in find_or_alloc_domain to add 
> checks for the other conditions that a device may share a domain. However, 
> this duplicates code which the generic IOMMU code implements. Instead this 
> issue can be fixed by allowing the allocation of default_domain on the 
> IOMMU group. This is not currently supported as the intel driver does not 
> allow allocation of domain type IOMMU_DOMAIN_DMA.
> 
> Allowing allocation of DMA domains has the effect that the default_domain 
> is non NULL and is attached to a device when initialising. This delegates 
> the handling of domains to the generic IOMMU code. Once this is 
> implemented it is possible to remove the lazy allocation of domains 
> entirely.
> 
> This patch implements DMA and identity domains to be allocated for 
> external management. As it isn’t known which device will be attached to a 
> domain, the dma domain is not initialised at alloc time. Instead it is 
> allocated when attached. As we may lose RMRR mappings when attaching a 
> device to a new domain, we also ensure these are mapped at attach time.
> 
> This will likely conflict with the work done for auxiliary domains by 
> Baolu but the code to accommodate won’t change much.
> 
> I had also started on a patch to remove find_or_alloc_domain and various 
> functions that call it but had issues with edge cases such as 
> iommu_prepare_isa that is doing domain operations at IOMMU init time.
> 
> Cheers,
> James.
> 
> 
> ---
> drivers/iommu/intel-iommu.c | 159 +---
> 1 file changed, 110 insertions(+), 49 deletions(-)
> 
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index 41a4b8808802..6437cb2e9b22 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -351,6 +351,14 @@ static int hw_pass_through = 1;
> /* si_domain contains mulitple devices */
> #define DOMAIN_FLAG_STATIC_IDENTITY   (1 << 1)
> 
> +/* Domain managed externally, don't cleanup if it isn't attached
> + * to any devices. */
> +#define DOMAIN_FLAG_NO_CLEANUP   (1 << 2)
> +
> +/* Set after domain initialisation. Used when allocating dma domains to
> + * defer domain initialisation until it is attached to a device */
> +#define DOMAIN_FLAG_INITIALISED  (1 << 4)
> +
> #define for_each_domain_iommu(idx, domain)\
>   for (idx = 0; idx < g_num_of_iommus; idx++) \
>   if (domain->iommu_refcnt[idx])
> @@ -624,6 +632,16 @@ static inline int domain_type_is_vm_or_si(struct 
> dmar_domain *domain)
>   DOMAIN_FLAG_STATIC_IDENTITY);
> }
> 
> +static inline int domain_managed_externally(struct dmar_domain *domain)
> +{
> + return domain->flags & DOMAIN_FLAG_NO_CLEANUP;
> +}
> +
> +static inline int domain_is_initialised(struct dmar_domain *domain)
> +{
> + return domain->flags & DOMAIN_FLAG_INITIALISED;
> +}
> +
> static inline int domain_pfn_supported(struct dmar_domain *domain,
>  unsigned long pfn)
> {
> @@ -1717,7 +1735,7 @@ static void disable_dmar_iommu(struct intel_iommu 
> *iommu)
> 
>   __dmar_remove_one_dev_info(info);
> 
> - if (!domain_type_is_vm_or_si(domain)) {
> + if (!domain_managed_externally(domain)) {
>   /*
>* The domain_exit() function  can't be called under
>* device_domain_lock, as it takes this lock itself.
> @@ -1951,6 +1969,7 @@ static int domain_init(struct dmar_domain *domain, 
> struct intel_iommu *iommu,
>   domain->pgd = (struct dma_pte *)alloc_pgtable_page(domain->nid);
>   if (!domain->pgd)
>   return -ENOMEM;
> + domain->flags |= DOMAIN_FLAG_INITIALISED;
>   __iommu_flush_cache(iommu, domain->pgd, PAGE_SIZE);
>   return 0;
> }
> @@ -3234,11 +3253,9 @@ static int copy_translation_tables(struct intel_iommu 
> *iommu)
> static int __init init_dmars(void)
> {
>   

[RFC v2] iommu/vt-d: Allow iommu_domain_alloc to allocate IOMMU_DOMAIN_DMA

2018-12-05 Thread James Sewart via iommu
Hey,

There exists an issue in the logic used to determine domain association 
with devices. Currently the driver uses find_or_alloc_domain to either 
reuse an existing domain or allocate a new one if one isn’t found. Domains 
should be shared between all members of an IOMMU group as this is the 
minimum granularity that we can guarantee address space isolation.

The intel IOMMU driver exposes pci_device_group in intel_iommu_ops as the 
function to call to determine the group of a device, this is implemented 
in the generic IOMMU code and checks: dma aliases, upstream pcie switch 
ACS, pci aliases, and pci function aliases. The find_or_alloc_domain code 
currently only uses dma aliases to determine if a domain is shared. This 
causes a disconnect between IOMMU groups and domains. We have observed 
devices under a pcie switch each having their own domain but assigned the 
same group.

One solution would be to fix the logic in find_or_alloc_domain to add 
checks for the other conditions that a device may share a domain. However, 
this duplicates code which the generic IOMMU code implements. Instead this 
issue can be fixed by allowing the allocation of default_domain on the 
IOMMU group. This is not currently supported as the intel driver does not 
allow allocation of domain type IOMMU_DOMAIN_DMA.

Allowing allocation of DMA domains has the effect that the default_domain 
is non NULL and is attached to a device when initialising. This delegates 
the handling of domains to the generic IOMMU code. Once this is 
implemented it is possible to remove the lazy allocation of domains 
entirely.

This patch implements DMA and identity domains to be allocated for 
external management. As it isn’t known which device will be attached to a 
domain, the dma domain is not initialised at alloc time. Instead it is 
allocated when attached. As we may lose RMRR mappings when attaching a 
device to a new domain, we also ensure these are mapped at attach time.

This will likely conflict with the work done for auxiliary domains by 
Baolu but the code to accommodate won’t change much.

I had also started on a patch to remove find_or_alloc_domain and various 
functions that call it but had issues with edge cases such as 
iommu_prepare_isa that is doing domain operations at IOMMU init time.

Cheers,
James.


---
 drivers/iommu/intel-iommu.c | 159 +---
 1 file changed, 110 insertions(+), 49 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 41a4b8808802..6437cb2e9b22 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -351,6 +351,14 @@ static int hw_pass_through = 1;
 /* si_domain contains mulitple devices */
 #define DOMAIN_FLAG_STATIC_IDENTITY(1 << 1)
 
+/* Domain managed externally, don't cleanup if it isn't attached
+ * to any devices. */
+#define DOMAIN_FLAG_NO_CLEANUP (1 << 2)
+
+/* Set after domain initialisation. Used when allocating dma domains to
+ * defer domain initialisation until it is attached to a device */
+#define DOMAIN_FLAG_INITIALISED(1 << 4)
+
 #define for_each_domain_iommu(idx, domain) \
for (idx = 0; idx < g_num_of_iommus; idx++) \
if (domain->iommu_refcnt[idx])
@@ -624,6 +632,16 @@ static inline int domain_type_is_vm_or_si(struct 
dmar_domain *domain)
DOMAIN_FLAG_STATIC_IDENTITY);
 }
 
+static inline int domain_managed_externally(struct dmar_domain *domain)
+{
+   return domain->flags & DOMAIN_FLAG_NO_CLEANUP;
+}
+
+static inline int domain_is_initialised(struct dmar_domain *domain)
+{
+   return domain->flags & DOMAIN_FLAG_INITIALISED;
+}
+
 static inline int domain_pfn_supported(struct dmar_domain *domain,
   unsigned long pfn)
 {
@@ -1717,7 +1735,7 @@ static void disable_dmar_iommu(struct intel_iommu *iommu)
 
__dmar_remove_one_dev_info(info);
 
-   if (!domain_type_is_vm_or_si(domain)) {
+   if (!domain_managed_externally(domain)) {
/*
 * The domain_exit() function  can't be called under
 * device_domain_lock, as it takes this lock itself.
@@ -1951,6 +1969,7 @@ static int domain_init(struct dmar_domain *domain, struct 
intel_iommu *iommu,
domain->pgd = (struct dma_pte *)alloc_pgtable_page(domain->nid);
if (!domain->pgd)
return -ENOMEM;
+   domain->flags |= DOMAIN_FLAG_INITIALISED;
__iommu_flush_cache(iommu, domain->pgd, PAGE_SIZE);
return 0;
 }
@@ -3234,11 +3253,9 @@ static int copy_translation_tables(struct intel_iommu 
*iommu)
 static int __init init_dmars(void)
 {
struct dmar_drhd_unit *drhd;
-   struct dmar_rmrr_unit *rmrr;
bool copied_tables = false;
-   struct device *dev;
struct intel_iommu *iommu;
-   int i, ret;
+   int ret;
 
/*
 * for each 

Re: [RFC] iommu/vt-d: Group and domain relationship

2018-11-12 Thread James Sewart via iommu
ment apply_resv_region() in VT-d driver and
> move RMRR setup here.
> - DMA map API, will use per group default domain instead of per device
>  private domain
> 
> The change is rather pervasive so i am trying to set a balance for
> short term functionality and complete clean up. Any ideas welcome.
> 

Sounds good, thanks. I am eager to test the patch.

>>> 
>>>>> So we can't just open the door but not cleanup the things right?  
>>>> A user of domain_alloc and attach_device is responsible for
>>>> detaching a domain if it is no longer needed and calling
>>>> domain_free.  
>>> 
>>> Currently DMA API calls get_valid_domain_for_dev() to retrieve a DMA
>>> domain. If the domain has already been allocated, return directly.
>>> Otherwise, allocate and initialize a new one for the device. Let's
>>> call domains allocated by get_valid_domain_for_dev() as "A".
>>> 
>>> If we open the door and allow another component to manage the DMA
>>> domains through domain iommu_domain_alloc/free(). Let's call domains
>>> allocated through iommu_domain_alloc() as "B".
>>> 
>>> So how can we sync between A and B?  
>> 
>> I’m not sure we need to sync them. Domain A would be replaced
>> entirely for a certain device with domain B by attach_device. As we
>> saw above domain A would be freed if there are no more devices
>> attached to it. Domain B would be managed by the user of the iommu
>> api. If a device is detached from domain B, the find_or_alloc_domain
>> logic will take over once more and allocate a fresh domain.
>> 
> Agreed, it is more like switching domains when you switching APIs. But
> the reserved region better be preserved.
>> Cheers,
>> James.
>> 
>>> 
>>> Need to go through the code to find out more.
>>> 
>>> Best regards,
>>> Lu Baolu  
>>>> Cheers,
>>>> James.  
>>>>> 
>>>>> I haven't spent time on details. So I cc'ed Jacob for corrections.
>>>>> 
>>>>> Best regards,
>>>>> Lu Baolu
>>>>> 
>>>>>> Cheers,
>>>>>> James.
>>>>>> On Fri, Nov 2, 2018 at 2:43 AM Lu Baolu
>>>>>>  wrote:  
>>>>>>> 
>>>>>>> Hi,
>>>>>>> 
>>>>>>> On 10/30/18 10:18 PM, James Sewart via iommu wrote:  
>>>>>>>> Hey,
>>>>>>>> 
>>>>>>>> I’ve been investigating the relationship between iommu groups
>>>>>>>> and domains on our systems and have a few question. Why does
>>>>>>>> the intel iommu code not allow allocating IOMMU_DOMAIN_DMA?
>>>>>>>> Returning NULL when given this domain type has the side effect
>>>>>>>> that the default_domain for an iommu group is not set, which,
>>>>>>>> when using for e.g. dma_map_ops.map_page, means a domain is
>>>>>>>> allocated per device.  
>>>>>>> 
>>>>>>> Intel vt-d driver doesn't implement the default domain and
>>>>>>> allocates domain only on demanded. There are lots of things to
>>>>>>> do before we allow iommu API to allocate domains other than
>>>>>>> IOMMU_DOMAIN_UNMANAGED.
>>>>>>> 
>>>>>>> Best regards,
>>>>>>> Lu Baolu
>>>>>>> 
>>>>>>>> 
>>>>>>>> This seems to be the opposite behaviour to the AMD iommu code
>>>>>>>> which supports allocating an IOMMU_DOMAIN_DMA and will only
>>>>>>>> look to the iommu group if a domain is not attached to the
>>>>>>>> device rather than allocating a new one. On AMD every device
>>>>>>>> in an iommu group will share the same domain.
>>>>>>>> 
>>>>>>>> Appended is what I think a patch to implement domain_alloc for
>>>>>>>> IOMMU_DOMAIN_DMA and also IOMMU_DOMAIN_IDENTITY would look
>>>>>>>> like. Testing shows each device in a group will share a domain
>>>>>>>> by default, it also allows allocating a new dma domain that
>>>>>>>> can be successfully attached to a group with
>>>>>>>> iommu_attach_group.
>>>>>

Re: [RFC] iommu/vt-d: Group and domain relationship

2018-11-09 Thread James Sewart via iommu
Hey Yi,

> On 9 Nov 2018, at 06:54, Liu, Yi L  wrote:
> 
> Hi James,
> 
> Regards to the relationship of iommu group and domain, the blog written by 
> Alex
> may help you. The blog explained very well on how iommu group is determined 
> and
> why.
> 
> http://vfio.blogspot.com/2014/08/iommu-groups-inside-and-out.html

Thanks for the link, this explains how I expected the group domain 
relationship to work.

> 
>> From: iommu-boun...@lists.linux-foundation.org [mailto:iommu-
>> boun...@lists.linux-foundation.org] On Behalf Of James Sewart via iommu
>> Sent: Thursday, November 8, 2018 7:30 PM
>> Subject: Re: [RFC] iommu/vt-d: Group and domain relationship
>> 
>> Hey,
>> 
>>> On 8 Nov 2018, at 01:42, Lu Baolu  wrote:
>>> 
>>> Hi,
>>> 
>>> On 11/8/18 1:55 AM, James Sewart wrote:
>>>> Hey,
>>>>> On 7 Nov 2018, at 02:10, Lu Baolu  wrote:
>>>>> 
>>>>> Hi,
>>>>> 
>>>>> On 11/6/18 6:40 PM, James Sewart wrote:
>>>>>> Hey Lu,
>>>>>> Would you be able to go into more detail about the issues with
> 
> [...]
> 
>>>>> 
>>>>> Why do we want to open this door? Probably we want the generic iommu
>>>>> layer to handle these things (it's called default domain).
>>>> I’d like to allocate a domain and attach it to multiple devices in a
>>>> group/multiple groups so that they share address translation, but still
>>>> allow drivers for devices in those groups to use the dma_map_ops api.
>>> 
>>> Just out of curiosity, why do you want to share a single domain across
>>> multiple groups? By default, the groups and DMA domains are normally
>>> 1-1 mapped, right?
>> 
>> Currently we see each device in a group with their own domain.
>> find_or_alloc_domain looks at dma aliases to determine who shares domains
>> whereas pci_device_group in the generic iommu code determines groups using
>> a few other checks. We have observed internally that devices under a pcie
>> switch will be put in the same group but they do not share a domain within
>> that group.
> 
> Really? iommu group is DMA isolation unit. You said they are not sharing a 
> domain.
> Do you mean they have different IOVA address space by mentioning they are not
> sharing a domain? Is there any special things(e.g. special IOVA allocation to 
> avoid
> unexpected P2P) done on your system? Normally, devices within an iommu group
> should share an IOVA address space.

We see a unique set of mappings for each device within a group. Adding 
debug output to the kernel shows find_or_alloc_domain allocating a new 
domain for each device in a group. We don’t have any special configuration 
with regard to IOVA allocation.

I think the issue is that find_or_alloc_domain only uses 
pci_for_each_dma_alias to determine domain sharing. This doesn’t check the 
configuration of upstream pcie-pcie switches and doesn’t look to see if ACS 
is enabled or disabled. pci_device_group in drivers/iommu/iommu.c, which 
is used to determine which group a device is assigned, performs an extra 
walk of the upstream buses to see if there exists a bus without ACS which 
would allow peer-to-peer dma and groups the device accordingly.

One solution would be to allow the generic iommu code to assign domains, 
which it does based on the device_group function of the iommu_ops. The 
reason it doesn’t currently is that the default domain type is 
IOMMU_DOMAIN_DMA which the intel driver currently disallows allocating via 
the domain_alloc api.

Cheers,
James.


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC] iommu/vt-d: Group and domain relationship

2018-11-08 Thread James Sewart via iommu
Hey,

> On 8 Nov 2018, at 01:42, Lu Baolu  wrote:
> 
> Hi,
> 
> On 11/8/18 1:55 AM, James Sewart wrote:
>> Hey,
>>> On 7 Nov 2018, at 02:10, Lu Baolu  wrote:
>>> 
>>> Hi,
>>> 
>>> On 11/6/18 6:40 PM, James Sewart wrote:
>>>> Hey Lu,
>>>> Would you be able to go into more detail about the issues with
>>>> allowing IOMMU_DOMAIN_DMA to be allocated via domain_alloc?
>>> 
>>> This door is closed because intel iommu driver does everything for
>>> IOMMU_DOMAIN_DMA: allocating a domain and setup the context entries
>>> for the domain.
>> As far as I can tell, attach_device in the intel driver will handle
>> cleaning up any old domain context mapping and ensure the new domain is
>> mapped with calls to dmar_remove_one_dev_info and domain_add_dev_info.
> 
> That's only for domains of IOMMU_DOMAIN_UNMANAGED, right?

attach_device has logic for cleaning up domains that are allocated by the 
intel driver.

old_domain = find_domain(dev);
if (old_domain) {
rcu_read_lock();
dmar_remove_one_dev_info(old_domain, dev);
rcu_read_unlock();

if (!domain_type_is_vm_or_si(old_domain) &&
 list_empty(_domain->devices))
domain_exit(old_domain);
}

This is checking the type of the old domain only, freeing it if is not 
attached to any devices. Looking at this now, maybe the solution would be 
to distinguish between internally allocated dma domains and dma domains 
allocated via the external api, so we avoid freeing a domain a driver has 
reference to.

>>> 
>>> Why do we want to open this door? Probably we want the generic iommu
>>> layer to handle these things (it's called default domain).
>> I’d like to allocate a domain and attach it to multiple devices in a
>> group/multiple groups so that they share address translation, but still
>> allow drivers for devices in those groups to use the dma_map_ops api.
> 
> Just out of curiosity, why do you want to share a single domain across
> multiple groups? By default, the groups and DMA domains are normally
> 1-1 mapped, right?

Currently we see each device in a group with their own domain. 
find_or_alloc_domain looks at dma aliases to determine who shares domains 
whereas pci_device_group in the generic iommu code determines groups using 
a few other checks. We have observed internally that devices under a pcie 
switch will be put in the same group but they do not share a domain within 
that group.

Getting every device within a group to share a domain would get us 90% of 
the way to what we want. But we have some configurations where there exist 
devices put in other groups that we want to share translations with.

> 
>>> So we can't just open the door but not cleanup the things right?
>> A user of domain_alloc and attach_device is responsible for detaching a
>> domain if it is no longer needed and calling domain_free.
> 
> Currently DMA API calls get_valid_domain_for_dev() to retrieve a DMA
> domain. If the domain has already been allocated, return directly.
> Otherwise, allocate and initialize a new one for the device. Let's call
> domains allocated by get_valid_domain_for_dev() as "A".
> 
> If we open the door and allow another component to manage the DMA
> domains through domain iommu_domain_alloc/free(). Let's call domains
> allocated through iommu_domain_alloc() as "B".
> 
> So how can we sync between A and B?

I’m not sure we need to sync them. Domain A would be replaced entirely for 
a certain device with domain B by attach_device. As we saw above domain A 
would be freed if there are no more devices attached to it. Domain B would 
be managed by the user of the iommu api. If a device is detached from 
domain B, the find_or_alloc_domain logic will take over once more and 
allocate a fresh domain.

Cheers,
James.

> 
> Need to go through the code to find out more.
> 
> Best regards,
> Lu Baolu
>> Cheers,
>> James.
>>> 
>>> I haven't spent time on details. So I cc'ed Jacob for corrections.
>>> 
>>> Best regards,
>>> Lu Baolu
>>> 
>>>> Cheers,
>>>> James.
>>>> On Fri, Nov 2, 2018 at 2:43 AM Lu Baolu  wrote:
>>>>> 
>>>>> Hi,
>>>>> 
>>>>> On 10/30/18 10:18 PM, James Sewart via iommu wrote:
>>>>>> Hey,
>>>>>> 
>>>>>> I’ve been investigating the relationship between iommu groups and domains
>>>>>> on our systems and have a few question. Why 

Re: [RFC] iommu/vt-d: Group and domain relationship

2018-11-07 Thread James Sewart via iommu
Hey,

> On 7 Nov 2018, at 02:10, Lu Baolu  wrote:
> 
> Hi,
> 
> On 11/6/18 6:40 PM, James Sewart wrote:
>> Hey Lu,
>> Would you be able to go into more detail about the issues with
>> allowing IOMMU_DOMAIN_DMA to be allocated via domain_alloc?
> 
> This door is closed because intel iommu driver does everything for
> IOMMU_DOMAIN_DMA: allocating a domain and setup the context entries
> for the domain.

As far as I can tell, attach_device in the intel driver will handle 
cleaning up any old domain context mapping and ensure the new domain is 
mapped with calls to dmar_remove_one_dev_info and domain_add_dev_info.

> 
> Why do we want to open this door? Probably we want the generic iommu
> layer to handle these things (it's called default domain).

I’d like to allocate a domain and attach it to multiple devices in a 
group/multiple groups so that they share address translation, but still 
allow drivers for devices in those groups to use the dma_map_ops api.

> So we can't just open the door but not cleanup the things right?

A user of domain_alloc and attach_device is responsible for detaching a 
domain if it is no longer needed and calling domain_free.

Cheers,
James.


> 
> I haven't spent time on details. So I cc'ed Jacob for corrections.
> 
> Best regards,
> Lu Baolu
> 
>> Cheers,
>> James.
>> On Fri, Nov 2, 2018 at 2:43 AM Lu Baolu  wrote:
>>> 
>>> Hi,
>>> 
>>> On 10/30/18 10:18 PM, James Sewart via iommu wrote:
>>>> Hey,
>>>> 
>>>> I’ve been investigating the relationship between iommu groups and domains
>>>> on our systems and have a few question. Why does the intel iommu code not
>>>> allow allocating IOMMU_DOMAIN_DMA? Returning NULL when given this domain
>>>> type has the side effect that the default_domain for an iommu group is not
>>>> set, which, when using for e.g. dma_map_ops.map_page, means a domain is
>>>> allocated per device.
>>> 
>>> Intel vt-d driver doesn't implement the default domain and allocates
>>> domain only on demanded. There are lots of things to do before we allow
>>> iommu API to allocate domains other than IOMMU_DOMAIN_UNMANAGED.
>>> 
>>> Best regards,
>>> Lu Baolu
>>> 
>>>> 
>>>> This seems to be the opposite behaviour to the AMD iommu code which
>>>> supports allocating an IOMMU_DOMAIN_DMA and will only look to the iommu
>>>> group if a domain is not attached to the device rather than allocating a
>>>> new one. On AMD every device in an iommu group will share the same domain.
>>>> 
>>>> Appended is what I think a patch to implement domain_alloc for
>>>> IOMMU_DOMAIN_DMA and also IOMMU_DOMAIN_IDENTITY would look like. Testing
>>>> shows each device in a group will share a domain by default, it also
>>>> allows allocating a new dma domain that can be successfully attached to a
>>>> group with iommu_attach_group.
>>>> 
>>>> Looking for comment on why the behaviour is how it is currently and if
>>>> there are any issues with the solution I’ve been testing.
>>>> 
>>>> Cheers,
>>>> James.
>>>> 
>>>> 
>>>> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
>>>> index bff2abd6..3a58389f 100644
>>>> --- a/drivers/iommu/intel-iommu.c
>>>> +++ b/drivers/iommu/intel-iommu.c
>>>> @@ -5170,10 +5170,15 @@ static struct iommu_domain 
>>>> *intel_iommu_domain_alloc(unsigned type)
>>>>   struct dmar_domain *dmar_domain;
>>>>   struct iommu_domain *domain;
>>>> 
>>>> - if (type != IOMMU_DOMAIN_UNMANAGED)
>>>> + if (type == IOMMU_DOMAIN_UNMANAGED)
>>>> + dmar_domain = alloc_domain(DOMAIN_FLAG_VIRTUAL_MACHINE);
>>>> + else if(type == IOMMU_DOMAIN_DMA)
>>>> + dmar_domain = alloc_domain(0);
>>>> + else if(type == IOMMU_DOMAIN_IDENTITY)
>>>> + dmar_domain = alloc_domain(DOMAIN_FLAG_STATIC_IDENTITY);
>>>> + else
>>>>   return NULL;
>>>> 
>>>> - dmar_domain = alloc_domain(DOMAIN_FLAG_VIRTUAL_MACHINE);
>>>>   if (!dmar_domain) {
>>>>   pr_err("Can't allocate dmar_domain\n");
>>>>   return NULL;
>>>> @@ -5186,9 +5191,12 @@ static struct iommu_domain 
>>>> *intel_iommu_domain_alloc(unsigned type)
>>>>   domain_update_iommu_cap(dmar_do

Re: [RFC] iommu/vt-d: Group and domain relationship

2018-11-06 Thread James Sewart via iommu
Hey Lu,

Would you be able to go into more detail about the issues with
allowing IOMMU_DOMAIN_DMA to be allocated via domain_alloc?


Cheers,
James.
On Fri, Nov 2, 2018 at 2:43 AM Lu Baolu  wrote:
>
> Hi,
>
> On 10/30/18 10:18 PM, James Sewart via iommu wrote:
> > Hey,
> >
> > I’ve been investigating the relationship between iommu groups and domains
> > on our systems and have a few question. Why does the intel iommu code not
> > allow allocating IOMMU_DOMAIN_DMA? Returning NULL when given this domain
> > type has the side effect that the default_domain for an iommu group is not
> > set, which, when using for e.g. dma_map_ops.map_page, means a domain is
> > allocated per device.
>
> Intel vt-d driver doesn't implement the default domain and allocates
> domain only on demanded. There are lots of things to do before we allow
> iommu API to allocate domains other than IOMMU_DOMAIN_UNMANAGED.
>
> Best regards,
> Lu Baolu
>
> >
> > This seems to be the opposite behaviour to the AMD iommu code which
> > supports allocating an IOMMU_DOMAIN_DMA and will only look to the iommu
> > group if a domain is not attached to the device rather than allocating a
> > new one. On AMD every device in an iommu group will share the same domain.
> >
> > Appended is what I think a patch to implement domain_alloc for
> > IOMMU_DOMAIN_DMA and also IOMMU_DOMAIN_IDENTITY would look like. Testing
> > shows each device in a group will share a domain by default, it also
> > allows allocating a new dma domain that can be successfully attached to a
> > group with iommu_attach_group.
> >
> > Looking for comment on why the behaviour is how it is currently and if
> > there are any issues with the solution I’ve been testing.
> >
> > Cheers,
> > James.
> >
> >
> > diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> > index bff2abd6..3a58389f 100644
> > --- a/drivers/iommu/intel-iommu.c
> > +++ b/drivers/iommu/intel-iommu.c
> > @@ -5170,10 +5170,15 @@ static struct iommu_domain 
> > *intel_iommu_domain_alloc(unsigned type)
> >   struct dmar_domain *dmar_domain;
> >   struct iommu_domain *domain;
> >
> > - if (type != IOMMU_DOMAIN_UNMANAGED)
> > + if (type == IOMMU_DOMAIN_UNMANAGED)
> > + dmar_domain = alloc_domain(DOMAIN_FLAG_VIRTUAL_MACHINE);
> > + else if(type == IOMMU_DOMAIN_DMA)
> > + dmar_domain = alloc_domain(0);
> > + else if(type == IOMMU_DOMAIN_IDENTITY)
> > + dmar_domain = alloc_domain(DOMAIN_FLAG_STATIC_IDENTITY);
> > + else
> >   return NULL;
> >
> > - dmar_domain = alloc_domain(DOMAIN_FLAG_VIRTUAL_MACHINE);
> >   if (!dmar_domain) {
> >   pr_err("Can't allocate dmar_domain\n");
> >   return NULL;
> > @@ -5186,9 +5191,12 @@ static struct iommu_domain 
> > *intel_iommu_domain_alloc(unsigned type)
> >   domain_update_iommu_cap(dmar_domain);
> >
> >   domain = _domain->domain;
> > - domain->geometry.aperture_start = 0;
> > - domain->geometry.aperture_end   = __DOMAIN_MAX_ADDR(dmar_domain->gaw);
> > - domain->geometry.force_aperture = true;
> > +
> > + if (type == IOMMU_DOMAIN_UNMANAGED) {
> > + domain->geometry.aperture_start = 0;
> > + domain->geometry.aperture_end   = 
> > __DOMAIN_MAX_ADDR(dmar_domain->gaw);
> > + domain->geometry.force_aperture = true;
> > + }
> >
> >   return domain;
> >   }
> > ___
> > iommu mailing list
> > iommu@lists.linux-foundation.org
> > https://lists.linuxfoundation.org/mailman/listinfo/iommu
> >
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[RFC] iommu/vt-d: Group and domain relationship

2018-10-30 Thread James Sewart via iommu
Hey,

I’ve been investigating the relationship between iommu groups and domains 
on our systems and have a few question. Why does the intel iommu code not 
allow allocating IOMMU_DOMAIN_DMA? Returning NULL when given this domain 
type has the side effect that the default_domain for an iommu group is not 
set, which, when using for e.g. dma_map_ops.map_page, means a domain is 
allocated per device.

This seems to be the opposite behaviour to the AMD iommu code which 
supports allocating an IOMMU_DOMAIN_DMA and will only look to the iommu 
group if a domain is not attached to the device rather than allocating a 
new one. On AMD every device in an iommu group will share the same domain.

Appended is what I think a patch to implement domain_alloc for 
IOMMU_DOMAIN_DMA and also IOMMU_DOMAIN_IDENTITY would look like. Testing 
shows each device in a group will share a domain by default, it also 
allows allocating a new dma domain that can be successfully attached to a 
group with iommu_attach_group.

Looking for comment on why the behaviour is how it is currently and if 
there are any issues with the solution I’ve been testing.

Cheers,
James.


diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index bff2abd6..3a58389f 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5170,10 +5170,15 @@ static struct iommu_domain 
*intel_iommu_domain_alloc(unsigned type)
struct dmar_domain *dmar_domain;
struct iommu_domain *domain;
 
-   if (type != IOMMU_DOMAIN_UNMANAGED)
+   if (type == IOMMU_DOMAIN_UNMANAGED)
+   dmar_domain = alloc_domain(DOMAIN_FLAG_VIRTUAL_MACHINE);
+   else if(type == IOMMU_DOMAIN_DMA)
+   dmar_domain = alloc_domain(0);
+   else if(type == IOMMU_DOMAIN_IDENTITY)
+   dmar_domain = alloc_domain(DOMAIN_FLAG_STATIC_IDENTITY);
+   else
return NULL;
 
-   dmar_domain = alloc_domain(DOMAIN_FLAG_VIRTUAL_MACHINE);
if (!dmar_domain) {
pr_err("Can't allocate dmar_domain\n");
return NULL;
@@ -5186,9 +5191,12 @@ static struct iommu_domain 
*intel_iommu_domain_alloc(unsigned type)
domain_update_iommu_cap(dmar_domain);
 
domain = _domain->domain;
-   domain->geometry.aperture_start = 0;
-   domain->geometry.aperture_end   = __DOMAIN_MAX_ADDR(dmar_domain->gaw);
-   domain->geometry.force_aperture = true;
+
+   if (type == IOMMU_DOMAIN_UNMANAGED) {
+   domain->geometry.aperture_start = 0;
+   domain->geometry.aperture_end   = 
__DOMAIN_MAX_ADDR(dmar_domain->gaw);
+   domain->geometry.force_aperture = true;
+   }
 
return domain;
 }
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu