Re: [PATCH v3 7/7] iommu/vt-d: Differentiate relaxable and non relaxable RMRRs

2019-05-16 Thread Lu Baolu

Hi Eric,

On 5/16/19 6:08 PM, Eric Auger wrote:

Now we have a new IOMMU_RESV_DIRECT_RELAXABLE reserved memory
region type, let's report USB and GFX RMRRs as relaxable ones.

This allows to have a finer reporting at IOMMU API level of
reserved memory regions. This will be exploitable by VFIO to
define the usable IOVA range and detect potential conflicts
between the guest physical address space and host reserved
regions.

Signed-off-by: Eric Auger 
---
  drivers/iommu/intel-iommu.c | 10 --
  1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index a36604f4900f..af1d65fdedfc 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5493,7 +5493,9 @@ static void intel_iommu_get_resv_regions(struct device 
*device,
for_each_rmrr_units(rmrr) {
for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
  i, i_dev) {
+   struct pci_dev *pdev = to_pci_dev(device);


Probably should be:

struct pci_dev *pdev = dev_is_pci(device) ? to_pci_dev(device) : NULL;

Best regards,
Lu Baolu



[PATCH 3/3] iommu/amd: only free resources once on init error

2019-05-16 Thread Kevin Mitchell
When amd_iommu=off was specified on the command line, free_X_resources
functions were called immediately after early_amd_iommu_init. They were
then called again when amd_iommu_init also failed (as expected).

Instead, call them only once: at the end of state_next() whenever
there's an error. These functions should be safe to call any time and
any number of times. However, since state_next is never called again in
an error state, the cleanup will only ever be run once.

This also ensures that cleanup code is run as soon as possible after an
error is detected rather than waiting for amd_iommu_init() to be called.

Signed-off-by: Kevin Mitchell 
---
 drivers/iommu/amd_iommu_init.c | 27 +--
 1 file changed, 13 insertions(+), 14 deletions(-)

diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index 5f3df5ae6ba8..24fc060fe596 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -2638,8 +2638,6 @@ static int __init state_next(void)
init_state = ret ? IOMMU_INIT_ERROR : IOMMU_ACPI_FINISHED;
if (init_state == IOMMU_ACPI_FINISHED && amd_iommu_disabled) {
pr_info("AMD IOMMU disabled on kernel command-line\n");
-   free_dma_resources();
-   free_iommu_resources();
init_state = IOMMU_CMDLINE_DISABLED;
ret = -EINVAL;
}
@@ -2680,6 +2678,19 @@ static int __init state_next(void)
BUG();
}
 
+   if (ret) {
+   free_dma_resources();
+   if (!irq_remapping_enabled) {
+   disable_iommus();
+   free_iommu_resources();
+   } else {
+   struct amd_iommu *iommu;
+
+   uninit_device_table_dma();
+   for_each_iommu(iommu)
+   iommu_flush_all_caches(iommu);
+   }
+   }
return ret;
 }
 
@@ -2753,18 +2764,6 @@ static int __init amd_iommu_init(void)
int ret;
 
ret = iommu_go_to_state(IOMMU_INITIALIZED);
-   if (ret) {
-   free_dma_resources();
-   if (!irq_remapping_enabled) {
-   disable_iommus();
-   free_iommu_resources();
-   } else {
-   uninit_device_table_dma();
-   for_each_iommu(iommu)
-   iommu_flush_all_caches(iommu);
-   }
-   }
-
 #ifdef CONFIG_GART_IOMMU
if (ret && list_empty(_iommu_list)) {
/*
-- 
2.20.1



[PATCH 2/3] iommu/amd: move gart fallback to amd_iommu_init

2019-05-16 Thread Kevin Mitchell
The fallback to the GART driver in the case amd_iommu doesn't work was
executed in a function called free_iommu_resources, which didn't really
make sense. This was even being called twice if amd_iommu=off was
specified on the command line.

The only complication is that it needs to be verified that amd_iommu has
fully relinquished control by calling free_iommu_resources and emptying
the amd_iommu_list.

Signed-off-by: Kevin Mitchell 
---
 drivers/iommu/amd_iommu_init.c | 19 ++-
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index 3798d7303c99..5f3df5ae6ba8 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -2345,15 +2345,6 @@ static void __init free_iommu_resources(void)
amd_iommu_dev_table = NULL;
 
free_iommu_all();
-
-#ifdef CONFIG_GART_IOMMU
-   /*
-* We failed to initialize the AMD IOMMU - try fallback to GART
-* if possible.
-*/
-   gart_iommu_init();
-
-#endif
 }
 
 /* SB IOAPIC is always on this device in AMD systems */
@@ -2774,6 +2765,16 @@ static int __init amd_iommu_init(void)
}
}
 
+#ifdef CONFIG_GART_IOMMU
+   if (ret && list_empty(_iommu_list)) {
+   /*
+* We failed to initialize the AMD IOMMU - try fallback
+* to GART if possible.
+*/
+   gart_iommu_init();
+   }
+#endif
+
for_each_iommu(iommu)
amd_iommu_debugfs_setup(iommu);
 
-- 
2.20.1



[PATCH 1/3] iommu/amd: make iommu_disable safer

2019-05-16 Thread Kevin Mitchell via iommu
Make it safe to call iommu_disable during early init error conditions
before mmio_base is set, but after the struct amd_iommu has been added
to the amd_iommu_list. For example, this happens if firmware fails to
fill in mmio_phys in the ACPI table leading to a NULL pointer
dereference in iommu_feature_disable.

Signed-off-by: Kevin Mitchell 
---
 drivers/iommu/amd_iommu_init.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index f773792d77fd..3798d7303c99 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -424,6 +424,9 @@ static void iommu_enable(struct amd_iommu *iommu)
 
 static void iommu_disable(struct amd_iommu *iommu)
 {
+   if (!iommu->mmio_base)
+   return;
+
/* Disable command buffer */
iommu_feature_disable(iommu, CONTROL_CMDBUF_EN);
 
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 0/3] handle init errors more gracefully in amd_iommu

2019-05-16 Thread Kevin Mitchell via iommu
This series makes error handling more robust in the amd_iommu init
code. It was initially motivated by problematic firmware that does not
set up the physical address of the iommu. This led to a NULL dereference
panic when iommu_disable was called during cleanup.

While the first patch is sufficient to avoid the panic, the subsequent
two move the cleanup closer to the actual error and avoid calling the
cleanup code twice when amd_iommu=off is specified on the command line.

I have tested this series on a variety of AMD CPUs with firmware
exhibiting the issue. I have additionally tested on platforms where the
firmware has been fixed. I tried both with and without amd_iommu=off. I
have also tested on older CPUs where no IOMMU is detected and even one
where the GART driver ends up running.

Thanks,

Kevin

Kevin Mitchell (3):
  iommu/amd: make iommu_disable safer
  iommu/amd: move gart fallback to amd_iommu_init
  iommu/amd: only free resources once on init error

 drivers/iommu/amd_iommu_init.c | 45 ++
 1 file changed, 24 insertions(+), 21 deletions(-)

-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/4] vfio: vfio_iommu_type1: Define VFIO_IOMMU_INFO_CAPABILITIES

2019-05-16 Thread Alex Williamson
On Thu, 16 May 2019 16:57:42 +0200
Christian Borntraeger  wrote:

> Alex, 
> 
> patch 1 and 3 are s390 specific, 2 and 4 are vfio common code.
> Are you ok with the common code changes? If yes, would you prefer to have this
> via the s390 tree (Martin) or your tree?

Hi Christian,

The vfio code still needs work imo, and I'm not sure it isn't somewhat
abusive of the iommu attribute interface as well.  I don't necessarily
have a problem with it ultimately going through the s390 tree, but
let's see what comes in the next revision.  Thanks,

Alex

> On 10.05.19 10:22, Pierre Morel wrote:
> > To use the VFIO_IOMMU_GET_INFO to retrieve IOMMU specific information,
> > we define a new flag VFIO_IOMMU_INFO_CAPABILITIES in the
> > vfio_iommu_type1_info structure and the associated capability
> > information block.
> > 
> > Signed-off-by: Pierre Morel 
> > ---
> >  include/uapi/linux/vfio.h | 10 ++
> >  1 file changed, 10 insertions(+)
> > 
> > diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> > index 8f10748..8f68e0f 100644
> > --- a/include/uapi/linux/vfio.h
> > +++ b/include/uapi/linux/vfio.h
> > @@ -715,6 +715,16 @@ struct vfio_iommu_type1_info {
> > __u32   flags;
> >  #define VFIO_IOMMU_INFO_PGSIZES (1 << 0)   /* supported page sizes info */
> > __u64   iova_pgsizes;   /* Bitmap of supported page sizes */
> > +#define VFIO_IOMMU_INFO_CAPABILITIES (1 << 1)  /* support capabilities 
> > info */
> > +   __u64   cap_offset; /* Offset within info struct of first cap */
> > +};
> > +
> > +#define VFIO_IOMMU_INFO_CAP_QFN1
> > +#define VFIO_IOMMU_INFO_CAP_QGRP   2
> > +
> > +struct vfio_iommu_type1_info_block {
> > +   struct vfio_info_cap_header header;
> > +   __u32 data[];
> >  };
> >  
> >  #define VFIO_IOMMU_GET_INFO _IO(VFIO_TYPE, VFIO_BASE + 12)
> >   
> 



Re: [PATCH 2/4] vfio: vfio_iommu_type1: Define VFIO_IOMMU_INFO_CAPABILITIES

2019-05-16 Thread Alex Williamson
On Fri, 10 May 2019 10:22:33 +0200
Pierre Morel  wrote:

> To use the VFIO_IOMMU_GET_INFO to retrieve IOMMU specific information,
> we define a new flag VFIO_IOMMU_INFO_CAPABILITIES in the
> vfio_iommu_type1_info structure and the associated capability
> information block.
> 
> Signed-off-by: Pierre Morel 
> ---
>  include/uapi/linux/vfio.h | 10 ++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> index 8f10748..8f68e0f 100644
> --- a/include/uapi/linux/vfio.h
> +++ b/include/uapi/linux/vfio.h
> @@ -715,6 +715,16 @@ struct vfio_iommu_type1_info {
>   __u32   flags;
>  #define VFIO_IOMMU_INFO_PGSIZES (1 << 0) /* supported page sizes info */
>   __u64   iova_pgsizes;   /* Bitmap of supported page sizes */
> +#define VFIO_IOMMU_INFO_CAPABILITIES (1 << 1)  /* support capabilities info 
> */
> + __u64   cap_offset; /* Offset within info struct of first cap */
> +};
> +
> +#define VFIO_IOMMU_INFO_CAP_QFN  1
> +#define VFIO_IOMMU_INFO_CAP_QGRP 2

Descriptions?

> +
> +struct vfio_iommu_type1_info_block {
> + struct vfio_info_cap_header header;
> + __u32 data[];
>  };
>  
>  #define VFIO_IOMMU_GET_INFO _IO(VFIO_TYPE, VFIO_BASE + 12)

This is just a blob of data, what's the API?  How do we revision it?
How does the user know how to interpret it?  Dumping kernel internal
structures out to userspace like this is not acceptable, define a user
API. Thanks,

Alex


Re: [PATCH v3 6/7] iommu: Introduce IOMMU_RESV_DIRECT_RELAXABLE reserved memory regions

2019-05-16 Thread Robin Murphy

On 16/05/2019 14:23, Auger Eric wrote:

Hi Robin,
On 5/16/19 2:46 PM, Robin Murphy wrote:

On 16/05/2019 11:08, Eric Auger wrote:

Introduce a new type for reserved region. This corresponds
to directly mapped regions which are known to be relaxable
in some specific conditions, such as device assignment use
case. Well known examples are those used by USB controllers
providing PS/2 keyboard emulation for pre-boot BIOS and
early BOOT or RMRRs associated to IGD working in legacy mode.

Since commit c875d2c1b808 ("iommu/vt-d: Exclude devices using RMRRs
from IOMMU API domains") and commit 18436afdc11a ("iommu/vt-d: Allow
RMRR on graphics devices too"), those regions are currently
considered "safe" with respect to device assignment use case
which requires a non direct mapping at IOMMU physical level
(RAM GPA -> HPA mapping).

Those RMRRs currently exist and sometimes the device is
attempting to access it but this has not been considered
an issue until now.

However at the moment, iommu_get_group_resv_regions() is
not able to make any difference between directly mapped
regions: those which must be absolutely enforced and those
like above ones which are known as relaxable.

This is a blocker for reporting severe conflicts between
non relaxable RMRRs (like MSI doorbells) and guest GPA space.

With this new reserved region type we will be able to use
iommu_get_group_resv_regions() to enumerate the IOVA space
that is usable through the IOMMU API without introducing
regressions with respect to existing device assignment
use cases (USB and IGD).

Signed-off-by: Eric Auger 

---

Note: At the moment the sysfs ABI is not changed. However I wonder
whether it wouldn't be preferable to report the direct region as
"direct_relaxed" there. At the moment, in case the same direct
region is used by 2 devices, one USB/GFX and another not belonging
to the previous categories, the direct region will be output twice
with "direct" type.


Hmm, that sounds a bit off - if we have overlapping regions within the
same domain, then we need to do some additional pre-processing to adjust
them anyway, since any part of a relaxable region which overlaps a
non-relaxable region cannot actually be relaxed, and so really should
never be described as such.

In iommu_insert_resv_region(), we are overlapping regions of the same
type. So iommu_get_group_resv_regions() should return both the relaxable
region and non relaxable one. I should test this again using a hacked
kernel though.


We should still consider relaxable regions as being able to merge back 
in to regular direct regions to a degree - If a relaxable region falls 
entirely within a direct one then there's no point exposing it because 
the direct region *has* to take precedence and be enforced. If there is 
an incomplete overlap then we could possibly just trust consumers to see 
it and give the direct region precedence themselves, but since the 
relaxable region is our own in-kernel invention rather than firmware 
gospel I think it would be safer to truncate it to just its 
non-overlapping part.


Robin.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 6/7] iommu: Introduce IOMMU_RESV_DIRECT_RELAXABLE reserved memory regions

2019-05-16 Thread Robin Murphy

On 16/05/2019 18:06, Alex Williamson wrote:

On Thu, 16 May 2019 14:58:08 +0200
Auger Eric  wrote:


Hi Jean-Philippe,

On 5/16/19 2:43 PM, Jean-Philippe Brucker wrote:

On 16/05/2019 12:45, Auger Eric wrote:

Hi Jean-Philippe,

On 5/16/19 1:16 PM, Jean-Philippe Brucker wrote:

On 16/05/2019 11:08, Eric Auger wrote:

Note: At the moment the sysfs ABI is not changed. However I wonder
whether it wouldn't be preferable to report the direct region as
"direct_relaxed" there. At the moment, in case the same direct
region is used by 2 devices, one USB/GFX and another not belonging
to the previous categories, the direct region will be output twice
with "direct" type.

This would unblock Shameer's series:
[PATCH v6 0/7] vfio/type1: Add support for valid iova list management
https://patchwork.kernel.org/patch/10425309/


Thanks for doing this!
  

which failed to get pulled for 4.18 merge window due to IGD
device assignment regression.

v2 -> v3:
- fix direct type check
---
  drivers/iommu/iommu.c | 12 +++-
  include/linux/iommu.h |  6 ++
  2 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index ae4ea5c0e6f9..28c3d6351832 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -73,10 +73,11 @@ struct iommu_group_attribute {
  };
  
  static const char * const iommu_group_resv_type_string[] = {

-   [IOMMU_RESV_DIRECT] = "direct",
-   [IOMMU_RESV_RESERVED]   = "reserved",
-   [IOMMU_RESV_MSI]= "msi",
-   [IOMMU_RESV_SW_MSI] = "msi",
+   [IOMMU_RESV_DIRECT] = "direct",
+   [IOMMU_RESV_DIRECT_RELAXABLE]   = "direct",
+   [IOMMU_RESV_RESERVED]   = "reserved",
+   [IOMMU_RESV_MSI]= "msi",
+   [IOMMU_RESV_SW_MSI] = "msi",
  };
  
  #define IOMMU_GROUP_ATTR(_name, _mode, _show, _store)		\

@@ -573,7 +574,8 @@ static int iommu_group_create_direct_mappings(struct 
iommu_group *group,
start = ALIGN(entry->start, pg_size);
end   = ALIGN(entry->start + entry->length, pg_size);
  
-		if (entry->type != IOMMU_RESV_DIRECT)

+   if (entry->type != IOMMU_RESV_DIRECT &&
+   entry->type != IOMMU_RESV_DIRECT_RELAXABLE)


I'm trying to understand why you need to create direct mappings at all
for these relaxable regions. In the host the region is needed for legacy
device features, which are disabled (and cannot be re-enabled) when
assigning the device to a guest?

This follows Kevin's comment in the thread below:
https://patchwork.kernel.org/patch/10449103/#21957279

In normal DMA API host path, those regions need to be 1-1 mapped. They
are likely to be accessed by the driver or FW at early boot phase or
even during execution, depending on features being used.

That's the reason, according to Kevin we couldn't hide them.

We just know that, in general, they are not used anymore when assigning
the device or if accesses are attempted this generally does not block
the assignment use case. For example, it is said in
https://github.com/qemu/qemu/blob/master/docs/igd-assign.txt that in
legacy IGD assignment use case, there may be "a small numbers of DMAR
faults when initially assigned".


Hmm, fair enough. That doesn't sound too good, if the device might
perform arbitrary writes into guest memory once new IOMMU mappings are
in place. I was wondering if we could report some IOVA ranges as
"available but avoid if possible".

In Shameer's series we currently reject any vfio dma_map that would fall
into an RMRR (hence the regression on existing USB/GFX use case). With
the relaxable RMRR info we could imagine to let the userspace choose
whether we want to proceed with the dma_map despite the risk or
introduce a vfio_iommu_type1 module option (turned off by default for
not regressing existing USB/GFX passthrough) that would forbid dma_map
on relaxable RMRR regions.


Yep, the risk that Jean-Philippe mentions is real, the IGD device has
the stolen memory addresses latched into the hardware and we're unable
to change that.  What we try to do now is trap page table writes to the
device and translate them to a VM allocated stolen memory range, which
is sufficient for getting a BIOS splash screen, but we really want to
assume that the OS level driver just doesn't use the stolen memory
range.  There was a time when it seemed like we could assume the Intel
drivers were heading in that direction, but it seems that's no longer
an actual goal.  To fully support IGD assignment in a way that isn't as
fragile as it is today, we'd want to re-export the RMRR out to
userspace so that QEMU could identity map it into the VM address
space.  That's not trivial, it's only one of several issues around
IGD assignment, and we've got GVT-g (Intel vGPUs) now that don't impose
these requirements, so motivation to tackle the issue is somewhat
reduced.

With the changes here, we might want vfio to 

Re: [PATCH v3 6/7] iommu: Introduce IOMMU_RESV_DIRECT_RELAXABLE reserved memory regions

2019-05-16 Thread Alex Williamson
On Thu, 16 May 2019 14:58:08 +0200
Auger Eric  wrote:

> Hi Jean-Philippe,
> 
> On 5/16/19 2:43 PM, Jean-Philippe Brucker wrote:
> > On 16/05/2019 12:45, Auger Eric wrote:  
> >> Hi Jean-Philippe,
> >>
> >> On 5/16/19 1:16 PM, Jean-Philippe Brucker wrote:  
> >>> On 16/05/2019 11:08, Eric Auger wrote:  
>  Note: At the moment the sysfs ABI is not changed. However I wonder
>  whether it wouldn't be preferable to report the direct region as
>  "direct_relaxed" there. At the moment, in case the same direct
>  region is used by 2 devices, one USB/GFX and another not belonging
>  to the previous categories, the direct region will be output twice
>  with "direct" type.
> 
>  This would unblock Shameer's series:
>  [PATCH v6 0/7] vfio/type1: Add support for valid iova list management
>  https://patchwork.kernel.org/patch/10425309/  
> >>>
> >>> Thanks for doing this!
> >>>  
>  which failed to get pulled for 4.18 merge window due to IGD
>  device assignment regression.
> 
>  v2 -> v3:
>  - fix direct type check
>  ---
>   drivers/iommu/iommu.c | 12 +++-
>   include/linux/iommu.h |  6 ++
>   2 files changed, 13 insertions(+), 5 deletions(-)
> 
>  diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>  index ae4ea5c0e6f9..28c3d6351832 100644
>  --- a/drivers/iommu/iommu.c
>  +++ b/drivers/iommu/iommu.c
>  @@ -73,10 +73,11 @@ struct iommu_group_attribute {
>   };
>   
>   static const char * const iommu_group_resv_type_string[] = {
>  -[IOMMU_RESV_DIRECT] = "direct",
>  -[IOMMU_RESV_RESERVED]   = "reserved",
>  -[IOMMU_RESV_MSI]= "msi",
>  -[IOMMU_RESV_SW_MSI] = "msi",
>  +[IOMMU_RESV_DIRECT] = "direct",
>  +[IOMMU_RESV_DIRECT_RELAXABLE]   = "direct",
>  +[IOMMU_RESV_RESERVED]   = "reserved",
>  +[IOMMU_RESV_MSI]= "msi",
>  +[IOMMU_RESV_SW_MSI] = "msi",
>   };
>   
>   #define IOMMU_GROUP_ATTR(_name, _mode, _show, _store)   \
>  @@ -573,7 +574,8 @@ static int iommu_group_create_direct_mappings(struct 
>  iommu_group *group,
>   start = ALIGN(entry->start, pg_size);
>   end   = ALIGN(entry->start + entry->length, pg_size);
>   
>  -if (entry->type != IOMMU_RESV_DIRECT)
>  +if (entry->type != IOMMU_RESV_DIRECT &&
>  +entry->type != IOMMU_RESV_DIRECT_RELAXABLE)  
> >>>
> >>> I'm trying to understand why you need to create direct mappings at all
> >>> for these relaxable regions. In the host the region is needed for legacy
> >>> device features, which are disabled (and cannot be re-enabled) when
> >>> assigning the device to a guest?  
> >> This follows Kevin's comment in the thread below:
> >> https://patchwork.kernel.org/patch/10449103/#21957279
> >>
> >> In normal DMA API host path, those regions need to be 1-1 mapped. They
> >> are likely to be accessed by the driver or FW at early boot phase or
> >> even during execution, depending on features being used.
> >>
> >> That's the reason, according to Kevin we couldn't hide them.
> >>
> >> We just know that, in general, they are not used anymore when assigning
> >> the device or if accesses are attempted this generally does not block
> >> the assignment use case. For example, it is said in
> >> https://github.com/qemu/qemu/blob/master/docs/igd-assign.txt that in
> >> legacy IGD assignment use case, there may be "a small numbers of DMAR
> >> faults when initially assigned".  
> > 
> > Hmm, fair enough. That doesn't sound too good, if the device might
> > perform arbitrary writes into guest memory once new IOMMU mappings are
> > in place. I was wondering if we could report some IOVA ranges as
> > "available but avoid if possible".  
> In Shameer's series we currently reject any vfio dma_map that would fall
> into an RMRR (hence the regression on existing USB/GFX use case). With
> the relaxable RMRR info we could imagine to let the userspace choose
> whether we want to proceed with the dma_map despite the risk or
> introduce a vfio_iommu_type1 module option (turned off by default for
> not regressing existing USB/GFX passthrough) that would forbid dma_map
> on relaxable RMRR regions.

Yep, the risk that Jean-Philippe mentions is real, the IGD device has
the stolen memory addresses latched into the hardware and we're unable
to change that.  What we try to do now is trap page table writes to the
device and translate them to a VM allocated stolen memory range, which
is sufficient for getting a BIOS splash screen, but we really want to
assume that the OS level driver just doesn't use the stolen memory
range.  There was a time when it seemed like we could assume the Intel
drivers were heading in that 

Re: [PATCH v3 6/7] iommu: Introduce IOMMU_RESV_DIRECT_RELAXABLE reserved memory regions

2019-05-16 Thread Alex Williamson
On Thu, 16 May 2019 15:23:17 +0200
Auger Eric  wrote:

> Hi Robin,
> On 5/16/19 2:46 PM, Robin Murphy wrote:
> >> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> >> index ba91666998fb..14a521f85f14 100644
> >> --- a/include/linux/iommu.h
> >> +++ b/include/linux/iommu.h
> >> @@ -135,6 +135,12 @@ enum iommu_attr {
> >>   enum iommu_resv_type {
> >>   /* Memory regions which must be mapped 1:1 at all times */
> >>   IOMMU_RESV_DIRECT,
> >> +    /*
> >> + * Memory regions which are advertised to be 1:1 but are
> >> + * commonly considered relaxable in some conditions,
> >> + * for instance in device assignment use case (USB, Graphics)
> >> + */
> >> +    IOMMU_RESV_DIRECT_RELAXABLE,  
> > 
> > What do you think of s/RELAXABLE/BOOT/ ? My understanding is that these
> > regions are only considered relevant until Linux has taken full control
> > of the endpoint, and having a slightly more well-defined scope than
> > "some conditions" might be nice.  
> That's not my current understanding. I think those RMRRs may be used
> post-boot (especially the IGD stolen memory covered by RMRR). I
> understand this depends on the video mode or FW in use by the IGD. But I
> am definitively not an expert here.

Nor am I, but generally the distinction I'm trying to achieve is
whether the reserved region is necessary for the device operation or
for the system operation.  If we deny the IGD device its mapping to
stolen memory, then maybe the IGD device doesn't work, no big deal.  If
we deny USB its RMRR, then we assume we're only cutting off PS/2
emulation that we expect isn't used at this point anyway.  Both of these
are choices in how the driver wants to use the device.  On the other
hand if we have a system where management firmware has backdoors to
devices for system health monitoring, then declining to honor the RMRR
has larger implications.  Thanks,

Alex
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 09/16] iommu: Introduce guest PASID bind function

2019-05-16 Thread Jacob Pan
On Thu, 16 May 2019 15:14:40 +0100
Jean-Philippe Brucker  wrote:

> Hi Jacob,
> 
> On 03/05/2019 23:32, Jacob Pan wrote:
> > +/**
> > + * struct gpasid_bind_data - Information about device and guest
> > PASID binding
> > + * @gcr3:  Guest CR3 value from guest mm
> > + * @pasid: Process address space ID used for the guest mm
> > + * @addr_width:Guest address width. Paging mode can also
> > be derived.
> > + */
> > +struct gpasid_bind_data {
> > +   __u64 gcr3;
> > +   __u32 pasid;
> > +   __u32 addr_width;
> > +   __u32 flags;
> > +#defineIOMMU_SVA_GPASID_SREBIT(0) /* supervisor
> > request */
> > +   __u8 padding[4];
> > +};  
> 
> Could you wrap this structure into a generic one like we now do for
> bind_pasid_table? It would make the API easier to extend, because if
> we ever add individual PASID bind on Arm (something I'd like to do for
> virtio-iommu, eventually) it will have different parameters, as our
> PASID table entry has a lot of fields describing the page table
> format.
> 
> Maybe something like the following would do?
> 
> struct gpasid_bind_data {
> #define IOMMU_GPASID_BIND_VERSION_1 1
>   __u32 version;
> #define IOMMU_GPASID_BIND_FORMAT_INTEL_VTD1
>   __u32 format;
>   union {
>   // the current gpasid_bind_data:
>   struct gpasid_bind_intel_vtd vtd;
>   };
> };
> 
OK, sounds great.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/4] vfio: vfio_iommu_type1: Define VFIO_IOMMU_INFO_CAPABILITIES

2019-05-16 Thread Christian Borntraeger
Alex, 

patch 1 and 3 are s390 specific, 2 and 4 are vfio common code.
Are you ok with the common code changes? If yes, would you prefer to have this
via the s390 tree (Martin) or your tree?

On 10.05.19 10:22, Pierre Morel wrote:
> To use the VFIO_IOMMU_GET_INFO to retrieve IOMMU specific information,
> we define a new flag VFIO_IOMMU_INFO_CAPABILITIES in the
> vfio_iommu_type1_info structure and the associated capability
> information block.
> 
> Signed-off-by: Pierre Morel 
> ---
>  include/uapi/linux/vfio.h | 10 ++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> index 8f10748..8f68e0f 100644
> --- a/include/uapi/linux/vfio.h
> +++ b/include/uapi/linux/vfio.h
> @@ -715,6 +715,16 @@ struct vfio_iommu_type1_info {
>   __u32   flags;
>  #define VFIO_IOMMU_INFO_PGSIZES (1 << 0) /* supported page sizes info */
>   __u64   iova_pgsizes;   /* Bitmap of supported page sizes */
> +#define VFIO_IOMMU_INFO_CAPABILITIES (1 << 1)  /* support capabilities info 
> */
> + __u64   cap_offset; /* Offset within info struct of first cap */
> +};
> +
> +#define VFIO_IOMMU_INFO_CAP_QFN  1
> +#define VFIO_IOMMU_INFO_CAP_QGRP 2
> +
> +struct vfio_iommu_type1_info_block {
> + struct vfio_info_cap_header header;
> + __u32 data[];
>  };
>  
>  #define VFIO_IOMMU_GET_INFO _IO(VFIO_TYPE, VFIO_BASE + 12)
> 



Re: [PATCH v3 09/16] iommu: Introduce guest PASID bind function

2019-05-16 Thread Jean-Philippe Brucker
Hi Jacob,

On 03/05/2019 23:32, Jacob Pan wrote:
> +/**
> + * struct gpasid_bind_data - Information about device and guest PASID binding
> + * @gcr3:Guest CR3 value from guest mm
> + * @pasid:   Process address space ID used for the guest mm
> + * @addr_width:  Guest address width. Paging mode can also be derived.
> + */
> +struct gpasid_bind_data {
> + __u64 gcr3;
> + __u32 pasid;
> + __u32 addr_width;
> + __u32 flags;
> +#define  IOMMU_SVA_GPASID_SREBIT(0) /* supervisor request */
> + __u8 padding[4];
> +};

Could you wrap this structure into a generic one like we now do for
bind_pasid_table? It would make the API easier to extend, because if we
ever add individual PASID bind on Arm (something I'd like to do for
virtio-iommu, eventually) it will have different parameters, as our
PASID table entry has a lot of fields describing the page table format.

Maybe something like the following would do?

struct gpasid_bind_data {
#define IOMMU_GPASID_BIND_VERSION_1 1
__u32 version;
#define IOMMU_GPASID_BIND_FORMAT_INTEL_VTD  1
__u32 format;
union {
// the current gpasid_bind_data:
struct gpasid_bind_intel_vtd vtd;
};
};

Thanks,
Jean


Re: Linux crash when using FTDI FT232R USB to serial converter on AMD boards.

2019-05-16 Thread starost...@gmail.com

Dne 16.5.2019 v 15:11 Oliver Neukum napsal(a):


You will be asked whether this worked in earlier version of the
kernel. The answer would be: yes, no, unknown (why)

HTH
Oliver


Thank you, I sent new report.
starosta
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: Linux crash when using FTDI FT232R USB to serial converter on AMD boards.

2019-05-16 Thread Oliver Neukum
On Do, 2019-05-16 at 14:29 +0200,  starost...@gmail.com  wrote:
> Dne 16.5.2019 v 10:34 Oliver Neukum napsal(a):
> > Mention
> > 
> > 1. kernel version
> > 2. whether this is known to be a regression
> 
> Sorry for question, can you more specify what you mean?

You will be asked whether this worked in earlier version of the
kernel. The answer would be: yes, no, unknown (why)

HTH
Oliver

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 6/7] iommu: Introduce IOMMU_RESV_DIRECT_RELAXABLE reserved memory regions

2019-05-16 Thread Auger Eric
Hi Robin,
On 5/16/19 2:46 PM, Robin Murphy wrote:
> On 16/05/2019 11:08, Eric Auger wrote:
>> Introduce a new type for reserved region. This corresponds
>> to directly mapped regions which are known to be relaxable
>> in some specific conditions, such as device assignment use
>> case. Well known examples are those used by USB controllers
>> providing PS/2 keyboard emulation for pre-boot BIOS and
>> early BOOT or RMRRs associated to IGD working in legacy mode.
>>
>> Since commit c875d2c1b808 ("iommu/vt-d: Exclude devices using RMRRs
>> from IOMMU API domains") and commit 18436afdc11a ("iommu/vt-d: Allow
>> RMRR on graphics devices too"), those regions are currently
>> considered "safe" with respect to device assignment use case
>> which requires a non direct mapping at IOMMU physical level
>> (RAM GPA -> HPA mapping).
>>
>> Those RMRRs currently exist and sometimes the device is
>> attempting to access it but this has not been considered
>> an issue until now.
>>
>> However at the moment, iommu_get_group_resv_regions() is
>> not able to make any difference between directly mapped
>> regions: those which must be absolutely enforced and those
>> like above ones which are known as relaxable.
>>
>> This is a blocker for reporting severe conflicts between
>> non relaxable RMRRs (like MSI doorbells) and guest GPA space.
>>
>> With this new reserved region type we will be able to use
>> iommu_get_group_resv_regions() to enumerate the IOVA space
>> that is usable through the IOMMU API without introducing
>> regressions with respect to existing device assignment
>> use cases (USB and IGD).
>>
>> Signed-off-by: Eric Auger 
>>
>> ---
>>
>> Note: At the moment the sysfs ABI is not changed. However I wonder
>> whether it wouldn't be preferable to report the direct region as
>> "direct_relaxed" there. At the moment, in case the same direct
>> region is used by 2 devices, one USB/GFX and another not belonging
>> to the previous categories, the direct region will be output twice
>> with "direct" type.
> 
> Hmm, that sounds a bit off - if we have overlapping regions within the
> same domain, then we need to do some additional pre-processing to adjust
> them anyway, since any part of a relaxable region which overlaps a
> non-relaxable region cannot actually be relaxed, and so really should
> never be described as such.
In iommu_insert_resv_region(), we are overlapping regions of the same
type. So iommu_get_group_resv_regions() should return both the relaxable
region and non relaxable one. I should test this again using a hacked
kernel though.
> 
>> This would unblock Shameer's series:
>> [PATCH v6 0/7] vfio/type1: Add support for valid iova list management
>> https://patchwork.kernel.org/patch/10425309/
>>
>> which failed to get pulled for 4.18 merge window due to IGD
>> device assignment regression.
>>
>> v2 -> v3:
>> - fix direct type check
>> ---
>>   drivers/iommu/iommu.c | 12 +++-
>>   include/linux/iommu.h |  6 ++
>>   2 files changed, 13 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>> index ae4ea5c0e6f9..28c3d6351832 100644
>> --- a/drivers/iommu/iommu.c
>> +++ b/drivers/iommu/iommu.c
>> @@ -73,10 +73,11 @@ struct iommu_group_attribute {
>>   };
>>     static const char * const iommu_group_resv_type_string[] = {
>> -    [IOMMU_RESV_DIRECT]    = "direct",
>> -    [IOMMU_RESV_RESERVED]    = "reserved",
>> -    [IOMMU_RESV_MSI]    = "msi",
>> -    [IOMMU_RESV_SW_MSI]    = "msi",
>> +    [IOMMU_RESV_DIRECT]    = "direct",
>> +    [IOMMU_RESV_DIRECT_RELAXABLE]    = "direct",
> 
> Wasn't part of the idea that we might not need to expose these to
> userspace at all if they're only relevant to default domains, and not to
> VFIO domains or anything else userspace can get involved with?
Fur sure, today, those regions are exposed as "direct". Isn't this
information relevant to the userspace as it would rather not use those
direct regions. We eventually chose to implement the check inside the
VFIO driver but I understood we wanted this info to be visible. Then it
is arguable whether we should expose the new relaxable type, at the pain
of making the UABI evolve.

> 
>> +    [IOMMU_RESV_RESERVED]    = "reserved",
>> +    [IOMMU_RESV_MSI]    = "msi",
>> +    [IOMMU_RESV_SW_MSI]    = "msi",
>>   };
>>     #define IOMMU_GROUP_ATTR(_name, _mode, _show, _store)    \
>> @@ -573,7 +574,8 @@ static int
>> iommu_group_create_direct_mappings(struct iommu_group *group,
>>   start = ALIGN(entry->start, pg_size);
>>   end   = ALIGN(entry->start + entry->length, pg_size);
>>   -    if (entry->type != IOMMU_RESV_DIRECT)
>> +    if (entry->type != IOMMU_RESV_DIRECT &&
>> +    entry->type != IOMMU_RESV_DIRECT_RELAXABLE)
>>   continue;
>>     for (addr = start; addr < end; addr += pg_size) {
>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>> index ba91666998fb..14a521f85f14 100644
>> 

Re: [PATCH v3 6/7] iommu: Introduce IOMMU_RESV_DIRECT_RELAXABLE reserved memory regions

2019-05-16 Thread Auger Eric
Hi Jean-Philippe,

On 5/16/19 2:43 PM, Jean-Philippe Brucker wrote:
> On 16/05/2019 12:45, Auger Eric wrote:
>> Hi Jean-Philippe,
>>
>> On 5/16/19 1:16 PM, Jean-Philippe Brucker wrote:
>>> On 16/05/2019 11:08, Eric Auger wrote:
 Note: At the moment the sysfs ABI is not changed. However I wonder
 whether it wouldn't be preferable to report the direct region as
 "direct_relaxed" there. At the moment, in case the same direct
 region is used by 2 devices, one USB/GFX and another not belonging
 to the previous categories, the direct region will be output twice
 with "direct" type.

 This would unblock Shameer's series:
 [PATCH v6 0/7] vfio/type1: Add support for valid iova list management
 https://patchwork.kernel.org/patch/10425309/
>>>
>>> Thanks for doing this!
>>>
 which failed to get pulled for 4.18 merge window due to IGD
 device assignment regression.

 v2 -> v3:
 - fix direct type check
 ---
  drivers/iommu/iommu.c | 12 +++-
  include/linux/iommu.h |  6 ++
  2 files changed, 13 insertions(+), 5 deletions(-)

 diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
 index ae4ea5c0e6f9..28c3d6351832 100644
 --- a/drivers/iommu/iommu.c
 +++ b/drivers/iommu/iommu.c
 @@ -73,10 +73,11 @@ struct iommu_group_attribute {
  };
  
  static const char * const iommu_group_resv_type_string[] = {
 -  [IOMMU_RESV_DIRECT] = "direct",
 -  [IOMMU_RESV_RESERVED]   = "reserved",
 -  [IOMMU_RESV_MSI]= "msi",
 -  [IOMMU_RESV_SW_MSI] = "msi",
 +  [IOMMU_RESV_DIRECT] = "direct",
 +  [IOMMU_RESV_DIRECT_RELAXABLE]   = "direct",
 +  [IOMMU_RESV_RESERVED]   = "reserved",
 +  [IOMMU_RESV_MSI]= "msi",
 +  [IOMMU_RESV_SW_MSI] = "msi",
  };
  
  #define IOMMU_GROUP_ATTR(_name, _mode, _show, _store) \
 @@ -573,7 +574,8 @@ static int iommu_group_create_direct_mappings(struct 
 iommu_group *group,
start = ALIGN(entry->start, pg_size);
end   = ALIGN(entry->start + entry->length, pg_size);
  
 -  if (entry->type != IOMMU_RESV_DIRECT)
 +  if (entry->type != IOMMU_RESV_DIRECT &&
 +  entry->type != IOMMU_RESV_DIRECT_RELAXABLE)
>>>
>>> I'm trying to understand why you need to create direct mappings at all
>>> for these relaxable regions. In the host the region is needed for legacy
>>> device features, which are disabled (and cannot be re-enabled) when
>>> assigning the device to a guest?
>> This follows Kevin's comment in the thread below:
>> https://patchwork.kernel.org/patch/10449103/#21957279
>>
>> In normal DMA API host path, those regions need to be 1-1 mapped. They
>> are likely to be accessed by the driver or FW at early boot phase or
>> even during execution, depending on features being used.
>>
>> That's the reason, according to Kevin we couldn't hide them.
>>
>> We just know that, in general, they are not used anymore when assigning
>> the device or if accesses are attempted this generally does not block
>> the assignment use case. For example, it is said in
>> https://github.com/qemu/qemu/blob/master/docs/igd-assign.txt that in
>> legacy IGD assignment use case, there may be "a small numbers of DMAR
>> faults when initially assigned".
> 
> Hmm, fair enough. That doesn't sound too good, if the device might
> perform arbitrary writes into guest memory once new IOMMU mappings are
> in place. I was wondering if we could report some IOVA ranges as
> "available but avoid if possible".
In Shameer's series we currently reject any vfio dma_map that would fall
into an RMRR (hence the regression on existing USB/GFX use case). With
the relaxable RMRR info we could imagine to let the userspace choose
whether we want to proceed with the dma_map despite the risk or
introduce a vfio_iommu_type1 module option (turned off by default for
not regressing existing USB/GFX passthrough) that would forbid dma_map
on relaxable RMRR regions.

Thanks

Eric

 If the guest has a vIOMMU, they are
> easy to avoid. But I doubt they would ever get used, since probably no
> one is going to instantiate a vIOMMU for a graphics device in legacy mode.
> 
> Thanks,
> Jean
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 6/7] iommu: Introduce IOMMU_RESV_DIRECT_RELAXABLE reserved memory regions

2019-05-16 Thread Robin Murphy

On 16/05/2019 11:08, Eric Auger wrote:

Introduce a new type for reserved region. This corresponds
to directly mapped regions which are known to be relaxable
in some specific conditions, such as device assignment use
case. Well known examples are those used by USB controllers
providing PS/2 keyboard emulation for pre-boot BIOS and
early BOOT or RMRRs associated to IGD working in legacy mode.

Since commit c875d2c1b808 ("iommu/vt-d: Exclude devices using RMRRs
from IOMMU API domains") and commit 18436afdc11a ("iommu/vt-d: Allow
RMRR on graphics devices too"), those regions are currently
considered "safe" with respect to device assignment use case
which requires a non direct mapping at IOMMU physical level
(RAM GPA -> HPA mapping).

Those RMRRs currently exist and sometimes the device is
attempting to access it but this has not been considered
an issue until now.

However at the moment, iommu_get_group_resv_regions() is
not able to make any difference between directly mapped
regions: those which must be absolutely enforced and those
like above ones which are known as relaxable.

This is a blocker for reporting severe conflicts between
non relaxable RMRRs (like MSI doorbells) and guest GPA space.

With this new reserved region type we will be able to use
iommu_get_group_resv_regions() to enumerate the IOVA space
that is usable through the IOMMU API without introducing
regressions with respect to existing device assignment
use cases (USB and IGD).

Signed-off-by: Eric Auger 

---

Note: At the moment the sysfs ABI is not changed. However I wonder
whether it wouldn't be preferable to report the direct region as
"direct_relaxed" there. At the moment, in case the same direct
region is used by 2 devices, one USB/GFX and another not belonging
to the previous categories, the direct region will be output twice
with "direct" type.


Hmm, that sounds a bit off - if we have overlapping regions within the 
same domain, then we need to do some additional pre-processing to adjust 
them anyway, since any part of a relaxable region which overlaps a 
non-relaxable region cannot actually be relaxed, and so really should 
never be described as such.



This would unblock Shameer's series:
[PATCH v6 0/7] vfio/type1: Add support for valid iova list management
https://patchwork.kernel.org/patch/10425309/

which failed to get pulled for 4.18 merge window due to IGD
device assignment regression.

v2 -> v3:
- fix direct type check
---
  drivers/iommu/iommu.c | 12 +++-
  include/linux/iommu.h |  6 ++
  2 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index ae4ea5c0e6f9..28c3d6351832 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -73,10 +73,11 @@ struct iommu_group_attribute {
  };
  
  static const char * const iommu_group_resv_type_string[] = {

-   [IOMMU_RESV_DIRECT] = "direct",
-   [IOMMU_RESV_RESERVED]   = "reserved",
-   [IOMMU_RESV_MSI]= "msi",
-   [IOMMU_RESV_SW_MSI] = "msi",
+   [IOMMU_RESV_DIRECT] = "direct",
+   [IOMMU_RESV_DIRECT_RELAXABLE]   = "direct",


Wasn't part of the idea that we might not need to expose these to 
userspace at all if they're only relevant to default domains, and not to 
VFIO domains or anything else userspace can get involved with?



+   [IOMMU_RESV_RESERVED]   = "reserved",
+   [IOMMU_RESV_MSI]= "msi",
+   [IOMMU_RESV_SW_MSI] = "msi",
  };
  
  #define IOMMU_GROUP_ATTR(_name, _mode, _show, _store)		\

@@ -573,7 +574,8 @@ static int iommu_group_create_direct_mappings(struct 
iommu_group *group,
start = ALIGN(entry->start, pg_size);
end   = ALIGN(entry->start + entry->length, pg_size);
  
-		if (entry->type != IOMMU_RESV_DIRECT)

+   if (entry->type != IOMMU_RESV_DIRECT &&
+   entry->type != IOMMU_RESV_DIRECT_RELAXABLE)
continue;
  
  		for (addr = start; addr < end; addr += pg_size) {

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index ba91666998fb..14a521f85f14 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -135,6 +135,12 @@ enum iommu_attr {
  enum iommu_resv_type {
/* Memory regions which must be mapped 1:1 at all times */
IOMMU_RESV_DIRECT,
+   /*
+* Memory regions which are advertised to be 1:1 but are
+* commonly considered relaxable in some conditions,
+* for instance in device assignment use case (USB, Graphics)
+*/
+   IOMMU_RESV_DIRECT_RELAXABLE,


What do you think of s/RELAXABLE/BOOT/ ? My understanding is that these 
regions are only considered relevant until Linux has taken full control 
of the endpoint, and having a slightly more well-defined scope than 
"some conditions" might be nice.


There's an open question of how to handle things like simple-fb and EFI 
framebuffer handover on 

Re: [PATCH v3 6/7] iommu: Introduce IOMMU_RESV_DIRECT_RELAXABLE reserved memory regions

2019-05-16 Thread Jean-Philippe Brucker
On 16/05/2019 12:45, Auger Eric wrote:
> Hi Jean-Philippe,
> 
> On 5/16/19 1:16 PM, Jean-Philippe Brucker wrote:
>> On 16/05/2019 11:08, Eric Auger wrote:
>>> Note: At the moment the sysfs ABI is not changed. However I wonder
>>> whether it wouldn't be preferable to report the direct region as
>>> "direct_relaxed" there. At the moment, in case the same direct
>>> region is used by 2 devices, one USB/GFX and another not belonging
>>> to the previous categories, the direct region will be output twice
>>> with "direct" type.
>>>
>>> This would unblock Shameer's series:
>>> [PATCH v6 0/7] vfio/type1: Add support for valid iova list management
>>> https://patchwork.kernel.org/patch/10425309/
>>
>> Thanks for doing this!
>>
>>> which failed to get pulled for 4.18 merge window due to IGD
>>> device assignment regression.
>>>
>>> v2 -> v3:
>>> - fix direct type check
>>> ---
>>>  drivers/iommu/iommu.c | 12 +++-
>>>  include/linux/iommu.h |  6 ++
>>>  2 files changed, 13 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>>> index ae4ea5c0e6f9..28c3d6351832 100644
>>> --- a/drivers/iommu/iommu.c
>>> +++ b/drivers/iommu/iommu.c
>>> @@ -73,10 +73,11 @@ struct iommu_group_attribute {
>>>  };
>>>  
>>>  static const char * const iommu_group_resv_type_string[] = {
>>> -   [IOMMU_RESV_DIRECT] = "direct",
>>> -   [IOMMU_RESV_RESERVED]   = "reserved",
>>> -   [IOMMU_RESV_MSI]= "msi",
>>> -   [IOMMU_RESV_SW_MSI] = "msi",
>>> +   [IOMMU_RESV_DIRECT] = "direct",
>>> +   [IOMMU_RESV_DIRECT_RELAXABLE]   = "direct",
>>> +   [IOMMU_RESV_RESERVED]   = "reserved",
>>> +   [IOMMU_RESV_MSI]= "msi",
>>> +   [IOMMU_RESV_SW_MSI] = "msi",
>>>  };
>>>  
>>>  #define IOMMU_GROUP_ATTR(_name, _mode, _show, _store)  \
>>> @@ -573,7 +574,8 @@ static int iommu_group_create_direct_mappings(struct 
>>> iommu_group *group,
>>> start = ALIGN(entry->start, pg_size);
>>> end   = ALIGN(entry->start + entry->length, pg_size);
>>>  
>>> -   if (entry->type != IOMMU_RESV_DIRECT)
>>> +   if (entry->type != IOMMU_RESV_DIRECT &&
>>> +   entry->type != IOMMU_RESV_DIRECT_RELAXABLE)
>>
>> I'm trying to understand why you need to create direct mappings at all
>> for these relaxable regions. In the host the region is needed for legacy
>> device features, which are disabled (and cannot be re-enabled) when
>> assigning the device to a guest?
> This follows Kevin's comment in the thread below:
> https://patchwork.kernel.org/patch/10449103/#21957279
> 
> In normal DMA API host path, those regions need to be 1-1 mapped. They
> are likely to be accessed by the driver or FW at early boot phase or
> even during execution, depending on features being used.
> 
> That's the reason, according to Kevin we couldn't hide them.
> 
> We just know that, in general, they are not used anymore when assigning
> the device or if accesses are attempted this generally does not block
> the assignment use case. For example, it is said in
> https://github.com/qemu/qemu/blob/master/docs/igd-assign.txt that in
> legacy IGD assignment use case, there may be "a small numbers of DMAR
> faults when initially assigned".

Hmm, fair enough. That doesn't sound too good, if the device might
perform arbitrary writes into guest memory once new IOMMU mappings are
in place. I was wondering if we could report some IOVA ranges as
"available but avoid if possible". If the guest has a vIOMMU, they are
easy to avoid. But I doubt they would ever get used, since probably no
one is going to instantiate a vIOMMU for a graphics device in legacy mode.

Thanks,
Jean


Re: Linux crash when using FTDI FT232R USB to serial converter on AMD boards.

2019-05-16 Thread starost...@gmail.com

Dne 16.5.2019 v 10:34 Oliver Neukum napsal(a):

Mention

1. kernel version
2. whether this is known to be a regression

Sorry for question, can you more specify what you mean?


3. include the log with iommu disabled and mention that you disabled
the IOMMU
4. Include the output of lsusb

HTH
Oliver

Thank you
starosta
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 6/7] iommu: Introduce IOMMU_RESV_DIRECT_RELAXABLE reserved memory regions

2019-05-16 Thread Auger Eric
Hi Jean-Philippe,

On 5/16/19 1:16 PM, Jean-Philippe Brucker wrote:
> On 16/05/2019 11:08, Eric Auger wrote:
>> Note: At the moment the sysfs ABI is not changed. However I wonder
>> whether it wouldn't be preferable to report the direct region as
>> "direct_relaxed" there. At the moment, in case the same direct
>> region is used by 2 devices, one USB/GFX and another not belonging
>> to the previous categories, the direct region will be output twice
>> with "direct" type.
>>
>> This would unblock Shameer's series:
>> [PATCH v6 0/7] vfio/type1: Add support for valid iova list management
>> https://patchwork.kernel.org/patch/10425309/
> 
> Thanks for doing this!
> 
>> which failed to get pulled for 4.18 merge window due to IGD
>> device assignment regression.
>>
>> v2 -> v3:
>> - fix direct type check
>> ---
>>  drivers/iommu/iommu.c | 12 +++-
>>  include/linux/iommu.h |  6 ++
>>  2 files changed, 13 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>> index ae4ea5c0e6f9..28c3d6351832 100644
>> --- a/drivers/iommu/iommu.c
>> +++ b/drivers/iommu/iommu.c
>> @@ -73,10 +73,11 @@ struct iommu_group_attribute {
>>  };
>>  
>>  static const char * const iommu_group_resv_type_string[] = {
>> -[IOMMU_RESV_DIRECT] = "direct",
>> -[IOMMU_RESV_RESERVED]   = "reserved",
>> -[IOMMU_RESV_MSI]= "msi",
>> -[IOMMU_RESV_SW_MSI] = "msi",
>> +[IOMMU_RESV_DIRECT] = "direct",
>> +[IOMMU_RESV_DIRECT_RELAXABLE]   = "direct",
>> +[IOMMU_RESV_RESERVED]   = "reserved",
>> +[IOMMU_RESV_MSI]= "msi",
>> +[IOMMU_RESV_SW_MSI] = "msi",
>>  };
>>  
>>  #define IOMMU_GROUP_ATTR(_name, _mode, _show, _store)   \
>> @@ -573,7 +574,8 @@ static int iommu_group_create_direct_mappings(struct 
>> iommu_group *group,
>>  start = ALIGN(entry->start, pg_size);
>>  end   = ALIGN(entry->start + entry->length, pg_size);
>>  
>> -if (entry->type != IOMMU_RESV_DIRECT)
>> +if (entry->type != IOMMU_RESV_DIRECT &&
>> +entry->type != IOMMU_RESV_DIRECT_RELAXABLE)
> 
> I'm trying to understand why you need to create direct mappings at all
> for these relaxable regions. In the host the region is needed for legacy
> device features, which are disabled (and cannot be re-enabled) when
> assigning the device to a guest?
This follows Kevin's comment in the thread below:
https://patchwork.kernel.org/patch/10449103/#21957279

In normal DMA API host path, those regions need to be 1-1 mapped. They
are likely to be accessed by the driver or FW at early boot phase or
even during execution, depending on features being used.

That's the reason, according to Kevin we couldn't hide them.

We just know that, in general, they are not used anymore when assigning
the device or if accesses are attempted this generally does not block
the assignment use case. For example, it is said in
https://github.com/qemu/qemu/blob/master/docs/igd-assign.txt that in
legacy IGD assignment use case, there may be "a small numbers of DMAR
faults when initially assigned".

Thanks

Eric


> 
> Thanks,
> Jean
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 6/7] iommu: Introduce IOMMU_RESV_DIRECT_RELAXABLE reserved memory regions

2019-05-16 Thread Jean-Philippe Brucker
On 16/05/2019 11:08, Eric Auger wrote:
> Note: At the moment the sysfs ABI is not changed. However I wonder
> whether it wouldn't be preferable to report the direct region as
> "direct_relaxed" there. At the moment, in case the same direct
> region is used by 2 devices, one USB/GFX and another not belonging
> to the previous categories, the direct region will be output twice
> with "direct" type.
> 
> This would unblock Shameer's series:
> [PATCH v6 0/7] vfio/type1: Add support for valid iova list management
> https://patchwork.kernel.org/patch/10425309/

Thanks for doing this!

> which failed to get pulled for 4.18 merge window due to IGD
> device assignment regression.
> 
> v2 -> v3:
> - fix direct type check
> ---
>  drivers/iommu/iommu.c | 12 +++-
>  include/linux/iommu.h |  6 ++
>  2 files changed, 13 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index ae4ea5c0e6f9..28c3d6351832 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -73,10 +73,11 @@ struct iommu_group_attribute {
>  };
>  
>  static const char * const iommu_group_resv_type_string[] = {
> - [IOMMU_RESV_DIRECT] = "direct",
> - [IOMMU_RESV_RESERVED]   = "reserved",
> - [IOMMU_RESV_MSI]= "msi",
> - [IOMMU_RESV_SW_MSI] = "msi",
> + [IOMMU_RESV_DIRECT] = "direct",
> + [IOMMU_RESV_DIRECT_RELAXABLE]   = "direct",
> + [IOMMU_RESV_RESERVED]   = "reserved",
> + [IOMMU_RESV_MSI]= "msi",
> + [IOMMU_RESV_SW_MSI] = "msi",
>  };
>  
>  #define IOMMU_GROUP_ATTR(_name, _mode, _show, _store)\
> @@ -573,7 +574,8 @@ static int iommu_group_create_direct_mappings(struct 
> iommu_group *group,
>   start = ALIGN(entry->start, pg_size);
>   end   = ALIGN(entry->start + entry->length, pg_size);
>  
> - if (entry->type != IOMMU_RESV_DIRECT)
> + if (entry->type != IOMMU_RESV_DIRECT &&
> + entry->type != IOMMU_RESV_DIRECT_RELAXABLE)

I'm trying to understand why you need to create direct mappings at all
for these relaxable regions. In the host the region is needed for legacy
device features, which are disabled (and cannot be re-enabled) when
assigning the device to a guest?

Thanks,
Jean



[PATCH v3 6/7] iommu: Introduce IOMMU_RESV_DIRECT_RELAXABLE reserved memory regions

2019-05-16 Thread Eric Auger
Introduce a new type for reserved region. This corresponds
to directly mapped regions which are known to be relaxable
in some specific conditions, such as device assignment use
case. Well known examples are those used by USB controllers
providing PS/2 keyboard emulation for pre-boot BIOS and
early BOOT or RMRRs associated to IGD working in legacy mode.

Since commit c875d2c1b808 ("iommu/vt-d: Exclude devices using RMRRs
from IOMMU API domains") and commit 18436afdc11a ("iommu/vt-d: Allow
RMRR on graphics devices too"), those regions are currently
considered "safe" with respect to device assignment use case
which requires a non direct mapping at IOMMU physical level
(RAM GPA -> HPA mapping).

Those RMRRs currently exist and sometimes the device is
attempting to access it but this has not been considered
an issue until now.

However at the moment, iommu_get_group_resv_regions() is
not able to make any difference between directly mapped
regions: those which must be absolutely enforced and those
like above ones which are known as relaxable.

This is a blocker for reporting severe conflicts between
non relaxable RMRRs (like MSI doorbells) and guest GPA space.

With this new reserved region type we will be able to use
iommu_get_group_resv_regions() to enumerate the IOVA space
that is usable through the IOMMU API without introducing
regressions with respect to existing device assignment
use cases (USB and IGD).

Signed-off-by: Eric Auger 

---

Note: At the moment the sysfs ABI is not changed. However I wonder
whether it wouldn't be preferable to report the direct region as
"direct_relaxed" there. At the moment, in case the same direct
region is used by 2 devices, one USB/GFX and another not belonging
to the previous categories, the direct region will be output twice
with "direct" type.

This would unblock Shameer's series:
[PATCH v6 0/7] vfio/type1: Add support for valid iova list management
https://patchwork.kernel.org/patch/10425309/

which failed to get pulled for 4.18 merge window due to IGD
device assignment regression.

v2 -> v3:
- fix direct type check
---
 drivers/iommu/iommu.c | 12 +++-
 include/linux/iommu.h |  6 ++
 2 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index ae4ea5c0e6f9..28c3d6351832 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -73,10 +73,11 @@ struct iommu_group_attribute {
 };
 
 static const char * const iommu_group_resv_type_string[] = {
-   [IOMMU_RESV_DIRECT] = "direct",
-   [IOMMU_RESV_RESERVED]   = "reserved",
-   [IOMMU_RESV_MSI]= "msi",
-   [IOMMU_RESV_SW_MSI] = "msi",
+   [IOMMU_RESV_DIRECT] = "direct",
+   [IOMMU_RESV_DIRECT_RELAXABLE]   = "direct",
+   [IOMMU_RESV_RESERVED]   = "reserved",
+   [IOMMU_RESV_MSI]= "msi",
+   [IOMMU_RESV_SW_MSI] = "msi",
 };
 
 #define IOMMU_GROUP_ATTR(_name, _mode, _show, _store)  \
@@ -573,7 +574,8 @@ static int iommu_group_create_direct_mappings(struct 
iommu_group *group,
start = ALIGN(entry->start, pg_size);
end   = ALIGN(entry->start + entry->length, pg_size);
 
-   if (entry->type != IOMMU_RESV_DIRECT)
+   if (entry->type != IOMMU_RESV_DIRECT &&
+   entry->type != IOMMU_RESV_DIRECT_RELAXABLE)
continue;
 
for (addr = start; addr < end; addr += pg_size) {
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index ba91666998fb..14a521f85f14 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -135,6 +135,12 @@ enum iommu_attr {
 enum iommu_resv_type {
/* Memory regions which must be mapped 1:1 at all times */
IOMMU_RESV_DIRECT,
+   /*
+* Memory regions which are advertised to be 1:1 but are
+* commonly considered relaxable in some conditions,
+* for instance in device assignment use case (USB, Graphics)
+*/
+   IOMMU_RESV_DIRECT_RELAXABLE,
/* Arbitrary "never map this or give it to a device" address ranges */
IOMMU_RESV_RESERVED,
/* Hardware MSI region (untranslated) */
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 7/7] iommu/vt-d: Differentiate relaxable and non relaxable RMRRs

2019-05-16 Thread Eric Auger
Now we have a new IOMMU_RESV_DIRECT_RELAXABLE reserved memory
region type, let's report USB and GFX RMRRs as relaxable ones.

This allows to have a finer reporting at IOMMU API level of
reserved memory regions. This will be exploitable by VFIO to
define the usable IOVA range and detect potential conflicts
between the guest physical address space and host reserved
regions.

Signed-off-by: Eric Auger 
---
 drivers/iommu/intel-iommu.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index a36604f4900f..af1d65fdedfc 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5493,7 +5493,9 @@ static void intel_iommu_get_resv_regions(struct device 
*device,
for_each_rmrr_units(rmrr) {
for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
  i, i_dev) {
+   struct pci_dev *pdev = to_pci_dev(device);
struct iommu_resv_region *resv;
+   enum iommu_resv_type type;
size_t length;
 
if (i_dev != device &&
@@ -5501,9 +5503,13 @@ static void intel_iommu_get_resv_regions(struct device 
*device,
continue;
 
length = rmrr->end_address - rmrr->base_address + 1;
+
+   type = (pdev &&
+   (IS_USB_DEVICE(pdev) || IS_GFX_DEVICE(pdev))) ?
+   IOMMU_RESV_DIRECT_RELAXABLE : IOMMU_RESV_DIRECT;
+
resv = iommu_alloc_resv_region(rmrr->base_address,
-  length, prot,
-  IOMMU_RESV_DIRECT,
+  length, prot, type,
   GFP_ATOMIC);
if (!resv)
break;
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 1/7] iommu: Pass a GFP flag parameter to iommu_alloc_resv_region()

2019-05-16 Thread Eric Auger
We plan to call iommu_alloc_resv_region in a non preemptible section.
Pass a GFP flag to the function and update all the call sites.

Signed-off-by: Eric Auger 
---
 drivers/acpi/arm64/iort.c   | 3 ++-
 drivers/iommu/amd_iommu.c   | 7 ---
 drivers/iommu/arm-smmu-v3.c | 2 +-
 drivers/iommu/arm-smmu.c| 2 +-
 drivers/iommu/intel-iommu.c | 4 ++--
 drivers/iommu/iommu.c   | 7 ---
 include/linux/iommu.h   | 2 +-
 7 files changed, 15 insertions(+), 12 deletions(-)

diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index 9058cb084b91..8313b2d1b710 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -868,7 +868,8 @@ int iort_iommu_msi_get_resv_regions(struct device *dev, 
struct list_head *head)
struct iommu_resv_region *region;
 
region = iommu_alloc_resv_region(base + SZ_64K, SZ_64K,
-prot, IOMMU_RESV_MSI);
+prot, IOMMU_RESV_MSI,
+GFP_KERNEL);
if (region) {
list_add_tail(>list, head);
resv++;
diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 09c9e45f7fa2..f2eb8e9cd8a6 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -3136,7 +3136,8 @@ static void amd_iommu_get_resv_regions(struct device *dev,
type = IOMMU_RESV_RESERVED;
 
region = iommu_alloc_resv_region(entry->address_start,
-length, prot, type);
+length, prot, type,
+GFP_KERNEL);
if (!region) {
dev_err(dev, "Out of memory allocating dm-regions\n");
return;
@@ -3146,14 +3147,14 @@ static void amd_iommu_get_resv_regions(struct device 
*dev,
 
region = iommu_alloc_resv_region(MSI_RANGE_START,
 MSI_RANGE_END - MSI_RANGE_START + 1,
-0, IOMMU_RESV_MSI);
+0, IOMMU_RESV_MSI, GFP_KERNEL);
if (!region)
return;
list_add_tail(>list, head);
 
region = iommu_alloc_resv_region(HT_RANGE_START,
 HT_RANGE_END - HT_RANGE_START + 1,
-0, IOMMU_RESV_RESERVED);
+0, IOMMU_RESV_RESERVED, GFP_KERNEL);
if (!region)
return;
list_add_tail(>list, head);
diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 4d5a694f02c2..f9b1279ef5bf 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -2226,7 +2226,7 @@ static void arm_smmu_get_resv_regions(struct device *dev,
int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
 
region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
-prot, IOMMU_RESV_SW_MSI);
+prot, IOMMU_RESV_SW_MSI, GFP_KERNEL);
if (!region)
return;
 
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 5e54cc0a28b3..646e76813e91 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1670,7 +1670,7 @@ static void arm_smmu_get_resv_regions(struct device *dev,
int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
 
region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
-prot, IOMMU_RESV_SW_MSI);
+prot, IOMMU_RESV_SW_MSI, GFP_KERNEL);
if (!region)
return;
 
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index a209199f3af6..2be36dff189a 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -4220,7 +4220,7 @@ int __init dmar_parse_one_rmrr(struct acpi_dmar_header 
*header, void *arg)
 
length = rmrr->end_address - rmrr->base_address + 1;
rmrru->resv = iommu_alloc_resv_region(rmrr->base_address, length, prot,
- IOMMU_RESV_DIRECT);
+ IOMMU_RESV_DIRECT, GFP_KERNEL);
if (!rmrru->resv)
goto free_rmrru;
 
@@ -5489,7 +5489,7 @@ static void intel_iommu_get_resv_regions(struct device 
*device,
 
reg = iommu_alloc_resv_region(IOAPIC_RANGE_START,
  IOAPIC_RANGE_END - IOAPIC_RANGE_START + 1,
- 0, IOMMU_RESV_MSI);
+ 0, IOMMU_RESV_MSI, GFP_KERNEL);
if (!reg)
return;
list_add_tail(>list, 

[PATCH v3 5/7] iommu/vt-d: Handle PCI bridge RMRR device scopes in intel_iommu_get_resv_regions

2019-05-16 Thread Eric Auger
In the case the RMRR device scope is a PCI-PCI bridge, let's check
the device belongs to the PCI sub-hierarchy.

Fixes: 0659b8dc45a6 ("iommu/vt-d: Implement reserved region get/put callbacks")

Signed-off-by: Eric Auger 
---
 drivers/iommu/intel-iommu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 7ed820e79313..a36604f4900f 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5496,7 +5496,8 @@ static void intel_iommu_get_resv_regions(struct device 
*device,
struct iommu_resv_region *resv;
size_t length;
 
-   if (i_dev != device)
+   if (i_dev != device &&
+   !is_downstream_to_pci_bridge(device, i_dev))
continue;
 
length = rmrr->end_address - rmrr->base_address + 1;
-- 
2.20.1



[PATCH v3 4/7] iommu/vt-d: Handle RMRR with PCI bridge device scopes

2019-05-16 Thread Eric Auger
When reading the vtd specification and especially the
Reserved Memory Region Reporting Structure chapter,
it is not obvious a device scope element cannot be a
PCI-PCI bridge, in which case all downstream ports are
likely to access the reserved memory region. Let's handle
this case in device_has_rmrr.

Fixes: ea2447f700ca ("intel-iommu: Prevent devices with RMRRs from being placed 
into SI Domain")

Signed-off-by: Eric Auger 

---

v1 -> v2:
- is_downstream_to_pci_bridge helper introduced in a separate patch
---
 drivers/iommu/intel-iommu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 15c2f9677491..7ed820e79313 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -2910,7 +2910,8 @@ static bool device_has_rmrr(struct device *dev)
 */
for_each_active_dev_scope(rmrr->devices,
  rmrr->devices_cnt, i, tmp)
-   if (tmp == dev) {
+   if (tmp == dev ||
+   is_downstream_to_pci_bridge(dev, tmp)) {
rcu_read_unlock();
return true;
}
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 2/7] iommu/vt-d: Duplicate iommu_resv_region objects per device list

2019-05-16 Thread Eric Auger
intel_iommu_get_resv_regions() aims to return the list of
reserved regions accessible by a given @device. However several
devices can access the same reserved memory region and when
building the list it is not safe to use a single iommu_resv_region
object, whose container is the RMRR. This iommu_resv_region must
be duplicated per device reserved region list.

Let's remove the struct iommu_resv_region from the RMRR unit
and allocate the iommu_resv_region directly in
intel_iommu_get_resv_regions().

Fixes: 0659b8dc45a6 ("iommu/vt-d: Implement reserved region get/put callbacks")
Signed-off-by: Eric Auger 
---
 drivers/iommu/intel-iommu.c | 31 ---
 1 file changed, 16 insertions(+), 15 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 2be36dff189a..590a0e78d11d 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -322,7 +322,6 @@ struct dmar_rmrr_unit {
u64 end_address;/* reserved end address */
struct dmar_dev_scope *devices; /* target devices */
int devices_cnt;/* target device count */
-   struct iommu_resv_region *resv; /* reserved region handle */
 };
 
 struct dmar_atsr_unit {
@@ -4205,7 +4204,6 @@ static inline void init_iommu_pm_ops(void) {}
 int __init dmar_parse_one_rmrr(struct acpi_dmar_header *header, void *arg)
 {
struct acpi_dmar_reserved_memory *rmrr;
-   int prot = DMA_PTE_READ|DMA_PTE_WRITE;
struct dmar_rmrr_unit *rmrru;
size_t length;
 
@@ -4219,22 +4217,16 @@ int __init dmar_parse_one_rmrr(struct acpi_dmar_header 
*header, void *arg)
rmrru->end_address = rmrr->end_address;
 
length = rmrr->end_address - rmrr->base_address + 1;
-   rmrru->resv = iommu_alloc_resv_region(rmrr->base_address, length, prot,
- IOMMU_RESV_DIRECT, GFP_KERNEL);
-   if (!rmrru->resv)
-   goto free_rmrru;
 
rmrru->devices = dmar_alloc_dev_scope((void *)(rmrr + 1),
((void *)rmrr) + rmrr->header.length,
>devices_cnt);
if (rmrru->devices_cnt && rmrru->devices == NULL)
-   goto free_all;
+   goto free_rmrru;
 
list_add(>list, _rmrr_units);
 
return 0;
-free_all:
-   kfree(rmrru->resv);
 free_rmrru:
kfree(rmrru);
 out:
@@ -4452,7 +,6 @@ static void intel_iommu_free_dmars(void)
list_for_each_entry_safe(rmrru, rmrr_n, _rmrr_units, list) {
list_del(>list);
dmar_free_dev_scope(>devices, >devices_cnt);
-   kfree(rmrru->resv);
kfree(rmrru);
}
 
@@ -5470,6 +5461,7 @@ static void intel_iommu_remove_device(struct device *dev)
 static void intel_iommu_get_resv_regions(struct device *device,
 struct list_head *head)
 {
+   int prot = DMA_PTE_READ|DMA_PTE_WRITE;
struct iommu_resv_region *reg;
struct dmar_rmrr_unit *rmrr;
struct device *i_dev;
@@ -5479,10 +5471,21 @@ static void intel_iommu_get_resv_regions(struct device 
*device,
for_each_rmrr_units(rmrr) {
for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
  i, i_dev) {
+   struct iommu_resv_region *resv;
+   size_t length;
+
if (i_dev != device)
continue;
 
-   list_add_tail(>resv->list, head);
+   length = rmrr->end_address - rmrr->base_address + 1;
+   resv = iommu_alloc_resv_region(rmrr->base_address,
+  length, prot,
+  IOMMU_RESV_DIRECT,
+  GFP_ATOMIC);
+   if (!resv)
+   break;
+
+   list_add_tail(>list, head);
}
}
rcu_read_unlock();
@@ -5500,10 +5503,8 @@ static void intel_iommu_put_resv_regions(struct device 
*dev,
 {
struct iommu_resv_region *entry, *next;
 
-   list_for_each_entry_safe(entry, next, head, list) {
-   if (entry->type == IOMMU_RESV_MSI)
-   kfree(entry);
-   }
+   list_for_each_entry_safe(entry, next, head, list)
+   kfree(entry);
 }
 
 int intel_iommu_enable_pasid(struct intel_iommu *iommu, struct device *dev)
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 3/7] iommu/vt-d: Introduce is_downstream_to_pci_bridge helper

2019-05-16 Thread Eric Auger
Several call sites are about to check whether a device belongs
to the PCI sub-hierarchy of a candidate PCI-PCI bridge.
Introduce an helper to perform that check.

Signed-off-by: Eric Auger 
---
 drivers/iommu/intel-iommu.c | 37 +
 1 file changed, 29 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 590a0e78d11d..15c2f9677491 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -736,12 +736,39 @@ static int iommu_dummy(struct device *dev)
return dev->archdata.iommu == DUMMY_DEVICE_DOMAIN_INFO;
 }
 
+/* is_downstream_to_pci_bridge - test if a device belongs to the
+ * PCI sub-hierarchy of a candidate PCI-PCI bridge
+ *
+ * @dev: candidate PCI device belonging to @bridge PCI sub-hierarchy
+ * @bridge: the candidate PCI-PCI bridge
+ *
+ * Return: true if @dev belongs to @bridge PCI sub-hierarchy
+ */
+static bool
+is_downstream_to_pci_bridge(struct device *dev, struct device *bridge)
+{
+   struct pci_dev *pdev, *pbridge;
+
+   if (!dev_is_pci(dev) || !dev_is_pci(bridge))
+   return false;
+
+   pdev = to_pci_dev(dev);
+   pbridge = to_pci_dev(bridge);
+
+   if (pbridge->subordinate &&
+   pbridge->subordinate->number <= pdev->bus->number &&
+   pbridge->subordinate->busn_res.end >= pdev->bus->number)
+   return true;
+
+   return false;
+}
+
 static struct intel_iommu *device_to_iommu(struct device *dev, u8 *bus, u8 
*devfn)
 {
struct dmar_drhd_unit *drhd = NULL;
struct intel_iommu *iommu;
struct device *tmp;
-   struct pci_dev *ptmp, *pdev = NULL;
+   struct pci_dev *pdev = NULL;
u16 segment = 0;
int i;
 
@@ -787,13 +814,7 @@ static struct intel_iommu *device_to_iommu(struct device 
*dev, u8 *bus, u8 *devf
goto out;
}
 
-   if (!pdev || !dev_is_pci(tmp))
-   continue;
-
-   ptmp = to_pci_dev(tmp);
-   if (ptmp->subordinate &&
-   ptmp->subordinate->number <= pdev->bus->number &&
-   ptmp->subordinate->busn_res.end >= 
pdev->bus->number)
+   if (is_downstream_to_pci_bridge(dev, tmp))
goto got_pdev;
}
 
-- 
2.20.1



[PATCH v3 0/7] RMRR related fixes and enhancements

2019-05-16 Thread Eric Auger
Currently the Intel reserved region is attached to the
RMRR unit and when building the list of RMRR seen by a device
we link this unique reserved region without taking care of
potential multiple usage of this reserved region by several devices.

Also while reading the vtd spec it is unclear to me whether
the RMRR device scope referenced by an RMRR ACPI struct could
be a PCI-PCI bridge, in which case I think we also need to
check the device belongs to the PCI sub-hierarchy of the device
referenced in the scope. This would be true for device_has_rmrr()
and intel_iommu_get_resv_regions().

Last, the VFIO subsystem would need to compute the usable IOVA range
by querying the iommu_get_group_resv_regions() API. This would allow,
for instance, to report potential conflicts between the guest physical
address space and host reserved regions.
  
However iommu_get_group_resv_regions() currently fails to differentiate
RMRRs that are known safe for device assignment and RMRRs that must be
enforced. So we introduce a new reserved memory region type (relaxable),
reported when associated to an USB or GFX device. The last 2 patches aim
at unblocking [1] which is stuck since 4.18.

[1-5] are fixes
[6-7] are enhancements

The two parts can be considered separately if needed.

References:
[1] [PATCH v6 0/7] vfio/type1: Add support for valid iova list management
https://patchwork.kernel.org/patch/10425309/

Branch: This series is available at:
https://github.com/eauger/linux/tree/v5.1-rmrr-v3

History:

v2 -> v3:
s/||/&& in iommu_group_create_direct_mappings

v1 -> v2:
- introduce is_downstream_to_pci_bridge() in a separate patch, change param
  names and add kerneldoc comment
- add 6,7

Eric Auger (7):
  iommu: Pass a GFP flag parameter to iommu_alloc_resv_region()
  iommu/vt-d: Duplicate iommu_resv_region objects per device list
  iommu/vt-d: Introduce is_downstream_to_pci_bridge helper
  iommu/vt-d: Handle RMRR with PCI bridge device scopes
  iommu/vt-d: Handle PCI bridge RMRR device scopes in
intel_iommu_get_resv_regions
  iommu: Introduce IOMMU_RESV_DIRECT_RELAXABLE reserved memory regions
  iommu/vt-d: Differentiate relaxable and non relaxable RMRRs

 drivers/acpi/arm64/iort.c   |  3 +-
 drivers/iommu/amd_iommu.c   |  7 ++--
 drivers/iommu/arm-smmu-v3.c |  2 +-
 drivers/iommu/arm-smmu.c|  2 +-
 drivers/iommu/intel-iommu.c | 82 +
 drivers/iommu/iommu.c   | 19 +
 include/linux/iommu.h   |  8 +++-
 7 files changed, 82 insertions(+), 41 deletions(-)

-- 
2.20.1



Re: [PATCH v2 6/7] iommu: Introduce IOMMU_RESV_DIRECT_RELAXABLE reserved memory regions

2019-05-16 Thread Auger Eric
Hi

On 5/16/19 10:47 AM, Eric Auger wrote:
> Introduce a new type for reserved region. This corresponds
> to directly mapped regions which are known to be relaxable
> in some specific conditions, such as device assignment use
> case. Well known examples are those used by USB controllers
> providing PS/2 keyboard emulation for pre-boot BIOS and
> early BOOT or RMRRs associated to IGD working in legacy mode.
> 
> Since commit c875d2c1b808 ("iommu/vt-d: Exclude devices using RMRRs
> from IOMMU API domains") and commit 18436afdc11a ("iommu/vt-d: Allow
> RMRR on graphics devices too"), those regions are currently
> considered "safe" with respect to device assignment use case
> which requires a non direct mapping at IOMMU physical level
> (RAM GPA -> HPA mapping).
> 
> Those RMRRs currently exist and sometimes the device is
> attempting to access it but this has not been considered
> an issue until now.
> 
> However at the moment, iommu_get_group_resv_regions() is
> not able to make any difference between directly mapped
> regions: those which must be absolutely enforced and those
> like above ones which are known as relaxable.
> 
> This is a blocker for reporting severe conflicts between
> non relaxable RMRRs (like MSI doorbells) and guest GPA space.
> 
> With this new reserved region type we will be able to use
> iommu_get_group_resv_regions() to enumerate the IOVA space
> that is usable through the IOMMU API without introducing
> regressions with respect to existing device assignment
> use cases (USB and IGD).
> 
> Signed-off-by: Eric Auger 
> 
> ---
> 
> Note: At the moment the sysfs ABI is not changed. However I wonder
> whether it wouldn't be preferable to report the direct region as
> "direct_relaxed" there. At the moment, in case the same direct
> region is used by 2 devices, one USB/GFX and another not belonging
> to the previous categories, the direct region will be output twice
> with "direct" type.
> 
> This would unblock Shameer's series:
> [PATCH v6 0/7] vfio/type1: Add support for valid iova list management
> https://patchwork.kernel.org/patch/10425309/
> 
> which failed to get pulled for 4.18 merge window due to IGD
> device assignment regression.
> ---
>  drivers/iommu/iommu.c | 12 +++-
>  include/linux/iommu.h |  6 ++
>  2 files changed, 13 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index ae4ea5c0e6f9..84dcb6af6511 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -73,10 +73,11 @@ struct iommu_group_attribute {
>  };
>  
>  static const char * const iommu_group_resv_type_string[] = {
> - [IOMMU_RESV_DIRECT] = "direct",
> - [IOMMU_RESV_RESERVED]   = "reserved",
> - [IOMMU_RESV_MSI]= "msi",
> - [IOMMU_RESV_SW_MSI] = "msi",
> + [IOMMU_RESV_DIRECT] = "direct",
> + [IOMMU_RESV_DIRECT_RELAXABLE]   = "direct",
> + [IOMMU_RESV_RESERVED]   = "reserved",
> + [IOMMU_RESV_MSI]= "msi",
> + [IOMMU_RESV_SW_MSI] = "msi",
>  };
>  
>  #define IOMMU_GROUP_ATTR(_name, _mode, _show, _store)\
> @@ -573,7 +574,8 @@ static int iommu_group_create_direct_mappings(struct 
> iommu_group *group,
>   start = ALIGN(entry->start, pg_size);
>   end   = ALIGN(entry->start + entry->length, pg_size);
>  
> - if (entry->type != IOMMU_RESV_DIRECT)
> + if (entry->type != IOMMU_RESV_DIRECT ||
It must be a "&&" and not "||"

Respinning ...

Eric

> + entry->type != IOMMU_RESV_DIRECT_RELAXABLE)
>   continue;
>  
>   for (addr = start; addr < end; addr += pg_size) {
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index ba91666998fb..14a521f85f14 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -135,6 +135,12 @@ enum iommu_attr {
>  enum iommu_resv_type {
>   /* Memory regions which must be mapped 1:1 at all times */
>   IOMMU_RESV_DIRECT,
> + /*
> +  * Memory regions which are advertised to be 1:1 but are
> +  * commonly considered relaxable in some conditions,
> +  * for instance in device assignment use case (USB, Graphics)
> +  */
> + IOMMU_RESV_DIRECT_RELAXABLE,
>   /* Arbitrary "never map this or give it to a device" address ranges */
>   IOMMU_RESV_RESERVED,
>   /* Hardware MSI region (untranslated) */
> 


[PATCH v5 1/1] iommu/io-pgtable-arm: Add support to use system cache

2019-05-16 Thread Vivek Gautam
Few Qualcomm platforms such as, sdm845 have an additional outer
cache called as System cache, aka. Last level cache (LLC) that
allows non-coherent devices to upgrade to using caching.
This cache sits right before the DDR, and is tightly coupled
with the memory controller. The clients using this cache request
their slices from this system cache, make it active, and can then
start using it.

There is a fundamental assumption that non-coherent devices can't
access caches. This change adds an exception where they *can* use
some level of cache despite still being non-coherent overall.
The coherent devices that use cacheable memory, and CPU make use of
this system cache by default.

Looking at memory types, we have following -
a) Normal uncached :- MAIR 0x44, inner non-cacheable,
  outer non-cacheable;
b) Normal cached :-   MAIR 0xff, inner read write-back non-transient,
  outer read write-back non-transient;
  attribute setting for coherenet I/O devices.
and, for non-coherent i/o devices that can allocate in system cache
another type gets added -
c) Normal sys-cached :- MAIR 0xf4, inner non-cacheable,
outer read write-back non-transient

Coherent I/O devices use system cache by marking the memory as
normal cached.
Non-coherent I/O devices should mark the memory as normal
sys-cached in page tables to use system cache.

Signed-off-by: Vivek Gautam 
---

V3 version of this patch and related series can be found at [1].
V4 of this patch is available at [2].

The example usage of how a smmu master can make use of this protection
flag and set the correct memory attributes to start using system cache,
can be found at [3]; and here at [3] IOMMU_UPSTREAM_HINT is same as
IOMMU_QCOM_SYS_CACHE.

Changes since v4:
 - Changed ARM_LPAE_MAIR_ATTR_QCOM_SYS_CACHE to
   ARM_LPAE_MAIR_ATTR_INC_OWBRWA.
 - Changed ARM_LPAE_MAIR_ATTR_IDX_QCOM_SYS_CACHE to
   ARM_LPAE_MAIR_ATTR_IDX_INC_OCACHE.
 - Added comments to iommu protection flag - IOMMU_QCOM_SYS_CACHE.

Changes since v3:
 - Dropping support to cache i/o page tables to system cache. Getting support
   for data buffers is the first step.
   Removed io-pgtable quirk and related change to add domain attribute.

Glmark2 numbers on SDM845 based cheza board:

S.No.|  with LLC support   |without LLC support
 |  for data buffers   |
--- 
1|  4480; 72.3fps  |4042; 65.2fps
2|  4500; 72.6fps  |4039; 65.1fps
3|  4523; 72.9fps  |4106; 66.2fps
4|  4489; 72.4fps  |4104; 66.2fps
5|  4518; 72.9fps  |4072; 65.7fps

[1] https://patchwork.kernel.org/cover/10772629/
[2] https://lore.kernel.org/patchwork/patch/1072936/
[3] https://patchwork.kernel.org/patch/10302791/

 drivers/iommu/io-pgtable-arm.c | 9 -
 include/linux/iommu.h  | 6 ++
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 4e21efbc4459..2454ac11aa97 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -167,10 +167,12 @@
 #define ARM_LPAE_MAIR_ATTR_MASK0xff
 #define ARM_LPAE_MAIR_ATTR_DEVICE  0x04
 #define ARM_LPAE_MAIR_ATTR_NC  0x44
+#define ARM_LPAE_MAIR_ATTR_INC_OWBRWA  0xf4
 #define ARM_LPAE_MAIR_ATTR_WBRWA   0xff
 #define ARM_LPAE_MAIR_ATTR_IDX_NC  0
 #define ARM_LPAE_MAIR_ATTR_IDX_CACHE   1
 #define ARM_LPAE_MAIR_ATTR_IDX_DEV 2
+#define ARM_LPAE_MAIR_ATTR_IDX_INC_OCACHE  3
 
 #define ARM_MALI_LPAE_TTBR_ADRMODE_TABLE (3u << 0)
 #define ARM_MALI_LPAE_TTBR_READ_INNER  BIT(2)
@@ -470,6 +472,9 @@ static arm_lpae_iopte arm_lpae_prot_to_pte(struct 
arm_lpae_io_pgtable *data,
else if (prot & IOMMU_CACHE)
pte |= (ARM_LPAE_MAIR_ATTR_IDX_CACHE
<< ARM_LPAE_PTE_ATTRINDX_SHIFT);
+   else if (prot & IOMMU_QCOM_SYS_CACHE)
+   pte |= (ARM_LPAE_MAIR_ATTR_IDX_INC_OCACHE
+   << ARM_LPAE_PTE_ATTRINDX_SHIFT);
}
 
if (prot & IOMMU_NOEXEC)
@@ -857,7 +862,9 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, 
void *cookie)
  (ARM_LPAE_MAIR_ATTR_WBRWA
   << ARM_LPAE_MAIR_ATTR_SHIFT(ARM_LPAE_MAIR_ATTR_IDX_CACHE)) |
  (ARM_LPAE_MAIR_ATTR_DEVICE
-  << ARM_LPAE_MAIR_ATTR_SHIFT(ARM_LPAE_MAIR_ATTR_IDX_DEV));
+  << ARM_LPAE_MAIR_ATTR_SHIFT(ARM_LPAE_MAIR_ATTR_IDX_DEV)) |
+ (ARM_LPAE_MAIR_ATTR_INC_OWBRWA
+  << ARM_LPAE_MAIR_ATTR_SHIFT(ARM_LPAE_MAIR_ATTR_IDX_INC_OCACHE));
 
cfg->arm_lpae_s1_cfg.mair[0] = reg;
cfg->arm_lpae_s1_cfg.mair[1] = 0;
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index a815cf6f6f47..8ee3fbaf5855 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -41,6 +41,12 @@
  * if the IOMMU page 

[PATCH v2 7/7] iommu/vt-d: Differentiate relaxable and non relaxable RMRRs

2019-05-16 Thread Eric Auger
Now we have a new IOMMU_RESV_DIRECT_RELAXABLE reserved memory
region type, let's report USB and GFX RMRRs as relaxable ones.

This allows to have a finer reporting at IOMMU API level of
reserved memory regions. This will be exploitable by VFIO to
define the usable IOVA range and detect potential conflicts
between the guest physical address space and host reserved
regions.

Signed-off-by: Eric Auger 
---
 drivers/iommu/intel-iommu.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index a36604f4900f..af1d65fdedfc 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5493,7 +5493,9 @@ static void intel_iommu_get_resv_regions(struct device 
*device,
for_each_rmrr_units(rmrr) {
for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
  i, i_dev) {
+   struct pci_dev *pdev = to_pci_dev(device);
struct iommu_resv_region *resv;
+   enum iommu_resv_type type;
size_t length;
 
if (i_dev != device &&
@@ -5501,9 +5503,13 @@ static void intel_iommu_get_resv_regions(struct device 
*device,
continue;
 
length = rmrr->end_address - rmrr->base_address + 1;
+
+   type = (pdev &&
+   (IS_USB_DEVICE(pdev) || IS_GFX_DEVICE(pdev))) ?
+   IOMMU_RESV_DIRECT_RELAXABLE : IOMMU_RESV_DIRECT;
+
resv = iommu_alloc_resv_region(rmrr->base_address,
-  length, prot,
-  IOMMU_RESV_DIRECT,
+  length, prot, type,
   GFP_ATOMIC);
if (!resv)
break;
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 6/7] iommu: Introduce IOMMU_RESV_DIRECT_RELAXABLE reserved memory regions

2019-05-16 Thread Eric Auger
Introduce a new type for reserved region. This corresponds
to directly mapped regions which are known to be relaxable
in some specific conditions, such as device assignment use
case. Well known examples are those used by USB controllers
providing PS/2 keyboard emulation for pre-boot BIOS and
early BOOT or RMRRs associated to IGD working in legacy mode.

Since commit c875d2c1b808 ("iommu/vt-d: Exclude devices using RMRRs
from IOMMU API domains") and commit 18436afdc11a ("iommu/vt-d: Allow
RMRR on graphics devices too"), those regions are currently
considered "safe" with respect to device assignment use case
which requires a non direct mapping at IOMMU physical level
(RAM GPA -> HPA mapping).

Those RMRRs currently exist and sometimes the device is
attempting to access it but this has not been considered
an issue until now.

However at the moment, iommu_get_group_resv_regions() is
not able to make any difference between directly mapped
regions: those which must be absolutely enforced and those
like above ones which are known as relaxable.

This is a blocker for reporting severe conflicts between
non relaxable RMRRs (like MSI doorbells) and guest GPA space.

With this new reserved region type we will be able to use
iommu_get_group_resv_regions() to enumerate the IOVA space
that is usable through the IOMMU API without introducing
regressions with respect to existing device assignment
use cases (USB and IGD).

Signed-off-by: Eric Auger 

---

Note: At the moment the sysfs ABI is not changed. However I wonder
whether it wouldn't be preferable to report the direct region as
"direct_relaxed" there. At the moment, in case the same direct
region is used by 2 devices, one USB/GFX and another not belonging
to the previous categories, the direct region will be output twice
with "direct" type.

This would unblock Shameer's series:
[PATCH v6 0/7] vfio/type1: Add support for valid iova list management
https://patchwork.kernel.org/patch/10425309/

which failed to get pulled for 4.18 merge window due to IGD
device assignment regression.
---
 drivers/iommu/iommu.c | 12 +++-
 include/linux/iommu.h |  6 ++
 2 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index ae4ea5c0e6f9..84dcb6af6511 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -73,10 +73,11 @@ struct iommu_group_attribute {
 };
 
 static const char * const iommu_group_resv_type_string[] = {
-   [IOMMU_RESV_DIRECT] = "direct",
-   [IOMMU_RESV_RESERVED]   = "reserved",
-   [IOMMU_RESV_MSI]= "msi",
-   [IOMMU_RESV_SW_MSI] = "msi",
+   [IOMMU_RESV_DIRECT] = "direct",
+   [IOMMU_RESV_DIRECT_RELAXABLE]   = "direct",
+   [IOMMU_RESV_RESERVED]   = "reserved",
+   [IOMMU_RESV_MSI]= "msi",
+   [IOMMU_RESV_SW_MSI] = "msi",
 };
 
 #define IOMMU_GROUP_ATTR(_name, _mode, _show, _store)  \
@@ -573,7 +574,8 @@ static int iommu_group_create_direct_mappings(struct 
iommu_group *group,
start = ALIGN(entry->start, pg_size);
end   = ALIGN(entry->start + entry->length, pg_size);
 
-   if (entry->type != IOMMU_RESV_DIRECT)
+   if (entry->type != IOMMU_RESV_DIRECT ||
+   entry->type != IOMMU_RESV_DIRECT_RELAXABLE)
continue;
 
for (addr = start; addr < end; addr += pg_size) {
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index ba91666998fb..14a521f85f14 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -135,6 +135,12 @@ enum iommu_attr {
 enum iommu_resv_type {
/* Memory regions which must be mapped 1:1 at all times */
IOMMU_RESV_DIRECT,
+   /*
+* Memory regions which are advertised to be 1:1 but are
+* commonly considered relaxable in some conditions,
+* for instance in device assignment use case (USB, Graphics)
+*/
+   IOMMU_RESV_DIRECT_RELAXABLE,
/* Arbitrary "never map this or give it to a device" address ranges */
IOMMU_RESV_RESERVED,
/* Hardware MSI region (untranslated) */
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 5/7] iommu/vt-d: Handle PCI bridge RMRR device scopes in intel_iommu_get_resv_regions

2019-05-16 Thread Eric Auger
In the case the RMRR device scope is a PCI-PCI bridge, let's check
the device belongs to the PCI sub-hierarchy.

Fixes: 0659b8dc45a6 ("iommu/vt-d: Implement reserved region get/put callbacks")

Signed-off-by: Eric Auger 
---
 drivers/iommu/intel-iommu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 7ed820e79313..a36604f4900f 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5496,7 +5496,8 @@ static void intel_iommu_get_resv_regions(struct device 
*device,
struct iommu_resv_region *resv;
size_t length;
 
-   if (i_dev != device)
+   if (i_dev != device &&
+   !is_downstream_to_pci_bridge(device, i_dev))
continue;
 
length = rmrr->end_address - rmrr->base_address + 1;
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: Linux crash when using FTDI FT232R USB to serial converter on AMD boards.

2019-05-16 Thread Oliver Neukum
On Do, 2019-05-16 at 10:20 +0200,  starost...@gmail.com  wrote:
> Dne 16.5.2019 v 9:58 Oliver Neukum napsal(a):
> > > > 2.Send a new report to the corresponding mailing list
> > > 
> > > Which mailing list is correct?
> > 
> > In that case linux-...@vger.kernel.org
> > 
> > HTH
> > Oliver
> > 
> 
> Is there some rules how to send bug report? Or I can send report with 
> "my amateur description"?

Mention

1. kernel version
2. whether this is known to be a regression
3. include the log with iommu disabled and mention that you disabled
the IOMMU
4. Include the output of lsusb

HTH
Oliver

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 4/7] iommu/vt-d: Handle RMRR with PCI bridge device scopes

2019-05-16 Thread Eric Auger
When reading the vtd specification and especially the
Reserved Memory Region Reporting Structure chapter,
it is not obvious a device scope element cannot be a
PCI-PCI bridge, in which case all downstream ports are
likely to access the reserved memory region. Let's handle
this case in device_has_rmrr.

Fixes: ea2447f700ca ("intel-iommu: Prevent devices with RMRRs from being placed 
into SI Domain")

Signed-off-by: Eric Auger 

---

v1 -> v2:
- is_downstream_to_pci_bridge helper introduced in a separate patch
---
 drivers/iommu/intel-iommu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 15c2f9677491..7ed820e79313 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -2910,7 +2910,8 @@ static bool device_has_rmrr(struct device *dev)
 */
for_each_active_dev_scope(rmrr->devices,
  rmrr->devices_cnt, i, tmp)
-   if (tmp == dev) {
+   if (tmp == dev ||
+   is_downstream_to_pci_bridge(dev, tmp)) {
rcu_read_unlock();
return true;
}
-- 
2.20.1



[PATCH v2 1/7] iommu: Pass a GFP flag parameter to iommu_alloc_resv_region()

2019-05-16 Thread Eric Auger
We plan to call iommu_alloc_resv_region in a non preemptible section.
Pass a GFP flag to the function and update all the call sites.

Signed-off-by: Eric Auger 
---
 drivers/acpi/arm64/iort.c   | 3 ++-
 drivers/iommu/amd_iommu.c   | 7 ---
 drivers/iommu/arm-smmu-v3.c | 2 +-
 drivers/iommu/arm-smmu.c| 2 +-
 drivers/iommu/intel-iommu.c | 4 ++--
 drivers/iommu/iommu.c   | 7 ---
 include/linux/iommu.h   | 2 +-
 7 files changed, 15 insertions(+), 12 deletions(-)

diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index 9058cb084b91..8313b2d1b710 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -868,7 +868,8 @@ int iort_iommu_msi_get_resv_regions(struct device *dev, 
struct list_head *head)
struct iommu_resv_region *region;
 
region = iommu_alloc_resv_region(base + SZ_64K, SZ_64K,
-prot, IOMMU_RESV_MSI);
+prot, IOMMU_RESV_MSI,
+GFP_KERNEL);
if (region) {
list_add_tail(>list, head);
resv++;
diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 09c9e45f7fa2..f2eb8e9cd8a6 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -3136,7 +3136,8 @@ static void amd_iommu_get_resv_regions(struct device *dev,
type = IOMMU_RESV_RESERVED;
 
region = iommu_alloc_resv_region(entry->address_start,
-length, prot, type);
+length, prot, type,
+GFP_KERNEL);
if (!region) {
dev_err(dev, "Out of memory allocating dm-regions\n");
return;
@@ -3146,14 +3147,14 @@ static void amd_iommu_get_resv_regions(struct device 
*dev,
 
region = iommu_alloc_resv_region(MSI_RANGE_START,
 MSI_RANGE_END - MSI_RANGE_START + 1,
-0, IOMMU_RESV_MSI);
+0, IOMMU_RESV_MSI, GFP_KERNEL);
if (!region)
return;
list_add_tail(>list, head);
 
region = iommu_alloc_resv_region(HT_RANGE_START,
 HT_RANGE_END - HT_RANGE_START + 1,
-0, IOMMU_RESV_RESERVED);
+0, IOMMU_RESV_RESERVED, GFP_KERNEL);
if (!region)
return;
list_add_tail(>list, head);
diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 4d5a694f02c2..f9b1279ef5bf 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -2226,7 +2226,7 @@ static void arm_smmu_get_resv_regions(struct device *dev,
int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
 
region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
-prot, IOMMU_RESV_SW_MSI);
+prot, IOMMU_RESV_SW_MSI, GFP_KERNEL);
if (!region)
return;
 
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 5e54cc0a28b3..646e76813e91 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1670,7 +1670,7 @@ static void arm_smmu_get_resv_regions(struct device *dev,
int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
 
region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
-prot, IOMMU_RESV_SW_MSI);
+prot, IOMMU_RESV_SW_MSI, GFP_KERNEL);
if (!region)
return;
 
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index a209199f3af6..2be36dff189a 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -4220,7 +4220,7 @@ int __init dmar_parse_one_rmrr(struct acpi_dmar_header 
*header, void *arg)
 
length = rmrr->end_address - rmrr->base_address + 1;
rmrru->resv = iommu_alloc_resv_region(rmrr->base_address, length, prot,
- IOMMU_RESV_DIRECT);
+ IOMMU_RESV_DIRECT, GFP_KERNEL);
if (!rmrru->resv)
goto free_rmrru;
 
@@ -5489,7 +5489,7 @@ static void intel_iommu_get_resv_regions(struct device 
*device,
 
reg = iommu_alloc_resv_region(IOAPIC_RANGE_START,
  IOAPIC_RANGE_END - IOAPIC_RANGE_START + 1,
- 0, IOMMU_RESV_MSI);
+ 0, IOMMU_RESV_MSI, GFP_KERNEL);
if (!reg)
return;
list_add_tail(>list, 

[PATCH v2 3/7] iommu/vt-d: Introduce is_downstream_to_pci_bridge helper

2019-05-16 Thread Eric Auger
Several call sites are about to check whether a device belongs
to the PCI sub-hierarchy of a candidate PCI-PCI bridge.
Introduce an helper to perform that check.

Signed-off-by: Eric Auger 
---
 drivers/iommu/intel-iommu.c | 37 +
 1 file changed, 29 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 590a0e78d11d..15c2f9677491 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -736,12 +736,39 @@ static int iommu_dummy(struct device *dev)
return dev->archdata.iommu == DUMMY_DEVICE_DOMAIN_INFO;
 }
 
+/* is_downstream_to_pci_bridge - test if a device belongs to the
+ * PCI sub-hierarchy of a candidate PCI-PCI bridge
+ *
+ * @dev: candidate PCI device belonging to @bridge PCI sub-hierarchy
+ * @bridge: the candidate PCI-PCI bridge
+ *
+ * Return: true if @dev belongs to @bridge PCI sub-hierarchy
+ */
+static bool
+is_downstream_to_pci_bridge(struct device *dev, struct device *bridge)
+{
+   struct pci_dev *pdev, *pbridge;
+
+   if (!dev_is_pci(dev) || !dev_is_pci(bridge))
+   return false;
+
+   pdev = to_pci_dev(dev);
+   pbridge = to_pci_dev(bridge);
+
+   if (pbridge->subordinate &&
+   pbridge->subordinate->number <= pdev->bus->number &&
+   pbridge->subordinate->busn_res.end >= pdev->bus->number)
+   return true;
+
+   return false;
+}
+
 static struct intel_iommu *device_to_iommu(struct device *dev, u8 *bus, u8 
*devfn)
 {
struct dmar_drhd_unit *drhd = NULL;
struct intel_iommu *iommu;
struct device *tmp;
-   struct pci_dev *ptmp, *pdev = NULL;
+   struct pci_dev *pdev = NULL;
u16 segment = 0;
int i;
 
@@ -787,13 +814,7 @@ static struct intel_iommu *device_to_iommu(struct device 
*dev, u8 *bus, u8 *devf
goto out;
}
 
-   if (!pdev || !dev_is_pci(tmp))
-   continue;
-
-   ptmp = to_pci_dev(tmp);
-   if (ptmp->subordinate &&
-   ptmp->subordinate->number <= pdev->bus->number &&
-   ptmp->subordinate->busn_res.end >= 
pdev->bus->number)
+   if (is_downstream_to_pci_bridge(dev, tmp))
goto got_pdev;
}
 
-- 
2.20.1



[PATCH v2 2/7] iommu/vt-d: Duplicate iommu_resv_region objects per device list

2019-05-16 Thread Eric Auger
intel_iommu_get_resv_regions() aims to return the list of
reserved regions accessible by a given @device. However several
devices can access the same reserved memory region and when
building the list it is not safe to use a single iommu_resv_region
object, whose container is the RMRR. This iommu_resv_region must
be duplicated per device reserved region list.

Let's remove the struct iommu_resv_region from the RMRR unit
and allocate the iommu_resv_region directly in
intel_iommu_get_resv_regions().

Fixes: 0659b8dc45a6 ("iommu/vt-d: Implement reserved region get/put callbacks")
Signed-off-by: Eric Auger 
---
 drivers/iommu/intel-iommu.c | 31 ---
 1 file changed, 16 insertions(+), 15 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 2be36dff189a..590a0e78d11d 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -322,7 +322,6 @@ struct dmar_rmrr_unit {
u64 end_address;/* reserved end address */
struct dmar_dev_scope *devices; /* target devices */
int devices_cnt;/* target device count */
-   struct iommu_resv_region *resv; /* reserved region handle */
 };
 
 struct dmar_atsr_unit {
@@ -4205,7 +4204,6 @@ static inline void init_iommu_pm_ops(void) {}
 int __init dmar_parse_one_rmrr(struct acpi_dmar_header *header, void *arg)
 {
struct acpi_dmar_reserved_memory *rmrr;
-   int prot = DMA_PTE_READ|DMA_PTE_WRITE;
struct dmar_rmrr_unit *rmrru;
size_t length;
 
@@ -4219,22 +4217,16 @@ int __init dmar_parse_one_rmrr(struct acpi_dmar_header 
*header, void *arg)
rmrru->end_address = rmrr->end_address;
 
length = rmrr->end_address - rmrr->base_address + 1;
-   rmrru->resv = iommu_alloc_resv_region(rmrr->base_address, length, prot,
- IOMMU_RESV_DIRECT, GFP_KERNEL);
-   if (!rmrru->resv)
-   goto free_rmrru;
 
rmrru->devices = dmar_alloc_dev_scope((void *)(rmrr + 1),
((void *)rmrr) + rmrr->header.length,
>devices_cnt);
if (rmrru->devices_cnt && rmrru->devices == NULL)
-   goto free_all;
+   goto free_rmrru;
 
list_add(>list, _rmrr_units);
 
return 0;
-free_all:
-   kfree(rmrru->resv);
 free_rmrru:
kfree(rmrru);
 out:
@@ -4452,7 +,6 @@ static void intel_iommu_free_dmars(void)
list_for_each_entry_safe(rmrru, rmrr_n, _rmrr_units, list) {
list_del(>list);
dmar_free_dev_scope(>devices, >devices_cnt);
-   kfree(rmrru->resv);
kfree(rmrru);
}
 
@@ -5470,6 +5461,7 @@ static void intel_iommu_remove_device(struct device *dev)
 static void intel_iommu_get_resv_regions(struct device *device,
 struct list_head *head)
 {
+   int prot = DMA_PTE_READ|DMA_PTE_WRITE;
struct iommu_resv_region *reg;
struct dmar_rmrr_unit *rmrr;
struct device *i_dev;
@@ -5479,10 +5471,21 @@ static void intel_iommu_get_resv_regions(struct device 
*device,
for_each_rmrr_units(rmrr) {
for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
  i, i_dev) {
+   struct iommu_resv_region *resv;
+   size_t length;
+
if (i_dev != device)
continue;
 
-   list_add_tail(>resv->list, head);
+   length = rmrr->end_address - rmrr->base_address + 1;
+   resv = iommu_alloc_resv_region(rmrr->base_address,
+  length, prot,
+  IOMMU_RESV_DIRECT,
+  GFP_ATOMIC);
+   if (!resv)
+   break;
+
+   list_add_tail(>list, head);
}
}
rcu_read_unlock();
@@ -5500,10 +5503,8 @@ static void intel_iommu_put_resv_regions(struct device 
*dev,
 {
struct iommu_resv_region *entry, *next;
 
-   list_for_each_entry_safe(entry, next, head, list) {
-   if (entry->type == IOMMU_RESV_MSI)
-   kfree(entry);
-   }
+   list_for_each_entry_safe(entry, next, head, list)
+   kfree(entry);
 }
 
 int intel_iommu_enable_pasid(struct intel_iommu *iommu, struct device *dev)
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 0/7] RMRR related fixes and enhancements

2019-05-16 Thread Eric Auger
Currently the Intel reserved region is attached to the
RMRR unit and when building the list of RMRR seen by a device
we link this unique reserved region without taking care of
potential multiple usage of this reserved region by several devices.

Also while reading the vtd spec it is unclear to me whether
the RMRR device scope referenced by an RMRR ACPI struct could
be a PCI-PCI bridge, in which case I think we also need to
check the device belongs to the PCI sub-hierarchy of the device
referenced in the scope. This would be true for device_has_rmrr()
and intel_iommu_get_resv_regions().

Last, the VFIO subsystem would need to compute the usable IOVA range
by querying the iommu_get_group_resv_regions() API. This would allow,
for instance, to report potential conflicts between the guest physical
address space and host reserved regions.

However iommu_get_group_resv_regions() currently fails to differentiate
RMRRs that are known safe for device assignment and RMRRs that must be
enforced. So we introduce a new reserved memory region type (relaxable),
reported when associated to an USB or GFX device. The last 2 patches aim
at unblocking [1] which is stuck since 4.18.

[1-5] are fixes
[6-7] are enhancements

References:
[1] [PATCH v6 0/7] vfio/type1: Add support for valid iova list management
https://patchwork.kernel.org/patch/10425309/

The two parts can be considered separately if needed.

History

1->2:
- introduce is_downstream_to_pci_bridge() in a separate patch, change param
  names and add kerneldoc comment (Jacob)
- add 6,7


Eric Auger (7):
  iommu: Pass a GFP flag parameter to iommu_alloc_resv_region()
  iommu/vt-d: Duplicate iommu_resv_region objects per device list
  iommu/vt-d: Introduce is_downstream_to_pci_bridge helper
  iommu/vt-d: Handle RMRR with PCI bridge device scopes
  iommu/vt-d: Handle PCI bridge RMRR device scopes in
intel_iommu_get_resv_regions
  iommu: Introduce IOMMU_RESV_DIRECT_RELAXABLE reserved memory regions
  iommu/vt-d: Differentiate relaxable and non relaxable RMRRs

 drivers/acpi/arm64/iort.c   |  3 +-
 drivers/iommu/amd_iommu.c   |  7 ++--
 drivers/iommu/arm-smmu-v3.c |  2 +-
 drivers/iommu/arm-smmu.c|  2 +-
 drivers/iommu/intel-iommu.c | 82 +
 drivers/iommu/iommu.c   | 19 +
 include/linux/iommu.h   |  8 +++-
 7 files changed, 82 insertions(+), 41 deletions(-)

-- 
2.20.1



Re: Linux crash when using FTDI FT232R USB to serial converter on AMD boards.

2019-05-16 Thread starost...@gmail.com

Dne 16.5.2019 v 9:58 Oliver Neukum napsal(a):

2.Send a new report to the corresponding mailing list

Which mailing list is correct?

In that case linux-...@vger.kernel.org

HTH
Oliver

Is there some rules how to send bug report? Or I can send report with 
"my amateur description"?


starosta
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu: io-pgtable: Support non-coherent page tables

2019-05-16 Thread Vivek Gautam
On Thu, May 16, 2019 at 5:03 AM Bjorn Andersson
 wrote:
>
> Describe the memory related to page table walks as non-cachable for iommu
> instances that are not DMA coherent.
>
> Signed-off-by: Bjorn Andersson 
> ---
>  drivers/iommu/io-pgtable-arm.c | 12 +---
>  1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index 4e21efbc4459..68ff22ffd2cb 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -803,9 +803,15 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, 
> void *cookie)
> return NULL;
>
> /* TCR */
> -   reg = (ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT) |
> - (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_IRGN0_SHIFT) |
> - (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_ORGN0_SHIFT);
> +   if (cfg->quirks & IO_PGTABLE_QUIRK_NO_DMA) {
> +   reg = (ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT) |
> + (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_IRGN0_SHIFT) |
> + (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_ORGN0_SHIFT);
> +   } else {
> +   reg = (ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT) |
> + (ARM_LPAE_TCR_RGN_NC << ARM_LPAE_TCR_IRGN0_SHIFT) |
> + (ARM_LPAE_TCR_RGN_NC << ARM_LPAE_TCR_ORGN0_SHIFT);
> +   }

This looks okay to me based on the discussion that we had on a similar
patch that I
posted. So,
Reviewed-by: Vivek Gautam 

[1] https://lore.kernel.org/patchwork/patch/1032939/

Thanks & regards
Vivek

>
> switch (ARM_LPAE_GRANULE(data)) {
> case SZ_4K:
> --
> 2.18.0
>
> ___
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu



-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu