RE: [PATCH 1/1] iommu/vt-d: Serialize IOMMU GCMD register modifications

2020-08-25 Thread Tian, Kevin
> From: Lu Baolu
> Sent: Wednesday, August 26, 2020 10:58 AM
> 
> The VT-d spec requires (10.4.4 Global Command Register, GCMD_REG
> General
> Description) that:
> 
> If multiple control fields in this register need to be modified, software
> must serialize the modifications through multiple writes to this register.
> 
> However, in irq_remapping.c, modifications of IRE and CFI are done in one
> write. We need to do two separate writes with STS checking after each.
> 
> Fixes: af8d102f999a4 ("x86/intel/irq_remapping: Clean up x2apic opt-out
> security warning mess")
> Cc: Andy Lutomirski 
> Cc: Jacob Pan 
> Signed-off-by: Lu Baolu 
> ---
>  drivers/iommu/intel/irq_remapping.c | 8 ++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/iommu/intel/irq_remapping.c
> b/drivers/iommu/intel/irq_remapping.c
> index 9564d23d094f..19d7e18876fe 100644
> --- a/drivers/iommu/intel/irq_remapping.c
> +++ b/drivers/iommu/intel/irq_remapping.c
> @@ -507,12 +507,16 @@ static void iommu_enable_irq_remapping(struct
> intel_iommu *iommu)
> 
>   /* Enable interrupt-remapping */
>   iommu->gcmd |= DMA_GCMD_IRE;
> - iommu->gcmd &= ~DMA_GCMD_CFI;  /* Block compatibility-format
> MSIs */
>   writel(iommu->gcmd, iommu->reg + DMAR_GCMD_REG);
> -
>   IOMMU_WAIT_OP(iommu, DMAR_GSTS_REG,
> readl, (sts & DMA_GSTS_IRES), sts);
> 
> + /* Block compatibility-format MSIs */
> + iommu->gcmd &= ~DMA_GCMD_CFI;
> + writel(iommu->gcmd, iommu->reg + DMAR_GCMD_REG);
> + IOMMU_WAIT_OP(iommu, DMAR_GSTS_REG,
> +   readl, !(sts & DMA_GSTS_CFIS), sts);
> +

Better do it only when CFI is actually enabled (by checking sts).

>   /*
>* With CFI clear in the Global Command register, we should be
>* protected from dangerous (i.e. compatibility) interrupts
> --
> 2.17.1
> 
> ___
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/1] iommu/vt-d: Serialize IOMMU GCMD register modifications

2020-08-25 Thread Lu Baolu
The VT-d spec requires (10.4.4 Global Command Register, GCMD_REG General
Description) that:

If multiple control fields in this register need to be modified, software
must serialize the modifications through multiple writes to this register.

However, in irq_remapping.c, modifications of IRE and CFI are done in one
write. We need to do two separate writes with STS checking after each.

Fixes: af8d102f999a4 ("x86/intel/irq_remapping: Clean up x2apic opt-out 
security warning mess")
Cc: Andy Lutomirski 
Cc: Jacob Pan 
Signed-off-by: Lu Baolu 
---
 drivers/iommu/intel/irq_remapping.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/intel/irq_remapping.c 
b/drivers/iommu/intel/irq_remapping.c
index 9564d23d094f..19d7e18876fe 100644
--- a/drivers/iommu/intel/irq_remapping.c
+++ b/drivers/iommu/intel/irq_remapping.c
@@ -507,12 +507,16 @@ static void iommu_enable_irq_remapping(struct intel_iommu 
*iommu)
 
/* Enable interrupt-remapping */
iommu->gcmd |= DMA_GCMD_IRE;
-   iommu->gcmd &= ~DMA_GCMD_CFI;  /* Block compatibility-format MSIs */
writel(iommu->gcmd, iommu->reg + DMAR_GCMD_REG);
-
IOMMU_WAIT_OP(iommu, DMAR_GSTS_REG,
  readl, (sts & DMA_GSTS_IRES), sts);
 
+   /* Block compatibility-format MSIs */
+   iommu->gcmd &= ~DMA_GCMD_CFI;
+   writel(iommu->gcmd, iommu->reg + DMAR_GCMD_REG);
+   IOMMU_WAIT_OP(iommu, DMAR_GSTS_REG,
+ readl, !(sts & DMA_GSTS_CFIS), sts);
+
/*
 * With CFI clear in the Global Command register, we should be
 * protected from dangerous (i.e. compatibility) interrupts
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/vt-d:Add support for detecting ACPI device in RMRR

2020-08-25 Thread Lu Baolu

Hi Felix,

On 8/18/20 10:44 AM, FelixCuioc wrote:

Some ACPI devices need to issue dma requests to access
the reserved memory area.BIOS uses the device scope type
ACPI_NAMESPACE_DEVICE in RMRR to report these ACPI devices.
This patch add support for detecting ACPI devices in RMRR.



If you are willing to submit a new version of this series, can you
please add the version tag (commit title) and a change log (under the
tear line)?

Best regards,
baolu


Signed-off-by: FelixCuioc 
---
  drivers/iommu/intel/dmar.c  | 74 -
  drivers/iommu/intel/iommu.c | 22 ++-
  include/linux/dmar.h| 12 +-
  3 files changed, 72 insertions(+), 36 deletions(-)

diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c
index 93e6345f3414..024ca38dba12 100644
--- a/drivers/iommu/intel/dmar.c
+++ b/drivers/iommu/intel/dmar.c
@@ -215,7 +215,7 @@ static bool dmar_match_pci_path(struct dmar_pci_notify_info 
*info, int bus,
  }
  
  /* Return: > 0 if match found, 0 if no match found, < 0 if error happens */

-int dmar_insert_dev_scope(struct dmar_pci_notify_info *info,
+int dmar_pci_insert_dev_scope(struct dmar_pci_notify_info *info,
  void *start, void*end, u16 segment,
  struct dmar_dev_scope *devices,
  int devices_cnt)
@@ -304,7 +304,7 @@ static int dmar_pci_bus_add_dev(struct dmar_pci_notify_info 
*info)
  
  		drhd = container_of(dmaru->hdr,

struct acpi_dmar_hardware_unit, header);
-   ret = dmar_insert_dev_scope(info, (void *)(drhd + 1),
+   ret = dmar_pci_insert_dev_scope(info, (void *)(drhd + 1),
((void *)drhd) + drhd->header.length,
dmaru->segment,
dmaru->devices, dmaru->devices_cnt);
@@ -696,48 +696,56 @@ dmar_find_matched_drhd_unit(struct pci_dev *dev)
  
  	return dmaru;

  }
-
-static void __init dmar_acpi_insert_dev_scope(u8 device_number,
- struct acpi_device *adev)
+int dmar_acpi_insert_dev_scope(u8 device_number,
+   struct acpi_device *adev,
+   void *start, void *end,
+   struct dmar_dev_scope *devices,
+   int devices_cnt)
  {
-   struct dmar_drhd_unit *dmaru;
-   struct acpi_dmar_hardware_unit *drhd;
struct acpi_dmar_device_scope *scope;
struct device *tmp;
int i;
struct acpi_dmar_pci_path *path;
  
+	for (; start < end; start += scope->length) {

+   scope = start;
+   if (scope->entry_type != ACPI_DMAR_SCOPE_TYPE_NAMESPACE)
+   continue;
+   if (scope->enumeration_id != device_number)
+   continue;
+   path = (void *)(scope + 1);
+   for_each_dev_scope(devices, devices_cnt, i, tmp)
+   if (tmp == NULL) {
+   devices[i].bus = scope->bus;
+   devices[i].devfn = PCI_DEVFN(path->device, 
path->function);
+   rcu_assign_pointer(devices[i].dev,
+  get_device(>dev));
+   return 1;
+   }
+   WARN_ON(i >= devices_cnt);
+   }
+   return 0;
+}
+static int dmar_acpi_bus_add_dev(u8 device_number, struct acpi_device *adev)
+{
+   struct dmar_drhd_unit *dmaru;
+   struct acpi_dmar_hardware_unit *drhd;
+   int ret = 0;
+
for_each_drhd_unit(dmaru) {
drhd = container_of(dmaru->hdr,
struct acpi_dmar_hardware_unit,
header);
+   ret = dmar_acpi_insert_dev_scope(device_number, adev, (void 
*)(drhd+1),
+   ((void 
*)drhd)+drhd->header.length,
+   dmaru->devices, 
dmaru->devices_cnt);
+   if (ret)
+   break;
+   }
+   ret = dmar_rmrr_add_acpi_dev(device_number, adev);
  
-		for (scope = (void *)(drhd + 1);

-(unsigned long)scope < ((unsigned long)drhd) + 
drhd->header.length;
-scope = ((void *)scope) + scope->length) {
-   if (scope->entry_type != ACPI_DMAR_SCOPE_TYPE_NAMESPACE)
-   continue;
-   if (scope->enumeration_id != device_number)
-   continue;
+   return ret;
  
-			path = (void *)(scope + 1);

-   pr_info("ACPI device \"%s\" under DMAR at %llx as 
%02x:%02x.%d\n",
-   dev_name(>dev), dmaru->reg_base_addr,
-   scope->bus, path->device, path->function);
-   

Re: [PATCH] iommu: Add support to filter non-strict/lazy mode based on device names

2020-08-25 Thread Rob Clark
On Tue, Aug 25, 2020 at 5:24 PM Doug Anderson  wrote:
>
> Hi,
>
> On Tue, Aug 25, 2020 at 3:54 PM Rob Clark  wrote:
> >
> > On Tue, Aug 25, 2020 at 3:23 PM Doug Anderson  wrote:
> > >
> > > Hi,
> > >
> > > On Tue, Aug 25, 2020 at 12:01 PM Sai Prakash Ranjan
> > >  wrote:
> > > >
> > > > Hi,
> > > >
> > > > On 2020-08-25 21:40, Doug Anderson wrote:
> > > > > Hi,
> > > > >
> > > > > On Tue, Aug 25, 2020 at 8:43 AM Sai Prakash Ranjan
> > > > >  wrote:
> > > > >>
> > > > >> Currently the non-strict or lazy mode of TLB invalidation can only be
> > > > >> set
> > > > >> for all or no domains. This works well for development platforms 
> > > > >> where
> > > > >> setting to non-strict/lazy mode is fine for performance reasons but 
> > > > >> on
> > > > >> production devices, we need a more fine grained control to allow only
> > > > >> certain peripherals to support this mode where we can be sure that it
> > > > >> is
> > > > >> safe. So add support to filter non-strict/lazy mode based on the
> > > > >> device
> > > > >> names that are passed via cmdline parameter "iommu.nonstrict_device".
> > > > >>
> > > > >> Example:
> > > > >> iommu.nonstrict_device="7c4000.sdhci,a60.dwc3,6048000.etr"
> > >
> > > Just curious: are device names like this really guaranteed to be
> > > stable across versions?
> > >
> > >
> > > > > I have an inherent dislike of jamming things like this onto the
> > > > > command line.  IMHO the command line is the last resort for specifying
> > > > > configuration and generally should be limited to some specialized
> > > > > debug options and cases where the person running the kernel needs to
> > > > > override a config that was set by the person (or company) compiling
> > > > > the kernel.  Specifically, having a long/unwieldy command line makes
> > > > > it harder to use for the case when an end user actually wants to use
> > > > > it to override something.  It's also just another place to look for
> > > > > config.
> > > > >
> > > >
> > > > Good thing about command line parameters are that they are optional,
> > > > they do
> > > > not specify any default behaviour (I mean they are not mandatory to be
> > > > set
> > > > for the system to be functional), so I would like to view it as an
> > > > optional
> > > > config. And this command line parameter (nonstrict_device) is strictly
> > > > optional
> > > > with default being strict already set in the driver.
> > > >
> > > > They can be passed from the bootloader via chosen node for DT platforms
> > > > or choose
> > > > a new *bootconfig* as a way to pass the cmdline but finally it does boil
> > > > down to
> > > > just another config.
> > >
> > > Never looked at bootconfig.  Unfortunately it seems to require
> > > initramfs so that pretty much means it's out for my usage.  :(
> > >
> > >
> > > > I agree with general boolean or single value command line parameters
> > > > being just
> > > > more messy which could just be Kconfigs instead but for multiple value
> > > > parameters
> > > > like these do not fit in Kconfig.
> > > >
> > > > As you might already know, command line also gives an advantage to the
> > > > end user
> > > > to configure system without building kernel, for this specific command
> > > > line its
> > > > very useful because the performance bump is quite noticeable when the
> > > > iommu.strict
> > > > is off. Now for end user who would not be interested in building entire
> > > > kernel(majority)
> > > > and just cares about good speeds or throughput can find this very
> > > > beneficial.
> > > > I am not talking about one specific OS usecase here but more in general
> > > > term.
> > > >
> > > > > The other problem is that this doesn't necessarily scale very well.
> > > > > While it works OK for embedded cases it doesn't work terribly well for
> > > > > distributions.  I know that in an out-of-band thread you indicated
> > > > > that it doesn't break anything that's not already broken (AKA this
> > > > > doesn't fix the distro case but it doesn't make it worse), it would be
> > > > > better to come up with a more universal solution.
> > > > >
> > > >
> > > > Is the universal solution here referring to fix all the command line
> > > > parameters
> > > > in the kernel or this specific command line? Are we going to remove any
> > > > more
> > > > addition to the cmdline ;)
> > >
> > > There are very few cases where a kernel command line parameter is the
> > > only way to configure something.  Most of the time it's just there to
> > > override a config.  I wouldn't suggest removing those.  I just don't
> > > want a kernel command line parameter to be the primary way to enable
> > > something.
> > >
> > >
> > > > So possible other solution is the *bootconfig* which is again just
> > > > another place
> > > > to look for a config. So thing is that this universal solution would
> > > > result in
> > > > just more new fancy ways of passing configs or adding such configs to
> > > > the drivers
> > > > or subsystems in kernel 

Re: [PATCH] iommu: Add support to filter non-strict/lazy mode based on device names

2020-08-25 Thread Doug Anderson
Hi,

On Tue, Aug 25, 2020 at 3:54 PM Rob Clark  wrote:
>
> On Tue, Aug 25, 2020 at 3:23 PM Doug Anderson  wrote:
> >
> > Hi,
> >
> > On Tue, Aug 25, 2020 at 12:01 PM Sai Prakash Ranjan
> >  wrote:
> > >
> > > Hi,
> > >
> > > On 2020-08-25 21:40, Doug Anderson wrote:
> > > > Hi,
> > > >
> > > > On Tue, Aug 25, 2020 at 8:43 AM Sai Prakash Ranjan
> > > >  wrote:
> > > >>
> > > >> Currently the non-strict or lazy mode of TLB invalidation can only be
> > > >> set
> > > >> for all or no domains. This works well for development platforms where
> > > >> setting to non-strict/lazy mode is fine for performance reasons but on
> > > >> production devices, we need a more fine grained control to allow only
> > > >> certain peripherals to support this mode where we can be sure that it
> > > >> is
> > > >> safe. So add support to filter non-strict/lazy mode based on the
> > > >> device
> > > >> names that are passed via cmdline parameter "iommu.nonstrict_device".
> > > >>
> > > >> Example:
> > > >> iommu.nonstrict_device="7c4000.sdhci,a60.dwc3,6048000.etr"
> >
> > Just curious: are device names like this really guaranteed to be
> > stable across versions?
> >
> >
> > > > I have an inherent dislike of jamming things like this onto the
> > > > command line.  IMHO the command line is the last resort for specifying
> > > > configuration and generally should be limited to some specialized
> > > > debug options and cases where the person running the kernel needs to
> > > > override a config that was set by the person (or company) compiling
> > > > the kernel.  Specifically, having a long/unwieldy command line makes
> > > > it harder to use for the case when an end user actually wants to use
> > > > it to override something.  It's also just another place to look for
> > > > config.
> > > >
> > >
> > > Good thing about command line parameters are that they are optional,
> > > they do
> > > not specify any default behaviour (I mean they are not mandatory to be
> > > set
> > > for the system to be functional), so I would like to view it as an
> > > optional
> > > config. And this command line parameter (nonstrict_device) is strictly
> > > optional
> > > with default being strict already set in the driver.
> > >
> > > They can be passed from the bootloader via chosen node for DT platforms
> > > or choose
> > > a new *bootconfig* as a way to pass the cmdline but finally it does boil
> > > down to
> > > just another config.
> >
> > Never looked at bootconfig.  Unfortunately it seems to require
> > initramfs so that pretty much means it's out for my usage.  :(
> >
> >
> > > I agree with general boolean or single value command line parameters
> > > being just
> > > more messy which could just be Kconfigs instead but for multiple value
> > > parameters
> > > like these do not fit in Kconfig.
> > >
> > > As you might already know, command line also gives an advantage to the
> > > end user
> > > to configure system without building kernel, for this specific command
> > > line its
> > > very useful because the performance bump is quite noticeable when the
> > > iommu.strict
> > > is off. Now for end user who would not be interested in building entire
> > > kernel(majority)
> > > and just cares about good speeds or throughput can find this very
> > > beneficial.
> > > I am not talking about one specific OS usecase here but more in general
> > > term.
> > >
> > > > The other problem is that this doesn't necessarily scale very well.
> > > > While it works OK for embedded cases it doesn't work terribly well for
> > > > distributions.  I know that in an out-of-band thread you indicated
> > > > that it doesn't break anything that's not already broken (AKA this
> > > > doesn't fix the distro case but it doesn't make it worse), it would be
> > > > better to come up with a more universal solution.
> > > >
> > >
> > > Is the universal solution here referring to fix all the command line
> > > parameters
> > > in the kernel or this specific command line? Are we going to remove any
> > > more
> > > addition to the cmdline ;)
> >
> > There are very few cases where a kernel command line parameter is the
> > only way to configure something.  Most of the time it's just there to
> > override a config.  I wouldn't suggest removing those.  I just don't
> > want a kernel command line parameter to be the primary way to enable
> > something.
> >
> >
> > > So possible other solution is the *bootconfig* which is again just
> > > another place
> > > to look for a config. So thing is that this universal solution would
> > > result in
> > > just more new fancy ways of passing configs or adding such configs to
> > > the drivers
> > > or subsystems in kernel which is pretty much similar to implementing
> > > policy in
> > > kernel which I think is frowned upon and mentioned in the other thread.
> > >
> > > > Ideally it feels like we should figure out how to tag devices in a
> > > > generic manner automatically (hardcode at the driver or in the device
> > > > tree). 

Re: [PATCH] iommu: Add support to filter non-strict/lazy mode based on device names

2020-08-25 Thread Rob Clark
On Tue, Aug 25, 2020 at 3:23 PM Doug Anderson  wrote:
>
> Hi,
>
> On Tue, Aug 25, 2020 at 12:01 PM Sai Prakash Ranjan
>  wrote:
> >
> > Hi,
> >
> > On 2020-08-25 21:40, Doug Anderson wrote:
> > > Hi,
> > >
> > > On Tue, Aug 25, 2020 at 8:43 AM Sai Prakash Ranjan
> > >  wrote:
> > >>
> > >> Currently the non-strict or lazy mode of TLB invalidation can only be
> > >> set
> > >> for all or no domains. This works well for development platforms where
> > >> setting to non-strict/lazy mode is fine for performance reasons but on
> > >> production devices, we need a more fine grained control to allow only
> > >> certain peripherals to support this mode where we can be sure that it
> > >> is
> > >> safe. So add support to filter non-strict/lazy mode based on the
> > >> device
> > >> names that are passed via cmdline parameter "iommu.nonstrict_device".
> > >>
> > >> Example:
> > >> iommu.nonstrict_device="7c4000.sdhci,a60.dwc3,6048000.etr"
>
> Just curious: are device names like this really guaranteed to be
> stable across versions?
>
>
> > > I have an inherent dislike of jamming things like this onto the
> > > command line.  IMHO the command line is the last resort for specifying
> > > configuration and generally should be limited to some specialized
> > > debug options and cases where the person running the kernel needs to
> > > override a config that was set by the person (or company) compiling
> > > the kernel.  Specifically, having a long/unwieldy command line makes
> > > it harder to use for the case when an end user actually wants to use
> > > it to override something.  It's also just another place to look for
> > > config.
> > >
> >
> > Good thing about command line parameters are that they are optional,
> > they do
> > not specify any default behaviour (I mean they are not mandatory to be
> > set
> > for the system to be functional), so I would like to view it as an
> > optional
> > config. And this command line parameter (nonstrict_device) is strictly
> > optional
> > with default being strict already set in the driver.
> >
> > They can be passed from the bootloader via chosen node for DT platforms
> > or choose
> > a new *bootconfig* as a way to pass the cmdline but finally it does boil
> > down to
> > just another config.
>
> Never looked at bootconfig.  Unfortunately it seems to require
> initramfs so that pretty much means it's out for my usage.  :(
>
>
> > I agree with general boolean or single value command line parameters
> > being just
> > more messy which could just be Kconfigs instead but for multiple value
> > parameters
> > like these do not fit in Kconfig.
> >
> > As you might already know, command line also gives an advantage to the
> > end user
> > to configure system without building kernel, for this specific command
> > line its
> > very useful because the performance bump is quite noticeable when the
> > iommu.strict
> > is off. Now for end user who would not be interested in building entire
> > kernel(majority)
> > and just cares about good speeds or throughput can find this very
> > beneficial.
> > I am not talking about one specific OS usecase here but more in general
> > term.
> >
> > > The other problem is that this doesn't necessarily scale very well.
> > > While it works OK for embedded cases it doesn't work terribly well for
> > > distributions.  I know that in an out-of-band thread you indicated
> > > that it doesn't break anything that's not already broken (AKA this
> > > doesn't fix the distro case but it doesn't make it worse), it would be
> > > better to come up with a more universal solution.
> > >
> >
> > Is the universal solution here referring to fix all the command line
> > parameters
> > in the kernel or this specific command line? Are we going to remove any
> > more
> > addition to the cmdline ;)
>
> There are very few cases where a kernel command line parameter is the
> only way to configure something.  Most of the time it's just there to
> override a config.  I wouldn't suggest removing those.  I just don't
> want a kernel command line parameter to be the primary way to enable
> something.
>
>
> > So possible other solution is the *bootconfig* which is again just
> > another place
> > to look for a config. So thing is that this universal solution would
> > result in
> > just more new fancy ways of passing configs or adding such configs to
> > the drivers
> > or subsystems in kernel which is pretty much similar to implementing
> > policy in
> > kernel which I think is frowned upon and mentioned in the other thread.
> >
> > > Ideally it feels like we should figure out how to tag devices in a
> > > generic manner automatically (hardcode at the driver or in the device
> > > tree).  I think the out-of-band discussions talked about "external
> > > facing" and the like.  We could also, perhaps, tag devices that have
> > > "binary blob" firmware if we wanted.  Then we'd have a policy (set by
> > > Kconfig, perhaps overridable via commandline) that indicated the
> > > 

Re: [PATCH] iommu: Add support to filter non-strict/lazy mode based on device names

2020-08-25 Thread Doug Anderson
Hi,

On Tue, Aug 25, 2020 at 12:01 PM Sai Prakash Ranjan
 wrote:
>
> Hi,
>
> On 2020-08-25 21:40, Doug Anderson wrote:
> > Hi,
> >
> > On Tue, Aug 25, 2020 at 8:43 AM Sai Prakash Ranjan
> >  wrote:
> >>
> >> Currently the non-strict or lazy mode of TLB invalidation can only be
> >> set
> >> for all or no domains. This works well for development platforms where
> >> setting to non-strict/lazy mode is fine for performance reasons but on
> >> production devices, we need a more fine grained control to allow only
> >> certain peripherals to support this mode where we can be sure that it
> >> is
> >> safe. So add support to filter non-strict/lazy mode based on the
> >> device
> >> names that are passed via cmdline parameter "iommu.nonstrict_device".
> >>
> >> Example:
> >> iommu.nonstrict_device="7c4000.sdhci,a60.dwc3,6048000.etr"

Just curious: are device names like this really guaranteed to be
stable across versions?


> > I have an inherent dislike of jamming things like this onto the
> > command line.  IMHO the command line is the last resort for specifying
> > configuration and generally should be limited to some specialized
> > debug options and cases where the person running the kernel needs to
> > override a config that was set by the person (or company) compiling
> > the kernel.  Specifically, having a long/unwieldy command line makes
> > it harder to use for the case when an end user actually wants to use
> > it to override something.  It's also just another place to look for
> > config.
> >
>
> Good thing about command line parameters are that they are optional,
> they do
> not specify any default behaviour (I mean they are not mandatory to be
> set
> for the system to be functional), so I would like to view it as an
> optional
> config. And this command line parameter (nonstrict_device) is strictly
> optional
> with default being strict already set in the driver.
>
> They can be passed from the bootloader via chosen node for DT platforms
> or choose
> a new *bootconfig* as a way to pass the cmdline but finally it does boil
> down to
> just another config.

Never looked at bootconfig.  Unfortunately it seems to require
initramfs so that pretty much means it's out for my usage.  :(


> I agree with general boolean or single value command line parameters
> being just
> more messy which could just be Kconfigs instead but for multiple value
> parameters
> like these do not fit in Kconfig.
>
> As you might already know, command line also gives an advantage to the
> end user
> to configure system without building kernel, for this specific command
> line its
> very useful because the performance bump is quite noticeable when the
> iommu.strict
> is off. Now for end user who would not be interested in building entire
> kernel(majority)
> and just cares about good speeds or throughput can find this very
> beneficial.
> I am not talking about one specific OS usecase here but more in general
> term.
>
> > The other problem is that this doesn't necessarily scale very well.
> > While it works OK for embedded cases it doesn't work terribly well for
> > distributions.  I know that in an out-of-band thread you indicated
> > that it doesn't break anything that's not already broken (AKA this
> > doesn't fix the distro case but it doesn't make it worse), it would be
> > better to come up with a more universal solution.
> >
>
> Is the universal solution here referring to fix all the command line
> parameters
> in the kernel or this specific command line? Are we going to remove any
> more
> addition to the cmdline ;)

There are very few cases where a kernel command line parameter is the
only way to configure something.  Most of the time it's just there to
override a config.  I wouldn't suggest removing those.  I just don't
want a kernel command line parameter to be the primary way to enable
something.


> So possible other solution is the *bootconfig* which is again just
> another place
> to look for a config. So thing is that this universal solution would
> result in
> just more new fancy ways of passing configs or adding such configs to
> the drivers
> or subsystems in kernel which is pretty much similar to implementing
> policy in
> kernel which I think is frowned upon and mentioned in the other thread.
>
> > Ideally it feels like we should figure out how to tag devices in a
> > generic manner automatically (hardcode at the driver or in the device
> > tree).  I think the out-of-band discussions talked about "external
> > facing" and the like.  We could also, perhaps, tag devices that have
> > "binary blob" firmware if we wanted.  Then we'd have a policy (set by
> > Kconfig, perhaps overridable via commandline) that indicated the
> > strictness level for the various classes of devices.  So policy would
> > be decided by KConfig and/or command line.
> >
>
> How is tagging in driver or device tree better than the simple command
> line
> approach to pass the same list of devices which otherwise you would
> hardcode
> 

Re: [patch RFC 30/38] PCI/MSI: Allow to disable arch fallbacks

2020-08-25 Thread Thomas Gleixner
On Tue, Aug 25 2020 at 23:40, Thomas Gleixner wrote:
> On Tue, Aug 25 2020 at 16:35, Bjorn Helgaas wrote:
>> On Tue, Aug 25, 2020 at 11:28:30PM +0200, Thomas Gleixner wrote:
>>> 
>>> Or did you just mean that those architectures should select
>>> CONFIG_I_WANT_THE CRUFT instead of opting out on the fully irq domain
>>> based ones?
>>
>> Yes, that was my real question -- can we confine the cruft in the
>> crufty arches?  If not, no big deal.
>
> Should be doable. Let me try.

Bah. There is more cruft.

The weak implementation has another way to go sideways via
msi_controller::setup_irq[s] and msi_controller::teardown_irq

drivers/pci/controller/pci-tegra.c
drivers/pci/controller/pcie-rcar-host.c
drivers/pci/controller/pcie-xilinx.c

Groan

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [patch RFC 34/38] x86/msi: Let pci_msi_prepare() handle non-PCI MSI

2020-08-25 Thread Bjorn Helgaas
On Tue, Aug 25, 2020 at 11:30:41PM +0200, Thomas Gleixner wrote:
> On Tue, Aug 25 2020 at 15:24, Bjorn Helgaas wrote:
> > On Fri, Aug 21, 2020 at 02:24:58AM +0200, Thomas Gleixner wrote:
> >> Rename it to x86_msi_prepare() and handle the allocation type setup
> >> depending on the device type.
> >
> > I see what you're doing, but the subject reads a little strangely
> 
> Yes :(
> 
> > ("pci_msi_prepare() handling non-PCI" stuff) since it doesn't mention
> > the rename.  Maybe not practical or worthwhile to split into a rename
> > + make generic, I dunno.
> 
> What about
> 
> x86/msi: Rename and rework pci_msi_prepare() to cover non-PCI MSI

Perfect!
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [patch RFC 30/38] PCI/MSI: Allow to disable arch fallbacks

2020-08-25 Thread Thomas Gleixner
On Tue, Aug 25 2020 at 16:35, Bjorn Helgaas wrote:
> On Tue, Aug 25, 2020 at 11:28:30PM +0200, Thomas Gleixner wrote:
>> 
>> Or did you just mean that those architectures should select
>> CONFIG_I_WANT_THE CRUFT instead of opting out on the fully irq domain
>> based ones?
>
> Yes, that was my real question -- can we confine the cruft in the
> crufty arches?  If not, no big deal.

Should be doable. Let me try.

Thanks,

tglx
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [patch RFC 30/38] PCI/MSI: Allow to disable arch fallbacks

2020-08-25 Thread Bjorn Helgaas
On Tue, Aug 25, 2020 at 11:28:30PM +0200, Thomas Gleixner wrote:
> On Tue, Aug 25 2020 at 15:07, Bjorn Helgaas wrote:
> >> + * The arch hooks to setup up msi irqs. Default functions are implemented
> >> + * as weak symbols so that they /can/ be overriden by architecture 
> >> specific
> >> + * code if needed.
> >> + *
> >> + * They can be replaced by stubs with warnings via
> >> + * CONFIG_PCI_MSI_DISABLE_ARCH_FALLBACKS when the architecture fully
> >> + * utilizes direct irqdomain based setup.

> > If not, it seems like it'd be nicer to have the burden on the arches
> > that need/want to use arch-specific code instead of on the arches that
> > do things generically.
> 
> Right, but they still share the common code there and some of them
> provide only parts of the weak callbacks. I'm not sure whether it's a
> good idea to copy all of this into each affected architecture.
> 
> Or did you just mean that those architectures should select
> CONFIG_I_WANT_THE CRUFT instead of opting out on the fully irq domain
> based ones?

Yes, that was my real question -- can we confine the cruft in the
crufty arches?  If not, no big deal.

Bjorn
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [patch RFC 34/38] x86/msi: Let pci_msi_prepare() handle non-PCI MSI

2020-08-25 Thread Thomas Gleixner
On Tue, Aug 25 2020 at 15:24, Bjorn Helgaas wrote:
> On Fri, Aug 21, 2020 at 02:24:58AM +0200, Thomas Gleixner wrote:
>> Rename it to x86_msi_prepare() and handle the allocation type setup
>> depending on the device type.
>
> I see what you're doing, but the subject reads a little strangely

Yes :(

> ("pci_msi_prepare() handling non-PCI" stuff) since it doesn't mention
> the rename.  Maybe not practical or worthwhile to split into a rename
> + make generic, I dunno.

What about

x86/msi: Rename and rework pci_msi_prepare() to cover non-PCI MSI

Thanks,

tglx
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [patch RFC 30/38] PCI/MSI: Allow to disable arch fallbacks

2020-08-25 Thread Thomas Gleixner
On Tue, Aug 25 2020 at 15:07, Bjorn Helgaas wrote:
>> + * The arch hooks to setup up msi irqs. Default functions are implemented
>> + * as weak symbols so that they /can/ be overriden by architecture specific
>> + * code if needed.
>> + *
>> + * They can be replaced by stubs with warnings via
>> + * CONFIG_PCI_MSI_DISABLE_ARCH_FALLBACKS when the architecture fully
>> + * utilizes direct irqdomain based setup.
>
> Do you expect *all* arches to eventually use direct irqdomain setup?

Ideally that happens some day. We have five left when x86 is converted:

IA64, MIPS, POWERPC, S390, SPARC

IA64 is unlikely to be fixed, but might be solved naturally by removal.

For the others I don't know, but it's not on the horizon anytime soon I
fear.

> And in that case, to remove the config option?

Yes, and all the code which depends on it.

> If not, it seems like it'd be nicer to have the burden on the arches
> that need/want to use arch-specific code instead of on the arches that
> do things generically.

Right, but they still share the common code there and some of them
provide only parts of the weak callbacks. I'm not sure whether it's a
good idea to copy all of this into each affected architecture.

Or did you just mean that those architectures should select
CONFIG_I_WANT_THE CRUFT instead of opting out on the fully irq domain
based ones?

Thanks,

tglx


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [patch RFC 13/38] PCI: MSI: Rework pci_msi_domain_calc_hwirq()

2020-08-25 Thread Thomas Gleixner
On Tue, Aug 25 2020 at 15:03, Bjorn Helgaas wrote:
> On Fri, Aug 21, 2020 at 02:24:37AM +0200, Thomas Gleixner wrote:
>> Retrieve the PCI device from the msi descriptor instead of doing so at the
>> call sites.
>
> I'd like it *better* with "PCI/MSI: " in the subject (to match history

Duh, yes.

> and other patches in this series) and "MSI" here in the commit log,
> but nice cleanup and:
>> --- a/arch/x86/kernel/apic/msi.c
>> +++ b/arch/x86/kernel/apic/msi.c
>> @@ -232,7 +232,7 @@ EXPORT_SYMBOL_GPL(pci_msi_prepare);
>>  
>>  void pci_msi_set_desc(msi_alloc_info_t *arg, struct msi_desc *desc)
>>  {
>> -arg->msi_hwirq = pci_msi_domain_calc_hwirq(arg->msi_dev, desc);
>> +arg->msi_hwirq = pci_msi_domain_calc_hwirq(desc);
>
> I guess it's safe to assume that "arg->msi_dev ==
> msi_desc_to_pci_dev(desc)"?  I didn't try to verify that.

It is.

>> +irq_hw_number_t pci_msi_domain_calc_hwirq(struct msi_desc *desc)
>>  {
>> +struct pci_dev *pdev = msi_desc_to_pci_dev(desc);
>
> If you named this "struct pci_dev *dev" (not "pdev"), the diff would
> be a little smaller and it would match other usage in the file.

Ok. I'm always happy to see pdev because that doesn't make me wonder
which type of dev it is :) But, yeah lets keep it consistent.

Thanks,

tglx
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [patch RFC 34/38] x86/msi: Let pci_msi_prepare() handle non-PCI MSI

2020-08-25 Thread Bjorn Helgaas
On Fri, Aug 21, 2020 at 02:24:58AM +0200, Thomas Gleixner wrote:
> Rename it to x86_msi_prepare() and handle the allocation type setup
> depending on the device type.

I see what you're doing, but the subject reads a little strangely
("pci_msi_prepare() handling non-PCI" stuff) since it doesn't mention
the rename.  Maybe not practical or worthwhile to split into a rename
+ make generic, I dunno.

> Add a new arch_msi_prepare define which will be utilized by the upcoming
> device MSI support. Define it to NULL if not provided by an architecture in
> the generic MSI header.
> 
> One arch specific function for MSI support is truly enough.
> 
> Signed-off-by: Thomas Gleixner 
> Cc: linux-...@vger.kernel.org
> Cc: linux-hyp...@vger.kernel.org
> ---
>  arch/x86/include/asm/msi.h  |4 +++-
>  arch/x86/kernel/apic/msi.c  |   27 ---
>  drivers/pci/controller/pci-hyperv.c |2 +-
>  include/linux/msi.h |4 
>  4 files changed, 28 insertions(+), 9 deletions(-)
> 
> --- a/arch/x86/include/asm/msi.h
> +++ b/arch/x86/include/asm/msi.h
> @@ -6,7 +6,9 @@
>  
>  typedef struct irq_alloc_info msi_alloc_info_t;
>  
> -int pci_msi_prepare(struct irq_domain *domain, struct device *dev, int nvec,
> +int x86_msi_prepare(struct irq_domain *domain, struct device *dev, int nvec,
>   msi_alloc_info_t *arg);
>  
> +#define arch_msi_prepare x86_msi_prepare
> +
>  #endif /* _ASM_X86_MSI_H */
> --- a/arch/x86/kernel/apic/msi.c
> +++ b/arch/x86/kernel/apic/msi.c
> @@ -182,26 +182,39 @@ static struct irq_chip pci_msi_controlle
>   .flags  = IRQCHIP_SKIP_SET_WAKE,
>  };
>  
> -int pci_msi_prepare(struct irq_domain *domain, struct device *dev, int nvec,
> - msi_alloc_info_t *arg)
> +static void pci_msi_prepare(struct device *dev, msi_alloc_info_t *arg)
>  {
> - struct pci_dev *pdev = to_pci_dev(dev);
> - struct msi_desc *desc = first_pci_msi_entry(pdev);
> + struct msi_desc *desc = first_msi_entry(dev);
>  
> - init_irq_alloc_info(arg, NULL);
>   if (desc->msi_attrib.is_msix) {
>   arg->type = X86_IRQ_ALLOC_TYPE_PCI_MSIX;
>   } else {
>   arg->type = X86_IRQ_ALLOC_TYPE_PCI_MSI;
>   arg->flags |= X86_IRQ_ALLOC_CONTIGUOUS_VECTORS;
>   }
> +}
> +
> +static void dev_msi_prepare(struct device *dev, msi_alloc_info_t *arg)
> +{
> + arg->type = X86_IRQ_ALLOC_TYPE_DEV_MSI;
> +}
> +
> +int x86_msi_prepare(struct irq_domain *domain, struct device *dev, int nvec,
> + msi_alloc_info_t *arg)
> +{
> + init_irq_alloc_info(arg, NULL);
> +
> + if (dev_is_pci(dev))
> + pci_msi_prepare(dev, arg);
> + else
> + dev_msi_prepare(dev, arg);
>  
>   return 0;
>  }
> -EXPORT_SYMBOL_GPL(pci_msi_prepare);
> +EXPORT_SYMBOL_GPL(x86_msi_prepare);
>  
>  static struct msi_domain_ops pci_msi_domain_ops = {
> - .msi_prepare= pci_msi_prepare,
> + .msi_prepare= x86_msi_prepare,
>  };
>  
>  static struct msi_domain_info pci_msi_domain_info = {
> --- a/drivers/pci/controller/pci-hyperv.c
> +++ b/drivers/pci/controller/pci-hyperv.c
> @@ -1532,7 +1532,7 @@ static struct irq_chip hv_msi_irq_chip =
>  };
>  
>  static struct msi_domain_ops hv_msi_ops = {
> - .msi_prepare= pci_msi_prepare,
> + .msi_prepare= arch_msi_prepare,
>   .msi_free   = hv_msi_free,
>  };
>  
> --- a/include/linux/msi.h
> +++ b/include/linux/msi.h
> @@ -430,4 +430,8 @@ static inline struct irq_domain *pci_msi
>  }
>  #endif /* CONFIG_PCI_MSI_IRQ_DOMAIN */
>  
> +#ifndef arch_msi_prepare
> +# define arch_msi_prepareNULL
> +#endif
> +
>  #endif /* LINUX_MSI_H */
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [patch RFC 17/38] x86/pci: Reducde #ifdeffery in PCI init code

2020-08-25 Thread Bjorn Helgaas
s/Reducde/Reduce/ (in subject)

On Fri, Aug 21, 2020 at 02:24:41AM +0200, Thomas Gleixner wrote:
> Adding a function call before the first #ifdef in arch_pci_init() triggers
> a 'mixed declarations and code' warning if PCI_DIRECT is enabled.
> 
> Use stub functions and move the #ifdeffery to the header file where it is
> not in the way.
> 
> Signed-off-by: Thomas Gleixner 
> Cc: linux-...@vger.kernel.org

Nice cleanup, thanks.  Glad to get rid of the useless initializer,
too.

Acked-by: Bjorn Helgaas 

> ---
>  arch/x86/include/asm/pci_x86.h |   11 +++
>  arch/x86/pci/init.c|   10 +++---
>  2 files changed, 14 insertions(+), 7 deletions(-)
> 
> --- a/arch/x86/include/asm/pci_x86.h
> +++ b/arch/x86/include/asm/pci_x86.h
> @@ -114,9 +114,20 @@ extern const struct pci_raw_ops pci_dire
>  extern bool port_cf9_safe;
>  
>  /* arch_initcall level */
> +#ifdef CONFIG_PCI_DIRECT
>  extern int pci_direct_probe(void);
>  extern void pci_direct_init(int type);
> +#else
> +static inline int pci_direct_probe(void) { return -1; }
> +static inline  void pci_direct_init(int type) { }
> +#endif
> +
> +#ifdef CONFIG_PCI_BIOS
>  extern void pci_pcbios_init(void);
> +#else
> +static inline void pci_pcbios_init(void) { }
> +#endif
> +
>  extern void __init dmi_check_pciprobe(void);
>  extern void __init dmi_check_skip_isa_align(void);
>  
> --- a/arch/x86/pci/init.c
> +++ b/arch/x86/pci/init.c
> @@ -8,11 +8,9 @@
> in the right sequence from here. */
>  static __init int pci_arch_init(void)
>  {
> -#ifdef CONFIG_PCI_DIRECT
> - int type = 0;
> + int type;
>  
>   type = pci_direct_probe();
> -#endif
>  
>   if (!(pci_probe & PCI_PROBE_NOEARLY))
>   pci_mmcfg_early_init();
> @@ -20,18 +18,16 @@ static __init int pci_arch_init(void)
>   if (x86_init.pci.arch_init && !x86_init.pci.arch_init())
>   return 0;
>  
> -#ifdef CONFIG_PCI_BIOS
>   pci_pcbios_init();
> -#endif
> +
>   /*
>* don't check for raw_pci_ops here because we want pcbios as last
>* fallback, yet it's needed to run first to set pcibios_last_bus
>* in case legacy PCI probing is used. otherwise detecting peer busses
>* fails.
>*/
> -#ifdef CONFIG_PCI_DIRECT
>   pci_direct_init(type);
> -#endif
> +
>   if (!raw_pci_ops && !raw_pci_ext_ops)
>   printk(KERN_ERR
>   "PCI: Fatal: No config space access function found\n");
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [patch RFC 21/38] PCI: MSI: Provide pci_dev_has_special_msi_domain() helper

2020-08-25 Thread Bjorn Helgaas
On Fri, Aug 21, 2020 at 02:24:45AM +0200, Thomas Gleixner wrote:
> Provide a helper function to check whether a PCI device is handled by a
> non-standard PCI/MSI domain. This will be used to exclude such devices
> which hang of a special bus, e.g. VMD, to be excluded from the irq domain
> override in irq remapping.
> 
> Signed-off-by: Thomas Gleixner 
> Cc: Bjorn Helgaas 
> Cc: linux-...@vger.kernel.org

Acked-by: Bjorn Helgaas 

s|PCI: MSI:|PCI/MSI:| in the subject if feasible.

> ---
>  drivers/pci/msi.c   |   22 ++
>  include/linux/msi.h |1 +
>  2 files changed, 23 insertions(+)
> 
> --- a/drivers/pci/msi.c
> +++ b/drivers/pci/msi.c
> @@ -1553,4 +1553,26 @@ struct irq_domain *pci_msi_get_device_do
>DOMAIN_BUS_PCI_MSI);
>   return dom;
>  }
> +
> +/**
> + * pci_dev_has_special_msi_domain - Check whether the device is handled by
> + *   a non-standard PCI-MSI domain
> + * @pdev:The PCI device to check.
> + *
> + * Returns: True if the device irqdomain or the bus irqdomain is
> + * non-standard PCI/MSI.
> + */
> +bool pci_dev_has_special_msi_domain(struct pci_dev *pdev)
> +{
> + struct irq_domain *dom = dev_get_msi_domain(>dev);
> +
> + if (!dom)
> + dom = dev_get_msi_domain(>bus->dev);
> +
> + if (!dom)
> + return true;
> +
> + return dom->bus_token != DOMAIN_BUS_PCI_MSI;
> +}
> +
>  #endif /* CONFIG_PCI_MSI_IRQ_DOMAIN */
> --- a/include/linux/msi.h
> +++ b/include/linux/msi.h
> @@ -374,6 +374,7 @@ int pci_msi_domain_check_cap(struct irq_
>struct msi_domain_info *info, struct device *dev);
>  u32 pci_msi_domain_get_msi_rid(struct irq_domain *domain, struct pci_dev 
> *pdev);
>  struct irq_domain *pci_msi_get_device_domain(struct pci_dev *pdev);
> +bool pci_dev_has_special_msi_domain(struct pci_dev *pdev);
>  #else
>  static inline struct irq_domain *pci_msi_get_device_domain(struct pci_dev 
> *pdev)
>  {
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [patch RFC 30/38] PCI/MSI: Allow to disable arch fallbacks

2020-08-25 Thread Bjorn Helgaas
On Fri, Aug 21, 2020 at 02:24:54AM +0200, Thomas Gleixner wrote:
> If an architecture does not require the MSI setup/teardown fallback
> functions, then allow them to be replaced by stub functions which emit a
> warning.
> 
> Signed-off-by: Thomas Gleixner 
> Cc: Bjorn Helgaas 
> Cc: linux-...@vger.kernel.org

Acked-by: Bjorn Helgaas 

Question/comment below.

> ---
>  drivers/pci/Kconfig |3 +++
>  drivers/pci/msi.c   |3 ++-
>  include/linux/msi.h |   31 ++-
>  3 files changed, 31 insertions(+), 6 deletions(-)
> 
> --- a/drivers/pci/Kconfig
> +++ b/drivers/pci/Kconfig
> @@ -56,6 +56,9 @@ config PCI_MSI_IRQ_DOMAIN
>   depends on PCI_MSI
>   select GENERIC_MSI_IRQ_DOMAIN
>  
> +config PCI_MSI_DISABLE_ARCH_FALLBACKS
> + bool
> +
>  config PCI_QUIRKS
>   default y
>   bool "Enable PCI quirk workarounds" if EXPERT
> --- a/drivers/pci/msi.c
> +++ b/drivers/pci/msi.c
> @@ -58,8 +58,8 @@ static void pci_msi_teardown_msi_irqs(st
>  #define pci_msi_teardown_msi_irqsarch_teardown_msi_irqs
>  #endif
>  
> +#ifndef CONFIG_PCI_MSI_DISABLE_ARCH_FALLBACKS
>  /* Arch hooks */
> -
>  int __weak arch_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc)
>  {
>   struct msi_controller *chip = dev->bus->msi;
> @@ -132,6 +132,7 @@ void __weak arch_teardown_msi_irqs(struc
>  {
>   return default_teardown_msi_irqs(dev);
>  }
> +#endif /* !CONFIG_PCI_MSI_DISABLE_ARCH_FALLBACKS */
>  
>  static void default_restore_msi_irq(struct pci_dev *dev, int irq)
>  {
> --- a/include/linux/msi.h
> +++ b/include/linux/msi.h
> @@ -193,17 +193,38 @@ void pci_msi_mask_irq(struct irq_data *d
>  void pci_msi_unmask_irq(struct irq_data *data);
>  
>  /*
> - * The arch hooks to setup up msi irqs. Those functions are
> - * implemented as weak symbols so that they /can/ be overriden by
> - * architecture specific code if needed.
> + * The arch hooks to setup up msi irqs. Default functions are implemented
> + * as weak symbols so that they /can/ be overriden by architecture specific
> + * code if needed.
> + *
> + * They can be replaced by stubs with warnings via
> + * CONFIG_PCI_MSI_DISABLE_ARCH_FALLBACKS when the architecture fully
> + * utilizes direct irqdomain based setup.

Do you expect *all* arches to eventually use direct irqdomain setup?
And in that case, to remove the config option?

If not, it seems like it'd be nicer to have the burden on the arches
that need/want to use arch-specific code instead of on the arches that
do things generically.

>   */
> +#ifndef CONFIG_PCI_MSI_DISABLE_ARCH_FALLBACKS
>  int arch_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc);
>  void arch_teardown_msi_irq(unsigned int irq);
>  int arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type);
>  void arch_teardown_msi_irqs(struct pci_dev *dev);
> -void arch_restore_msi_irqs(struct pci_dev *dev);
> -
>  void default_teardown_msi_irqs(struct pci_dev *dev);
> +#else
> +static inline int arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int 
> type)
> +{
> + WARN_ON_ONCE(1);
> + return -ENODEV;
> +}
> +
> +static inline void arch_teardown_msi_irqs(struct pci_dev *dev)
> +{
> + WARN_ON_ONCE(1);
> +}
> +#endif
> +
> +/*
> + * The restore hooks are still available as they are useful even
> + * for fully irq domain based setups. Courtesy to XEN/X86.
> + */
> +void arch_restore_msi_irqs(struct pci_dev *dev);
>  void default_restore_msi_irqs(struct pci_dev *dev);
>  
>  struct msi_controller {
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [patch RFC 20/38] PCI: vmd: Mark VMD irqdomain with DOMAIN_BUS_VMD_MSI

2020-08-25 Thread Bjorn Helgaas
On Fri, Aug 21, 2020 at 02:24:44AM +0200, Thomas Gleixner wrote:
> Devices on the VMD bus use their own MSI irq domain, but it is not
> distinguishable from regular PCI/MSI irq domains. This is required
> to exclude VMD devices from getting the irq domain pointer set by
> interrupt remapping.
> 
> Override the default bus token.
> 
> Signed-off-by: Thomas Gleixner 
> Cc: Bjorn Helgaas 
> Cc: Lorenzo Pieralisi 
> Cc: Jonathan Derrick 
> Cc: linux-...@vger.kernel.org

Acked-by: Bjorn Helgaas 

> ---
>  drivers/pci/controller/vmd.c |6 ++
>  1 file changed, 6 insertions(+)
> 
> --- a/drivers/pci/controller/vmd.c
> +++ b/drivers/pci/controller/vmd.c
> @@ -579,6 +579,12 @@ static int vmd_enable_domain(struct vmd_
>   return -ENODEV;
>   }
>  
> + /*
> +  * Override the irq domain bus token so the domain can be distinguished
> +  * from a regular PCI/MSI domain.
> +  */
> + irq_domain_update_bus_token(vmd->irq_domain, DOMAIN_BUS_VMD_MSI);
> +
>   pci_add_resource(, >resources[0]);
>   pci_add_resource_offset(, >resources[1], offset[0]);
>   pci_add_resource_offset(, >resources[2], offset[1]);
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [patch RFC 13/38] PCI: MSI: Rework pci_msi_domain_calc_hwirq()

2020-08-25 Thread Bjorn Helgaas
On Fri, Aug 21, 2020 at 02:24:37AM +0200, Thomas Gleixner wrote:
> Retrieve the PCI device from the msi descriptor instead of doing so at the
> call sites.

I'd like it *better* with "PCI/MSI: " in the subject (to match history
and other patches in this series) and "MSI" here in the commit log,
but nice cleanup and:

Acked-by: Bjorn Helgaas 

Minor comments below.

> Signed-off-by: Thomas Gleixner 
> Cc: linux-...@vger.kernel.org
> ---
>  arch/x86/kernel/apic/msi.c |2 +-
>  drivers/pci/msi.c  |   13 ++---
>  include/linux/msi.h|3 +--
>  3 files changed, 8 insertions(+), 10 deletions(-)
> 
> --- a/arch/x86/kernel/apic/msi.c
> +++ b/arch/x86/kernel/apic/msi.c
> @@ -232,7 +232,7 @@ EXPORT_SYMBOL_GPL(pci_msi_prepare);
>  
>  void pci_msi_set_desc(msi_alloc_info_t *arg, struct msi_desc *desc)
>  {
> - arg->msi_hwirq = pci_msi_domain_calc_hwirq(arg->msi_dev, desc);
> + arg->msi_hwirq = pci_msi_domain_calc_hwirq(desc);

I guess it's safe to assume that "arg->msi_dev ==
msi_desc_to_pci_dev(desc)"?  I didn't try to verify that.

>  }
>  EXPORT_SYMBOL_GPL(pci_msi_set_desc);
>  
> --- a/drivers/pci/msi.c
> +++ b/drivers/pci/msi.c
> @@ -1346,17 +1346,17 @@ void pci_msi_domain_write_msg(struct irq
>  
>  /**
>   * pci_msi_domain_calc_hwirq - Generate a unique ID for an MSI source
> - * @dev: Pointer to the PCI device
>   * @desc:Pointer to the MSI descriptor
>   *
>   * The ID number is only used within the irqdomain.
>   */
> -irq_hw_number_t pci_msi_domain_calc_hwirq(struct pci_dev *dev,
> -   struct msi_desc *desc)
> +irq_hw_number_t pci_msi_domain_calc_hwirq(struct msi_desc *desc)
>  {
> + struct pci_dev *pdev = msi_desc_to_pci_dev(desc);

If you named this "struct pci_dev *dev" (not "pdev"), the diff would
be a little smaller and it would match other usage in the file.

>   return (irq_hw_number_t)desc->msi_attrib.entry_nr |
> - pci_dev_id(dev) << 11 |
> - (pci_domain_nr(dev->bus) & 0x) << 27;
> + pci_dev_id(pdev) << 11 |
> + (pci_domain_nr(pdev->bus) & 0x) << 27;
>  }
>  
>  static inline bool pci_msi_desc_is_multi_msi(struct msi_desc *desc)
> @@ -1406,8 +1406,7 @@ static void pci_msi_domain_set_desc(msi_
>   struct msi_desc *desc)
>  {
>   arg->desc = desc;
> - arg->hwirq = pci_msi_domain_calc_hwirq(msi_desc_to_pci_dev(desc),
> -desc);
> + arg->hwirq = pci_msi_domain_calc_hwirq(desc);
>  }
>  #else
>  #define pci_msi_domain_set_desc  NULL
> --- a/include/linux/msi.h
> +++ b/include/linux/msi.h
> @@ -369,8 +369,7 @@ void pci_msi_domain_write_msg(struct irq
>  struct irq_domain *pci_msi_create_irq_domain(struct fwnode_handle *fwnode,
>struct msi_domain_info *info,
>struct irq_domain *parent);
> -irq_hw_number_t pci_msi_domain_calc_hwirq(struct pci_dev *dev,
> -   struct msi_desc *desc);
> +irq_hw_number_t pci_msi_domain_calc_hwirq(struct msi_desc *desc);
>  int pci_msi_domain_check_cap(struct irq_domain *domain,
>struct msi_domain_info *info, struct device *dev);
>  u32 pci_msi_domain_get_msi_rid(struct irq_domain *domain, struct pci_dev 
> *pdev);
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu: Add support to filter non-strict/lazy mode based on device names

2020-08-25 Thread Sai Prakash Ranjan

Hi,

On 2020-08-25 21:40, Doug Anderson wrote:

Hi,

On Tue, Aug 25, 2020 at 8:43 AM Sai Prakash Ranjan
 wrote:


Currently the non-strict or lazy mode of TLB invalidation can only be 
set

for all or no domains. This works well for development platforms where
setting to non-strict/lazy mode is fine for performance reasons but on
production devices, we need a more fine grained control to allow only
certain peripherals to support this mode where we can be sure that it 
is
safe. So add support to filter non-strict/lazy mode based on the 
device

names that are passed via cmdline parameter "iommu.nonstrict_device".

Example: 
iommu.nonstrict_device="7c4000.sdhci,a60.dwc3,6048000.etr"


I have an inherent dislike of jamming things like this onto the
command line.  IMHO the command line is the last resort for specifying
configuration and generally should be limited to some specialized
debug options and cases where the person running the kernel needs to
override a config that was set by the person (or company) compiling
the kernel.  Specifically, having a long/unwieldy command line makes
it harder to use for the case when an end user actually wants to use
it to override something.  It's also just another place to look for
config.



Good thing about command line parameters are that they are optional, 
they do
not specify any default behaviour (I mean they are not mandatory to be 
set
for the system to be functional), so I would like to view it as an 
optional
config. And this command line parameter (nonstrict_device) is strictly 
optional

with default being strict already set in the driver.

They can be passed from the bootloader via chosen node for DT platforms 
or choose
a new *bootconfig* as a way to pass the cmdline but finally it does boil 
down to

just another config.

I agree with general boolean or single value command line parameters 
being just
more messy which could just be Kconfigs instead but for multiple value 
parameters

like these do not fit in Kconfig.

As you might already know, command line also gives an advantage to the 
end user
to configure system without building kernel, for this specific command 
line its
very useful because the performance bump is quite noticeable when the 
iommu.strict
is off. Now for end user who would not be interested in building entire 
kernel(majority)
and just cares about good speeds or throughput can find this very 
beneficial.
I am not talking about one specific OS usecase here but more in general 
term.



The other problem is that this doesn't necessarily scale very well.
While it works OK for embedded cases it doesn't work terribly well for
distributions.  I know that in an out-of-band thread you indicated
that it doesn't break anything that's not already broken (AKA this
doesn't fix the distro case but it doesn't make it worse), it would be
better to come up with a more universal solution.



Is the universal solution here referring to fix all the command line 
parameters
in the kernel or this specific command line? Are we going to remove any 
more

addition to the cmdline ;)

So possible other solution is the *bootconfig* which is again just 
another place
to look for a config. So thing is that this universal solution would 
result in
just more new fancy ways of passing configs or adding such configs to 
the drivers
or subsystems in kernel which is pretty much similar to implementing 
policy in

kernel which I think is frowned upon and mentioned in the other thread.


Ideally it feels like we should figure out how to tag devices in a
generic manner automatically (hardcode at the driver or in the device
tree).  I think the out-of-band discussions talked about "external
facing" and the like.  We could also, perhaps, tag devices that have
"binary blob" firmware if we wanted.  Then we'd have a policy (set by
Kconfig, perhaps overridable via commandline) that indicated the
strictness level for the various classes of devices.  So policy would
be decided by KConfig and/or command line.



How is tagging in driver or device tree better than the simple command 
line
approach to pass the same list of devices which otherwise you would 
hardcode
in the corresponding drivers and device tree all over the kernel other 
than

the scalability part for command line? IMHO it is too much churn.

Device tree could be used but then we have a problem with it being for 
only

describing hardware and it doesn't work for ACPI based systems.

Command line approach works for all systems (both DT and ACPI) without 
having
to add too much churn to drivers. Lastly, I think we can have both 
options, it

doesn't hurt to add command line parameter since it is optional.

Thanks,
Sai
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member

of Code Aurora Forum, hosted by The Linux Foundation
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v11 00/11] PCI: brcmstb: enable PCIe for STB chips

2020-08-25 Thread Florian Fainelli

Hi,

On 8/24/2020 12:30 PM, Jim Quinlan wrote:


Patchset Summary:
   Enhance a PCIe host controller driver.  Because of its unusual design
   we are foced to change dev->dma_pfn_offset into a more general role
   allowing multiple offsets.  See the 'v1' notes below for more info.


We are version 11 and counting, and it is not clear to me whether there 
is any chance of getting these patches reviewed and hopefully merged for 
the 5.10 merge window.


There are a lot of different files being touched, so what would be the 
ideal way of routing those changes towards inclusion?


Thanks!
--
Florian
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu: Add support to filter non-strict/lazy mode based on device names

2020-08-25 Thread Doug Anderson
Hi,

On Tue, Aug 25, 2020 at 8:43 AM Sai Prakash Ranjan
 wrote:
>
> Currently the non-strict or lazy mode of TLB invalidation can only be set
> for all or no domains. This works well for development platforms where
> setting to non-strict/lazy mode is fine for performance reasons but on
> production devices, we need a more fine grained control to allow only
> certain peripherals to support this mode where we can be sure that it is
> safe. So add support to filter non-strict/lazy mode based on the device
> names that are passed via cmdline parameter "iommu.nonstrict_device".
>
> Example: iommu.nonstrict_device="7c4000.sdhci,a60.dwc3,6048000.etr"

I have an inherent dislike of jamming things like this onto the
command line.  IMHO the command line is the last resort for specifying
configuration and generally should be limited to some specialized
debug options and cases where the person running the kernel needs to
override a config that was set by the person (or company) compiling
the kernel.  Specifically, having a long/unwieldy command line makes
it harder to use for the case when an end user actually wants to use
it to override something.  It's also just another place to look for
config.

The other problem is that this doesn't necessarily scale very well.
While it works OK for embedded cases it doesn't work terribly well for
distributions.  I know that in an out-of-band thread you indicated
that it doesn't break anything that's not already broken (AKA this
doesn't fix the distro case but it doesn't make it worse), it would be
better to come up with a more universal solution.

Ideally it feels like we should figure out how to tag devices in a
generic manner automatically (hardcode at the driver or in the device
tree).  I think the out-of-band discussions talked about "external
facing" and the like.  We could also, perhaps, tag devices that have
"binary blob" firmware if we wanted.  Then we'd have a policy (set by
Kconfig, perhaps overridable via commandline) that indicated the
strictness level for the various classes of devices.  So policy would
be decided by KConfig and/or command line.

-Doug
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu: Add support to filter non-strict/lazy mode based on device names

2020-08-25 Thread Sai Prakash Ranjan
Currently the non-strict or lazy mode of TLB invalidation can only be set
for all or no domains. This works well for development platforms where
setting to non-strict/lazy mode is fine for performance reasons but on
production devices, we need a more fine grained control to allow only
certain peripherals to support this mode where we can be sure that it is
safe. So add support to filter non-strict/lazy mode based on the device
names that are passed via cmdline parameter "iommu.nonstrict_device".

Example: iommu.nonstrict_device="7c4000.sdhci,a60.dwc3,6048000.etr"

Signed-off-by: Sai Prakash Ranjan 
---
 drivers/iommu/iommu.c | 37 +
 1 file changed, 33 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 609bd25bf154..fd10a073f557 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -32,6 +32,9 @@ static unsigned int iommu_def_domain_type __read_mostly;
 static bool iommu_dma_strict __read_mostly = true;
 static u32 iommu_cmd_line __read_mostly;
 
+#define DEVICE_NAME_LEN1024
+static char nonstrict_device[DEVICE_NAME_LEN] __read_mostly;
+
 struct iommu_group {
struct kobject kobj;
struct kobject *devices_kobj;
@@ -327,6 +330,32 @@ static int __init iommu_dma_setup(char *str)
 }
 early_param("iommu.strict", iommu_dma_setup);
 
+static int __init iommu_nonstrict_filter_setup(char *str)
+{
+   strlcpy(nonstrict_device, str, DEVICE_NAME_LEN);
+   return 1;
+}
+__setup("iommu.nonstrict_device=", iommu_nonstrict_filter_setup);
+
+static bool iommu_nonstrict_device(struct device *dev)
+{
+   char *filter, *device;
+
+   if (!dev)
+   return false;
+
+   filter = kstrdup(nonstrict_device, GFP_KERNEL);
+   if (!filter)
+   return false;
+
+   while ((device = strsep(, ","))) {
+   if (!strcmp(device, dev_name(dev)))
+   return true;
+   }
+
+   return false;
+}
+
 static ssize_t iommu_group_attr_show(struct kobject *kobj,
 struct attribute *__attr, char *buf)
 {
@@ -1470,7 +1499,7 @@ static int iommu_get_def_domain_type(struct device *dev)
 
 static int iommu_group_alloc_default_domain(struct bus_type *bus,
struct iommu_group *group,
-   unsigned int type)
+   unsigned int type, struct device 
*dev)
 {
struct iommu_domain *dom;
 
@@ -1489,7 +1518,7 @@ static int iommu_group_alloc_default_domain(struct 
bus_type *bus,
if (!group->domain)
group->domain = dom;
 
-   if (!iommu_dma_strict) {
+   if (!iommu_dma_strict || iommu_nonstrict_device(dev)) {
int attr = 1;
iommu_domain_set_attr(dom,
  DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE,
@@ -1509,7 +1538,7 @@ static int iommu_alloc_default_domain(struct iommu_group 
*group,
 
type = iommu_get_def_domain_type(dev);
 
-   return iommu_group_alloc_default_domain(dev->bus, group, type);
+   return iommu_group_alloc_default_domain(dev->bus, group, type, dev);
 }
 
 /**
@@ -1684,7 +1713,7 @@ static void probe_alloc_default_domain(struct bus_type 
*bus,
if (!gtype.type)
gtype.type = iommu_def_domain_type;
 
-   iommu_group_alloc_default_domain(bus, group, gtype.type);
+   iommu_group_alloc_default_domain(bus, group, gtype.type, NULL);
 
 }
 

base-commit: e46b3c0d011eab9933c183d5b47569db8e377281
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v11 07/11] device-mapping: Introduce DMA range map, supplanting dma_pfn_offset

2020-08-25 Thread Jim Quinlan via iommu
Hi Andy,


On Tue, Aug 25, 2020 at 5:54 AM Andy Shevchenko
 wrote:
>
> On Mon, Aug 24, 2020 at 03:30:20PM -0400, Jim Quinlan wrote:
> > The new field 'dma_range_map' in struct device is used to facilitate the
> > use of single or multiple offsets between mapping regions of cpu addrs and
> > dma addrs.  It subsumes the role of "dev->dma_pfn_offset" which was only
> > capable of holding a single uniform offset and had no region bounds
> > checking.
> >
> > The function of_dma_get_range() has been modified so that it takes a single
> > argument -- the device node -- and returns a map, NULL, or an error code.
> > The map is an array that holds the information regarding the DMA regions.
> > Each range entry contains the address offset, the cpu_start address, the
> > dma_start address, and the size of the region.
> >
> > of_dma_configure() is the typical manner to set range offsets but there are
> > a number of ad hoc assignments to "dev->dma_pfn_offset" in the kernel
> > driver code.  These cases now invoke the function
> > dma_attach_offset_range(dev, cpu_addr, dma_addr, size).
>
> ...
>
> > + /*
> > +  * Record all info in the generic DMA ranges array for struct device.
> > +  */
> > + *map = r;
> > + for_each_of_range(, ) {
> > + pr_debug("dma_addr(%llx) cpu_addr(%llx) size(%llx)\n",
> > +  range.bus_addr, range.cpu_addr, range.size);
> > + r->cpu_start = range.cpu_addr;
> > + r->dma_start = range.bus_addr;
> > + r->size = range.size;
>
> > + r->offset = (u64)range.cpu_addr - (u64)range.bus_addr;
>
> What's the point in explicit castings to the same type?
No point.  If I have to send out another version I will fix this.

>
> > + r++;
> > + }
>
> ...
>
> > + phys_addr_t paddr;
> > + dma_addr_t  dma_addr;
> > + struct device   dev_bogus;
>
> >   unittest(paddr == expect_paddr,
> > -  "of_dma_get_range wrong phys addr (%llx) on node 
> > %pOF", paddr, np);
> > +  "of_dma_get_range: wrong phys addr %llx (expecting 
> > %llx) on node %pOF\n",
> > +  (u64)paddr, expect_paddr, np);
>
> %llx -> %pap
This was intentional -- I'm aware of %pap and %pad.  The problem is
that %pa[pd]  print out a zero-filled 16 character number regardless
of what the number is.  For example, 1 is "0x0001",
whereas using %llx yields "1".

>
> >   unittest(dma_addr == expect_dma_addr,
> > -  "of_dma_get_range wrong DMA addr (%llx) on node 
> > %pOF", dma_addr, np);
> > +  "of_dma_get_range: wrong DMA addr %llx (expecting 
> > %llx) on node %pOF\n",
> > +  (u64)dma_addr, expect_dma_addr, np);
>
> %llx -> %pad
>
> ...
>
> > + if (mem->use_dev_dma_pfn_offset) {
> > + u64 base_addr = PFN_PHYS((u64)mem->pfn_base);
>
> Do we need explicit casting here?
I don't think it is needed.  However, the "(u64)" is useless though
since the macro recasts it to a phys_addr_t.

If there is another version of this submission I will change this.
>
> > +
> > + return base_addr - dma_offset_from_phys_addr(dev, base_addr);
> > + }
>
> ...
>
> > +int dma_set_offset_range(struct device *dev, phys_addr_t cpu_start,
> > +  dma_addr_t dma_start, u64 size)
> > +{
> > + struct bus_dma_region *map;
> > + u64 offset = (u64)cpu_start - (u64)dma_start;
> > +
> > + if (dev->dma_range_map) {
> > + dev_err(dev, "attempt to add DMA range to existing map\n");
> > + return -EINVAL;
> > + }
>
> Wouldn't be better to do an assignment of offset here?
IIRC this was what Christoph requested.  It has actually gone back and
forth over the versions of this submission.
>
> > + if (!offset)
> > + return 0;
> > +
> > + map = kcalloc(2, sizeof(*map), GFP_KERNEL);
> > + if (!map)
> > + return -ENOMEM;
> > + map[0].cpu_start = cpu_start;
> > + map[0].dma_start = dma_start;
> > + map[0].offset = offset;
> > + map[0].size = size;
> > + dev->dma_range_map = map;
> > +
> > + return 0;
> > +}
>
> --
> With Best Regards,
> Andy Shevchenko
Thanks again,
Jim
>
>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/2] dt-bindings: iommu: renesas, ipmmu-vmsa: Add r8a7742 support

2020-08-25 Thread Lad Prabhakar
Document RZ/G1H (R8A7742) SoC bindings.

No driver change is needed due to the fallback compatible value
"renesas,ipmmu-vmsa".

Signed-off-by: Lad Prabhakar 
Reviewed-by: Chris Paterson 
---
 Documentation/devicetree/bindings/iommu/renesas,ipmmu-vmsa.yaml | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/devicetree/bindings/iommu/renesas,ipmmu-vmsa.yaml 
b/Documentation/devicetree/bindings/iommu/renesas,ipmmu-vmsa.yaml
index 6bfa090fd73a..a79b172b75cb 100644
--- a/Documentation/devicetree/bindings/iommu/renesas,ipmmu-vmsa.yaml
+++ b/Documentation/devicetree/bindings/iommu/renesas,ipmmu-vmsa.yaml
@@ -20,6 +20,7 @@ properties:
   - items:
   - enum:
   - renesas,ipmmu-r8a73a4  # R-Mobile APE6
+  - renesas,ipmmu-r8a7742  # RZ/G1H
   - renesas,ipmmu-r8a7743  # RZ/G1M
   - renesas,ipmmu-r8a7744  # RZ/G1N
   - renesas,ipmmu-r8a7745  # RZ/G1E
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/2] ARM: dts: r8a7742: Add IPMMU DT nodes

2020-08-25 Thread Lad Prabhakar
Add the five IPMMU instances found in the r8a7742 to DT with a disabled
status.

Signed-off-by: Lad Prabhakar 
Reviewed-by: Chris Paterson 
---
 arch/arm/boot/dts/r8a7742.dtsi | 48 ++
 1 file changed, 48 insertions(+)

diff --git a/arch/arm/boot/dts/r8a7742.dtsi b/arch/arm/boot/dts/r8a7742.dtsi
index 0fc52b27ae64..c62e26876f95 100644
--- a/arch/arm/boot/dts/r8a7742.dtsi
+++ b/arch/arm/boot/dts/r8a7742.dtsi
@@ -412,6 +412,54 @@
#thermal-sensor-cells = <0>;
};
 
+   ipmmu_sy0: iommu@e628 {
+   compatible = "renesas,ipmmu-r8a7742",
+"renesas,ipmmu-vmsa";
+   reg = <0 0xe628 0 0x1000>;
+   interrupts = ,
+;
+   #iommu-cells = <1>;
+   status = "disabled";
+   };
+
+   ipmmu_sy1: iommu@e629 {
+   compatible = "renesas,ipmmu-r8a7742",
+"renesas,ipmmu-vmsa";
+   reg = <0 0xe629 0 0x1000>;
+   interrupts = ;
+   #iommu-cells = <1>;
+   status = "disabled";
+   };
+
+   ipmmu_ds: iommu@e674 {
+   compatible = "renesas,ipmmu-r8a7742",
+"renesas,ipmmu-vmsa";
+   reg = <0 0xe674 0 0x1000>;
+   interrupts = ,
+;
+   #iommu-cells = <1>;
+   status = "disabled";
+   };
+
+   ipmmu_mp: iommu@ec68 {
+   compatible = "renesas,ipmmu-r8a7742",
+"renesas,ipmmu-vmsa";
+   reg = <0 0xec68 0 0x1000>;
+   interrupts = ;
+   #iommu-cells = <1>;
+   status = "disabled";
+   };
+
+   ipmmu_mx: iommu@fe951000 {
+   compatible = "renesas,ipmmu-r8a7742",
+"renesas,ipmmu-vmsa";
+   reg = <0 0xfe951000 0 0x1000>;
+   interrupts = ,
+;
+   #iommu-cells = <1>;
+   status = "disabled";
+   };
+
icram0: sram@e63a {
compatible = "mmio-sram";
reg = <0 0xe63a 0 0x12000>;
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 0/2] r8a7742 SoC add IPMMU support

2020-08-25 Thread Lad Prabhakar
Hi All,

This patch series adds IPMMU support to R8A7742 (RZ/G1H)
SoC dtsi.

Cheers,
Prabhakar

Lad Prabhakar (2):
  dt-bindings: iommu: renesas,ipmmu-vmsa: Add r8a7742 support
  ARM: dts: r8a7742: Add IPMMU DT nodes

 .../bindings/iommu/renesas,ipmmu-vmsa.yaml|  1 +
 arch/arm/boot/dts/r8a7742.dtsi| 48 +++
 2 files changed, 49 insertions(+)

-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: a saner API for allocating DMA addressable pages

2020-08-25 Thread Christoph Hellwig
On Tue, Aug 25, 2020 at 01:30:41PM +0200, Marek Szyprowski wrote:
> I really wonder what is the difference between this new API and 
> alloc_pages(GFP_DMA, n). Is this API really needed? I thought that this 
> is legacy thing to be removed one day...

The difference is that the pages returned are guranteed to be addressable
by the devie.  This is a very important difference that matters for
a lot of use cases.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: a saner API for allocating DMA addressable pages

2020-08-25 Thread Marek Szyprowski
Hi Christoph,

On 19.08.2020 08:55, Christoph Hellwig wrote:
> this series replaced the DMA_ATTR_NON_CONSISTENT flag to dma_alloc_attrs
> with a separate new dma_alloc_pages API, which is available on all
> platforms.  In addition to cleaning up the convoluted code path, this
> ensures that other drivers that have asked for better support for
> non-coherent DMA to pages with incurring bounce buffering over can finally
> be properly supported.
>
> I'm still a little unsure about the API naming, as alloc_pages sort of
> implies a struct page return value, but we return a kernel virtual
> address.  The other alternative would be to name the API
> dma_alloc_noncoherent, but the whole non-coherent naming seems to put
> people off.  As a follow up I plan to move the implementation of the
> DMA_ATTR_NO_KERNEL_MAPPING flag over to this framework as well, given
> that is also is a fundamentally non coherent allocation.  The replacement
> for that flag would then return a struct page, as it is allowed to
> actually return pages without a kernel mapping as the name suggested
> (although most of the time they will actually have a kernel mapping..)
>
> In addition to the conversions of the existing non-coherent DMA users
> the last three patches also convert the DMA coherent allocations in
> the NVMe driver to use this new framework through a dmapool addition.
> This was both to give me a good testing vehicle, but also because it
> should speed up the NVMe driver on platforms with non-coherent DMA
> nicely, without a downside on platforms with cache coherent DMA.

I really wonder what is the difference between this new API and 
alloc_pages(GFP_DMA, n). Is this API really needed? I thought that this 
is legacy thing to be removed one day...

Maybe it would make more sense to convert the few remaining drivers to 
regular dma_map_page()/dma_sync_*()/dma_unmap_page() or have I missed 
something?

Best regards
-- 
Marek Szyprowski, PhD
Samsung R Institute Poland

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 6/9] iommu/ioasid: Introduce notification APIs

2020-08-25 Thread Jean-Philippe Brucker
On Fri, Aug 21, 2020 at 09:35:15PM -0700, Jacob Pan wrote:
> Relations among IOASID users largely follow a publisher-subscriber
> pattern. E.g. to support guest SVA on Intel Scalable I/O Virtualization
> (SIOV) enabled platforms, VFIO, IOMMU, device drivers, KVM are all users
> of IOASIDs. When a state change occurs, VFIO publishes the change event
> that needs to be processed by other users/subscribers.
> 
> This patch introduced two types of notifications: global and per
> ioasid_set. The latter is intended for users who only needs to handle
> events related to the IOASID of a given set.
> For more information, refer to the kernel documentation at
> Documentation/ioasid.rst.
> 
> Signed-off-by: Liu Yi L 
> Signed-off-by: Wu Hao 
> Signed-off-by: Jacob Pan 
> ---
>  drivers/iommu/ioasid.c | 280 
> -
>  include/linux/ioasid.h |  70 +
>  2 files changed, 348 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/iommu/ioasid.c b/drivers/iommu/ioasid.c
> index c0aef38a4fde..6ddc09a7fe74 100644
> --- a/drivers/iommu/ioasid.c
> +++ b/drivers/iommu/ioasid.c
> @@ -9,8 +9,35 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  static DEFINE_XARRAY_ALLOC(ioasid_sets);
> +/*
> + * An IOASID could have multiple consumers where each consumeer may have

consumer

> + * hardware contexts associated with IOASIDs.
> + * When a status change occurs, such as IOASID is being freed, notifier 
> chains
> + * are used to keep the consumers in sync.
> + * This is a publisher-subscriber pattern where publisher can change the
> + * state of each IOASID, e.g. alloc/free, bind IOASID to a device and mm.
> + * On the other hand, subscribers gets notified for the state change and
> + * keep local states in sync.
> + *
> + * Currently, the notifier is global. A further optimization could be per
> + * IOASID set notifier chain.

The patch adds both

> + */
> +static ATOMIC_NOTIFIER_HEAD(ioasid_chain);

"ioasid_notifier" may be clearer

> +
> +/* List to hold pending notification block registrations */
> +static LIST_HEAD(ioasid_nb_pending_list);
> +static DEFINE_SPINLOCK(ioasid_nb_lock);
> +struct ioasid_set_nb {
> + struct list_headlist;
> + struct notifier_block   *nb;
> + void*token;
> + struct ioasid_set   *set;
> + boolactive;
> +};
> +
>  enum ioasid_state {
>   IOASID_STATE_INACTIVE,
>   IOASID_STATE_ACTIVE,
> @@ -394,6 +421,7 @@ EXPORT_SYMBOL_GPL(ioasid_find_by_spid);
>  ioasid_t ioasid_alloc(struct ioasid_set *set, ioasid_t min, ioasid_t max,
> void *private)
>  {
> + struct ioasid_nb_args args;
>   struct ioasid_data *data;
>   void *adata;
>   ioasid_t id = INVALID_IOASID;
> @@ -445,8 +473,14 @@ ioasid_t ioasid_alloc(struct ioasid_set *set, ioasid_t 
> min, ioasid_t max,
>   goto exit_free;
>   }
>   set->nr_ioasids++;
> - goto done_unlock;
> + args.id = id;
> + /* Set private ID is not attached during allocation */
> + args.spid = INVALID_IOASID;
> + args.set = set;

args.pdata is uninitialized

> + atomic_notifier_call_chain(>nh, IOASID_ALLOC, );

No global notification?

>  
> + spin_unlock(_allocator_lock);
> + return id;
>  exit_free:
>   kfree(data);
>  done_unlock:
> @@ -479,6 +513,7 @@ static void ioasid_do_free(struct ioasid_data *data)
>  
>  static void ioasid_free_locked(struct ioasid_set *set, ioasid_t ioasid)
>  {
> + struct ioasid_nb_args args;
>   struct ioasid_data *data;
>  
>   data = xa_load(_allocator->xa, ioasid);
> @@ -491,7 +526,16 @@ static void ioasid_free_locked(struct ioasid_set *set, 
> ioasid_t ioasid)
>   pr_warn("Cannot free IOASID %u due to set ownership\n", ioasid);
>   return;
>   }
> +
>   data->state = IOASID_STATE_FREE_PENDING;
> + /* Notify all users that this IOASID is being freed */
> + args.id = ioasid;
> + args.spid = data->spid;
> + args.pdata = data->private;
> + args.set = data->set;
> + atomic_notifier_call_chain(_chain, IOASID_FREE, );
> + /* Notify the ioasid_set for per set users */
> + atomic_notifier_call_chain(>nh, IOASID_FREE, );
>  
>   if (!refcount_dec_and_test(>users))
>   return;
> @@ -514,6 +558,28 @@ void ioasid_free(struct ioasid_set *set, ioasid_t ioasid)
>  }
>  EXPORT_SYMBOL_GPL(ioasid_free);
>  
> +static void ioasid_add_pending_nb(struct ioasid_set *set)
> +{
> + struct ioasid_set_nb *curr;
> +
> + if (set->type != IOASID_SET_TYPE_MM)
> + return;
> +
> + /*
> +  * Check if there are any pending nb requests for the given token, if so
> +  * add them to the notifier chain.
> +  */
> + spin_lock(_nb_lock);
> + list_for_each_entry(curr, _nb_pending_list, list) {
> + if (curr->token == set->token && !curr->active) {
> + atomic_notifier_chain_register(>nh, 

Re: [PATCH v2 5/9] iommu/ioasid: Introduce ioasid_set private ID

2020-08-25 Thread Jean-Philippe Brucker
On Fri, Aug 21, 2020 at 09:35:14PM -0700, Jacob Pan wrote:
> When an IOASID set is used for guest SVA, each VM will acquire its
> ioasid_set for IOASID allocations. IOASIDs within the VM must have a
> host/physical IOASID backing, mapping between guest and host IOASIDs can
> be non-identical. IOASID set private ID (SPID) is introduced in this
> patch to be used as guest IOASID. However, the concept of ioasid_set
> specific namespace is generic, thus named SPID.
> 
> As SPID namespace is within the IOASID set, the IOASID core can provide
> lookup services at both directions. SPIDs may not be allocated when its
> IOASID is allocated, the mapping between SPID and IOASID is usually
> established when a guest page table is bound to a host PASID.
> 
> Signed-off-by: Jacob Pan 
> ---
>  drivers/iommu/ioasid.c | 54 
> ++
>  include/linux/ioasid.h | 12 +++
>  2 files changed, 66 insertions(+)
> 
> diff --git a/drivers/iommu/ioasid.c b/drivers/iommu/ioasid.c
> index 5f31d63c75b1..c0aef38a4fde 100644
> --- a/drivers/iommu/ioasid.c
> +++ b/drivers/iommu/ioasid.c
> @@ -21,6 +21,7 @@ enum ioasid_state {
>   * struct ioasid_data - Meta data about ioasid
>   *
>   * @id:  Unique ID
> + * @spid:Private ID unique within a set
>   * @usersNumber of active users
>   * @stateTrack state of the IOASID
>   * @set  Meta data of the set this IOASID belongs to
> @@ -29,6 +30,7 @@ enum ioasid_state {
>   */
>  struct ioasid_data {
>   ioasid_t id;
> + ioasid_t spid;
>   struct ioasid_set *set;
>   refcount_t users;
>   enum ioasid_state state;
> @@ -326,6 +328,58 @@ int ioasid_attach_data(ioasid_t ioasid, void *data)
>  EXPORT_SYMBOL_GPL(ioasid_attach_data);
>  
>  /**
> + * ioasid_attach_spid - Attach ioasid_set private ID to an IOASID
> + *
> + * @ioasid: the ID to attach
> + * @spid:   the ioasid_set private ID of @ioasid
> + *
> + * For IOASID that is already allocated, private ID within the set can be
> + * attached via this API. Future lookup can be done via ioasid_find.

via ioasid_find_by_spid()?

> + */
> +int ioasid_attach_spid(ioasid_t ioasid, ioasid_t spid)
> +{
> + struct ioasid_data *ioasid_data;
> + int ret = 0;
> +
> + spin_lock(_allocator_lock);
> + ioasid_data = xa_load(_allocator->xa, ioasid);
> +
> + if (!ioasid_data) {
> + pr_err("No IOASID entry %d to attach SPID %d\n",
> + ioasid, spid);
> + ret = -ENOENT;
> + goto done_unlock;
> + }
> + ioasid_data->spid = spid;
> +
> +done_unlock:
> + spin_unlock(_allocator_lock);
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(ioasid_attach_spid);
> +
> +ioasid_t ioasid_find_by_spid(struct ioasid_set *set, ioasid_t spid)

Maybe add a bit of documentation as this is public-facing.

> +{
> + struct ioasid_data *entry;
> + unsigned long index;
> +
> + if (!xa_load(_sets, set->sid)) {
> + pr_warn("Invalid set\n");
> + return INVALID_IOASID;
> + }
> +
> + xa_for_each(>xa, index, entry) {
> + if (spid == entry->spid) {
> + pr_debug("Found ioasid %lu by spid %u\n", index, spid);
> + refcount_inc(>users);

Nothing prevents ioasid_free() from concurrently dropping the refcount to
zero and calling ioasid_do_free(). The caller will later call ioasid_put()
on a stale/reallocated index.

> + return index;
> + }
> + }
> + return INVALID_IOASID;
> +}
> +EXPORT_SYMBOL_GPL(ioasid_find_by_spid);
> +
> +/**
>   * ioasid_alloc - Allocate an IOASID
>   * @set: the IOASID set
>   * @min: the minimum ID (inclusive)
> diff --git a/include/linux/ioasid.h b/include/linux/ioasid.h
> index 310abe4187a3..d4b3e83672f6 100644
> --- a/include/linux/ioasid.h
> +++ b/include/linux/ioasid.h
> @@ -73,6 +73,8 @@ bool ioasid_is_active(ioasid_t ioasid);
>  
>  void *ioasid_find(struct ioasid_set *set, ioasid_t ioasid, bool 
> (*getter)(void *));
>  int ioasid_attach_data(ioasid_t ioasid, void *data);
> +int ioasid_attach_spid(ioasid_t ioasid, ioasid_t spid);
> +ioasid_t ioasid_find_by_spid(struct ioasid_set *set, ioasid_t spid);
>  int ioasid_register_allocator(struct ioasid_allocator_ops *allocator);
>  void ioasid_unregister_allocator(struct ioasid_allocator_ops *allocator);
>  void ioasid_is_in_set(struct ioasid_set *set, ioasid_t ioasid);
> @@ -136,5 +138,15 @@ static inline int ioasid_attach_data(ioasid_t ioasid, 
> void *data)
>   return -ENOTSUPP;
>  }
>  
> +staic inline int ioasid_attach_spid(ioasid_t ioasid, ioasid_t spid)
> +{
> + return -ENOTSUPP;
> +}
> +
> +static inline ioasid_t ioasid_find_by_spid(struct ioasid_set *set, ioasid_t 
> spid)
> +{
> + return -ENOTSUPP;

INVALID_IOASID

Thanks,
Jean

> +}
> +
>  #endif /* CONFIG_IOASID */
>  #endif /* __LINUX_IOASID_H */
> -- 
> 2.7.4
> 
___
iommu mailing list

Re: [PATCH v2 4/9] iommu/ioasid: Add reference couting functions

2020-08-25 Thread Jean-Philippe Brucker
On Mon, Aug 24, 2020 at 10:26:55AM +0800, Lu Baolu wrote:
> Hi Jacob,
> 
> On 8/22/20 12:35 PM, Jacob Pan wrote:
> > There can be multiple users of an IOASID, each user could have hardware
> > contexts associated with the IOASID. In order to align lifecycles,
> > reference counting is introduced in this patch. It is expected that when
> > an IOASID is being freed, each user will drop a reference only after its
> > context is cleared.
> > 
> > Signed-off-by: Jacob Pan 
[...]
> > +/**
> >* ioasid_find - Find IOASID data
> >* @set: the IOASID set
> >* @ioasid: the IOASID to find
> 
> Do you need to increase the refcount of the found ioasid and ask the
> caller to drop it after use? Otherwise, the ioasid might be freed
> elsewhere.

ioasid_find() takes a getter function as parameter, which ensures that the
returned data is valid. It fetches the IOASID data under rcu_read_lock()
and calls the getter on the private data (for example mmget_not_zero() for
bare-metal SVA). Given that, I don't think returning with a reference to
the IOASID is necessary. The IOASID may be freed once ioasid_find()
returns but not the returned data.

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 4/9] iommu/ioasid: Add reference couting functions

2020-08-25 Thread Jean-Philippe Brucker
On Fri, Aug 21, 2020 at 09:35:13PM -0700, Jacob Pan wrote:
> There can be multiple users of an IOASID, each user could have hardware
> contexts associated with the IOASID. In order to align lifecycles,
> reference counting is introduced in this patch. It is expected that when
> an IOASID is being freed, each user will drop a reference only after its
> context is cleared.
> 
> Signed-off-by: Jacob Pan 
> ---
>  drivers/iommu/ioasid.c | 113 
> +
>  include/linux/ioasid.h |   4 ++
>  2 files changed, 117 insertions(+)
> 
> diff --git a/drivers/iommu/ioasid.c b/drivers/iommu/ioasid.c
> index f73b3dbfc37a..5f31d63c75b1 100644
> --- a/drivers/iommu/ioasid.c
> +++ b/drivers/iommu/ioasid.c
> @@ -717,6 +717,119 @@ int ioasid_set_for_each_ioasid(struct ioasid_set *set,
>  EXPORT_SYMBOL_GPL(ioasid_set_for_each_ioasid);
>  
>  /**
> + * IOASID refcounting rules
> + * - ioasid_alloc() set initial refcount to 1
> + *
> + * - ioasid_free() decrement and test refcount.
> + * If refcount is 0, ioasid will be freed. Deleted from the system-wide
> + * xarray as well as per set xarray. The IOASID will be returned to the
> + * pool and available for new allocations.
> + *
> + * If recount is non-zero, mark IOASID as IOASID_STATE_FREE_PENDING.
> + * No new reference can be added. The IOASID is not returned to the pool
> + * for reuse.
> + * After free, ioasid_get() will return error but ioasid_find() and other
> + * non refcount adding APIs will continue to work until the last 
> reference
> + * is dropped
> + *
> + * - ioasid_get() get a reference on an active IOASID
> + *
> + * - ioasid_put() decrement and test refcount of the IOASID.
> + * If refcount is 0, ioasid will be freed. Deleted from the system-wide
> + * xarray as well as per set xarray. The IOASID will be returned to the
> + * pool and available for new allocations.
> + * Do nothing if refcount is non-zero.
> + *
> + * - ioasid_find() does not take reference, caller must hold reference
> + *
> + * ioasid_free() can be called multiple times without error until all refs 
> are
> + * dropped.
> + */

Since you already document this in ioasid.rst, I'm not sure the comment
is necessary. Maybe the doc for _free/_put would be better in the
function's documentation.

> +
> +int ioasid_get_locked(struct ioasid_set *set, ioasid_t ioasid)
> +{
> + struct ioasid_data *data;
> +
> + data = xa_load(_allocator->xa, ioasid);
> + if (!data) {
> + pr_err("Trying to get unknown IOASID %u\n", ioasid);
> + return -EINVAL;
> + }
> + if (data->state == IOASID_STATE_FREE_PENDING) {
> + pr_err("Trying to get IOASID being freed%u\n", ioasid);
> + return -EBUSY;
> + }
> +
> + if (set && data->set != set) {
> + pr_err("Trying to get IOASID not in set%u\n", ioasid);
> + /* data found but does not belong to the set */
> + return -EACCES;
> + }
> + refcount_inc(>users);
> +
> + return 0;
> +}
> +EXPORT_SYMBOL_GPL(ioasid_get_locked);

Is it necessary to export the *_locked variant?  Who'd call them and how
would they acquire the lock?

> +
> +/**
> + * ioasid_get - Obtain a reference of an ioasid
> + * @set
> + * @ioasid

Can be dropped. The doc checker will throw a warning, though.

> + *
> + * Check set ownership if @set is non-null.
> + */
> +int ioasid_get(struct ioasid_set *set, ioasid_t ioasid)
> +{
> + int ret = 0;

No need to initialize ret

> +
> + spin_lock(_allocator_lock);
> + ret = ioasid_get_locked(set, ioasid);
> + spin_unlock(_allocator_lock);
> +
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(ioasid_get);
> +
> +void ioasid_put_locked(struct ioasid_set *set, ioasid_t ioasid)
> +{
> + struct ioasid_data *data;
> +
> + data = xa_load(_allocator->xa, ioasid);
> + if (!data) {
> + pr_err("Trying to put unknown IOASID %u\n", ioasid);
> + return;
> + }
> +
> + if (set && data->set != set) {
> + pr_err("Trying to drop IOASID not in the set %u\n", ioasid);
> + return;
> + }
> +
> + if (!refcount_dec_and_test(>users)) {
> + pr_debug("%s: IOASID %d has %d remainning users\n",
> + __func__, ioasid, refcount_read(>users));
> + return;
> + }
> + ioasid_do_free(data);
> +}
> +EXPORT_SYMBOL_GPL(ioasid_put_locked);
> +
> +/**
> + * ioasid_put - Drop a reference of an ioasid
> + * @set
> + * @ioasid
> + *
> + * Check set ownership if @set is non-null.
> + */
> +void ioasid_put(struct ioasid_set *set, ioasid_t ioasid)
> +{
> + spin_lock(_allocator_lock);
> + ioasid_put_locked(set, ioasid);
> + spin_unlock(_allocator_lock);
> +}
> +EXPORT_SYMBOL_GPL(ioasid_put);
> +
> +/**
>   * ioasid_find - Find IOASID data
>   * @set: the IOASID set
>   * @ioasid: the IOASID to find
> diff --git a/include/linux/ioasid.h 

Re: [PATCH v11 07/11] device-mapping: Introduce DMA range map, supplanting dma_pfn_offset

2020-08-25 Thread Andy Shevchenko
On Mon, Aug 24, 2020 at 03:30:20PM -0400, Jim Quinlan wrote:
> The new field 'dma_range_map' in struct device is used to facilitate the
> use of single or multiple offsets between mapping regions of cpu addrs and
> dma addrs.  It subsumes the role of "dev->dma_pfn_offset" which was only
> capable of holding a single uniform offset and had no region bounds
> checking.
> 
> The function of_dma_get_range() has been modified so that it takes a single
> argument -- the device node -- and returns a map, NULL, or an error code.
> The map is an array that holds the information regarding the DMA regions.
> Each range entry contains the address offset, the cpu_start address, the
> dma_start address, and the size of the region.
> 
> of_dma_configure() is the typical manner to set range offsets but there are
> a number of ad hoc assignments to "dev->dma_pfn_offset" in the kernel
> driver code.  These cases now invoke the function
> dma_attach_offset_range(dev, cpu_addr, dma_addr, size).

...

> + /*
> +  * Record all info in the generic DMA ranges array for struct device.
> +  */
> + *map = r;
> + for_each_of_range(, ) {
> + pr_debug("dma_addr(%llx) cpu_addr(%llx) size(%llx)\n",
> +  range.bus_addr, range.cpu_addr, range.size);
> + r->cpu_start = range.cpu_addr;
> + r->dma_start = range.bus_addr;
> + r->size = range.size;

> + r->offset = (u64)range.cpu_addr - (u64)range.bus_addr;

What's the point in explicit castings to the same type?

> + r++;
> + }

...

> + phys_addr_t paddr;
> + dma_addr_t  dma_addr;
> + struct device   dev_bogus;

>   unittest(paddr == expect_paddr,
> -  "of_dma_get_range wrong phys addr (%llx) on node 
> %pOF", paddr, np);
> +  "of_dma_get_range: wrong phys addr %llx (expecting 
> %llx) on node %pOF\n",
> +  (u64)paddr, expect_paddr, np);

%llx -> %pap

>   unittest(dma_addr == expect_dma_addr,
> -  "of_dma_get_range wrong DMA addr (%llx) on node %pOF", 
> dma_addr, np);
> +  "of_dma_get_range: wrong DMA addr %llx (expecting 
> %llx) on node %pOF\n",
> +  (u64)dma_addr, expect_dma_addr, np);

%llx -> %pad

...

> + if (mem->use_dev_dma_pfn_offset) {
> + u64 base_addr = PFN_PHYS((u64)mem->pfn_base);

Do we need explicit casting here?

> +
> + return base_addr - dma_offset_from_phys_addr(dev, base_addr);
> + }

...

> +int dma_set_offset_range(struct device *dev, phys_addr_t cpu_start,
> +  dma_addr_t dma_start, u64 size)
> +{
> + struct bus_dma_region *map;
> + u64 offset = (u64)cpu_start - (u64)dma_start;
> +
> + if (dev->dma_range_map) {
> + dev_err(dev, "attempt to add DMA range to existing map\n");
> + return -EINVAL;
> + }

Wouldn't be better to do an assignment of offset here?

> + if (!offset)
> + return 0;
> +
> + map = kcalloc(2, sizeof(*map), GFP_KERNEL);
> + if (!map)
> + return -ENOMEM;
> + map[0].cpu_start = cpu_start;
> + map[0].dma_start = dma_start;
> + map[0].offset = offset;
> + map[0].size = size;
> + dev->dma_range_map = map;
> +
> + return 0;
> +}

-- 
With Best Regards,
Andy Shevchenko


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [patch RFC 24/38] x86/xen: Consolidate XEN-MSI init

2020-08-25 Thread Thomas Gleixner
On Tue, Aug 25 2020 at 06:21, Jürgen Groß wrote:
> On 24.08.20 23:21, Thomas Gleixner wrote:
>> I still think it does the right thing depending on the place it is
>> called from, but even if so, it's completely unreadable gunk. I'll fix
>> that proper.
>
> The main issue is that xen_initial_domain() and xen_pv_domain() are
> orthogonal to each other. So xen_initial_domain() can either be true
> for xen_pv_domain() (the "classic" pv dom0) or for xen_hvm_domain()
> (the new PVH dom0).

Fair enough. My limited XENology striked again.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [patch RFC 26/38] x86/xen: Wrap XEN MSI management into irqdomain

2020-08-25 Thread Thomas Gleixner
On Mon, Aug 24 2020 at 08:21, Jürgen Groß wrote:
> On 21.08.20 02:24, Thomas Gleixner wrote:
>
> Looking into https://www.kernel.org/doc/Documentation/IRQ-domain.txt (is
> this still valid?)

It's halfways correct. Emphasis on halfways.

> I believe Xen should be able to use the "No Map" approach, as Xen only
> ever uses software IRQs (at least those are the only ones visible to
> any driver). The (virtualized) hardware interrupts are Xen events
> after all.
>
> So maybe morphing Xen into supporting irqdomains in a sane way isn't
> that complicated. Maybe I'm missing the main complexities, though.

The wrapper domain I did is pretty much that, but with the extra
functionality required by hierarchical irq domains. So, yes it's
functionally correct, but it's only utilizing the alloc/free interface
and not any of the other mechanisms provided by irqdomains. The latter
should make the overall code simpler but that obviously needs some
thought.

Thanks,

tglx



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 2/2] iommu/vt-d:Add support for probing ACPI device in RMRR

2020-08-25 Thread FelixCuioc
In the probe_acpi_namespace_devices function,when the physical
node of the acpi device is NULL,the unlock function is missing.
Add mutex_unlock(>physical_node_lock).

Reported-by: Dan Carpenter 
Signed-off-by: FelixCuioc 
---
 drivers/iommu/intel/iommu.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index b31f02f41c96..25e9853cba1b 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -4851,6 +4851,7 @@ static int __init probe_acpi_namespace_devices(void)
if (pn_dev == NULL) {
dev->bus->iommu_ops = _iommu_ops;
ret = iommu_probe_device(dev);
+   mutex_unlock(>physical_node_lock);
if (ret) {
pr_err("acpi_device probe fail! 
ret:%d\n", ret);
return ret;
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] MAINTAINERS: update QUALCOMM IOMMU after Arm SSMU drivers move

2020-08-25 Thread Lukas Bulwahn



On Fri, 21 Aug 2020, Will Deacon wrote:

> On Sun, Aug 02, 2020 at 08:53:20AM +0200, Lukas Bulwahn wrote:
> > Commit e86d1aa8b60f ("iommu/arm-smmu: Move Arm SMMU drivers into their own
> > subdirectory") moved drivers/iommu/qcom_iommu.c to
> > drivers/iommu/arm/arm-smmu/qcom_iommu.c amongst other moves, adjusted some
> > sections in MAINTAINERS, but missed adjusting the QUALCOMM IOMMU section.
> > 
> > Hence, ./scripts/get_maintainer.pl --self-test=patterns complains:
> > 
> >   warning: no file matchesF:drivers/iommu/qcom_iommu.c
> > 
> > Update the file entry in MAINTAINERS to the new location.
> > 
> > Signed-off-by: Lukas Bulwahn 
> > ---
> > Will, please ack.
> 
> Typo in subject: s/SSMU/SMMU/
> 
> With that:
> 
> Acked-by: Will Deacon 
> 
> > Joerg, please pick this minor non-urgent patch for your -next branch.
> 
> Joerg -- can you queue this as a fix for 5.9-rc, please?
>
Will, Joerg, I addressed the typo.

Ignore PATCH v1; and take v2 instead:

https://lore.kernel.org/lkml/20200825053828.4166-1-lukas.bulw...@gmail.com/

Thanks,

Lukas
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu