Re: [trivial PATCH] treewide: Convert switch/case fallthrough; to break;

2020-09-10 Thread Joe Perches
On Thu, 2020-09-10 at 15:21 +0100, Robin Murphy wrote:
> On 2020-09-09 21:06, Joe Perches wrote:
> > fallthrough to a separate case/default label break; isn't very readable.
> > 
> > Convert pseudo-keyword fallthrough; statements to a simple break; when
> > the next label is case or default and the only statement in the next
> > label block is break;
> > 
> > Found using:
> > 
> > $ grep-2.5.4 -rP --include=*.[ch] -n 
> > "fallthrough;(\s*(case\s+\w+|default)\s*:\s*){1,7}break;" *
> > 
> > Miscellanea:
> > 
> > o Move or coalesce a couple label blocks above a default: block.
> > 
> > Signed-off-by: Joe Perches 
> > ---
> > 
> > Compiled allyesconfig x86-64 only.
> > A few files for other arches were not compiled.
> > 
> 
> [...]
> > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
> > b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > index c192544e874b..743db1abec40 100644
> > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > @@ -3777,7 +3777,7 @@ static int arm_smmu_device_hw_probe(struct 
> > arm_smmu_device *smmu)
> > switch (FIELD_GET(IDR0_TTF, reg)) {
> > case IDR0_TTF_AARCH32_64:
> > smmu->ias = 40;
> > -   fallthrough;
> > +   break;
> > case IDR0_TTF_AARCH64:
> > break;
> > default:
> 
> I have to say I don't really agree with the readability argument for 
> this one - a fallthrough is semantically correct here, since the first 
> case is a superset of the second. It just happens that anything we would 
> do for the common subset is implicitly assumed (there are other 
> potential cases we simply haven't added support for at the moment), thus 
> the second case is currently empty.
> This change actively obfuscates that distinction.

Then perhaps comments should be added to usefully
describe the mechanisms.

case IDR0_TTF_AARCH32_64:
smmu->ias = 40;
fallthrough;/* and still do the 64 bit processing */
case IDR0_TTF_AARCH64:
/* Nothing specific yet */
break;

> Robin.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/amd: Fix potential @entry null deref

2020-09-10 Thread Suravee Suthikulpanit

Thanks for catching this.

On 9/11/20 12:16 AM, Joao Martins wrote:

After commit 26e495f34107 ("iommu/amd: Restore IRTE.RemapEn bit after
programming IRTE"), smatch warns:

drivers/iommu/amd/iommu.c:3870 amd_iommu_deactivate_guest_mode()
 warn: variable dereferenced before check 'entry' (see line 3867)

Fix this by moving the @valid assignment to after @entry has been checked
for NULL.

Cc: Suravee Suthikulpanit 
Fixes: 26e495f34107 ("iommu/amd: Restore IRTE.RemapEn bit after programming 
IRTE")
Reported-by: Dan Carpenter 
Signed-off-by: Joao Martins 
---
  drivers/iommu/amd/iommu.c | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 07ae8b93887e..8abe1c7ad45b 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -3864,12 +3864,14 @@ int amd_iommu_deactivate_guest_mode(void *data)
struct amd_ir_data *ir_data = (struct amd_ir_data *)data;
struct irte_ga *entry = (struct irte_ga *) ir_data->entry;
struct irq_cfg *cfg = ir_data->cfg;
-   u64 valid = entry->lo.fields_remap.valid;
+   u64 valid;
  
  	if (!AMD_IOMMU_GUEST_IR_VAPIC(amd_iommu_guest_ir) ||

!entry || !entry->lo.fields_vapic.guest_mode)
return 0;
  
+	valid = entry->lo.fields_remap.valid;

+
entry->lo.val = 0;
entry->hi.val = 0;
  



Reviewed-by: Suravee Suthikulpanit 

Suravee
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 0/8] iommu/arm-smmu: Support maintaining bootloader mappings

2020-09-10 Thread John Stultz
On Fri, Sep 4, 2020 at 8:56 AM Bjorn Andersson
 wrote:
>
> Based on previous attempts and discussions this is the latest attempt at
> inheriting stream mappings set up by the bootloader, for e.g. boot splash or
> efifb.
>
> Per Will's request this builds on the work by Jordan and Rob for the Adreno
> SMMU support. It applies cleanly ontop of v16 of their series, which can be
> found at
> https://lore.kernel.org/linux-arm-msm/20200901164707.2645413-1-robdcl...@gmail.com/
>

Apologies, I just found this today. I've pulled your patches and Rob's
into my own tree here:
  
https://git.linaro.org/people/john.stultz/android-dev.git/log/?h=dev/db845c-mainline-WIP

And they all work fine on the db845c.

So for your whole series:
Tested-by: John Stultz 

thanks
-john
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 5/5] iommu/vt-d: Add is_aux_domain support

2020-09-10 Thread Alex Williamson
On Tue,  1 Sep 2020 11:34:22 +0800
Lu Baolu  wrote:

> With subdevice information opt-in through iommu_ops.aux_at(de)tach_dev()
> interfaces, the vendor iommu driver is able to learn the knowledge about
> the relationships between the subdevices and the aux-domains. Implement
> is_aux_domain() support based on the relationship knowledges.
> 
> Signed-off-by: Lu Baolu 
> ---
>  drivers/iommu/intel/iommu.c | 125 ++--
>  include/linux/intel-iommu.h |  17 +++--
>  2 files changed, 103 insertions(+), 39 deletions(-)
> 
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index 3c12fd06856c..50431c7b2e71 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -334,6 +334,8 @@ static int intel_iommu_attach_device(struct iommu_domain 
> *domain,
>struct device *dev);
>  static phys_addr_t intel_iommu_iova_to_phys(struct iommu_domain *domain,
>   dma_addr_t iova);
> +static bool intel_iommu_dev_feat_enabled(struct device *dev,
> +  enum iommu_dev_features feat);
>  
>  #ifdef CONFIG_INTEL_IOMMU_DEFAULT_ON
>  int dmar_disabled = 0;
> @@ -1832,6 +1834,7 @@ static struct dmar_domain *alloc_domain(int flags)
>   domain->flags |= DOMAIN_FLAG_USE_FIRST_LEVEL;
>   domain->has_iotlb_device = false;
>   INIT_LIST_HEAD(>devices);
> + INIT_LIST_HEAD(>subdevices);
>  
>   return domain;
>  }
> @@ -2580,7 +2583,7 @@ static struct dmar_domain 
> *dmar_insert_one_dev_info(struct intel_iommu *iommu,
>   info->iommu = iommu;
>   info->pasid_table = NULL;
>   info->auxd_enabled = 0;
> - INIT_LIST_HEAD(>auxiliary_domains);
> + INIT_LIST_HEAD(>subdevices);
>  
>   if (dev && dev_is_pci(dev)) {
>   struct pci_dev *pdev = to_pci_dev(info->dev);
> @@ -5137,21 +5140,28 @@ static void intel_iommu_domain_free(struct 
> iommu_domain *domain)
>   domain_exit(to_dmar_domain(domain));
>  }
>  
> -/*
> - * Check whether a @domain could be attached to the @dev through the
> - * aux-domain attach/detach APIs.
> - */
> -static inline bool
> -is_aux_domain(struct device *dev, struct iommu_domain *domain)
> +/* Lookup subdev_info in the domain's subdevice siblings. */
> +static struct subdev_info *
> +subdev_lookup_domain(struct dmar_domain *domain, struct device *dev,
> +  struct device *subdev)
>  {
> - struct device_domain_info *info = get_domain_info(dev);
> + struct subdev_info *sinfo = NULL, *tmp;
>  
> - return info && info->auxd_enabled &&
> - domain->type == IOMMU_DOMAIN_UNMANAGED;
> + assert_spin_locked(_domain_lock);
> +
> + list_for_each_entry(tmp, >subdevices, link_domain) {
> + if ((!dev || tmp->pdev == dev) && tmp->dev == subdev) {
> + sinfo = tmp;
> + break;
> + }
> + }
> +
> + return sinfo;
>  }
>  
> -static void auxiliary_link_device(struct dmar_domain *domain,
> -   struct device *dev)
> +static void
> +subdev_link_device(struct dmar_domain *domain, struct device *dev,
> +struct subdev_info *sinfo)
>  {
>   struct device_domain_info *info = get_domain_info(dev);
>  
> @@ -5159,12 +5169,13 @@ static void auxiliary_link_device(struct dmar_domain 
> *domain,
>   if (WARN_ON(!info))
>   return;
>  
> - domain->auxd_refcnt++;
> - list_add(>auxd, >auxiliary_domains);
> + list_add(>subdevices, >link_phys);
> + list_add(>subdevices, >link_domain);
>  }
>  
> -static void auxiliary_unlink_device(struct dmar_domain *domain,
> - struct device *dev)
> +static void
> +subdev_unlink_device(struct dmar_domain *domain, struct device *dev,
> +  struct subdev_info *sinfo)
>  {
>   struct device_domain_info *info = get_domain_info(dev);
>  
> @@ -5172,24 +5183,30 @@ static void auxiliary_unlink_device(struct 
> dmar_domain *domain,
>   if (WARN_ON(!info))
>   return;
>  
> - list_del(>auxd);
> - domain->auxd_refcnt--;
> + list_del(>link_phys);
> + list_del(>link_domain);
> + kfree(sinfo);
>  
> - if (!domain->auxd_refcnt && domain->default_pasid > 0)
> + if (list_empty(>subdevices) && domain->default_pasid > 0)
>   ioasid_free(domain->default_pasid);
>  }
>  
> -static int aux_domain_add_dev(struct dmar_domain *domain,
> -   struct device *dev)
> +static int aux_domain_add_dev(struct dmar_domain *domain, struct device *dev,
> +   struct device *subdev)
>  {
>   int ret;
>   unsigned long flags;
>   struct intel_iommu *iommu;
> + struct subdev_info *sinfo;
>  
>   iommu = device_to_iommu(dev, NULL, NULL);
>   if (!iommu)
>   return -ENODEV;
>  
> + sinfo = kzalloc(sizeof(*sinfo), 

Re: [PATCH v4 1/5] iommu: Add optional subdev in aux_at(de)tach ops

2020-09-10 Thread Alex Williamson
On Tue,  1 Sep 2020 11:34:18 +0800
Lu Baolu  wrote:

> In the vfio/mdev use case of aux-domain, the subdevices are created from
> the physical devices with IOMMU_DEV_FEAT_AUX enabled and the aux-domains
> are attached to the subdevices through the iommu_ops.aux_attach_dev()
> interface.
> 
> Current iommu_ops.aux_at(de)tach_dev() design only takes the aux-domain
> and the physical device as the parameters, this is insufficient if we
> want the vendor iommu drivers to learn the knowledge about relationships
> between the aux-domains and the subdevices. Add a @subdev parameter to
> iommu_ops.aux_at(de)tach_dev() interfaces so that a subdevice could be
> opt-in.
> 
> Signed-off-by: Lu Baolu 
> ---
>  drivers/iommu/intel/iommu.c | 10 ++
>  drivers/iommu/iommu.c   |  4 ++--
>  include/linux/iommu.h   |  6 --
>  3 files changed, 12 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index bce158468abf..3c12fd06856c 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -5338,8 +5338,9 @@ static int intel_iommu_attach_device(struct 
> iommu_domain *domain,
>   return domain_add_dev_info(to_dmar_domain(domain), dev);
>  }
>  
> -static int intel_iommu_aux_attach_device(struct iommu_domain *domain,
> -  struct device *dev)
> +static int
> +intel_iommu_aux_attach_device(struct iommu_domain *domain,
> +   struct device *dev, struct device *subdev)
>  {
>   int ret;
>  
> @@ -5359,8 +5360,9 @@ static void intel_iommu_detach_device(struct 
> iommu_domain *domain,
>   dmar_remove_one_dev_info(dev);
>  }
>  
> -static void intel_iommu_aux_detach_device(struct iommu_domain *domain,
> -   struct device *dev)
> +static void
> +intel_iommu_aux_detach_device(struct iommu_domain *domain, struct device 
> *dev,
> +   struct device *subdev)
>  {
>   aux_domain_remove_dev(to_dmar_domain(domain), dev);
>  }
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 609bd25bf154..38cdfeb887e1 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -2728,7 +2728,7 @@ int iommu_aux_attach_device(struct iommu_domain 
> *domain, struct device *dev)
>   int ret = -ENODEV;
>  
>   if (domain->ops->aux_attach_dev)
> - ret = domain->ops->aux_attach_dev(domain, dev);
> + ret = domain->ops->aux_attach_dev(domain, dev, NULL);
>  
>   if (!ret)
>   trace_attach_device_to_domain(dev);
> @@ -2740,7 +2740,7 @@ EXPORT_SYMBOL_GPL(iommu_aux_attach_device);
>  void iommu_aux_detach_device(struct iommu_domain *domain, struct device *dev)
>  {
>   if (domain->ops->aux_detach_dev) {
> - domain->ops->aux_detach_dev(domain, dev);
> + domain->ops->aux_detach_dev(domain, dev, NULL);
>   trace_detach_device_from_domain(dev);
>   }
>  }
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index fee209efb756..871267104915 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -279,8 +279,10 @@ struct iommu_ops {
>   int (*dev_disable_feat)(struct device *dev, enum iommu_dev_features f);
>  
>   /* Aux-domain specific attach/detach entries */
> - int (*aux_attach_dev)(struct iommu_domain *domain, struct device *dev);
> - void (*aux_detach_dev)(struct iommu_domain *domain, struct device *dev);
> + int (*aux_attach_dev)(struct iommu_domain *domain, struct device *dev,
> +   struct device *subdev);
> + void (*aux_detach_dev)(struct iommu_domain *domain, struct device *dev,
> +struct device *subdev);
>   int (*aux_get_pasid)(struct iommu_domain *domain, struct device *dev);
>  
>   struct iommu_sva *(*sva_bind)(struct device *dev, struct mm_struct *mm,

Would this be a good spot in the code to provide comments more formally
defining this subdevice concept?  For example, what exactly is the
relationship between the device and the subdevice and which device do
we use for the remaining aux domain functions, ex. aux_get_pasid().
Thanks,

Alex

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 2/5] iommu: Add iommu_at(de)tach_subdev_group()

2020-09-10 Thread Alex Williamson
On Tue,  1 Sep 2020 11:34:19 +0800
Lu Baolu  wrote:

> This adds two new APIs for the use cases like vfio/mdev where subdevices
> derived from physical devices are created and put in an iommu_group. The
> new IOMMU API interfaces mimic the vfio_mdev_at(de)tach_domain() directly,
> testing whether the resulting device supports IOMMU_DEV_FEAT_AUX and using
> an aux vs non-aux at(de)tach.
> 
> By doing this we could
> 
> - Set the iommu_group.domain. The iommu_group.domain is private to iommu
>   core (therefore vfio code cannot set it), but we need it set in order
>   for iommu_get_domain_for_dev() to work with a group attached to an aux
>   domain.
> 
> - Prefer to use the _attach_group() interfaces while the _attach_device()
>   interfaces are relegated to special cases.
> 
> Link: https://lore.kernel.org/linux-iommu/20200730134658.44c57...@x1.home/
> Link: https://lore.kernel.org/linux-iommu/20200730151703.5daf8...@x1.home/
> Signed-off-by: Lu Baolu 
> ---
>  drivers/iommu/iommu.c | 136 ++
>  include/linux/iommu.h |  20 +++
>  2 files changed, 156 insertions(+)
> 
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 38cdfeb887e1..fb21c2ff4861 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -2757,6 +2757,142 @@ int iommu_aux_get_pasid(struct iommu_domain *domain, 
> struct device *dev)
>  }
>  EXPORT_SYMBOL_GPL(iommu_aux_get_pasid);
>  
> +static int __iommu_aux_attach_device(struct iommu_domain *domain,
> +  struct device *phys_dev,
> +  struct device *sub_dev)
> +{
> + int ret;
> +
> + if (unlikely(!domain->ops->aux_attach_dev))
> + return -ENODEV;
> +
> + ret = domain->ops->aux_attach_dev(domain, phys_dev, sub_dev);
> + if (!ret)
> + trace_attach_device_to_domain(sub_dev);
> +
> + return ret;
> +}
> +
> +static void __iommu_aux_detach_device(struct iommu_domain *domain,
> +   struct device *phys_dev,
> +   struct device *sub_dev)
> +{
> + if (unlikely(!domain->ops->aux_detach_dev))
> + return;
> +
> + domain->ops->aux_detach_dev(domain, phys_dev, sub_dev);
> + trace_detach_device_from_domain(sub_dev);
> +}
> +
> +static int __iommu_attach_subdev_group(struct iommu_domain *domain,
> +struct iommu_group *group,
> +iommu_device_lookup_t fn)
> +{
> + struct group_device *device;
> + struct device *phys_dev;
> + int ret = -ENODEV;
> +
> + list_for_each_entry(device, >devices, list) {
> + phys_dev = fn(device->dev);
> + if (!phys_dev) {
> + ret = -ENODEV;
> + break;
> + }
> +
> + if (iommu_dev_feature_enabled(phys_dev, IOMMU_DEV_FEAT_AUX))
> + ret = __iommu_aux_attach_device(domain, phys_dev,
> + device->dev);
> + else
> + ret = __iommu_attach_device(domain, phys_dev);
> +
> + if (ret)
> + break;
> + }
> +
> + return ret;
> +}
> +
> +static void __iommu_detach_subdev_group(struct iommu_domain *domain,
> + struct iommu_group *group,
> + iommu_device_lookup_t fn)
> +{
> + struct group_device *device;
> + struct device *phys_dev;
> +
> + list_for_each_entry(device, >devices, list) {
> + phys_dev = fn(device->dev);
> + if (!phys_dev)
> + break;


Seems like this should be a continue rather than a break.  On the
unwind path maybe we're relying on holding the group mutex and
deterministic behavior from the fn() callback to unwind to the same
point, but we still have an entirely separate detach interface and I'm
not sure we couldn't end up with an inconsistent state if we don't
iterate each group device here.  Thanks,

Alex

> +
> + if (iommu_dev_feature_enabled(phys_dev, IOMMU_DEV_FEAT_AUX))
> + __iommu_aux_detach_device(domain, phys_dev, 
> device->dev);
> + else
> + __iommu_detach_device(domain, phys_dev);
> + }
> +}
> +
> +/**
> + * iommu_attach_subdev_group - attach domain to an iommu_group which
> + *  contains subdevices.
> + *
> + * @domain: domain
> + * @group:  iommu_group which contains subdevices
> + * @fn: callback for each subdevice in the @iommu_group to retrieve the
> + *  physical device where the subdevice was created from.
> + *
> + * Returns 0 on success, or an error value.
> + */
> +int iommu_attach_subdev_group(struct iommu_domain *domain,
> +   struct iommu_group *group,
> +   iommu_device_lookup_t fn)
> +{
> + int ret = 

Re: [PATCH v5] PCI/ACS: Enable PCI_ACS_TB and disable only when needed for ATS

2020-09-10 Thread Rajat Jain via iommu
Hello Bjorn,


On Mon, Aug 17, 2020 at 3:50 PM Rajat Jain  wrote:
>
> Hello Bjorn,
>
>
> On Sat, Aug 1, 2020 at 5:30 PM Rajat Jain  wrote:
> >
> > Hi Bjorn,
> >
> >
> > On Tue, Jul 14, 2020 at 1:24 PM Rajat Jain  wrote:
> > >
> > > On Tue, Jul 14, 2020 at 1:15 PM Rajat Jain  wrote:
> > > >
> > > > The ACS "Translation Blocking" bit blocks the translated addresses from
> > > > the devices. We don't expect such traffic from devices unless ATS is
> > > > enabled on them. A device sending such traffic without ATS enabled,
> > > > indicates malicious intent, and thus should be blocked.
> > > >
> > > > Enable PCI_ACS_TB by default for all devices, and it stays enabled until
> > > > atleast one of the devices downstream wants to enable ATS. It gets
> > > > disabled to enable ATS on a device downstream it, and then gets enabled
> > > > back on once all the downstream devices don't need ATS.
> > > >
> > > > Signed-off-by: Rajat Jain 
> >
> > Just checking to see if you got a chance to look at this V5 patch.
>
> Any feedback on this patch?

Gentle reminder?

Thanks & Best Regards,

Rajat


>
> Thanks & Best Regards,
>
> Rajat
>
> >
> > Thanks & Best Regards,
> >
> > Rajat
> >
> > > > ---
> > > > Note that I'm ignoring the devices that require quirks to enable or
> > > > disable ACS, instead of using the standard way for ACS configuration.
> > > > The reason is that it would require adding yet another quirk table or
> > > > quirk function pointer, that I don't know how to implement for those
> > > > devices, and will neither have the devices to test that code.
> > > >
> > > > v5: Enable TB and disable ATS for all devices on boot. Disable TB later
> > > > only if needed to enable ATS on downstream devices.
> > > > v4: Add braces to avoid warning from kernel robot
> > > > print warning for only external-facing devices.
> > > > v3: print warning if ACS_TB not supported on external-facing/untrusted 
> > > > ports.
> > > > Minor code comments fixes.
> > > > v2: Commit log change
> > > >
> > > >  drivers/pci/ats.c   |  5 
> > > >  drivers/pci/pci.c   | 57 +
> > > >  drivers/pci/pci.h   |  2 ++
> > > >  drivers/pci/probe.c |  2 +-
> > > >  include/linux/pci.h |  2 ++
> > > >  5 files changed, 67 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c
> > > > index b761c1f72f67..e2ea9083f30f 100644
> > > > --- a/drivers/pci/ats.c
> > > > +++ b/drivers/pci/ats.c
> > > > @@ -28,6 +28,9 @@ void pci_ats_init(struct pci_dev *dev)
> > > > return;
> > > >
> > > > dev->ats_cap = pos;
> > > > +
> > > > +   dev->ats_enabled = 1; /* To avoid WARN_ON from 
> > > > pci_disable_ats() */
> > > > +   pci_disable_ats(dev);
> > > >  }
> > > >
> > > >  /**
> > > > @@ -82,6 +85,7 @@ int pci_enable_ats(struct pci_dev *dev, int ps)
> > > > }
> > > > pci_write_config_word(dev, dev->ats_cap + PCI_ATS_CTRL, ctrl);
> > > >
> > > > +   pci_disable_acs_trans_blocking(dev);
> > > > dev->ats_enabled = 1;
> > > > return 0;
> > > >  }
> > > > @@ -102,6 +106,7 @@ void pci_disable_ats(struct pci_dev *dev)
> > > > ctrl &= ~PCI_ATS_CTRL_ENABLE;
> > > > pci_write_config_word(dev, dev->ats_cap + PCI_ATS_CTRL, ctrl);
> > > >
> > > > +   pci_enable_acs_trans_blocking(dev);
> > > > dev->ats_enabled = 0;
> > > >  }
> > > >  EXPORT_SYMBOL_GPL(pci_disable_ats);
> > > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > > > index 73a862782214..614e3c1e8c56 100644
> > > > --- a/drivers/pci/pci.c
> > > > +++ b/drivers/pci/pci.c
> > > > @@ -876,6 +876,9 @@ static void pci_std_enable_acs(struct pci_dev *dev)
> > > > /* Upstream Forwarding */
> > > > ctrl |= (cap & PCI_ACS_UF);
> > > >
> > > > +   /* Translation Blocking */
> > > > +   ctrl |= (cap & PCI_ACS_TB);
> > > > +
> > > > pci_write_config_word(dev, pos + PCI_ACS_CTRL, ctrl);
> > > >  }
> > > >
> > > > @@ -904,6 +907,60 @@ static void pci_enable_acs(struct pci_dev *dev)
> > > > pci_disable_acs_redir(dev);
> > > >  }
> > > >
> > > > +void pci_disable_acs_trans_blocking(struct pci_dev *pdev)
> > > > +{
> > > > +   u16 cap, ctrl, pos;
> > > > +   struct pci_dev *dev;
> > > > +
> > > > +   if (!pci_acs_enable)
> > > > +   return;
> > > > +
> > > > +   for (dev = pdev; dev; dev = pci_upstream_bridge(pdev)) {
> > > > +
> > > > +   pos = dev->acs_cap;
> > > > +   if (!pos)
> > > > +   continue;
> > > > +
> > > > +   /*
> > > > +* Disable translation blocking when first downstream
> > > > +* device that needs it (for ATS) wants to enable ATS
> > > > +*/
> > > > +   if (++dev->ats_dependencies == 1) {
> > >
> > > I am a little worried about a potential race condition here. I know
> > > that 2 PCI devices cannot be enumerating at the same time. Do 

Re: [PATCH 1/3] ARM/dma-mapping: move various helpers from dma-mapping.h to dma-direct.h

2020-09-10 Thread Robin Murphy

On 2020-09-10 06:40, Christoph Hellwig wrote:

Move the helpers to translate to and from direct mapping DMA addresses
to dma-direct.h.  This not only is the most logical place, but the new
placement also avoids dependency loops with pending commits.


For the straightforward move as it should be,

Reviewed-by: Robin Murphy 

However I do wonder how much of this could be cleaned up further...


Signed-off-by: Christoph Hellwig 
---
  arch/arm/common/dmabounce.c|  2 +-
  arch/arm/include/asm/dma-direct.h  | 70 ++
  arch/arm/include/asm/dma-mapping.h | 70 --
  3 files changed, 71 insertions(+), 71 deletions(-)

diff --git a/arch/arm/common/dmabounce.c b/arch/arm/common/dmabounce.c
index f4b719bde76367..d3e00ea9208834 100644
--- a/arch/arm/common/dmabounce.c
+++ b/arch/arm/common/dmabounce.c
@@ -24,7 +24,7 @@
  #include 
  #include 
  #include 
-#include 
+#include 
  #include 
  #include 
  #include 
diff --git a/arch/arm/include/asm/dma-direct.h 
b/arch/arm/include/asm/dma-direct.h
index 7c3001a6a775bf..de0f4ff9279615 100644
--- a/arch/arm/include/asm/dma-direct.h
+++ b/arch/arm/include/asm/dma-direct.h
@@ -2,6 +2,76 @@
  #ifndef ASM_ARM_DMA_DIRECT_H
  #define ASM_ARM_DMA_DIRECT_H 1
  
+#include 

+
+#ifdef __arch_page_to_dma
+#error Please update to __arch_pfn_to_dma
+#endif


This must be long, long dead by now.


+
+/*
+ * dma_to_pfn/pfn_to_dma/dma_to_virt/virt_to_dma are architecture private
+ * functions used internally by the DMA-mapping API to provide DMA
+ * addresses. They must not be used by drivers.
+ */
+#ifndef __arch_pfn_to_dma
+static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn)
+{
+   if (dev)
+   pfn -= dev->dma_pfn_offset;
+   return (dma_addr_t)__pfn_to_bus(pfn);
+}
+
+static inline unsigned long dma_to_pfn(struct device *dev, dma_addr_t addr)
+{
+   unsigned long pfn = __bus_to_pfn(addr);
+
+   if (dev)
+   pfn += dev->dma_pfn_offset;
+
+   return pfn;
+}


These are only overridden for OMAP1510, and it looks like it wouldn't 
take much for the platform code or ohci-omap driver to set up a generic 
DMA offset for the relevant device.



+
+static inline void *dma_to_virt(struct device *dev, dma_addr_t addr)
+{
+   if (dev) {
+   unsigned long pfn = dma_to_pfn(dev, addr);
+
+   return phys_to_virt(__pfn_to_phys(pfn));
+   }
+
+   return (void *)__bus_to_virt((unsigned long)addr);
+}


This appears entirely unused.


+
+static inline dma_addr_t virt_to_dma(struct device *dev, void *addr)
+{
+   if (dev)
+   return pfn_to_dma(dev, virt_to_pfn(addr));
+
+   return (dma_addr_t)__virt_to_bus((unsigned long)(addr));
+}


And this is only used for some debug prints in dmabounce.

Similarly the __bus_to_*()/__*_to_bus() calls themselves only appear 
significant to mach-footbridge any more, and could probably also be 
evolved into regular DMA offsets now that all API calls must have a 
non-NULL device. I think I might come back and take a closer look at all 
this at some point in future... :)


Robin.


+
+#else
+static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn)
+{
+   return __arch_pfn_to_dma(dev, pfn);
+}
+
+static inline unsigned long dma_to_pfn(struct device *dev, dma_addr_t addr)
+{
+   return __arch_dma_to_pfn(dev, addr);
+}
+
+static inline void *dma_to_virt(struct device *dev, dma_addr_t addr)
+{
+   return __arch_dma_to_virt(dev, addr);
+}
+
+static inline dma_addr_t virt_to_dma(struct device *dev, void *addr)
+{
+   return __arch_virt_to_dma(dev, addr);
+}
+#endif
+
  static inline dma_addr_t __phys_to_dma(struct device *dev, phys_addr_t paddr)
  {
unsigned int offset = paddr & ~PAGE_MASK;
diff --git a/arch/arm/include/asm/dma-mapping.h 
b/arch/arm/include/asm/dma-mapping.h
index bdd80ddbca3451..0a1a536368c3a4 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -8,8 +8,6 @@
  #include 
  #include 
  
-#include 

-
  #include 
  #include 
  
@@ -23,74 +21,6 @@ static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus)

return NULL;
  }
  
-#ifdef __arch_page_to_dma

-#error Please update to __arch_pfn_to_dma
-#endif
-
-/*
- * dma_to_pfn/pfn_to_dma/dma_to_virt/virt_to_dma are architecture private
- * functions used internally by the DMA-mapping API to provide DMA
- * addresses. They must not be used by drivers.
- */
-#ifndef __arch_pfn_to_dma
-static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn)
-{
-   if (dev)
-   pfn -= dev->dma_pfn_offset;
-   return (dma_addr_t)__pfn_to_bus(pfn);
-}
-
-static inline unsigned long dma_to_pfn(struct device *dev, dma_addr_t addr)
-{
-   unsigned long pfn = __bus_to_pfn(addr);
-
-   if (dev)
-   pfn += dev->dma_pfn_offset;
-
-   return pfn;
-}
-
-static inline void *dma_to_virt(struct device 

[PATCH] iommu/amd: Fix potential @entry null deref

2020-09-10 Thread Joao Martins
After commit 26e495f34107 ("iommu/amd: Restore IRTE.RemapEn bit after
programming IRTE"), smatch warns:

drivers/iommu/amd/iommu.c:3870 amd_iommu_deactivate_guest_mode()
warn: variable dereferenced before check 'entry' (see line 3867)

Fix this by moving the @valid assignment to after @entry has been checked
for NULL.

Cc: Suravee Suthikulpanit 
Fixes: 26e495f34107 ("iommu/amd: Restore IRTE.RemapEn bit after programming 
IRTE")
Reported-by: Dan Carpenter 
Signed-off-by: Joao Martins 
---
 drivers/iommu/amd/iommu.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 07ae8b93887e..8abe1c7ad45b 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -3864,12 +3864,14 @@ int amd_iommu_deactivate_guest_mode(void *data)
struct amd_ir_data *ir_data = (struct amd_ir_data *)data;
struct irte_ga *entry = (struct irte_ga *) ir_data->entry;
struct irq_cfg *cfg = ir_data->cfg;
-   u64 valid = entry->lo.fields_remap.valid;
+   u64 valid;
 
if (!AMD_IOMMU_GUEST_IR_VAPIC(amd_iommu_guest_ir) ||
!entry || !entry->lo.fields_vapic.guest_mode)
return 0;
 
+   valid = entry->lo.fields_remap.valid;
+
entry->lo.val = 0;
entry->hi.val = 0;
 
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 9/9] iommu/vt-d: Store guest PASID during bind

2020-09-10 Thread Jacob Pan
On Tue, 1 Sep 2020 19:08:44 +0200
Auger Eric  wrote:

> Hi Jacob,
> On 8/22/20 6:35 AM, Jacob Pan wrote:
> > IOASID core maintains the guest-host mapping in the form of SPID and
> > IOASID. This patch assigns the guest PASID (if valid) as SPID while
> > binding guest page table with a host PASID. This mapping will be
> > used for lookup and notifications.
> > 
> > Signed-off-by: Jacob Pan 
> > ---
> >  drivers/iommu/intel/svm.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
> > index d8a5efa75095..4c958b1aec4c 100644
> > --- a/drivers/iommu/intel/svm.c
> > +++ b/drivers/iommu/intel/svm.c
> > @@ -406,6 +406,7 @@ int intel_svm_bind_gpasid(struct iommu_domain
> > *domain, struct device *dev, if (data->flags &
> > IOMMU_SVA_GPASID_VAL) { svm->gpasid = data->gpasid;
> > svm->flags |= SVM_FLAG_GUEST_PASID;
> > +   ioasid_attach_spid(data->hpasid,
> > data->gpasid);  
> don't you want to handle the returned value?
Yes, I also need to add a check for duplicated SPID within a set.

> > }
> > svm->iommu = iommu;
> > /*
> > @@ -517,6 +518,7 @@ int intel_svm_unbind_gpasid(struct device *dev,
> > int pasid) ioasid_attach_data(pasid, NULL);
> > ioasid_notify(pasid, IOASID_UNBIND,
> > IOASID_NOTIFY_SET);
> > +   ioasid_attach_spid(pasid,
> > INVALID_IOASID);  
> So this answers my previous question ;-) but won't it enter the if
> (!ioasid_data) path and fail to reset the spid?
> 
Sorry, i am not following. If there is no ioasid_data then there is no
SPID to be stored.

BTW, I plan to separate the APIs into two.
ioasid_attach_spid
ioasid_detach_spid

Only ioasid_detach_spid will be calling synchronize RCU, then
ioasid_attach_spid can be called under spinlock.

Thanks,

> Eric
> > kfree(svm);
> > }
> > }
> >   
> 
> ___
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu

[Jacob Pan]
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 3/3] dma-mapping: introduce DMA range map, supplanting dma_pfn_offset

2020-09-10 Thread Greg KH
On Thu, Sep 10, 2020 at 11:13:51AM +0200, Christoph Hellwig wrote:
> On Thu, Sep 10, 2020 at 09:53:51AM +0200, Greg KH wrote:
> > >   /*
> > >* Please refer to usb_alloc_dev() to see why we set
> > > -  * dma_mask and dma_pfn_offset.
> > > +  * dma_mask and dma_range_map.
> > >*/
> > >   intf->dev.dma_mask = dev->dev.dma_mask;
> > > - intf->dev.dma_pfn_offset = dev->dev.dma_pfn_offset;
> > > + if (dma_direct_copy_range_map(>dev, >dev))
> > > + dev_err(>dev, "failed to copy DMA map\n");
> > 
> > We tell the user, but then just keep on running?  Is there anything that
> > we can do here?
> > 
> > If not, why not have dma_direct_copy_range_map() print out the error?
> 
> At least for USB I'm pretty sure this isn't required at all.  I've been
> running with the patch below on my desktop for two days now trying all
> the usb toys I have (in addition to grepping for obvious abuses in
> the drivers).  remoteproc is a different story, but the DMA handling
> seems there is sketchy to start with..
> 
> ---
> >From 8bae3e6833f2ca431dcfcbc8f9cced7d5e972a01 Mon Sep 17 00:00:00 2001
> From: Christoph Hellwig 
> Date: Wed, 9 Sep 2020 08:28:59 +0200
> Subject: usb: don't inherity DMA properties for USB devices
> 
> As the comment in usb_alloc_dev correctly states, drivers can't use
> the DMA API on usb device, and at least calling dma_set_mask on them
> is highly dangerous.  Unlike what the comment states upper level drivers
> also can't really use the presence of a dma mask to check for DMA
> support, as the dma_mask is set by default for most busses.
> 
> Remove the copying over of DMA information, and remove the now unused
> dma_direct_copy_range_map export.
> 
> Signed-off-by: Christoph Hellwig 
> ---
>  drivers/usb/core/message.c |  7 ---
>  drivers/usb/core/usb.c | 13 -
>  kernel/dma/direct.c|  1 -
>  3 files changed, 21 deletions(-)
> 
> diff --git a/drivers/usb/core/message.c b/drivers/usb/core/message.c
> index 935ee98e049f65..9e45732dc1d1d1 100644
> --- a/drivers/usb/core/message.c
> +++ b/drivers/usb/core/message.c
> @@ -1954,13 +1954,6 @@ int usb_set_configuration(struct usb_device *dev, int 
> configuration)
>   intf->dev.bus = _bus_type;
>   intf->dev.type = _if_device_type;
>   intf->dev.groups = usb_interface_groups;
> - /*
> -  * Please refer to usb_alloc_dev() to see why we set
> -  * dma_mask and dma_range_map.
> -  */
> - intf->dev.dma_mask = dev->dev.dma_mask;
> - if (dma_direct_copy_range_map(>dev, >dev))
> - dev_err(>dev, "failed to copy DMA map\n");
>   INIT_WORK(>reset_ws, __usb_queue_reset_device);
>   intf->minor = -1;
>   device_initialize(>dev);
> diff --git a/drivers/usb/core/usb.c b/drivers/usb/core/usb.c
> index 23d451f6894d70..9b4ac4415f1a47 100644
> --- a/drivers/usb/core/usb.c
> +++ b/drivers/usb/core/usb.c
> @@ -599,19 +599,6 @@ struct usb_device *usb_alloc_dev(struct usb_device 
> *parent,
>   dev->dev.bus = _bus_type;
>   dev->dev.type = _device_type;
>   dev->dev.groups = usb_device_groups;
> - /*
> -  * Fake a dma_mask/offset for the USB device:
> -  * We cannot really use the dma-mapping API (dma_alloc_* and
> -  * dma_map_*) for USB devices but instead need to use
> -  * usb_alloc_coherent and pass data in 'urb's, but some subsystems
> -  * manually look into the mask/offset pair to determine whether
> -  * they need bounce buffers.
> -  * Note: calling dma_set_mask() on a USB device would set the
> -  * mask for the entire HCD, so don't do that.
> -  */
> - dev->dev.dma_mask = bus->sysdev->dma_mask;
> - if (dma_direct_copy_range_map(>dev, bus->sysdev))
> - dev_err(>dev, "failed to copy DMA map\n");
>   set_dev_node(>dev, dev_to_node(bus->sysdev));
>   dev->state = USB_STATE_ATTACHED;
>   dev->lpm_disable_count = 1;
> diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
> index fc815f7375e282..3af257571a3b42 100644
> --- a/kernel/dma/direct.c
> +++ b/kernel/dma/direct.c
> @@ -552,4 +552,3 @@ int dma_direct_copy_range_map(struct device *to, struct 
> device *from)
>   to->dma_range_map = new_map;
>   return 0;
>  }
> -EXPORT_SYMBOL_GPL(dma_direct_copy_range_map);

If you think this is safe to do, great, but for some reason I thought
host controllers wanted this information, and that the scsi layer was
the offending layer that also wanted this type of thing.  But it's been
a really long time so I don't remember for sure, sorry.

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] intel-iommu: don't disable ATS for device without page aligned request

2020-09-10 Thread Raj, Ashok
On Wed, Sep 09, 2020 at 10:17:35PM -0400, Jason Wang wrote:
> 
> 
> - Original Message -
> > Hi Jason
> > 
> > On Wed, Sep 09, 2020 at 04:34:32PM +0800, Jason Wang wrote:
> > > Commit 61363c1474b1 ("iommu/vt-d: Enable ATS only if the device uses
> > > page aligned address.") disables ATS for device that can do unaligned
> > > page request.
> > 
> > Did you take a look at the PCI specification?
> > Page Aligned Request is in the ATS capability Register.
> > 
> > ATS Capability Register (Offset 0x04h)
> > 
> > bit (5):
> > Page Aligned Request - If Set, indicates the Untranslated address is always
> > aligned to 4096 byte boundary. Setting this field is recommended. This
> > field permits software to distinguish between implemntations compatible
> > with this specification and those compatible with an earlier version of
> > this specification in which a Requester was permitted to supply anything in
> > bits [11:2].
> 
> Yes, my understanding is that this is optional not mandatory.

Correct, but optional on the device side. An IOMMU might *require* this for
proper normal operation. Our IOMMU's do not get the low 12 bits. Which is
why the spec gives SW a way to detect if the device is compatible for this
IOMMU implementation.

> 
> > 
> > > 
> > > This looks wrong, since the commit log said it's because the page
> > > request descriptor doesn't support reporting unaligned request.
> > 
> > I don't think you can change the definition from ATS to PRI. Both are
> > orthogonal feature.
> 
> I may miss something, here's my understanding is that:
> 
> - page request descriptor will only be used when PRS is enabled
> - ATS spec allows unaligned request
> 
> So any reason for disabling ATS for unaligned request even if PRS is
> not enabled?

I think you are getting confused between the 2 different PCIe features.

ATS - Address Translation Services. Used by device to simply request the
Host Physical Address for some DMA operation. 

When ATS response indicates failed, then the device can request a
page-request (PRS this is like a device page-fault), and then IOMMU driver
would work with the kernel to fault a page then respond with
(Page-response) success/failure. Then the device will send a new ATS 
to get the new translation. 


> 
> > 
> > > 
> > > A victim is Qemu's virtio-pci which doesn't advertise the page aligned
> > > address. Fixing by disable PRI instead of ATS if device doesn't have
> > > page aligned request.
> > 
> > This is a requirement for the Intel IOMMU's.
> > 
> > You say virtio, so is it all emulated device or you talking about some
> > hardware that implemented virtio-pci compliant hw? If you are sure the
> > device actually does comply with the requirement, but just not enumerating
> > the capability, you can maybe work a quirk to overcome that?
> 
> So far only emulated devices. But we are helping some vendor to
> implement virtio hardware so  we need to understand the connection
> between ATS alignment and page request descriptor.

ATS and PRS are 2 separate orthogonal features. 
PRS requires ATS, but not the other way around. 

> 
> > 
> > Now PRI also has an alignment requirement, and Intel IOMMU's requires that
> > as well. If your device supports SRIOV as well, PASID and PRI are
> > enumerated just on the PF and not the VF. You might want to pay attension
> > to that. We are still working on a solution for that problem.
> 
> Thanks for the reminding, but it looks to me according to the ATS
> spec, all PRI message is 4096 byte aligned? E.g lower bites were used
> for group index etc.

Right, I should have been clear. The issue with PRI is we require responses
to have PASID field set. There is another capability on the device that
exposes that. pci_prg_resp_pasid_required(). This is required to enable PRI
for a device.

Cheers,
Ashok
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [trivial PATCH] treewide: Convert switch/case fallthrough; to break;

2020-09-10 Thread Robin Murphy

On 2020-09-09 21:06, Joe Perches wrote:

fallthrough to a separate case/default label break; isn't very readable.

Convert pseudo-keyword fallthrough; statements to a simple break; when
the next label is case or default and the only statement in the next
label block is break;

Found using:

$ grep-2.5.4 -rP --include=*.[ch] -n 
"fallthrough;(\s*(case\s+\w+|default)\s*:\s*){1,7}break;" *

Miscellanea:

o Move or coalesce a couple label blocks above a default: block.

Signed-off-by: Joe Perches 
---

Compiled allyesconfig x86-64 only.
A few files for other arches were not compiled.



[...]

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index c192544e874b..743db1abec40 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -3777,7 +3777,7 @@ static int arm_smmu_device_hw_probe(struct 
arm_smmu_device *smmu)
switch (FIELD_GET(IDR0_TTF, reg)) {
case IDR0_TTF_AARCH32_64:
smmu->ias = 40;
-   fallthrough;
+   break;
case IDR0_TTF_AARCH64:
break;
default:


I have to say I don't really agree with the readability argument for 
this one - a fallthrough is semantically correct here, since the first 
case is a superset of the second. It just happens that anything we would 
do for the common subset is implicitly assumed (there are other 
potential cases we simply haven't added support for at the moment), thus 
the second case is currently empty.


This change actively obfuscates that distinction.

Robin.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 12/12] dma-mapping: move the dma_declare_coherent_memory documentation

2020-09-10 Thread Robin Murphy

On 2020-09-08 17:47, Christoph Hellwig wrote:

dma_declare_coherent_memory should not be in a DMA API guide aimed
at driver writers (that is consumers of the API).  Move it to a comment
near the function instead.


I still think there might be an occasional valid use for device-local 
memory outside the scope of platform code without the driver having to 
go full ZONE_DEVICE/HMM/TTM, e.g. with stuff like PCIe-based FPGA 
prototyping cards, but the kind of driver I'm imagining for that case 
would never be upstream anyway (if it were even written, rather than 
just using hard-coded hacks), so meh.


Reviewed-by: Robin Murphy 


Signed-off-by: Christoph Hellwig 
---
  Documentation/core-api/dma-api.rst | 24 
  kernel/dma/coherent.c  | 17 +
  2 files changed, 17 insertions(+), 24 deletions(-)

diff --git a/Documentation/core-api/dma-api.rst 
b/Documentation/core-api/dma-api.rst
index 3b3a4b9a6f..90239348b30f6f 100644
--- a/Documentation/core-api/dma-api.rst
+++ b/Documentation/core-api/dma-api.rst
@@ -586,30 +586,6 @@ the DMA_ATTR_NON_CONSISTENT flag starting at virtual 
address vaddr and
  continuing on for size.  Again, you *must* observe the cache line
  boundaries when doing this.
  
-::

-
-   int
-   dma_declare_coherent_memory(struct device *dev, phys_addr_t phys_addr,
-   dma_addr_t device_addr, size_t size);
-
-Declare region of memory to be handed out by dma_alloc_coherent() when
-it's asked for coherent memory for this device.
-
-phys_addr is the CPU physical address to which the memory is currently
-assigned (this will be ioremapped so the CPU can access the region).
-
-device_addr is the DMA address the device needs to be programmed
-with to actually address this memory (this will be handed out as the
-dma_addr_t in dma_alloc_coherent()).
-
-size is the size of the area (must be multiples of PAGE_SIZE).
-
-As a simplification for the platforms, only *one* such region of
-memory may be declared per device.
-
-For reasons of efficiency, most platforms choose to track the declared
-region only at the granularity of a page.  For smaller allocations,
-you should use the dma_pool() API.
  
  Part III - Debug drivers use of the DMA-API

  ---
diff --git a/kernel/dma/coherent.c b/kernel/dma/coherent.c
index 2a0c4985f38e41..f85d14bbfcbe03 100644
--- a/kernel/dma/coherent.c
+++ b/kernel/dma/coherent.c
@@ -107,6 +107,23 @@ static int dma_assign_coherent_memory(struct device *dev,
return 0;
  }
  
+/*

+ * Declare a region of memory to be handed out by dma_alloc_coherent() when it
+ * is asked for coherent memory for this device.  This shall only be used
+ * from platform code, usually based on the device tree description.
+ *
+ * phys_addr is the CPU physical address to which the memory is currently
+ * assigned (this will be ioremapped so the CPU can access the region).
+ *
+ * device_addr is the DMA address the device needs to be programmed with to
+ * actually address this memory (this will be handed out as the dma_addr_t in
+ * dma_alloc_coherent()).
+ *
+ * size is the size of the area (must be a multiple of PAGE_SIZE).
+ *
+ * As a simplification for the platforms, only *one* such region of memory may
+ * be declared per device.
+ */
  int dma_declare_coherent_memory(struct device *dev, phys_addr_t phys_addr,
dma_addr_t device_addr, size_t size)
  {


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [Intel-gfx] [PATCH 0/8] Convert the intel iommu driver to the dma-iommu api

2020-09-10 Thread Tom Murphy
On Thu, 10 Sep 2020 at 14:33, Tom Murphy  wrote:
>
> On Wed, 9 Sep 2020 at 13:56, Tvrtko Ursulin
>  wrote:
> >
> >
> > On 09/09/2020 10:16, Tvrtko Ursulin wrote:
> > > On 08/09/2020 23:43, Tom Murphy wrote:
> > >> On Tue, 8 Sep 2020 at 16:56, Tvrtko Ursulin
> > >>  wrote:
> > >>> On 08/09/2020 16:44, Logan Gunthorpe wrote:
> >  On 2020-09-08 9:28 a.m., Tvrtko Ursulin wrote:
> > >>
> > >> diff --git a/drivers/gpu/drm/i915/i915_scatterlist.h
> > >> b/drivers/gpu/drm/i915/i915
> > >> index b7b59328cb76..9367ac801f0c 100644
> > >> --- a/drivers/gpu/drm/i915/i915_scatterlist.h
> > >> +++ b/drivers/gpu/drm/i915/i915_scatterlist.h
> > >> @@ -27,13 +27,19 @@ static __always_inline struct sgt_iter {
> > >> } __sgt_iter(struct scatterlist *sgl, bool dma) {
> > >>struct sgt_iter s = { .sgp = sgl };
> > >>
> > >> +   if (sgl && !sg_dma_len(s.sgp))
> > >
> > > I'd extend the condition to be, just to be safe:
> > >   if (dma && sgl && !sg_dma_len(s.sgp))
> > >
> > 
> >  Right, good catch, that's definitely necessary.
> > 
> > >> +   s.sgp = NULL;
> > >> +
> > >>if (s.sgp) {
> > >>s.max = s.curr = s.sgp->offset;
> > >> -   s.max += s.sgp->length;
> > >> -   if (dma)
> > >> +
> > >> +   if (dma) {
> > >> +   s.max += sg_dma_len(s.sgp);
> > >>s.dma = sg_dma_address(s.sgp);
> > >> -   else
> > >> +   } else {
> > >> +   s.max += s.sgp->length;
> > >>s.pfn = page_to_pfn(sg_page(s.sgp));
> > >> +   }
> > >
> > > Otherwise has this been tested or alternatively how to test it?
> > > (How to
> > > repro the issue.)
> > 
> >  It has not been tested. To test it, you need Tom's patch set without
> >  the
> >  last "DO NOT MERGE" patch:
> > 
> >  https://lkml.kernel.org/lkml/20200907070035.ga25...@infradead.org/T/
> > >>>
> > >>> Tom, do you have a branch somewhere I could pull from? (Just being lazy
> > >>> about downloading a bunch of messages from the archives.)
> > >>
> > >> I don't unfortunately. I'm working locally with poor internet.
> > >>
> > >>>
> > >>> What GPU is in your Lenovo x1 carbon 5th generation and what
> > >>> graphical/desktop setup I need to repro?
> > >>
> > >>
> > >> Is this enough info?:
> > >>
> > >> $ lspci -vnn | grep VGA -A 12
> > >> 00:02.0 VGA compatible controller [0300]: Intel Corporation HD
> > >> Graphics 620 [8086:5916] (rev 02) (prog-if 00 [VGA controller])
> > >>  Subsystem: Lenovo ThinkPad X1 Carbon 5th Gen [17aa:224f]
> > >>  Flags: bus master, fast devsel, latency 0, IRQ 148
> > >>  Memory at eb00 (64-bit, non-prefetchable) [size=16M]
> > >>  Memory at 6000 (64-bit, prefetchable) [size=256M]
> > >>  I/O ports at e000 [size=64]
> > >>  [virtual] Expansion ROM at 000c [disabled] [size=128K]
> > >>  Capabilities: [40] Vendor Specific Information: Len=0c 
> > >>  Capabilities: [70] Express Root Complex Integrated Endpoint, MSI 00
> > >>  Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable- 64bit-
> > >>  Capabilities: [d0] Power Management version 2
> > >>  Capabilities: [100] Process Address Space ID (PASID)
> > >>  Capabilities: [200] Address Translation Service (ATS)
> > >
> > > Works for a start. What about the steps to repro? Any desktop
> > > environment and it is just visual corruption, no hangs/stalls or such?
> > >
> > > I've submitted a series consisting of what I understood are the patches
> > > needed to repro the issue to our automated CI here:
> > >
> > > https://patchwork.freedesktop.org/series/81489/
> > >
> > > So will see if it will catch something, or more targeted testing will be
> > > required. Hopefully it does trip over in which case I can add the patch
> > > suggested by Logan on top and see if that fixes it. Or I'll need to
> > > write a new test case.
> > >
> > > If you could glance over my series to check I identified the patches
> > > correctly it would be appreciated.
> >
> > Our CI was more than capable at catching the breakage so I've copied you
> > on a patch (https://patchwork.freedesktop.org/series/81497/) which has a
> > good potential to fix this. (Or improve the robustness of our sg walks,
> > depends how you look at it.)
> >
> > Would you be able to test it in your environment by any chance? If it
> > works I understand it unblocks your IOMMU work, right?

And yes this does unblock the iommu work

>
> I tested your latest patch set ([PATCH 1/2] drm/i915: Fix DMA mapped
> scatterlist walks) and it fixes the issue. great work!
>
> >
> > Regards,
> >
> > Tvrtko
___
iommu mailing list
iommu@lists.linux-foundation.org

Re: [PATCH 11/12] dma-mapping: move dma_common_{mmap, get_sgtable} out of mapping.c

2020-09-10 Thread Robin Murphy

On 2020-09-08 17:47, Christoph Hellwig wrote:

Add a new file that contains helpera for misc DMA ops, which is only


The Latin plural of the singular "helperum", I guess? :P


built when CONFIG_DMA_OPS is set.


Reviewed-by: Robin Murphy 


Signed-off-by: Christoph Hellwig 
---
  kernel/dma/Makefile  |  1 +
  kernel/dma/mapping.c | 47 +---
  kernel/dma/ops_helpers.c | 51 
  3 files changed, 53 insertions(+), 46 deletions(-)
  create mode 100644 kernel/dma/ops_helpers.c

diff --git a/kernel/dma/Makefile b/kernel/dma/Makefile
index 32c7c1942bbd6c..dc755ab68aabf9 100644
--- a/kernel/dma/Makefile
+++ b/kernel/dma/Makefile
@@ -1,6 +1,7 @@
  # SPDX-License-Identifier: GPL-2.0
  
  obj-$(CONFIG_HAS_DMA)			+= mapping.o direct.o

+obj-$(CONFIG_DMA_OPS)  += ops_helpers.o
  obj-$(CONFIG_DMA_OPS) += dummy.o
  obj-$(CONFIG_DMA_CMA) += contiguous.o
  obj-$(CONFIG_DMA_DECLARE_COHERENT)+= coherent.o
diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
index 0d129421e75fc8..848c95c27d79ff 100644
--- a/kernel/dma/mapping.c
+++ b/kernel/dma/mapping.c
@@ -8,7 +8,7 @@
  #include  /* for max_pfn */
  #include 
  #include 
-#include 
+#include 
  #include 
  #include 
  #include 
@@ -295,22 +295,6 @@ void dma_sync_sg_for_device(struct device *dev, struct 
scatterlist *sg,
  }
  EXPORT_SYMBOL(dma_sync_sg_for_device);
  
-/*

- * Create scatter-list for the already allocated DMA buffer.
- */
-int dma_common_get_sgtable(struct device *dev, struct sg_table *sgt,
-void *cpu_addr, dma_addr_t dma_addr, size_t size,
-unsigned long attrs)
-{
-   struct page *page = virt_to_page(cpu_addr);
-   int ret;
-
-   ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
-   if (!ret)
-   sg_set_page(sgt->sgl, page, PAGE_ALIGN(size), 0);
-   return ret;
-}
-
  /*
   * The whole dma_get_sgtable() idea is fundamentally unsafe - it seems
   * that the intention is to allow exporting memory allocated via the
@@ -358,35 +342,6 @@ pgprot_t dma_pgprot(struct device *dev, pgprot_t prot, 
unsigned long attrs)
  }
  #endif /* CONFIG_MMU */
  
-/*

- * Create userspace mapping for the DMA-coherent memory.
- */
-int dma_common_mmap(struct device *dev, struct vm_area_struct *vma,
-   void *cpu_addr, dma_addr_t dma_addr, size_t size,
-   unsigned long attrs)
-{
-#ifdef CONFIG_MMU
-   unsigned long user_count = vma_pages(vma);
-   unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT;
-   unsigned long off = vma->vm_pgoff;
-   int ret = -ENXIO;
-
-   vma->vm_page_prot = dma_pgprot(dev, vma->vm_page_prot, attrs);
-
-   if (dma_mmap_from_dev_coherent(dev, vma, cpu_addr, size, ))
-   return ret;
-
-   if (off >= count || user_count > count - off)
-   return -ENXIO;
-
-   return remap_pfn_range(vma, vma->vm_start,
-   page_to_pfn(virt_to_page(cpu_addr)) + vma->vm_pgoff,
-   user_count << PAGE_SHIFT, vma->vm_page_prot);
-#else
-   return -ENXIO;
-#endif /* CONFIG_MMU */
-}
-
  /**
   * dma_can_mmap - check if a given device supports dma_mmap_*
   * @dev: device to check
diff --git a/kernel/dma/ops_helpers.c b/kernel/dma/ops_helpers.c
new file mode 100644
index 00..e443c69be4299f
--- /dev/null
+++ b/kernel/dma/ops_helpers.c
@@ -0,0 +1,51 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Helpers for DMA ops implementations.  These generally rely on the fact that
+ * the allocated memory contains normal pages in the direct kernel mapping.
+ */
+#include 
+
+/*
+ * Create scatter-list for the already allocated DMA buffer.
+ */
+int dma_common_get_sgtable(struct device *dev, struct sg_table *sgt,
+void *cpu_addr, dma_addr_t dma_addr, size_t size,
+unsigned long attrs)
+{
+   struct page *page = virt_to_page(cpu_addr);
+   int ret;
+
+   ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
+   if (!ret)
+   sg_set_page(sgt->sgl, page, PAGE_ALIGN(size), 0);
+   return ret;
+}
+
+/*
+ * Create userspace mapping for the DMA-coherent memory.
+ */
+int dma_common_mmap(struct device *dev, struct vm_area_struct *vma,
+   void *cpu_addr, dma_addr_t dma_addr, size_t size,
+   unsigned long attrs)
+{
+#ifdef CONFIG_MMU
+   unsigned long user_count = vma_pages(vma);
+   unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT;
+   unsigned long off = vma->vm_pgoff;
+   int ret = -ENXIO;
+
+   vma->vm_page_prot = dma_pgprot(dev, vma->vm_page_prot, attrs);
+
+   if (dma_mmap_from_dev_coherent(dev, vma, cpu_addr, size, ))
+   return ret;
+
+   if (off >= count || user_count > count - off)
+   return -ENXIO;
+
+   return remap_pfn_range(vma, vma->vm_start,
+   page_to_pfn(virt_to_page(cpu_addr)) + vma->vm_pgoff,
+

Re: [Intel-gfx] [PATCH 0/8] Convert the intel iommu driver to the dma-iommu api

2020-09-10 Thread Tom Murphy
On Wed, 9 Sep 2020 at 13:56, Tvrtko Ursulin
 wrote:
>
>
> On 09/09/2020 10:16, Tvrtko Ursulin wrote:
> > On 08/09/2020 23:43, Tom Murphy wrote:
> >> On Tue, 8 Sep 2020 at 16:56, Tvrtko Ursulin
> >>  wrote:
> >>> On 08/09/2020 16:44, Logan Gunthorpe wrote:
>  On 2020-09-08 9:28 a.m., Tvrtko Ursulin wrote:
> >>
> >> diff --git a/drivers/gpu/drm/i915/i915_scatterlist.h
> >> b/drivers/gpu/drm/i915/i915
> >> index b7b59328cb76..9367ac801f0c 100644
> >> --- a/drivers/gpu/drm/i915/i915_scatterlist.h
> >> +++ b/drivers/gpu/drm/i915/i915_scatterlist.h
> >> @@ -27,13 +27,19 @@ static __always_inline struct sgt_iter {
> >> } __sgt_iter(struct scatterlist *sgl, bool dma) {
> >>struct sgt_iter s = { .sgp = sgl };
> >>
> >> +   if (sgl && !sg_dma_len(s.sgp))
> >
> > I'd extend the condition to be, just to be safe:
> >   if (dma && sgl && !sg_dma_len(s.sgp))
> >
> 
>  Right, good catch, that's definitely necessary.
> 
> >> +   s.sgp = NULL;
> >> +
> >>if (s.sgp) {
> >>s.max = s.curr = s.sgp->offset;
> >> -   s.max += s.sgp->length;
> >> -   if (dma)
> >> +
> >> +   if (dma) {
> >> +   s.max += sg_dma_len(s.sgp);
> >>s.dma = sg_dma_address(s.sgp);
> >> -   else
> >> +   } else {
> >> +   s.max += s.sgp->length;
> >>s.pfn = page_to_pfn(sg_page(s.sgp));
> >> +   }
> >
> > Otherwise has this been tested or alternatively how to test it?
> > (How to
> > repro the issue.)
> 
>  It has not been tested. To test it, you need Tom's patch set without
>  the
>  last "DO NOT MERGE" patch:
> 
>  https://lkml.kernel.org/lkml/20200907070035.ga25...@infradead.org/T/
> >>>
> >>> Tom, do you have a branch somewhere I could pull from? (Just being lazy
> >>> about downloading a bunch of messages from the archives.)
> >>
> >> I don't unfortunately. I'm working locally with poor internet.
> >>
> >>>
> >>> What GPU is in your Lenovo x1 carbon 5th generation and what
> >>> graphical/desktop setup I need to repro?
> >>
> >>
> >> Is this enough info?:
> >>
> >> $ lspci -vnn | grep VGA -A 12
> >> 00:02.0 VGA compatible controller [0300]: Intel Corporation HD
> >> Graphics 620 [8086:5916] (rev 02) (prog-if 00 [VGA controller])
> >>  Subsystem: Lenovo ThinkPad X1 Carbon 5th Gen [17aa:224f]
> >>  Flags: bus master, fast devsel, latency 0, IRQ 148
> >>  Memory at eb00 (64-bit, non-prefetchable) [size=16M]
> >>  Memory at 6000 (64-bit, prefetchable) [size=256M]
> >>  I/O ports at e000 [size=64]
> >>  [virtual] Expansion ROM at 000c [disabled] [size=128K]
> >>  Capabilities: [40] Vendor Specific Information: Len=0c 
> >>  Capabilities: [70] Express Root Complex Integrated Endpoint, MSI 00
> >>  Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable- 64bit-
> >>  Capabilities: [d0] Power Management version 2
> >>  Capabilities: [100] Process Address Space ID (PASID)
> >>  Capabilities: [200] Address Translation Service (ATS)
> >
> > Works for a start. What about the steps to repro? Any desktop
> > environment and it is just visual corruption, no hangs/stalls or such?
> >
> > I've submitted a series consisting of what I understood are the patches
> > needed to repro the issue to our automated CI here:
> >
> > https://patchwork.freedesktop.org/series/81489/
> >
> > So will see if it will catch something, or more targeted testing will be
> > required. Hopefully it does trip over in which case I can add the patch
> > suggested by Logan on top and see if that fixes it. Or I'll need to
> > write a new test case.
> >
> > If you could glance over my series to check I identified the patches
> > correctly it would be appreciated.
>
> Our CI was more than capable at catching the breakage so I've copied you
> on a patch (https://patchwork.freedesktop.org/series/81497/) which has a
> good potential to fix this. (Or improve the robustness of our sg walks,
> depends how you look at it.)
>
> Would you be able to test it in your environment by any chance? If it
> works I understand it unblocks your IOMMU work, right?

I tested your latest patch set ([PATCH 1/2] drm/i915: Fix DMA mapped
scatterlist walks) and it fixes the issue. great work!

>
> Regards,
>
> Tvrtko
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 10/12] dma-direct: rename and cleanup __phys_to_dma

2020-09-10 Thread Robin Murphy

On 2020-09-08 17:47, Christoph Hellwig wrote:

The __phys_to_dma vs phys_to_dma distinction isn't exactly obvious.  Try
to improve the situation by renaming __phys_to_dma to
phys_to_dma_unencryped, and not forcing architectures that want to
override phys_to_dma to actually provide __phys_to_dma.


Reviewed-by: Robin Murphy 


Signed-off-by: Christoph Hellwig 
---
  arch/arm/include/asm/dma-direct.h  |  2 +-
  arch/mips/bmips/dma.c  |  2 +-
  arch/mips/cavium-octeon/dma-octeon.c   |  2 +-
  arch/mips/include/asm/dma-direct.h |  2 +-
  arch/mips/loongson2ef/fuloong-2e/dma.c |  2 +-
  arch/mips/loongson2ef/lemote-2f/dma.c  |  2 +-
  arch/mips/loongson64/dma.c |  2 +-
  arch/mips/pci/pci-ar2315.c |  2 +-
  arch/mips/pci/pci-xtalk-bridge.c   |  2 +-
  arch/mips/sgi-ip32/ip32-dma.c  |  2 +-
  arch/powerpc/include/asm/dma-direct.h  |  2 +-
  drivers/iommu/intel/iommu.c|  2 +-
  include/linux/dma-direct.h | 28 +++---
  kernel/dma/direct.c|  8 
  kernel/dma/swiotlb.c   |  4 ++--
  15 files changed, 34 insertions(+), 30 deletions(-)

diff --git a/arch/arm/include/asm/dma-direct.h 
b/arch/arm/include/asm/dma-direct.h
index a8cee87a93e8ab..bca0de56753439 100644
--- a/arch/arm/include/asm/dma-direct.h
+++ b/arch/arm/include/asm/dma-direct.h
@@ -2,7 +2,7 @@
  #ifndef ASM_ARM_DMA_DIRECT_H
  #define ASM_ARM_DMA_DIRECT_H 1
  
-static inline dma_addr_t __phys_to_dma(struct device *dev, phys_addr_t paddr)

+static inline dma_addr_t phys_to_dma(struct device *dev, phys_addr_t paddr)
  {
unsigned int offset = paddr & ~PAGE_MASK;
return pfn_to_dma(dev, __phys_to_pfn(paddr)) + offset;
diff --git a/arch/mips/bmips/dma.c b/arch/mips/bmips/dma.c
index ba2a5d33dfd3fa..49061b870680b9 100644
--- a/arch/mips/bmips/dma.c
+++ b/arch/mips/bmips/dma.c
@@ -40,7 +40,7 @@ static struct bmips_dma_range *bmips_dma_ranges;
  
  #define FLUSH_RAC		0x100
  
-dma_addr_t __phys_to_dma(struct device *dev, phys_addr_t pa)

+dma_addr_t phys_to_dma(struct device *dev, phys_addr_t pa)
  {
struct bmips_dma_range *r;
  
diff --git a/arch/mips/cavium-octeon/dma-octeon.c b/arch/mips/cavium-octeon/dma-octeon.c

index 388b13ba2558c2..232fa1017b1ec9 100644
--- a/arch/mips/cavium-octeon/dma-octeon.c
+++ b/arch/mips/cavium-octeon/dma-octeon.c
@@ -168,7 +168,7 @@ void __init octeon_pci_dma_init(void)
  }
  #endif /* CONFIG_PCI */
  
-dma_addr_t __phys_to_dma(struct device *dev, phys_addr_t paddr)

+dma_addr_t phys_to_dma(struct device *dev, phys_addr_t paddr)
  {
  #ifdef CONFIG_PCI
if (dev && dev_is_pci(dev))
diff --git a/arch/mips/include/asm/dma-direct.h 
b/arch/mips/include/asm/dma-direct.h
index 8e178651c638c2..9a640118316c9d 100644
--- a/arch/mips/include/asm/dma-direct.h
+++ b/arch/mips/include/asm/dma-direct.h
@@ -2,7 +2,7 @@
  #ifndef _MIPS_DMA_DIRECT_H
  #define _MIPS_DMA_DIRECT_H 1
  
-dma_addr_t __phys_to_dma(struct device *dev, phys_addr_t paddr);

+dma_addr_t phys_to_dma(struct device *dev, phys_addr_t paddr);
  phys_addr_t dma_to_phys(struct device *dev, dma_addr_t daddr);
  
  #endif /* _MIPS_DMA_DIRECT_H */

diff --git a/arch/mips/loongson2ef/fuloong-2e/dma.c 
b/arch/mips/loongson2ef/fuloong-2e/dma.c
index 83fadeb3fd7d56..cea167d8aba8db 100644
--- a/arch/mips/loongson2ef/fuloong-2e/dma.c
+++ b/arch/mips/loongson2ef/fuloong-2e/dma.c
@@ -1,7 +1,7 @@
  // SPDX-License-Identifier: GPL-2.0
  #include 
  
-dma_addr_t __phys_to_dma(struct device *dev, phys_addr_t paddr)

+dma_addr_t phys_to_dma(struct device *dev, phys_addr_t paddr)
  {
return paddr | 0x8000;
  }
diff --git a/arch/mips/loongson2ef/lemote-2f/dma.c 
b/arch/mips/loongson2ef/lemote-2f/dma.c
index 302b43a14eee74..3c9e994563578c 100644
--- a/arch/mips/loongson2ef/lemote-2f/dma.c
+++ b/arch/mips/loongson2ef/lemote-2f/dma.c
@@ -1,7 +1,7 @@
  // SPDX-License-Identifier: GPL-2.0
  #include 
  
-dma_addr_t __phys_to_dma(struct device *dev, phys_addr_t paddr)

+dma_addr_t phys_to_dma(struct device *dev, phys_addr_t paddr)
  {
return paddr | 0x8000;
  }
diff --git a/arch/mips/loongson64/dma.c b/arch/mips/loongson64/dma.c
index b3dc5d0bd2b113..364f2f27c8723f 100644
--- a/arch/mips/loongson64/dma.c
+++ b/arch/mips/loongson64/dma.c
@@ -4,7 +4,7 @@
  #include 
  #include 
  
-dma_addr_t __phys_to_dma(struct device *dev, phys_addr_t paddr)

+dma_addr_t phys_to_dma(struct device *dev, phys_addr_t paddr)
  {
/* We extract 2bit node id (bit 44~47, only bit 44~45 used now) from
 * Loongson-3's 48bit address space and embed it into 40bit */
diff --git a/arch/mips/pci/pci-ar2315.c b/arch/mips/pci/pci-ar2315.c
index d88395684f487d..cef4a47ab06311 100644
--- a/arch/mips/pci/pci-ar2315.c
+++ b/arch/mips/pci/pci-ar2315.c
@@ -170,7 +170,7 @@ static inline dma_addr_t ar2315_dev_offset(struct device 
*dev)
return 0;
  }
  
-dma_addr_t __phys_to_dma(struct device *dev, phys_addr_t paddr)


Re: [PATCH 09/12] dma-direct: remove __dma_to_phys

2020-09-10 Thread Robin Murphy

On 2020-09-08 17:47, Christoph Hellwig wrote:

There is no harm in just always clearing the SME encryption bit, while
significantly simplifying the interface.


After a 10-minute diversion into "but hang on, force_dma_unencrypted() 
is meaningful on PPC and S390 too..." before realising that it all does 
just come back to __sme_clr(), which is indeed a no-op for everyone 
other than AMD, any simplification of this mess is indeed welcome :)


Unless I've massively misunderstood how SME is supposed to work,

Reviewed-by: Robin Murphy 


Signed-off-by: Christoph Hellwig 
---
  arch/arm/include/asm/dma-direct.h  |  2 +-
  arch/mips/bmips/dma.c  |  2 +-
  arch/mips/cavium-octeon/dma-octeon.c   |  2 +-
  arch/mips/include/asm/dma-direct.h |  2 +-
  arch/mips/loongson2ef/fuloong-2e/dma.c |  2 +-
  arch/mips/loongson2ef/lemote-2f/dma.c  |  2 +-
  arch/mips/loongson64/dma.c |  2 +-
  arch/mips/pci/pci-ar2315.c |  2 +-
  arch/mips/pci/pci-xtalk-bridge.c   |  2 +-
  arch/mips/sgi-ip32/ip32-dma.c  |  2 +-
  arch/powerpc/include/asm/dma-direct.h  |  2 +-
  include/linux/dma-direct.h | 14 +-
  kernel/dma/direct.c|  6 +-
  13 files changed, 17 insertions(+), 25 deletions(-)

diff --git a/arch/arm/include/asm/dma-direct.h 
b/arch/arm/include/asm/dma-direct.h
index 7c3001a6a775bf..a8cee87a93e8ab 100644
--- a/arch/arm/include/asm/dma-direct.h
+++ b/arch/arm/include/asm/dma-direct.h
@@ -8,7 +8,7 @@ static inline dma_addr_t __phys_to_dma(struct device *dev, 
phys_addr_t paddr)
return pfn_to_dma(dev, __phys_to_pfn(paddr)) + offset;
  }
  
-static inline phys_addr_t __dma_to_phys(struct device *dev, dma_addr_t dev_addr)

+static inline phys_addr_t dma_to_phys(struct device *dev, dma_addr_t dev_addr)
  {
unsigned int offset = dev_addr & ~PAGE_MASK;
return __pfn_to_phys(dma_to_pfn(dev, dev_addr)) + offset;
diff --git a/arch/mips/bmips/dma.c b/arch/mips/bmips/dma.c
index df56bf4179e347..ba2a5d33dfd3fa 100644
--- a/arch/mips/bmips/dma.c
+++ b/arch/mips/bmips/dma.c
@@ -52,7 +52,7 @@ dma_addr_t __phys_to_dma(struct device *dev, phys_addr_t pa)
return pa;
  }
  
-phys_addr_t __dma_to_phys(struct device *dev, dma_addr_t dma_addr)

+phys_addr_t dma_to_phys(struct device *dev, dma_addr_t dma_addr)
  {
struct bmips_dma_range *r;
  
diff --git a/arch/mips/cavium-octeon/dma-octeon.c b/arch/mips/cavium-octeon/dma-octeon.c

index 14ea680d180e07..388b13ba2558c2 100644
--- a/arch/mips/cavium-octeon/dma-octeon.c
+++ b/arch/mips/cavium-octeon/dma-octeon.c
@@ -177,7 +177,7 @@ dma_addr_t __phys_to_dma(struct device *dev, phys_addr_t 
paddr)
return paddr;
  }
  
-phys_addr_t __dma_to_phys(struct device *dev, dma_addr_t daddr)

+phys_addr_t dma_to_phys(struct device *dev, dma_addr_t daddr)
  {
  #ifdef CONFIG_PCI
if (dev && dev_is_pci(dev))
diff --git a/arch/mips/include/asm/dma-direct.h 
b/arch/mips/include/asm/dma-direct.h
index 14e352651ce946..8e178651c638c2 100644
--- a/arch/mips/include/asm/dma-direct.h
+++ b/arch/mips/include/asm/dma-direct.h
@@ -3,6 +3,6 @@
  #define _MIPS_DMA_DIRECT_H 1
  
  dma_addr_t __phys_to_dma(struct device *dev, phys_addr_t paddr);

-phys_addr_t __dma_to_phys(struct device *dev, dma_addr_t daddr);
+phys_addr_t dma_to_phys(struct device *dev, dma_addr_t daddr);
  
  #endif /* _MIPS_DMA_DIRECT_H */

diff --git a/arch/mips/loongson2ef/fuloong-2e/dma.c 
b/arch/mips/loongson2ef/fuloong-2e/dma.c
index e122292bf6660a..83fadeb3fd7d56 100644
--- a/arch/mips/loongson2ef/fuloong-2e/dma.c
+++ b/arch/mips/loongson2ef/fuloong-2e/dma.c
@@ -6,7 +6,7 @@ dma_addr_t __phys_to_dma(struct device *dev, phys_addr_t paddr)
return paddr | 0x8000;
  }
  
-phys_addr_t __dma_to_phys(struct device *dev, dma_addr_t dma_addr)

+phys_addr_t dma_to_phys(struct device *dev, dma_addr_t dma_addr)
  {
return dma_addr & 0x7fff;
  }
diff --git a/arch/mips/loongson2ef/lemote-2f/dma.c 
b/arch/mips/loongson2ef/lemote-2f/dma.c
index abf0e39d7e4696..302b43a14eee74 100644
--- a/arch/mips/loongson2ef/lemote-2f/dma.c
+++ b/arch/mips/loongson2ef/lemote-2f/dma.c
@@ -6,7 +6,7 @@ dma_addr_t __phys_to_dma(struct device *dev, phys_addr_t paddr)
return paddr | 0x8000;
  }
  
-phys_addr_t __dma_to_phys(struct device *dev, dma_addr_t dma_addr)

+phys_addr_t dma_to_phys(struct device *dev, dma_addr_t dma_addr)
  {
if (dma_addr > 0x8fff)
return dma_addr;
diff --git a/arch/mips/loongson64/dma.c b/arch/mips/loongson64/dma.c
index dbfe6e82fddd1c..b3dc5d0bd2b113 100644
--- a/arch/mips/loongson64/dma.c
+++ b/arch/mips/loongson64/dma.c
@@ -13,7 +13,7 @@ dma_addr_t __phys_to_dma(struct device *dev, phys_addr_t 
paddr)
return ((nid << 44) ^ paddr) | (nid << node_id_offset);
  }
  
-phys_addr_t __dma_to_phys(struct device *dev, dma_addr_t daddr)

+phys_addr_t dma_to_phys(struct device *dev, dma_addr_t daddr)
  {
/* We extract 2bit node id (bit 

Re: [PATCH 08/12] dma-direct: use phys_to_dma_direct in dma_direct_alloc

2020-09-10 Thread Robin Murphy

On 2020-09-08 17:47, Christoph Hellwig wrote:

Replace the currently open code copy.


Reviewed-by: Robin Murphy 


Signed-off-by: Christoph Hellwig 
---
  kernel/dma/direct.c | 5 +
  1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
index 12e9f5f75dfe4b..57a6e7d7cf8f16 100644
--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -240,10 +240,7 @@ void *dma_direct_alloc(struct device *dev, size_t size,
goto out_encrypt_pages;
}
  done:
-   if (force_dma_unencrypted(dev))
-   *dma_handle = __phys_to_dma(dev, page_to_phys(page));
-   else
-   *dma_handle = phys_to_dma(dev, page_to_phys(page));
+   *dma_handle = phys_to_dma_direct(dev, page_to_phys(page));
return ret;
  
  out_encrypt_pages:



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 07/12] dma-direct: lift gfp_t manipulation out of__dma_direct_alloc_pages

2020-09-10 Thread Robin Murphy

On 2020-09-08 17:47, Christoph Hellwig wrote:

Move the detailed gfp_t setup from __dma_direct_alloc_pages into the
caller to clean things up a little.


Other than a mild nitpick that it might be nicer to spend one extra line 
to keep both gfp adjustments next to each other,


Reviewed-by: Robin Murphy 


Signed-off-by: Christoph Hellwig 
---
  kernel/dma/direct.c | 12 +---
  1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
index 1d564bea58571b..12e9f5f75dfe4b 100644
--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -108,7 +108,7 @@ static inline bool dma_should_free_from_pool(struct device 
*dev,
  }
  
  static struct page *__dma_direct_alloc_pages(struct device *dev, size_t size,

-   gfp_t gfp, unsigned long attrs)
+   gfp_t gfp)
  {
int node = dev_to_node(dev);
struct page *page = NULL;
@@ -116,11 +116,6 @@ static struct page *__dma_direct_alloc_pages(struct device 
*dev, size_t size,
  
  	WARN_ON_ONCE(!PAGE_ALIGNED(size));
  
-	if (attrs & DMA_ATTR_NO_WARN)

-   gfp |= __GFP_NOWARN;
-
-   /* we always manually zero the memory once we are done: */
-   gfp &= ~__GFP_ZERO;
gfp |= dma_direct_optimal_gfp_mask(dev, dev->coherent_dma_mask,
   _limit);
page = dma_alloc_contiguous(dev, size, gfp);
@@ -164,6 +159,8 @@ void *dma_direct_alloc(struct device *dev, size_t size,
return arch_dma_alloc(dev, size, dma_handle, gfp, attrs);
  
  	size = PAGE_ALIGN(size);

+   if (attrs & DMA_ATTR_NO_WARN)
+   gfp |= __GFP_NOWARN;
  
  	if (dma_should_alloc_from_pool(dev, gfp, attrs)) {

u64 phys_mask;
@@ -177,7 +174,8 @@ void *dma_direct_alloc(struct device *dev, size_t size,
goto done;
}
  
-	page = __dma_direct_alloc_pages(dev, size, gfp, attrs);

+   /* we always manually zero the memory once we are done */
+   page = __dma_direct_alloc_pages(dev, size, gfp & ~__GFP_ZERO);
if (!page)
return NULL;
  


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 06/12] dma-direct: remove dma_direct_{alloc,free}_pages

2020-09-10 Thread Robin Murphy

On 2020-09-08 17:47, Christoph Hellwig wrote:

Just merge these helpers into the main dma_direct_{alloc,free} routines,
as the additional checks are always false for the two callers.


Reviewed-by: Robin Murphy 


Signed-off-by: Christoph Hellwig 
---
  arch/x86/kernel/amd_gart_64.c |  6 +++---
  include/linux/dma-direct.h|  4 
  kernel/dma/direct.c   | 39 ++-
  kernel/dma/pool.c |  2 +-
  4 files changed, 19 insertions(+), 32 deletions(-)

diff --git a/arch/x86/kernel/amd_gart_64.c b/arch/x86/kernel/amd_gart_64.c
index bccc5357bffd6c..153374b996a279 100644
--- a/arch/x86/kernel/amd_gart_64.c
+++ b/arch/x86/kernel/amd_gart_64.c
@@ -467,7 +467,7 @@ gart_alloc_coherent(struct device *dev, size_t size, 
dma_addr_t *dma_addr,
  {
void *vaddr;
  
-	vaddr = dma_direct_alloc_pages(dev, size, dma_addr, flag, attrs);

+   vaddr = dma_direct_alloc(dev, size, dma_addr, flag, attrs);
if (!vaddr ||
!force_iommu || dev->coherent_dma_mask <= DMA_BIT_MASK(24))
return vaddr;
@@ -479,7 +479,7 @@ gart_alloc_coherent(struct device *dev, size_t size, 
dma_addr_t *dma_addr,
goto out_free;
return vaddr;
  out_free:
-   dma_direct_free_pages(dev, size, vaddr, *dma_addr, attrs);
+   dma_direct_free(dev, size, vaddr, *dma_addr, attrs);
return NULL;
  }
  
@@ -489,7 +489,7 @@ gart_free_coherent(struct device *dev, size_t size, void *vaddr,

   dma_addr_t dma_addr, unsigned long attrs)
  {
gart_unmap_page(dev, dma_addr, size, DMA_BIDIRECTIONAL, 0);
-   dma_direct_free_pages(dev, size, vaddr, dma_addr, attrs);
+   dma_direct_free(dev, size, vaddr, dma_addr, attrs);
  }
  
  static int no_agp;

diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h
index 95e3e28bd93f47..20eceb2e4f91f8 100644
--- a/include/linux/dma-direct.h
+++ b/include/linux/dma-direct.h
@@ -77,10 +77,6 @@ void *dma_direct_alloc(struct device *dev, size_t size, 
dma_addr_t *dma_handle,
gfp_t gfp, unsigned long attrs);
  void dma_direct_free(struct device *dev, size_t size, void *cpu_addr,
dma_addr_t dma_addr, unsigned long attrs);
-void *dma_direct_alloc_pages(struct device *dev, size_t size,
-   dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs);
-void dma_direct_free_pages(struct device *dev, size_t size, void *cpu_addr,
-   dma_addr_t dma_addr, unsigned long attrs);
  int dma_direct_get_sgtable(struct device *dev, struct sg_table *sgt,
void *cpu_addr, dma_addr_t dma_addr, size_t size,
unsigned long attrs);
diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
index 949c1cbf08b2d5..1d564bea58571b 100644
--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -151,13 +151,18 @@ static struct page *__dma_direct_alloc_pages(struct 
device *dev, size_t size,
return page;
  }
  
-void *dma_direct_alloc_pages(struct device *dev, size_t size,

+void *dma_direct_alloc(struct device *dev, size_t size,
dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs)
  {
struct page *page;
void *ret;
int err;
  
+	if (!IS_ENABLED(CONFIG_ARCH_HAS_DMA_SET_UNCACHED) &&

+   !IS_ENABLED(CONFIG_DMA_DIRECT_REMAP) &&
+   dma_alloc_need_uncached(dev, attrs))
+   return arch_dma_alloc(dev, size, dma_handle, gfp, attrs);
+
size = PAGE_ALIGN(size);
  
  	if (dma_should_alloc_from_pool(dev, gfp, attrs)) {

@@ -256,11 +261,18 @@ void *dma_direct_alloc_pages(struct device *dev, size_t 
size,
return NULL;
  }
  
-void dma_direct_free_pages(struct device *dev, size_t size, void *cpu_addr,

-   dma_addr_t dma_addr, unsigned long attrs)
+void dma_direct_free(struct device *dev, size_t size,
+   void *cpu_addr, dma_addr_t dma_addr, unsigned long attrs)
  {
unsigned int page_order = get_order(size);
  
+	if (!IS_ENABLED(CONFIG_ARCH_HAS_DMA_SET_UNCACHED) &&

+   !IS_ENABLED(CONFIG_DMA_DIRECT_REMAP) &&
+   dma_alloc_need_uncached(dev, attrs)) {
+   arch_dma_free(dev, size, cpu_addr, dma_addr, attrs);
+   return;
+   }
+
/* If cpu_addr is not from an atomic pool, dma_free_from_pool() fails */
if (dma_should_free_from_pool(dev, attrs) &&
dma_free_from_pool(dev, cpu_addr, PAGE_ALIGN(size)))
@@ -284,27 +296,6 @@ void dma_direct_free_pages(struct device *dev, size_t 
size, void *cpu_addr,
dma_free_contiguous(dev, dma_direct_to_page(dev, dma_addr), size);
  }
  
-void *dma_direct_alloc(struct device *dev, size_t size,

-   dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs)
-{
-   if (!IS_ENABLED(CONFIG_ARCH_HAS_DMA_SET_UNCACHED) &&
-   !IS_ENABLED(CONFIG_DMA_DIRECT_REMAP) &&
-   dma_alloc_need_uncached(dev, attrs))
-   return arch_dma_alloc(dev, size, dma_handle, gfp, attrs);
-

Re: [PATCH 04/12] dma-mapping: fix DMA_OPS dependencies

2020-09-10 Thread Robin Murphy

On 2020-09-08 17:47, Christoph Hellwig wrote:

Driver that select DMA_OPS need to depend on HAS_DMA support to
work.  The vop driver was missing that dependency, so add it, and also
add a nother depends in DMA_OPS itself.  That won't fix the issue due
to how the Kconfig dependencies work, but at least produce a warning
about unmet dependencies.

Signed-off-by: Christoph Hellwig 
---
  drivers/misc/mic/Kconfig | 1 +
  kernel/dma/Kconfig   | 1 +
  2 files changed, 2 insertions(+)

diff --git a/drivers/misc/mic/Kconfig b/drivers/misc/mic/Kconfig
index b9bb086785db48..8a7c2c5711d5f4 100644
--- a/drivers/misc/mic/Kconfig
+++ b/drivers/misc/mic/Kconfig
@@ -35,6 +35,7 @@ config SCIF_BUS
  
  config VOP_BUS

tristate "VOP Bus Driver"
+   depends on HAS_DMA
select DMA_OPS


AFAICS all three of these bus drivers are only proxying a struct 
dma_map_ops * pointer around, so if they used the set_dma_ops() helper 
they shouldn't even need these selects at all. Only INTEL_MIC_HOST 
appears to have a logical dependency on DMA_OPS for actual functionality.


However, I have a vague feeling you might not be fond of those dma_ops 
helpers, and I have no great objection to this one-liner as-is, so 
(modulo the couple of commit message typos),


Reviewed-by: Robin Murphy 

(of course the hunk below is unquestionably OK)

Robin.


help
  This option is selected by any driver which registers a
diff --git a/kernel/dma/Kconfig b/kernel/dma/Kconfig
index 0ddfb5510fe45f..e7b801649f6574 100644
--- a/kernel/dma/Kconfig
+++ b/kernel/dma/Kconfig
@@ -9,6 +9,7 @@ config HAS_DMA
default y
  
  config DMA_OPS

+   depends on HAS_DMA
bool
  
  #



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] Handle init_iova_flush_queue failure in dma-iommu path

2020-09-10 Thread Tom Murphy
init_iova_flush_queue can fail if we run out of memory. Fall back to noflush
 queue if it fails.

Signed-off-by: Tom Murphy 
---
 drivers/iommu/dma-iommu.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 4959f5df21bd..5f69126f3e91 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -343,8 +343,11 @@ static int iommu_dma_init_domain(struct iommu_domain 
*domain, dma_addr_t base,
 
if (!cookie->fq_domain && !iommu_domain_get_attr(domain,
DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE, ) && attr) {
-   cookie->fq_domain = domain;
-   init_iova_flush_queue(iovad, iommu_dma_flush_iotlb_all, NULL);
+   if (init_iova_flush_queue(iovad, iommu_dma_flush_iotlb_all,
+   NULL))
+   pr_warn("iova flush queue initialization failed\n");
+   else
+   cookie->fq_domain = domain;
}
 
if (!dev)
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/2] dma-debug: remove all exports

2020-09-10 Thread Christoph Hellwig
On Thu, Sep 10, 2020 at 12:45:09PM +0100, Robin Murphy wrote:
> FWIW I did briefly look at how much of dma-debug we could drop from 
> dma-mapping.h and make entirely private to kernel/dma/, but couldn't 
> convince myself that an awkward split with a few calls still needing to be 
> public would be worthwhile.

I've stared to look into a major dma header reshuffle, basically
have dma-mapping.h only contain the API needed by consumer of the
DMA API, a new linux/dma-map-ops.h for providers, and everyting else
under kernel/dma/.  The initial prototype looks nice, but I need to
finish off a few lose ends.

>> -EXPORT_SYMBOL(debug_dma_map_single);
>
> This is still called inline via dma_map_single_attrs(), no?
>
>>   }
>> -EXPORT_SYMBOL(debug_dma_mapping_error);
>
> Ditto this for dma_mapping_error(). We hardly want to discourage modules 
> from calling that ;)
>
> With those fixed (unless I've missed some other preceding change),

Also yes.  I actually fixed it up locall this morning, this is what
I have now:

---
>From 8a310506789611f7a9426e270a8bac647d6f4b1b Mon Sep 17 00:00:00 2001
From: Christoph Hellwig 
Date: Tue, 8 Sep 2020 18:35:24 +0200
Subject: dma-debug: remove most exports

Now that the main dma mapping entry points are out of line most of the
symbols in dma-debug.c can only be called from built-in code.  Remove
the unused exports.

Signed-off-by: Christoph Hellwig 
---
 kernel/dma/debug.c | 10 --
 1 file changed, 10 deletions(-)

diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c
index 8e9f7b301c6d39..6f53c2e03aa4f8 100644
--- a/kernel/dma/debug.c
+++ b/kernel/dma/debug.c
@@ -1235,7 +1235,6 @@ void debug_dma_map_page(struct device *dev, struct page 
*page, size_t offset,
 
add_dma_entry(entry);
 }
-EXPORT_SYMBOL(debug_dma_map_page);
 
 void debug_dma_mapping_error(struct device *dev, dma_addr_t dma_addr)
 {
@@ -1290,7 +1289,6 @@ void debug_dma_unmap_page(struct device *dev, dma_addr_t 
addr,
return;
check_unmap();
 }
-EXPORT_SYMBOL(debug_dma_unmap_page);
 
 void debug_dma_map_sg(struct device *dev, struct scatterlist *sg,
  int nents, int mapped_ents, int direction)
@@ -1328,7 +1326,6 @@ void debug_dma_map_sg(struct device *dev, struct 
scatterlist *sg,
add_dma_entry(entry);
}
 }
-EXPORT_SYMBOL(debug_dma_map_sg);
 
 static int get_nr_mapped_entries(struct device *dev,
 struct dma_debug_entry *ref)
@@ -1380,7 +1377,6 @@ void debug_dma_unmap_sg(struct device *dev, struct 
scatterlist *sglist,
check_unmap();
}
 }
-EXPORT_SYMBOL(debug_dma_unmap_sg);
 
 void debug_dma_alloc_coherent(struct device *dev, size_t size,
  dma_addr_t dma_addr, void *virt)
@@ -1466,7 +1462,6 @@ void debug_dma_map_resource(struct device *dev, 
phys_addr_t addr, size_t size,
 
add_dma_entry(entry);
 }
-EXPORT_SYMBOL(debug_dma_map_resource);
 
 void debug_dma_unmap_resource(struct device *dev, dma_addr_t dma_addr,
  size_t size, int direction)
@@ -1484,7 +1479,6 @@ void debug_dma_unmap_resource(struct device *dev, 
dma_addr_t dma_addr,
 
check_unmap();
 }
-EXPORT_SYMBOL(debug_dma_unmap_resource);
 
 void debug_dma_sync_single_for_cpu(struct device *dev, dma_addr_t dma_handle,
   size_t size, int direction)
@@ -1503,7 +1497,6 @@ void debug_dma_sync_single_for_cpu(struct device *dev, 
dma_addr_t dma_handle,
 
check_sync(dev, , true);
 }
-EXPORT_SYMBOL(debug_dma_sync_single_for_cpu);
 
 void debug_dma_sync_single_for_device(struct device *dev,
  dma_addr_t dma_handle, size_t size,
@@ -1523,7 +1516,6 @@ void debug_dma_sync_single_for_device(struct device *dev,
 
check_sync(dev, , false);
 }
-EXPORT_SYMBOL(debug_dma_sync_single_for_device);
 
 void debug_dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg,
   int nelems, int direction)
@@ -1556,7 +1548,6 @@ void debug_dma_sync_sg_for_cpu(struct device *dev, struct 
scatterlist *sg,
check_sync(dev, , true);
}
 }
-EXPORT_SYMBOL(debug_dma_sync_sg_for_cpu);
 
 void debug_dma_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
  int nelems, int direction)
@@ -1588,7 +1579,6 @@ void debug_dma_sync_sg_for_device(struct device *dev, 
struct scatterlist *sg,
check_sync(dev, , false);
}
 }
-EXPORT_SYMBOL(debug_dma_sync_sg_for_device);
 
 static int __init dma_debug_driver_setup(char *str)
 {
-- 
2.28.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/2] dma-debug: remove all exports

2020-09-10 Thread Robin Murphy

On 2020-09-08 17:39, Christoph Hellwig wrote:

Now that the main dma mapping entry points are out of line none
of these functions can be called from modular code.


FWIW I did briefly look at how much of dma-debug we could drop from 
dma-mapping.h and make entirely private to kernel/dma/, but couldn't 
convince myself that an awkward split with a few calls still needing to 
be public would be worthwhile.



Signed-off-by: Christoph Hellwig 
---
  kernel/dma/debug.c | 12 
  1 file changed, 12 deletions(-)

diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c
index 8e9f7b301c6d39..9e5370e3700c08 100644
--- a/kernel/dma/debug.c
+++ b/kernel/dma/debug.c
@@ -1199,7 +1199,6 @@ void debug_dma_map_single(struct device *dev, const void 
*addr,
err_printk(dev, NULL, "device driver maps memory from vmalloc area 
[addr=%p] [len=%lu]\n",
   addr, len);
  }
-EXPORT_SYMBOL(debug_dma_map_single);


This is still called inline via dma_map_single_attrs(), no?


  void debug_dma_map_page(struct device *dev, struct page *page, size_t offset,
size_t size, int direction, dma_addr_t dma_addr)
@@ -1235,7 +1234,6 @@ void debug_dma_map_page(struct device *dev, struct page 
*page, size_t offset,
  
  	add_dma_entry(entry);

  }
-EXPORT_SYMBOL(debug_dma_map_page);
  
  void debug_dma_mapping_error(struct device *dev, dma_addr_t dma_addr)

  {
@@ -1273,7 +1271,6 @@ void debug_dma_mapping_error(struct device *dev, 
dma_addr_t dma_addr)
  
  	put_hash_bucket(bucket, flags);

  }
-EXPORT_SYMBOL(debug_dma_mapping_error);


Ditto this for dma_mapping_error(). We hardly want to discourage modules 
from calling that ;)


With those fixed (unless I've missed some other preceding change),

Reviewed-by: Robin Murphy 

Cheers,
Robin.


  void debug_dma_unmap_page(struct device *dev, dma_addr_t addr,
  size_t size, int direction)
@@ -1290,7 +1287,6 @@ void debug_dma_unmap_page(struct device *dev, dma_addr_t 
addr,
return;
check_unmap();
  }
-EXPORT_SYMBOL(debug_dma_unmap_page);
  
  void debug_dma_map_sg(struct device *dev, struct scatterlist *sg,

  int nents, int mapped_ents, int direction)
@@ -1328,7 +1324,6 @@ void debug_dma_map_sg(struct device *dev, struct 
scatterlist *sg,
add_dma_entry(entry);
}
  }
-EXPORT_SYMBOL(debug_dma_map_sg);
  
  static int get_nr_mapped_entries(struct device *dev,

 struct dma_debug_entry *ref)
@@ -1380,7 +1375,6 @@ void debug_dma_unmap_sg(struct device *dev, struct 
scatterlist *sglist,
check_unmap();
}
  }
-EXPORT_SYMBOL(debug_dma_unmap_sg);
  
  void debug_dma_alloc_coherent(struct device *dev, size_t size,

  dma_addr_t dma_addr, void *virt)
@@ -1466,7 +1460,6 @@ void debug_dma_map_resource(struct device *dev, 
phys_addr_t addr, size_t size,
  
  	add_dma_entry(entry);

  }
-EXPORT_SYMBOL(debug_dma_map_resource);
  
  void debug_dma_unmap_resource(struct device *dev, dma_addr_t dma_addr,

  size_t size, int direction)
@@ -1484,7 +1477,6 @@ void debug_dma_unmap_resource(struct device *dev, 
dma_addr_t dma_addr,
  
  	check_unmap();

  }
-EXPORT_SYMBOL(debug_dma_unmap_resource);
  
  void debug_dma_sync_single_for_cpu(struct device *dev, dma_addr_t dma_handle,

   size_t size, int direction)
@@ -1503,7 +1495,6 @@ void debug_dma_sync_single_for_cpu(struct device *dev, 
dma_addr_t dma_handle,
  
  	check_sync(dev, , true);

  }
-EXPORT_SYMBOL(debug_dma_sync_single_for_cpu);
  
  void debug_dma_sync_single_for_device(struct device *dev,

  dma_addr_t dma_handle, size_t size,
@@ -1523,7 +1514,6 @@ void debug_dma_sync_single_for_device(struct device *dev,
  
  	check_sync(dev, , false);

  }
-EXPORT_SYMBOL(debug_dma_sync_single_for_device);
  
  void debug_dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg,

   int nelems, int direction)
@@ -1556,7 +1546,6 @@ void debug_dma_sync_sg_for_cpu(struct device *dev, struct 
scatterlist *sg,
check_sync(dev, , true);
}
  }
-EXPORT_SYMBOL(debug_dma_sync_sg_for_cpu);
  
  void debug_dma_sync_sg_for_device(struct device *dev, struct scatterlist *sg,

  int nelems, int direction)
@@ -1588,7 +1577,6 @@ void debug_dma_sync_sg_for_device(struct device *dev, 
struct scatterlist *sg,
check_sync(dev, , false);
}
  }
-EXPORT_SYMBOL(debug_dma_sync_sg_for_device);
  
  static int __init dma_debug_driver_setup(char *str)

  {


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/2] dma-mapping: remove the dma_dummy_ops export

2020-09-10 Thread Robin Murphy

On 2020-09-08 17:39, Christoph Hellwig wrote:

dma_dummy_ops is only used by the ACPI code, which can't be modular.


Reviewed-by: Robin Murphy 


Signed-off-by: Christoph Hellwig 
---
  kernel/dma/dummy.c | 1 -
  1 file changed, 1 deletion(-)

diff --git a/kernel/dma/dummy.c b/kernel/dma/dummy.c
index 05607642c888d9..6974b1bd7d0b88 100644
--- a/kernel/dma/dummy.c
+++ b/kernel/dma/dummy.c
@@ -36,4 +36,3 @@ const struct dma_map_ops dma_dummy_ops = {
.map_sg = dma_dummy_map_sg,
.dma_supported  = dma_dummy_supported,
  };
-EXPORT_SYMBOL(dma_dummy_ops);


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 10/16] vfio/type1: Support binding guest page tables to PASID

2020-09-10 Thread Liu Yi L
Nesting translation allows two-levels/stages page tables, with 1st level
for guest translations (e.g. GVA->GPA), 2nd level for host translations
(e.g. GPA->HPA). This patch adds interface for binding guest page tables
to a PASID. This PASID must have been allocated by the userspace before
the binding request.

Cc: Kevin Tian 
CC: Jacob Pan 
Cc: Alex Williamson 
Cc: Eric Auger 
Cc: Jean-Philippe Brucker 
Cc: Joerg Roedel 
Cc: Lu Baolu 
Signed-off-by: Jean-Philippe Brucker 
Signed-off-by: Liu Yi L 
Signed-off-by: Jacob Pan 
---
v6 -> v7:
*) introduced @user in struct domain_capsule to simplify the code per Eric's
   suggestion.
*) introduced VFIO_IOMMU_NESTING_OP_NUM for sanitizing op from userspace.
*) corrected the @argsz value of unbind_data in vfio_group_unbind_gpasid_fn().

v5 -> v6:
*) dropped vfio_find_nesting_group() and add vfio_get_nesting_domain_capsule().
   per comment from Eric.
*) use iommu_uapi_sva_bind/unbind_gpasid() and iommu_sva_unbind_gpasid() in
   linux/iommu.h for userspace operation and in-kernel operation.

v3 -> v4:
*) address comments from Alex on v3

v2 -> v3:
*) use __iommu_sva_unbind_gpasid() for unbind call issued by VFIO
   
https://lore.kernel.org/linux-iommu/1592931837-58223-6-git-send-email-jacob.jun@linux.intel.com/

v1 -> v2:
*) rename subject from "vfio/type1: Bind guest page tables to host"
*) remove VFIO_IOMMU_BIND, introduce VFIO_IOMMU_NESTING_OP to support bind/
   unbind guet page table
*) replaced vfio_iommu_for_each_dev() with a group level loop since this
   series enforces one group per container w/ nesting type as start.
*) rename vfio_bind/unbind_gpasid_fn() to vfio_dev_bind/unbind_gpasid_fn()
*) vfio_dev_unbind_gpasid() always successful
*) use vfio_mm->pasid_lock to avoid race between PASID free and page table
   bind/unbind
---
 drivers/vfio/vfio_iommu_type1.c | 163 
 drivers/vfio/vfio_pasid.c   |  26 +++
 include/linux/vfio.h|  20 +
 include/uapi/linux/vfio.h   |  36 +
 4 files changed, 245 insertions(+)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index bd4b668..11f1156 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -149,6 +149,39 @@ struct vfio_regions {
 #define DIRTY_BITMAP_PAGES_MAX  ((u64)INT_MAX)
 #define DIRTY_BITMAP_SIZE_MAX   DIRTY_BITMAP_BYTES(DIRTY_BITMAP_PAGES_MAX)
 
+struct domain_capsule {
+   struct vfio_group   *group;
+   struct iommu_domain *domain;
+   /* set if @data contains a user pointer*/
+   booluser;
+   void*data;
+};
+
+/* iommu->lock must be held */
+static int vfio_prepare_nesting_domain_capsule(struct vfio_iommu *iommu,
+  struct domain_capsule *dc)
+{
+   struct vfio_domain *domain = NULL;
+   struct vfio_group *group = NULL;
+
+   if (!iommu->nesting_info)
+   return -EINVAL;
+
+   /*
+* Only support singleton container with nesting type. If
+* nesting_info is non-NULL, the container is non-empty.
+* Also domain is non-empty.
+*/
+   domain = list_first_entry(>domain_list,
+ struct vfio_domain, next);
+   group = list_first_entry(>group_list,
+struct vfio_group, next);
+   dc->group = group;
+   dc->domain = domain->domain;
+   dc->user = true;
+   return 0;
+}
+
 static int put_pfn(unsigned long pfn, int prot);
 
 static struct vfio_group *vfio_iommu_find_iommu_group(struct vfio_iommu *iommu,
@@ -2405,6 +2438,49 @@ static int vfio_iommu_resv_refresh(struct vfio_iommu 
*iommu,
return ret;
 }
 
+static int vfio_dev_bind_gpasid_fn(struct device *dev, void *data)
+{
+   struct domain_capsule *dc = (struct domain_capsule *)data;
+   unsigned long arg = *(unsigned long *)dc->data;
+
+   return iommu_uapi_sva_bind_gpasid(dc->domain, dev,
+ (void __user *)arg);
+}
+
+static int vfio_dev_unbind_gpasid_fn(struct device *dev, void *data)
+{
+   struct domain_capsule *dc = (struct domain_capsule *)data;
+
+   if (dc->user) {
+   unsigned long arg = *(unsigned long *)dc->data;
+
+   iommu_uapi_sva_unbind_gpasid(dc->domain,
+dev, (void __user *)arg);
+   } else {
+   struct iommu_gpasid_bind_data *unbind_data =
+   (struct iommu_gpasid_bind_data *)dc->data;
+
+   iommu_sva_unbind_gpasid(dc->domain, dev, unbind_data);
+   }
+   return 0;
+}
+
+static void vfio_group_unbind_gpasid_fn(ioasid_t pasid, void *data)
+{
+   struct domain_capsule *dc = (struct domain_capsule *)data;
+   struct iommu_gpasid_bind_data unbind_data;
+
+   unbind_data.argsz = sizeof(struct iommu_gpasid_bind_data);
+   unbind_data.flags = 0;
+ 

[PATCH v7 14/16] vfio: Document dual stage control

2020-09-10 Thread Liu Yi L
From: Eric Auger 

The VFIO API was enhanced to support nested stage control: a bunch of
new ioctls and usage guideline.

Let's document the process to follow to set up nested mode.

Cc: Kevin Tian 
CC: Jacob Pan 
Cc: Alex Williamson 
Cc: Eric Auger 
Cc: Jean-Philippe Brucker 
Cc: Joerg Roedel 
Cc: Lu Baolu 
Signed-off-by: Eric Auger 
Signed-off-by: Liu Yi L 
Reviewed-by: Stefan Hajnoczi 
---
v6 -> v7:
*) tweak per Eric's comments.

v5 -> v6:
*) tweak per Eric's comments.

v3 -> v4:
*) add review-by from Stefan Hajnoczi

v2 -> v3:
*) address comments from Stefan Hajnoczi

v1 -> v2:
*) new in v2, compared with Eric's original version, pasid table bind
   and fault reporting is removed as this series doesn't cover them.
   Original version from Eric.
   https://lore.kernel.org/kvm/20200320161911.27494-12-eric.au...@redhat.com/
---
 Documentation/driver-api/vfio.rst | 76 +++
 1 file changed, 76 insertions(+)

diff --git a/Documentation/driver-api/vfio.rst 
b/Documentation/driver-api/vfio.rst
index f1a4d3c..10851dd 100644
--- a/Documentation/driver-api/vfio.rst
+++ b/Documentation/driver-api/vfio.rst
@@ -239,6 +239,82 @@ group and can access them as follows::
/* Gratuitous device reset and go... */
ioctl(device, VFIO_DEVICE_RESET);
 
+IOMMU Dual Stage Control
+
+
+Some IOMMUs support 2 stages/levels of translation. Stage corresponds
+to the ARM terminology while level corresponds to Intel's terminology.
+In the following text we use either without distinction.
+
+This is useful when the guest is exposed with a virtual IOMMU and some
+devices are assigned to the guest through VFIO. Then the guest OS can
+use stage-1 (GIOVA -> GPA or GVA->GPA), while the hypervisor uses stage
+2 for VM isolation (GPA -> HPA).
+
+Under dual stage translation, the guest gets ownership of the stage-1
+page tables or both the stage-1 configuration structures and page tables.
+This depends on vendor. e.g. on Intel platform, guest owns stage-1 page
+tables under nesting. While on ARM, guest owns both the stage-1 configuration
+structures and page tables under nesting. The hypervisor owns the root
+configuration structure (for security reason), including stage-2 configuration.
+This works as long as configuration structures and page table formats are
+compatible between the virtual IOMMU and the physical IOMMU.
+
+Assuming the HW supports it, this nested mode is selected by choosing the
+VFIO_TYPE1_NESTING_IOMMU type through:
+
+ioctl(container, VFIO_SET_IOMMU, VFIO_TYPE1_NESTING_IOMMU);
+
+This forces the hypervisor to use the stage-2, leaving stage-1 available
+for guest usage.
+The stage-1 format and binding method are reported in nesting capability.
+(VFIO_IOMMU_TYPE1_INFO_CAP_NESTING) through VFIO_IOMMU_GET_INFO:
+
+ioctl(container->fd, VFIO_IOMMU_GET_INFO, _info);
+
+The nesting cap info is available only after NESTING_IOMMU is selected.
+If the underlying IOMMU doesn't support nesting, VFIO_SET_IOMMU fails and
+userspace should try other IOMMU types. Details of the nesting cap info
+can be found in Documentation/userspace-api/iommu.rst.
+
+Bind stage-1 page table to the IOMMU differs per platform. On Intel,
+the stage1 page table info are mediated by the userspace for each PASID.
+On ARM, the userspace directly passes the GPA of the whole PASID table.
+Currently only Intel's binding is supported (IOMMU_NESTING_FEAT_BIND_PGTBL)
+is supported:
+
+nesting_op->flags = VFIO_IOMMU_NESTING_OP_BIND_PGTBL;
+memcpy(_op->data, _data, sizeof(bind_data));
+ioctl(container->fd, VFIO_IOMMU_NESTING_OP, nesting_op);
+
+When multiple stage-1 page tables are supported on a device, each page
+table is associated with a PASID (Process Address Space ID) to differentiate
+with each other. In such case, userspace should include PASID in the
+bind_data when issuing direct binding request.
+
+PASID could be managed per-device or system-wide which, again, depends on
+IOMMU vendor and is reported in nesting cap info. When system-wide policy
+is reported (IOMMU_NESTING_FEAT_SYSWIDE_PASID), e.g. as by Intel platforms,
+userspace *must* allocate PASID from VFIO before attempting binding of
+stage-1 page table:
+
+req.flags = VFIO_IOMMU_ALLOC_PASID;
+ioctl(container, VFIO_IOMMU_PASID_REQUEST, );
+
+Once the stage-1 page table is bound to the IOMMU, the guest is allowed to
+fully manage its mapping at its disposal. The IOMMU walks nested stage-1
+and stage-2 page tables when serving DMA requests from assigned device, and
+may cache the stage-1 mapping in the IOTLB. When required (IOMMU_NESTING_
+FEAT_CACHE_INVLD), userspace *must* forward guest stage-1 invalidation to
+the host, so the IOTLB is invalidated:
+
+nesting_op->flags = VFIO_IOMMU_NESTING_OP_CACHE_INVLD;
+memcpy(_op->data, _inv_data, sizeof(cache_inv_data));
+ioctl(container->fd, VFIO_IOMMU_NESTING_OP, nesting_op);
+
+Forwarded invalidations can happen at various granularity levels 

[PATCH v7 06/16] iommu/vt-d: Remove get_task_mm() in bind_gpasid()

2020-09-10 Thread Liu Yi L
This patch is to address a REVISIT. As ioasid_set is added to domain,
upper layer/VFIO can set ioasid_set to iommu driver, and track the
PASID ownership, so no need to get_task_mm() in intel_svm_bind_gpasid().

Cc: Kevin Tian 
CC: Jacob Pan 
Cc: Alex Williamson 
Cc: Eric Auger 
Cc: Jean-Philippe Brucker 
Cc: Joerg Roedel 
Cc: Lu Baolu 
Signed-off-by: Liu Yi L 
---
 drivers/iommu/intel/svm.c | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index 519eabb..d3cf52b 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -400,12 +400,6 @@ int intel_svm_bind_gpasid(struct iommu_domain *domain, 
struct device *dev,
ret = -ENOMEM;
goto out;
}
-   /* REVISIT: upper layer/VFIO can track host process that bind
-* the PASID. ioasid_set = mm might be sufficient for vfio to
-* check pasid VMM ownership. We can drop the following line
-* once VFIO and IOASID set check is in place.
-*/
-   svm->mm = get_task_mm(current);
svm->pasid = data->hpasid;
if (data->flags & IOMMU_SVA_GPASID_VAL) {
svm->gpasid = data->gpasid;
@@ -420,7 +414,6 @@ int intel_svm_bind_gpasid(struct iommu_domain *domain, 
struct device *dev,
INIT_WORK(>work, intel_svm_free_async_fn);
ioasid_attach_data(data->hpasid, svm);
INIT_LIST_HEAD_RCU(>devs);
-   mmput(svm->mm);
}
sdev = kzalloc(sizeof(*sdev), GFP_KERNEL);
if (!sdev) {
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 08/16] iommu: Pass domain to sva_unbind_gpasid()

2020-09-10 Thread Liu Yi L
From: Yi Sun 

Current interface is good enough for SVA virtualization on an assigned
physical PCI device, but when it comes to mediated devices, a physical
device may be attached with multiple aux-domains. Also, for guest unbind,
the PASID to be unbind should be allocated to the VM. This check requires
to know the ioasid_set which is associated with the domain.

So this interface needs to pass in domain info. Then the iommu driver is
able to know which domain will be used for the 2nd stage translation of
the nesting mode and also be able to do PASID ownership check. This patch
passes @domain per the above reason. Also, the prototype of  is
changed from an "int" to "u32" as the below link.

[PATCH v6 01/12] iommu: Change type of pasid to u32
https://lore.kernel.org/linux-iommu/1594684087-61184-2-git-send-email-fenghua...@intel.com/

Cc: Kevin Tian 
CC: Jacob Pan 
Cc: Alex Williamson 
Cc: Eric Auger 
Cc: Jean-Philippe Brucker 
Cc: Joerg Roedel 
Cc: Lu Baolu 
Signed-off-by: Yi Sun 
Signed-off-by: Liu Yi L 
---
v6 -> v7:
*) correct the link for the details of modifying pasid prototype to bve "u32".
*) hold off r-b from Eric Auger as there is modification in this patch, will
   seek r-b in this version.

v5 -> v6:
*) use "u32" prototype for @pasid.
*) add review-by from Eric Auger.

v2 -> v3:
*) pass in domain info only
*) use u32 for pasid instead of int type

v1 -> v2:
*) added in v2.
---
 drivers/iommu/intel/svm.c   | 3 ++-
 drivers/iommu/iommu.c   | 2 +-
 include/linux/intel-iommu.h | 3 ++-
 include/linux/iommu.h   | 3 ++-
 4 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index d3cf52b..d39fafb 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -476,7 +476,8 @@ int intel_svm_bind_gpasid(struct iommu_domain *domain, 
struct device *dev,
return ret;
 }
 
-int intel_svm_unbind_gpasid(struct device *dev, int pasid)
+int intel_svm_unbind_gpasid(struct iommu_domain *domain,
+   struct device *dev, u32 pasid)
 {
struct intel_iommu *iommu = device_to_iommu(dev, NULL, NULL);
struct intel_svm_dev *sdev;
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 3bc263a..52aabb64 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2155,7 +2155,7 @@ int iommu_sva_unbind_gpasid(struct iommu_domain *domain, 
struct device *dev,
if (unlikely(!domain->ops->sva_unbind_gpasid))
return -ENODEV;
 
-   return domain->ops->sva_unbind_gpasid(dev, data->hpasid);
+   return domain->ops->sva_unbind_gpasid(domain, dev, data->hpasid);
 }
 EXPORT_SYMBOL_GPL(iommu_sva_unbind_gpasid);
 
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 3345391..ce0b33b 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -741,7 +741,8 @@ extern int intel_svm_enable_prq(struct intel_iommu *iommu);
 extern int intel_svm_finish_prq(struct intel_iommu *iommu);
 int intel_svm_bind_gpasid(struct iommu_domain *domain, struct device *dev,
  struct iommu_gpasid_bind_data *data);
-int intel_svm_unbind_gpasid(struct device *dev, int pasid);
+int intel_svm_unbind_gpasid(struct iommu_domain *domain,
+   struct device *dev, u32 pasid);
 struct iommu_sva *intel_svm_bind(struct device *dev, struct mm_struct *mm,
 void *drvdata);
 void intel_svm_unbind(struct iommu_sva *handle);
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 5b9f630..d561448 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -297,7 +297,8 @@ struct iommu_ops {
int (*sva_bind_gpasid)(struct iommu_domain *domain,
struct device *dev, struct iommu_gpasid_bind_data 
*data);
 
-   int (*sva_unbind_gpasid)(struct device *dev, int pasid);
+   int (*sva_unbind_gpasid)(struct iommu_domain *domain,
+struct device *dev, u32 pasid);
 
int (*def_domain_type)(struct device *dev);
 
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 11/16] vfio/type1: Allow invalidating first-level/stage IOMMU cache

2020-09-10 Thread Liu Yi L
This patch provides an interface allowing the userspace to invalidate
IOMMU cache for first-level page table. It is required when the first
level IOMMU page table is not managed by the host kernel in the nested
translation setup.

Cc: Kevin Tian 
CC: Jacob Pan 
Cc: Alex Williamson 
Cc: Eric Auger 
Cc: Jean-Philippe Brucker 
Cc: Joerg Roedel 
Cc: Lu Baolu 
Signed-off-by: Liu Yi L 
Signed-off-by: Eric Auger 
Signed-off-by: Jacob Pan 
---
v1 -> v2:
*) rename from "vfio/type1: Flush stage-1 IOMMU cache for nesting type"
*) rename vfio_cache_inv_fn() to vfio_dev_cache_invalidate_fn()
*) vfio_dev_cache_inv_fn() always successful
*) remove VFIO_IOMMU_CACHE_INVALIDATE, and reuse VFIO_IOMMU_NESTING_OP
---
 drivers/vfio/vfio_iommu_type1.c | 38 ++
 include/uapi/linux/vfio.h   |  3 +++
 2 files changed, 41 insertions(+)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 11f1156..b67ce2d 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -3112,6 +3112,41 @@ static long vfio_iommu_handle_pgtbl_op(struct vfio_iommu 
*iommu,
return ret;
 }
 
+static int vfio_dev_cache_invalidate_fn(struct device *dev, void *data)
+{
+   struct domain_capsule *dc = (struct domain_capsule *)data;
+   unsigned long arg = *(unsigned long *)dc->data;
+
+   iommu_uapi_cache_invalidate(dc->domain, dev, (void __user *)arg);
+   return 0;
+}
+
+static long vfio_iommu_invalidate_cache(struct vfio_iommu *iommu,
+   unsigned long arg)
+{
+   struct domain_capsule dc = { .data =  };
+   struct iommu_nesting_info *info;
+   int ret;
+
+   mutex_lock(>lock);
+   info = iommu->nesting_info;
+   if (!info || !(info->features & IOMMU_NESTING_FEAT_CACHE_INVLD)) {
+   ret = -EOPNOTSUPP;
+   goto out_unlock;
+   }
+
+   ret = vfio_prepare_nesting_domain_capsule(iommu, );
+   if (ret)
+   goto out_unlock;
+
+   iommu_group_for_each_dev(dc.group->iommu_group, ,
+vfio_dev_cache_invalidate_fn);
+
+out_unlock:
+   mutex_unlock(>lock);
+   return ret;
+}
+
 static long vfio_iommu_type1_nesting_op(struct vfio_iommu *iommu,
unsigned long arg)
 {
@@ -3136,6 +3171,9 @@ static long vfio_iommu_type1_nesting_op(struct vfio_iommu 
*iommu,
case VFIO_IOMMU_NESTING_OP_UNBIND_PGTBL:
ret = vfio_iommu_handle_pgtbl_op(iommu, false, arg + minsz);
break;
+   case VFIO_IOMMU_NESTING_OP_CACHE_INVLD:
+   ret = vfio_iommu_invalidate_cache(iommu, arg + minsz);
+   break;
default:
ret = -EINVAL;
}
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index a99bd71..a09a407 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -1233,6 +1233,8 @@ struct vfio_iommu_type1_pasid_request {
  * +-+---+
  * | UNBIND_PGTBL|  struct iommu_gpasid_bind_data|
  * +-+---+
+ * | CACHE_INVLD |  struct iommu_cache_invalidate_info   |
+ * +-+---+
  *
  * returns: 0 on success, -errno on failure.
  */
@@ -1246,6 +1248,7 @@ struct vfio_iommu_type1_nesting_op {
 enum {
VFIO_IOMMU_NESTING_OP_BIND_PGTBL,
VFIO_IOMMU_NESTING_OP_UNBIND_PGTBL,
+   VFIO_IOMMU_NESTING_OP_CACHE_INVLD,
VFIO_IOMMU_NESTING_OP_NUM,
 };
 
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 05/16] iommu/vt-d: Support setting ioasid set to domain

2020-09-10 Thread Liu Yi L
>From IOMMU p.o.v., PASIDs allocated and managed by external components
(e.g. VFIO) will be passed in for gpasid_bind/unbind operation. IOMMU
needs some knowledge to check the PASID ownership, hence add an interface
for those components to tell the PASID owner.

In latest kernel design, PASID ownership is managed by IOASID set where
the PASID is allocated from. This patch adds support for setting ioasid
set ID to the domains used for nesting/vSVA. Subsequent SVA operations
will check the PASID against its IOASID set for proper ownership.

Cc: Kevin Tian 
CC: Jacob Pan 
Cc: Alex Williamson 
Cc: Eric Auger 
Cc: Jean-Philippe Brucker 
Cc: Joerg Roedel 
Cc: Lu Baolu 
Signed-off-by: Liu Yi L 
Signed-off-by: Jacob Pan 
---
v6 -> v7:
*) add a helper function __domain_config_ioasid_set() per Eric's comment.
*) rename @ioasid_sid field of struct dmar_domain to be @pasid_set.
*) Eric gave r-b against v6, but since there is change, so will seek for his
   r-b again on this version.

v5 -> v6:
*) address comments against v5 from Eric Auger.

v4 -> v5:
*) address comments from Eric Auger.
---
 drivers/iommu/intel/iommu.c | 26 ++
 include/linux/intel-iommu.h |  4 
 include/linux/iommu.h   |  1 +
 3 files changed, 31 insertions(+)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 5813eea..d1c77fc 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -1806,6 +1806,7 @@ static struct dmar_domain *alloc_domain(int flags)
if (first_level_by_default())
domain->flags |= DOMAIN_FLAG_USE_FIRST_LEVEL;
domain->has_iotlb_device = false;
+   domain->pasid_set = host_pasid_set;
INIT_LIST_HEAD(>devices);
 
return domain;
@@ -6007,6 +6008,22 @@ static bool intel_iommu_is_attach_deferred(struct 
iommu_domain *domain,
return attach_deferred(dev);
 }
 
+static int __domain_config_ioasid_set(struct dmar_domain *domain,
+ struct ioasid_set *set)
+{
+   if (!(domain->flags & DOMAIN_FLAG_NESTING_MODE))
+   return -ENODEV;
+
+   if (domain->pasid_set != host_pasid_set &&
+   domain->pasid_set != set) {
+   pr_warn_ratelimited("multi ioasid_set setting to domain");
+   return -EBUSY;
+   }
+
+   domain->pasid_set = set;
+   return 0;
+}
+
 static int
 intel_iommu_domain_set_attr(struct iommu_domain *domain,
enum iommu_attr attr, void *data)
@@ -6030,6 +6047,15 @@ intel_iommu_domain_set_attr(struct iommu_domain *domain,
}
spin_unlock_irqrestore(_domain_lock, flags);
break;
+   case DOMAIN_ATTR_IOASID_SET:
+   {
+   struct ioasid_set *set = (struct ioasid_set *)data;
+
+   spin_lock_irqsave(_domain_lock, flags);
+   ret = __domain_config_ioasid_set(dmar_domain, set);
+   spin_unlock_irqrestore(_domain_lock, flags);
+   break;
+   }
default:
ret = -EINVAL;
break;
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index d36038e..3345391 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -549,6 +549,10 @@ struct dmar_domain {
   2 == 1GiB, 3 == 512GiB, 4 == 1TiB */
u64 max_addr;   /* maximum mapped address */
 
+   struct ioasid_set *pasid_set;   /*
+* the ioasid set which tracks all
+* PASIDs used by the domain.
+*/
int default_pasid;  /*
 * The default pasid used for non-SVM
 * traffic on mediated devices.
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index c364b1c..5b9f630 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -118,6 +118,7 @@ enum iommu_attr {
DOMAIN_ATTR_FSL_PAMUV1,
DOMAIN_ATTR_NESTING,/* two stages of translation */
DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE,
+   DOMAIN_ATTR_IOASID_SET,
DOMAIN_ATTR_MAX,
 };
 
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 00/16] vfio: expose virtual Shared Virtual Addressing to VMs

2020-09-10 Thread Liu Yi L
Shared Virtual Addressing (SVA), a.k.a, Shared Virtual Memory (SVM) on
Intel platforms allows address space sharing between device DMA and
applications. SVA can reduce programming complexity and enhance security.

This VFIO series is intended to expose SVA usage to VMs. i.e. Sharing
guest application address space with passthru devices. This is called
vSVA in this series. The whole vSVA enabling requires QEMU/VFIO/IOMMU
changes. For IOMMU and QEMU changes, they are in separate series (listed
in the "Related series").

The high-level architecture for SVA virtualization is as below, the key
design of vSVA support is to utilize the dual-stage IOMMU translation (
also known as IOMMU nesting translation) capability in host IOMMU.


.-.  .---.
|   vIOMMU|  | Guest process CR3, FL only|
| |  '---'
./
| PASID Entry |--- PASID cache flush -
'-'   |
| |   V
| |CR3 in GPA
'-'
Guest
--| Shadow |--|
  vv  v
Host
.-.  .--.
|   pIOMMU|  | Bind FL for GVA-GPA  |
| |  '--'
./  |
| PASID Entry | V (Nested xlate)
'\.--.
| |   |SL for GPA-HPA, default domain|
| |   '--'
'-'
Where:
 - FL = First level/stage one page tables
 - SL = Second level/stage two page tables

Patch Overview:
 1. reports IOMMU nesting info to userspace ( patch 0001, 0002, 0003, 0015 , 
0016)
 2. vfio support for PASID allocation and free for VMs (patch 0004, 0005, 0007)
 3. a fix to a revisit in intel iommu driver (patch 0006)
 4. vfio support for binding guest page table to host (patch 0008, 0009, 0010)
 5. vfio support for IOMMU cache invalidation from VMs (patch 0011)
 6. vfio support for vSVA usage on IOMMU-backed mdevs (patch 0012)
 7. expose PASID capability to VM (patch 0013)
 8. add doc for VFIO dual stage control (patch 0014)

The complete vSVA kernel upstream patches are divided into three phases:
1. Common APIs and PCI device direct assignment
2. IOMMU-backed Mediated Device assignment
3. Page Request Services (PRS) support

This patchset is aiming for the phase 1 and phase 2, and based on Jacob's
below series.
*) [PATCH v8 0/7] IOMMU user API enhancement - wip
   
https://lore.kernel.org/linux-iommu/1598898300-65475-1-git-send-email-jacob.jun@linux.intel.com/

*) [PATCH v2 0/9] IOASID extensions for guest SVA - wip
   
https://lore.kernel.org/linux-iommu/1598070918-21321-1-git-send-email-jacob.jun@linux.intel.com/

Complete set for current vSVA can be found in below branch.
https://github.com/luxis1999/linux-vsva.git vsva-linux-5.9-rc2-v7

The corresponding QEMU patch series is included in below branch:
https://github.com/luxis1999/qemu.git vsva_5.9_rc2_qemu_rfcv10


Regards,
Yi Liu

Changelog:
- Patch v6 -> Patch v7:
  a) drop [PATCH v6 01/15] of v6 as it's merged by Alex.
  b) rebase on Jacob's v8 IOMMU uapi enhancement and v2 IOASID 
extension patchset.
  c) Address comments against v6 from Alex and Eric.
  Patch v6: 
https://lore.kernel.org/kvm/1595917664-33276-1-git-send-email-yi.l@intel.com/

- Patch v5 -> Patch v6:
  a) Address comments against v5 from Eric.
  b) rebase on Jacob's v6 IOMMU uapi enhancement
  Patch v5: 
https://lore.kernel.org/kvm/1594552870-55687-1-git-send-email-yi.l@intel.com/

- Patch v4 -> Patch v5:
  a) Address comments against v4
  Patch v4: 
https://lore.kernel.org/kvm/1593861989-35920-1-git-send-email-yi.l@intel.com/

- Patch v3 -> Patch v4:
  a) Address comments against v3
  b) Add rb from Stefan on patch 14/15
  Patch v3: 
https://lore.kernel.org/kvm/1592988927-48009-1-git-send-email-yi.l@intel.com/

- Patch v2 -> Patch v3:
  a) Rebase on top of Jacob's v3 iommu uapi patchset
  b) Address comments from Kevin and Stefan Hajnoczi
  c) Reuse DOMAIN_ATTR_NESTING to get iommu nesting info
  d) Drop [PATCH v2 07/15] iommu/uapi: Add iommu_gpasid_unbind_data
  Patch v2: 
https://lore.kernel.org/kvm/1591877734-66527-1-git-send-email-yi.l@intel.com/

- Patch v1 -> Patch v2:
  a) Refactor vfio_iommu_type1_ioctl() per suggestion from Christoph
 Hellwig.
  b) Re-sequence the patch series for better bisect support.
  c) Report IOMMU nesting cap info in detail instead of a format in
 v1.
  d) Enforce one group per nesting type container for vfio iommu type1
 driver.
  e) Build the vfio_mm related code 

[PATCH v7 15/16] iommu/vt-d: Only support nesting when nesting caps are consistent across iommu units

2020-09-10 Thread Liu Yi L
This patch makes change to only supports the case where all the physical iommu
units have the same CAP/ECAP MASKS for nested translation.

Cc: Kevin Tian 
CC: Jacob Pan 
Cc: Alex Williamson 
Cc: Eric Auger 
Cc: Jean-Philippe Brucker 
Cc: Joerg Roedel 
Cc: Lu Baolu 
Signed-off-by: Liu Yi L 
Signed-off-by: Jacob Pan 
---
v6 - > v7:
*) added per comment from Eric.
   https://lore.kernel.org/kvm/7fe337fa-abbc-82be-c8e8-b9e2a6179...@redhat.com/
---
 drivers/iommu/intel/iommu.c |  8 ++--
 include/linux/intel-iommu.h | 16 
 2 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 95740b9..38c6c9b 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -5675,12 +5675,16 @@ static inline bool iommu_pasid_support(void)
 static inline bool nested_mode_support(void)
 {
struct dmar_drhd_unit *drhd;
-   struct intel_iommu *iommu;
+   struct intel_iommu *iommu, *prev = NULL;
bool ret = true;
 
rcu_read_lock();
for_each_active_iommu(iommu, drhd) {
-   if (!sm_supported(iommu) || !ecap_nest(iommu->ecap)) {
+   if (!prev)
+   prev = iommu;
+   if (!sm_supported(iommu) || !ecap_nest(iommu->ecap) ||
+   (VTD_CAP_MASK & (iommu->cap ^ prev->cap)) ||
+   (VTD_ECAP_MASK & (iommu->ecap ^ prev->ecap))) {
ret = false;
break;
}
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index db7fc59..e7b7512 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -197,6 +197,22 @@
 #define ecap_max_handle_mask(e) ((e >> 20) & 0xf)
 #define ecap_sc_support(e) ((e >> 7) & 0x1) /* Snooping Control */
 
+/* Nesting Support Capability Alignment */
+#define VTD_CAP_FL1GP  BIT_ULL(56)
+#define VTD_CAP_FL5LP  BIT_ULL(60)
+#define VTD_ECAP_PRS   BIT_ULL(29)
+#define VTD_ECAP_ERS   BIT_ULL(30)
+#define VTD_ECAP_SRS   BIT_ULL(31)
+#define VTD_ECAP_EAFS  BIT_ULL(34)
+#define VTD_ECAP_PASID BIT_ULL(40)
+
+/* Only capabilities marked in below MASKs are reported */
+#define VTD_CAP_MASK   (VTD_CAP_FL1GP | VTD_CAP_FL5LP)
+
+#define VTD_ECAP_MASK  (VTD_ECAP_PRS | VTD_ECAP_ERS | \
+VTD_ECAP_SRS | VTD_ECAP_EAFS | \
+VTD_ECAP_PASID)
+
 /* Virtual command interface capability */
 #define vccap_pasid(v) (((v) & DMA_VCS_PAS)) /* PASID allocation */
 
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 01/16] iommu: Report domain nesting info

2020-09-10 Thread Liu Yi L
IOMMUs that support nesting translation needs report the capability info
to userspace. It gives information about requirements the userspace needs
to implement plus other features characterizing the physical implementation.

This patch introduces a new IOMMU UAPI struct that gives information about
the nesting capabilities and features. This struct is supposed to be returned
by iommu_domain_get_attr() with DOMAIN_ATTR_NESTING attribute parameter, with
one domain whose type has been set to DOMAIN_ATTR_NESTING.

Cc: Kevin Tian 
CC: Jacob Pan 
Cc: Alex Williamson 
Cc: Eric Auger 
Cc: Jean-Philippe Brucker 
Cc: Joerg Roedel 
Cc: Lu Baolu 
Signed-off-by: Liu Yi L 
Signed-off-by: Jacob Pan 
---
v6 -> v7:
*) rephrase the commit message, replace the @data[] field in struct
   iommu_nesting_info with union per comments from Eric Auger.

v5 -> v6:
*) rephrase the feature notes per comments from Eric Auger.
*) rename @size of struct iommu_nesting_info to @argsz.

v4 -> v5:
*) address comments from Eric Auger.

v3 -> v4:
*) split the SMMU driver changes to be a separate patch
*) move the @addr_width and @pasid_bits from vendor specific
   part to generic part.
*) tweak the description for the @features field of struct
   iommu_nesting_info.
*) add description on the @data[] field of struct iommu_nesting_info

v2 -> v3:
*) remvoe cap/ecap_mask in iommu_nesting_info.
*) reuse DOMAIN_ATTR_NESTING to get nesting info.
*) return an empty iommu_nesting_info for SMMU drivers per Jean'
   suggestion.
---
 include/uapi/linux/iommu.h | 76 ++
 1 file changed, 76 insertions(+)

diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h
index 1ebc23d..ff987e4 100644
--- a/include/uapi/linux/iommu.h
+++ b/include/uapi/linux/iommu.h
@@ -341,4 +341,80 @@ struct iommu_gpasid_bind_data {
} vendor;
 };
 
+/*
+ * struct iommu_nesting_info_vtd - Intel VT-d specific nesting info.
+ *
+ * @flags: VT-d specific flags. Currently reserved for future
+ * extension. must be set to 0.
+ * @cap_reg:   Describe basic capabilities as defined in VT-d capability
+ * register.
+ * @ecap_reg:  Describe the extended capabilities as defined in VT-d
+ * extended capability register.
+ */
+struct iommu_nesting_info_vtd {
+   __u32   flags;
+   __u64   cap_reg;
+   __u64   ecap_reg;
+};
+
+/*
+ * struct iommu_nesting_info - Information for nesting-capable IOMMU.
+ *userspace should check it before using
+ *nesting capability.
+ *
+ * @argsz: size of the whole structure.
+ * @flags: currently reserved for future extension. must set to 0.
+ * @format:PASID table entry format, the same definition as struct
+ * iommu_gpasid_bind_data @format.
+ * @features:  supported nesting features.
+ * @addr_width:The output addr width of first level/stage translation
+ * @pasid_bits:Maximum supported PASID bits, 0 represents no PASID
+ * support.
+ * @vendor:vendor specific data, structure type can be deduced from
+ * @format field.
+ *
+ * +===+==+
+ * | feature   |  Notes   |
+ * +===+==+
+ * | SYSWIDE_PASID |  IOMMU vendor driver sets it to mandate userspace|
+ * |   |  to allocate PASID from kernel. All PASID allocation |
+ * |   |  free must be mediated through the IOMMU UAPI.   |
+ * +---+--+
+ * | BIND_PGTBL|  IOMMU vendor driver sets it to mandate userspace to |
+ * |   |  bind the first level/stage page table to associated |
+ * |   |  PASID (either the one specified in bind request or  |
+ * |   |  the default PASID of iommu domain), through IOMMU   |
+ * |   |  UAPI.   |
+ * +---+--+
+ * | CACHE_INVLD   |  IOMMU vendor driver sets it to mandate userspace to |
+ * |   |  explicitly invalidate the IOMMU cache through IOMMU |
+ * |   |  UAPI according to vendor-specific requirement when  |
+ * |   |  changing the 1st level/stage page table.|
+ * +---+--+
+ *
+ * data struct types defined for @format:
+ * ++=+
+ * | @format| data struct |
+ * ++=+
+ * | IOMMU_PASID_FORMAT_INTEL_VTD   | struct iommu_nesting_info_vtd   |
+ * ++-+
+ *
+ */
+struct 

[PATCH v7 04/16] vfio: Add PASID allocation/free support

2020-09-10 Thread Liu Yi L
Shared Virtual Addressing (a.k.a Shared Virtual Memory) allows sharing
multiple process virtual address spaces with the device for simplified
programming model. PASID is used to tag an virtual address space in DMA
requests and to identify the related translation structure in IOMMU. When
a PASID-capable device is assigned to a VM, we want the same capability
of using PASID to tag guest process virtual address spaces to achieve
virtual SVA (vSVA).

PASID management for guest is vendor specific. Some vendors (e.g. Intel
VT-d) requires system-wide managed PASIDs across all devices, regardless
of whether a device is used by host or assigned to guest. Other vendors
(e.g. ARM SMMU) may allow PASIDs managed per-device thus could be fully
delegated to the guest for assigned devices.

For system-wide managed PASIDs, this patch introduces a vfio module to
handle explicit PASID alloc/free requests from guest. Allocated PASIDs
are associated to a process (or, mm_struct) in IOASID core. A vfio_mm
object is introduced to track mm_struct. Multiple VFIO containers within
a process share the same vfio_mm object.

A quota mechanism is provided to prevent malicious user from exhausting
available PASIDs. Currently the quota is a global parameter applied to
all VFIO devices. In the future per-device quota might be supported too.

Cc: Kevin Tian 
CC: Jacob Pan 
Cc: Eric Auger 
Cc: Jean-Philippe Brucker 
Cc: Joerg Roedel 
Cc: Lu Baolu 
Suggested-by: Alex Williamson 
Signed-off-by: Liu Yi L 
Reviewed-by: Eric Auger 
---
v6 -> v7:
*) remove "#include " and add r-b from Eric Auger.

v5 -> v6:
*) address comments from Eric. Add vfio_unlink_pasid() to be consistent
   with vfio_unlink_dma(). Add a comment in vfio_pasid_exit().

v4 -> v5:
*) address comments from Eric Auger.
*) address the comments from Alex on the pasid free range support. Added
   per vfio_mm pasid r-b tree.
   https://lore.kernel.org/kvm/20200709082751.32074...@x1.home/

v3 -> v4:
*) fix lock leam in vfio_mm_get_from_task()
*) drop pasid_quota field in struct vfio_mm
*) vfio_mm_get_from_task() returns ERR_PTR(-ENOTTY) when !CONFIG_VFIO_PASID

v1 -> v2:
*) added in v2, split from the pasid alloc/free support of v1
---
 drivers/vfio/Kconfig  |   5 +
 drivers/vfio/Makefile |   1 +
 drivers/vfio/vfio_pasid.c | 247 ++
 include/linux/vfio.h  |  28 ++
 4 files changed, 281 insertions(+)
 create mode 100644 drivers/vfio/vfio_pasid.c

diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig
index fd17db9..3d8a108 100644
--- a/drivers/vfio/Kconfig
+++ b/drivers/vfio/Kconfig
@@ -19,6 +19,11 @@ config VFIO_VIRQFD
depends on VFIO && EVENTFD
default n
 
+config VFIO_PASID
+   tristate
+   depends on IOASID && VFIO
+   default n
+
 menuconfig VFIO
tristate "VFIO Non-Privileged userspace driver framework"
depends on IOMMU_API
diff --git a/drivers/vfio/Makefile b/drivers/vfio/Makefile
index de67c47..bb836a3 100644
--- a/drivers/vfio/Makefile
+++ b/drivers/vfio/Makefile
@@ -3,6 +3,7 @@ vfio_virqfd-y := virqfd.o
 
 obj-$(CONFIG_VFIO) += vfio.o
 obj-$(CONFIG_VFIO_VIRQFD) += vfio_virqfd.o
+obj-$(CONFIG_VFIO_PASID) += vfio_pasid.o
 obj-$(CONFIG_VFIO_IOMMU_TYPE1) += vfio_iommu_type1.o
 obj-$(CONFIG_VFIO_IOMMU_SPAPR_TCE) += vfio_iommu_spapr_tce.o
 obj-$(CONFIG_VFIO_SPAPR_EEH) += vfio_spapr_eeh.o
diff --git a/drivers/vfio/vfio_pasid.c b/drivers/vfio/vfio_pasid.c
new file mode 100644
index 000..44ecdd5
--- /dev/null
+++ b/drivers/vfio/vfio_pasid.c
@@ -0,0 +1,247 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2020 Intel Corporation.
+ * Author: Liu Yi L 
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define DRIVER_VERSION  "0.1"
+#define DRIVER_AUTHOR   "Liu Yi L "
+#define DRIVER_DESC "PASID management for VFIO bus drivers"
+
+#define VFIO_DEFAULT_PASID_QUOTA   1000
+static int pasid_quota = VFIO_DEFAULT_PASID_QUOTA;
+module_param_named(pasid_quota, pasid_quota, uint, 0444);
+MODULE_PARM_DESC(pasid_quota,
+"Set the quota for max number of PASIDs that an application is 
allowed to request (default 1000)");
+
+struct vfio_mm_token {
+   unsigned long long val;
+};
+
+struct vfio_mm {
+   struct kref kref;
+   struct ioasid_set   *ioasid_set;
+   struct mutexpasid_lock;
+   struct rb_root  pasid_list;
+   struct list_headnext;
+   struct vfio_mm_tokentoken;
+};
+
+static struct mutexvfio_mm_lock;
+static struct list_headvfio_mm_list;
+
+struct vfio_pasid {
+   struct rb_node  node;
+   ioasid_tpasid;
+};
+
+static void vfio_remove_all_pasids(struct vfio_mm *vmm);
+
+/* called with vfio.vfio_mm_lock held */
+static void vfio_mm_release(struct kref *kref)
+{
+   struct vfio_mm *vmm = container_of(kref, struct vfio_mm, kref);
+
+   list_del(>next);
+   mutex_unlock(_mm_lock);

[PATCH v7 02/16] iommu/smmu: Report empty domain nesting info

2020-09-10 Thread Liu Yi L
This patch is added as instead of returning a boolean for DOMAIN_ATTR_NESTING,
iommu_domain_get_attr() should return an iommu_nesting_info handle. For
now, return an empty nesting info struct for now as true nesting is not
yet supported by the SMMUs.

Cc: Will Deacon 
Cc: Robin Murphy 
Cc: Eric Auger 
Cc: Jean-Philippe Brucker 
Suggested-by: Jean-Philippe Brucker 
Signed-off-by: Liu Yi L 
Signed-off-by: Jacob Pan 
Reviewed-by: Eric Auger 
---
v5 -> v6:
*) add review-by from Eric Auger.

v4 -> v5:
*) address comments from Eric Auger.
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 29 +++--
 drivers/iommu/arm/arm-smmu/arm-smmu.c   | 29 +++--
 2 files changed, 54 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 7196207..016e2e5 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -3019,6 +3019,32 @@ static struct iommu_group *arm_smmu_device_group(struct 
device *dev)
return group;
 }
 
+static int arm_smmu_domain_nesting_info(struct arm_smmu_domain *smmu_domain,
+   void *data)
+{
+   struct iommu_nesting_info *info = (struct iommu_nesting_info *)data;
+   unsigned int size;
+
+   if (!info || smmu_domain->stage != ARM_SMMU_DOMAIN_NESTED)
+   return -ENODEV;
+
+   size = sizeof(struct iommu_nesting_info);
+
+   /*
+* if provided buffer size is smaller than expected, should
+* return 0 and also the expected buffer size to caller.
+*/
+   if (info->argsz < size) {
+   info->argsz = size;
+   return 0;
+   }
+
+   /* report an empty iommu_nesting_info for now */
+   memset(info, 0x0, size);
+   info->argsz = size;
+   return 0;
+}
+
 static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
enum iommu_attr attr, void *data)
 {
@@ -3028,8 +3054,7 @@ static int arm_smmu_domain_get_attr(struct iommu_domain 
*domain,
case IOMMU_DOMAIN_UNMANAGED:
switch (attr) {
case DOMAIN_ATTR_NESTING:
-   *(int *)data = (smmu_domain->stage == 
ARM_SMMU_DOMAIN_NESTED);
-   return 0;
+   return arm_smmu_domain_nesting_info(smmu_domain, data);
default:
return -ENODEV;
}
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index 09c42af9..368486f 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -1510,6 +1510,32 @@ static struct iommu_group *arm_smmu_device_group(struct 
device *dev)
return group;
 }
 
+static int arm_smmu_domain_nesting_info(struct arm_smmu_domain *smmu_domain,
+   void *data)
+{
+   struct iommu_nesting_info *info = (struct iommu_nesting_info *)data;
+   unsigned int size;
+
+   if (!info || smmu_domain->stage != ARM_SMMU_DOMAIN_NESTED)
+   return -ENODEV;
+
+   size = sizeof(struct iommu_nesting_info);
+
+   /*
+* if provided buffer size is smaller than expected, should
+* return 0 and also the expected buffer size to caller.
+*/
+   if (info->argsz < size) {
+   info->argsz = size;
+   return 0;
+   }
+
+   /* report an empty iommu_nesting_info for now */
+   memset(info, 0x0, size);
+   info->argsz = size;
+   return 0;
+}
+
 static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
enum iommu_attr attr, void *data)
 {
@@ -1519,8 +1545,7 @@ static int arm_smmu_domain_get_attr(struct iommu_domain 
*domain,
case IOMMU_DOMAIN_UNMANAGED:
switch (attr) {
case DOMAIN_ATTR_NESTING:
-   *(int *)data = (smmu_domain->stage == 
ARM_SMMU_DOMAIN_NESTED);
-   return 0;
+   return arm_smmu_domain_nesting_info(smmu_domain, data);
default:
return -ENODEV;
}
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 12/16] vfio/type1: Add vSVA support for IOMMU-backed mdevs

2020-09-10 Thread Liu Yi L
Recent years, mediated device pass-through framework (e.g. vfio-mdev)
is used to achieve flexible device sharing across domains (e.g. VMs).
Also there are hardware assisted mediated pass-through solutions from
platform vendors. e.g. Intel VT-d scalable mode which supports Intel
Scalable I/O Virtualization technology. Such mdevs are called IOMMU-
backed mdevs as there are IOMMU enforced DMA isolation for such mdevs.
In kernel, IOMMU-backed mdevs are exposed to IOMMU layer by aux-domain
concept, which means mdevs are protected by an iommu domain which is
auxiliary to the domain that the kernel driver primarily uses for DMA
API. Details can be found in the KVM presentation as below:

https://events19.linuxfoundation.org/wp-content/uploads/2017/12/Hardware-Assisted-Mediated-Pass-Through-with-VFIO-Kevin-Tian-Intel.pdf

This patch extends NESTING_IOMMU ops to IOMMU-backed mdev devices. The
main requirement is to use the auxiliary domain associated with mdev.

Cc: Kevin Tian 
CC: Jacob Pan 
CC: Jun Tian 
Cc: Alex Williamson 
Cc: Eric Auger 
Cc: Jean-Philippe Brucker 
Cc: Joerg Roedel 
Cc: Lu Baolu 
Signed-off-by: Liu Yi L 
Reviewed-by: Eric Auger 
---
v5 -> v6:
*) add review-by from Eric Auger.

v1 -> v2:
*) check the iommu_device to ensure the handling mdev is IOMMU-backed
---
 drivers/vfio/vfio_iommu_type1.c | 36 +++-
 1 file changed, 31 insertions(+), 5 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index b67ce2d..5cef732 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -2438,29 +2438,49 @@ static int vfio_iommu_resv_refresh(struct vfio_iommu 
*iommu,
return ret;
 }
 
+static struct device *vfio_get_iommu_device(struct vfio_group *group,
+   struct device *dev)
+{
+   if (group->mdev_group)
+   return vfio_mdev_get_iommu_device(dev);
+   else
+   return dev;
+}
+
 static int vfio_dev_bind_gpasid_fn(struct device *dev, void *data)
 {
struct domain_capsule *dc = (struct domain_capsule *)data;
unsigned long arg = *(unsigned long *)dc->data;
+   struct device *iommu_device;
+
+   iommu_device = vfio_get_iommu_device(dc->group, dev);
+   if (!iommu_device)
+   return -EINVAL;
 
-   return iommu_uapi_sva_bind_gpasid(dc->domain, dev,
+   return iommu_uapi_sva_bind_gpasid(dc->domain, iommu_device,
  (void __user *)arg);
 }
 
 static int vfio_dev_unbind_gpasid_fn(struct device *dev, void *data)
 {
struct domain_capsule *dc = (struct domain_capsule *)data;
+   struct device *iommu_device;
+
+   iommu_device = vfio_get_iommu_device(dc->group, dev);
+   if (!iommu_device)
+   return -EINVAL;
 
if (dc->user) {
unsigned long arg = *(unsigned long *)dc->data;
 
-   iommu_uapi_sva_unbind_gpasid(dc->domain,
-dev, (void __user *)arg);
+   iommu_uapi_sva_unbind_gpasid(dc->domain, iommu_device,
+(void __user *)arg);
} else {
struct iommu_gpasid_bind_data *unbind_data =
(struct iommu_gpasid_bind_data *)dc->data;
 
-   iommu_sva_unbind_gpasid(dc->domain, dev, unbind_data);
+   iommu_sva_unbind_gpasid(dc->domain,
+   iommu_device, unbind_data);
}
return 0;
 }
@@ -3116,8 +3136,14 @@ static int vfio_dev_cache_invalidate_fn(struct device 
*dev, void *data)
 {
struct domain_capsule *dc = (struct domain_capsule *)data;
unsigned long arg = *(unsigned long *)dc->data;
+   struct device *iommu_device;
+
+   iommu_device = vfio_get_iommu_device(dc->group, dev);
+   if (!iommu_device)
+   return -EINVAL;
 
-   iommu_uapi_cache_invalidate(dc->domain, dev, (void __user *)arg);
+   iommu_uapi_cache_invalidate(dc->domain, iommu_device,
+   (void __user *)arg);
return 0;
 }
 
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 16/16] iommu/vt-d: Support reporting nesting capability info

2020-09-10 Thread Liu Yi L
This patch reports nesting info when iommu_domain_get_attr() is called with
DOMAIN_ATTR_NESTING and one domain with nesting set.

Cc: Kevin Tian 
CC: Jacob Pan 
Cc: Alex Williamson 
Cc: Eric Auger 
Cc: Jean-Philippe Brucker 
Cc: Joerg Roedel 
Cc: Lu Baolu 
Signed-off-by: Liu Yi L 
Signed-off-by: Jacob Pan 
---
v6 -> v7:
*) split the patch in v6 into two patches:
   [PATCH v7 15/16] iommu/vt-d: Only support nesting when nesting caps are 
consistent across iommu units
   [PATCH v7 16/16] iommu/vt-d: Support reporting nesting capability info

v2 -> v3:
*) remove cap/ecap_mask in iommu_nesting_info.
---
 drivers/iommu/intel/iommu.c | 74 +
 1 file changed, 74 insertions(+)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 38c6c9b..e46214e 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -6089,6 +6089,79 @@ intel_iommu_domain_set_attr(struct iommu_domain *domain,
return ret;
 }
 
+static int intel_iommu_get_nesting_info(struct iommu_domain *domain,
+   struct iommu_nesting_info *info)
+{
+   struct dmar_domain *dmar_domain = to_dmar_domain(domain);
+   u64 cap = VTD_CAP_MASK, ecap = VTD_ECAP_MASK;
+   struct device_domain_info *domain_info;
+   struct iommu_nesting_info_vtd vtd;
+   unsigned int size;
+
+   if (!info)
+   return -EINVAL;
+
+   if (domain->type != IOMMU_DOMAIN_UNMANAGED ||
+   !(dmar_domain->flags & DOMAIN_FLAG_NESTING_MODE))
+   return -ENODEV;
+
+   size = sizeof(struct iommu_nesting_info);
+   /*
+* if provided buffer size is smaller than expected, should
+* return 0 and also the expected buffer size to caller.
+*/
+   if (info->argsz < size) {
+   info->argsz = size;
+   return 0;
+   }
+
+   /*
+* arbitrary select the first domain_info as all nesting
+* related capabilities should be consistent across iommu
+* units.
+*/
+   domain_info = list_first_entry(_domain->devices,
+  struct device_domain_info, link);
+   cap &= domain_info->iommu->cap;
+   ecap &= domain_info->iommu->ecap;
+
+   info->addr_width = dmar_domain->gaw;
+   info->format = IOMMU_PASID_FORMAT_INTEL_VTD;
+   info->features = IOMMU_NESTING_FEAT_SYSWIDE_PASID |
+IOMMU_NESTING_FEAT_BIND_PGTBL |
+IOMMU_NESTING_FEAT_CACHE_INVLD;
+   info->pasid_bits = ilog2(intel_pasid_max_id);
+   memset(>padding, 0x0, 12);
+
+   vtd.flags = 0;
+   vtd.cap_reg = cap;
+   vtd.ecap_reg = ecap;
+
+   memcpy(>vendor.vtd, , sizeof(vtd));
+   return 0;
+}
+
+static int intel_iommu_domain_get_attr(struct iommu_domain *domain,
+  enum iommu_attr attr, void *data)
+{
+   switch (attr) {
+   case DOMAIN_ATTR_NESTING:
+   {
+   struct iommu_nesting_info *info =
+   (struct iommu_nesting_info *)data;
+   unsigned long flags;
+   int ret;
+
+   spin_lock_irqsave(_domain_lock, flags);
+   ret = intel_iommu_get_nesting_info(domain, info);
+   spin_unlock_irqrestore(_domain_lock, flags);
+   return ret;
+   }
+   default:
+   return -ENOENT;
+   }
+}
+
 /*
  * Check that the device does not live on an external facing PCI port that is
  * marked as untrusted. Such devices should not be able to apply quirks and
@@ -6111,6 +6184,7 @@ const struct iommu_ops intel_iommu_ops = {
.domain_alloc   = intel_iommu_domain_alloc,
.domain_free= intel_iommu_domain_free,
.domain_set_attr= intel_iommu_domain_set_attr,
+   .domain_get_attr= intel_iommu_domain_get_attr,
.attach_dev = intel_iommu_attach_device,
.detach_dev = intel_iommu_detach_device,
.aux_attach_dev = intel_iommu_aux_attach_device,
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 13/16] vfio/pci: Expose PCIe PASID capability to guest

2020-09-10 Thread Liu Yi L
This patch exposes PCIe PASID capability to guest for assigned devices.
Existing vfio_pci driver hides it from guest by setting the capability
length as 0 in pci_ext_cap_length[].

And this patch only exposes PASID capability for devices which has PCIe
PASID extended struture in its configuration space. VFs will not expose
the PASID capability as they do not implement the PASID extended structure
in their config space. It is a TODO in future. Related discussion can be
found in below link:

https://lore.kernel.org/kvm/20200407095801.648b1...@w520.home/

Cc: Kevin Tian 
CC: Jacob Pan 
Cc: Alex Williamson 
Cc: Eric Auger 
Cc: Jean-Philippe Brucker 
Cc: Joerg Roedel 
Cc: Lu Baolu 
Signed-off-by: Liu Yi L 
Reviewed-by: Eric Auger 
---
v5 -> v6:
*) add review-by from Eric Auger.

v1 -> v2:
*) added in v2, but it was sent in a separate patchseries before
---
 drivers/vfio/pci/vfio_pci_config.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/vfio/pci/vfio_pci_config.c 
b/drivers/vfio/pci/vfio_pci_config.c
index d98843f..07ff2e6 100644
--- a/drivers/vfio/pci/vfio_pci_config.c
+++ b/drivers/vfio/pci/vfio_pci_config.c
@@ -95,7 +95,7 @@ static const u16 pci_ext_cap_length[PCI_EXT_CAP_ID_MAX + 1] = 
{
[PCI_EXT_CAP_ID_LTR]=   PCI_EXT_CAP_LTR_SIZEOF,
[PCI_EXT_CAP_ID_SECPCI] =   0,  /* not yet */
[PCI_EXT_CAP_ID_PMUX]   =   0,  /* not yet */
-   [PCI_EXT_CAP_ID_PASID]  =   0,  /* not yet */
+   [PCI_EXT_CAP_ID_PASID]  =   PCI_EXT_CAP_PASID_SIZEOF,
 };
 
 /*
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 09/16] iommu/vt-d: Check ownership for PASIDs from user-space

2020-09-10 Thread Liu Yi L
When an IOMMU domain with nesting attribute is used for guest SVA, a
system-wide PASID is allocated for binding with the device and the domain.
For security reason, we need to check the PASID passed from user-space.
e.g. page table bind/unbind and PASID related cache invalidation.

Cc: Kevin Tian 
CC: Jacob Pan 
Cc: Alex Williamson 
Cc: Eric Auger 
Cc: Jean-Philippe Brucker 
Cc: Joerg Roedel 
Cc: Lu Baolu 
Signed-off-by: Liu Yi L 
Signed-off-by: Jacob Pan 
---
v6 -> v7:
*) acquire device_domain_lock in bind/unbind_gpasid() to ensure dmar_domain
   is not modified during bind/unbind_gpasid().
*) the change to svm.c varies from previous version as Jacob refactored the
   svm.c code.
---
 drivers/iommu/intel/iommu.c | 29 +
 drivers/iommu/intel/svm.c   | 33 -
 include/linux/intel-iommu.h |  2 ++
 3 files changed, 51 insertions(+), 13 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index d1c77fc..95740b9 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -5451,6 +5451,7 @@ intel_iommu_sva_invalidate(struct iommu_domain *domain, 
struct device *dev,
int granu = 0;
u64 pasid = 0;
u64 addr = 0;
+   void *pdata;
 
granu = to_vtd_granularity(cache_type, inv_info->granularity);
if (granu == -EINVAL) {
@@ -5470,6 +5471,15 @@ intel_iommu_sva_invalidate(struct iommu_domain *domain, 
struct device *dev,
 (inv_info->granu.addr_info.flags & 
IOMMU_INV_ADDR_FLAGS_PASID))
pasid = inv_info->granu.addr_info.pasid;
 
+   pdata = ioasid_find(dmar_domain->pasid_set, pasid, NULL);
+   if (!pdata) {
+   ret = -EINVAL;
+   goto out_unlock;
+   } else if (IS_ERR(pdata)) {
+   ret = PTR_ERR(pdata);
+   goto out_unlock;
+   }
+
switch (BIT(cache_type)) {
case IOMMU_CACHE_INV_TYPE_IOTLB:
/* HW will ignore LSB bits based on address mask */
@@ -5787,12 +5797,14 @@ static void intel_iommu_get_resv_regions(struct device 
*device,
list_add_tail(>list, head);
 }
 
-int intel_iommu_enable_pasid(struct intel_iommu *iommu, struct device *dev)
+/*
+ * Caller should have held device_domain_lock
+ */
+int intel_iommu_enable_pasid_locked(struct intel_iommu *iommu, struct device 
*dev)
 {
struct device_domain_info *info;
struct context_entry *context;
struct dmar_domain *domain;
-   unsigned long flags;
u64 ctx_lo;
int ret;
 
@@ -5800,7 +5812,6 @@ int intel_iommu_enable_pasid(struct intel_iommu *iommu, 
struct device *dev)
if (!domain)
return -EINVAL;
 
-   spin_lock_irqsave(_domain_lock, flags);
spin_lock(>lock);
 
ret = -EINVAL;
@@ -5833,11 +5844,21 @@ int intel_iommu_enable_pasid(struct intel_iommu *iommu, 
struct device *dev)
 
  out:
spin_unlock(>lock);
-   spin_unlock_irqrestore(_domain_lock, flags);
 
return ret;
 }
 
+int intel_iommu_enable_pasid(struct intel_iommu *iommu, struct device *dev)
+{
+   unsigned long flags;
+   int ret;
+
+   spin_lock_irqsave(_domain_lock, flags);
+   ret = intel_iommu_enable_pasid_locked(iommu, dev);
+   spin_unlock_irqrestore(_domain_lock, flags);
+   return ret;
+}
+
 static void intel_iommu_apply_resv_region(struct device *dev,
  struct iommu_domain *domain,
  struct iommu_resv_region *region)
diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index d39fafb..80f58ab 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -293,7 +293,9 @@ static LIST_HEAD(global_svm_list);
list_for_each_entry((sdev), &(svm)->devs, list) \
if ((d) != (sdev)->dev) {} else
 
-static int pasid_to_svm_sdev(struct device *dev, unsigned int pasid,
+static int pasid_to_svm_sdev(struct device *dev,
+struct ioasid_set *set,
+unsigned int pasid,
 struct intel_svm **rsvm,
 struct intel_svm_dev **rsdev)
 {
@@ -307,7 +309,7 @@ static int pasid_to_svm_sdev(struct device *dev, unsigned 
int pasid,
if (pasid == INVALID_IOASID || pasid >= PASID_MAX)
return -EINVAL;
 
-   svm = ioasid_find(NULL, pasid, NULL);
+   svm = ioasid_find(set, pasid, NULL);
if (IS_ERR(svm))
return PTR_ERR(svm);
 
@@ -344,6 +346,7 @@ int intel_svm_bind_gpasid(struct iommu_domain *domain, 
struct device *dev,
struct intel_svm_dev *sdev = NULL;
struct dmar_domain *dmar_domain;
struct intel_svm *svm = NULL;
+   unsigned long flags;
int ret = 0;
 
if 

[PATCH v7 07/16] vfio/type1: Add VFIO_IOMMU_PASID_REQUEST (alloc/free)

2020-09-10 Thread Liu Yi L
This patch allows userspace to request PASID allocation/free, e.g. when
serving the request from the guest.

PASIDs that are not freed by userspace are automatically freed when the
IOASID set is destroyed when process exits.

Cc: Kevin Tian 
CC: Jacob Pan 
Cc: Alex Williamson 
Cc: Eric Auger 
Cc: Jean-Philippe Brucker 
Cc: Joerg Roedel 
Cc: Lu Baolu 
Signed-off-by: Liu Yi L 
Signed-off-by: Yi Sun 
Signed-off-by: Jacob Pan 
---
v6 -> v7:
*) current VFIO returns allocated pasid via signed int, thus VFIO UAPI
   can only support 31 bits pasid. If user space gives min,max which is
   wider than 31 bits, should fail the allocation or free request.

v5 -> v6:
*) address comments from Eric against v5. remove the alloc/free helper.

v4 -> v5:
*) address comments from Eric Auger.
*) the comments for the PASID_FREE request is addressed in patch 5/15 of
   this series.

v3 -> v4:
*) address comments from v3, except the below comment against the range
   of PASID_FREE request. needs more help on it.
"> +if (req.range.min > req.range.max)

 Is it exploitable that a user can spin the kernel for a long time in
 the case of a free by calling this with [0, MAX_UINT] regardless of
 their actual allocations?"
https://lore.kernel.org/linux-iommu/20200702151832.048b4...@x1.home/

v1 -> v2:
*) move the vfio_mm related code to be a seprate module
*) use a single structure for alloc/free, could support a range of PASIDs
*) fetch vfio_mm at group_attach time instead of at iommu driver open time
---
 drivers/vfio/Kconfig|  1 +
 drivers/vfio/vfio_iommu_type1.c | 76 +
 drivers/vfio/vfio_pasid.c   | 10 ++
 include/linux/vfio.h|  6 
 include/uapi/linux/vfio.h   | 43 +++
 5 files changed, 136 insertions(+)

diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig
index 3d8a108..95d90c6 100644
--- a/drivers/vfio/Kconfig
+++ b/drivers/vfio/Kconfig
@@ -2,6 +2,7 @@
 config VFIO_IOMMU_TYPE1
tristate
depends on VFIO
+   select VFIO_PASID if (X86)
default n
 
 config VFIO_IOMMU_SPAPR_TCE
diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 3c0048b..bd4b668 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -76,6 +76,7 @@ struct vfio_iommu {
booldirty_page_tracking;
boolpinned_page_dirty_scope;
struct iommu_nesting_info   *nesting_info;
+   struct vfio_mm  *vmm;
 };
 
 struct vfio_domain {
@@ -2000,6 +2001,11 @@ static void vfio_iommu_iova_insert_copy(struct 
vfio_iommu *iommu,
 
 static void vfio_iommu_release_nesting_info(struct vfio_iommu *iommu)
 {
+   if (iommu->vmm) {
+   vfio_mm_put(iommu->vmm);
+   iommu->vmm = NULL;
+   }
+
kfree(iommu->nesting_info);
iommu->nesting_info = NULL;
 }
@@ -2127,6 +2133,26 @@ static int vfio_iommu_type1_attach_group(void 
*iommu_data,
iommu->nesting_info);
if (ret)
goto out_detach;
+
+   if (iommu->nesting_info->features &
+   IOMMU_NESTING_FEAT_SYSWIDE_PASID) {
+   struct vfio_mm *vmm;
+   struct ioasid_set *set;
+
+   vmm = vfio_mm_get_from_task(current);
+   if (IS_ERR(vmm)) {
+   ret = PTR_ERR(vmm);
+   goto out_detach;
+   }
+   iommu->vmm = vmm;
+
+   set = vfio_mm_ioasid_set(vmm);
+   ret = iommu_domain_set_attr(domain->domain,
+   DOMAIN_ATTR_IOASID_SET,
+   set);
+   if (ret)
+   goto out_detach;
+   }
}
 
/* Get aperture info */
@@ -2908,6 +2934,54 @@ static int vfio_iommu_type1_dirty_pages(struct 
vfio_iommu *iommu,
return -EINVAL;
 }
 
+static int vfio_iommu_type1_pasid_request(struct vfio_iommu *iommu,
+ unsigned long arg)
+{
+   struct vfio_iommu_type1_pasid_request req;
+   unsigned long minsz;
+   int ret;
+
+   minsz = offsetofend(struct vfio_iommu_type1_pasid_request, range);
+
+   if (copy_from_user(, (void __user *)arg, minsz))
+   return -EFAULT;
+
+   if (req.argsz < minsz || (req.flags & ~VFIO_PASID_REQUEST_MASK))
+   return -EINVAL;
+
+   /*
+* Current VFIO_IOMMU_PASID_REQUEST only supports at most
+* 31 bits PASID. The min,max value from userspace should
+* not exceed 31 bits.
+*/
+   if (req.range.min > req.range.max ||
+   req.range.min > (1 << VFIO_IOMMU_PASID_BITS) 

[PATCH v7 03/16] vfio/type1: Report iommu nesting info to userspace

2020-09-10 Thread Liu Yi L
This patch exports iommu nesting capability info to user space through
VFIO. Userspace is expected to check this info for supported uAPIs (e.g.
PASID alloc/free, bind page table, and cache invalidation) and the vendor
specific format information for first level/stage page table that will be
bound to.

The nesting info is available only after container set to be NESTED type.
Current implementation imposes one limitation - one nesting container
should include at most one iommu group. The philosophy of vfio container
is having all groups/devices within the container share the same IOMMU
context. When vSVA is enabled, one IOMMU context could include one 2nd-
level address space and multiple 1st-level address spaces. While the
2nd-level address space is reasonably sharable by multiple groups, blindly
sharing 1st-level address spaces across all groups within the container
might instead break the guest expectation. In the future sub/super container
concept might be introduced to allow partial address space sharing within
an IOMMU context. But for now let's go with this restriction by requiring
singleton container for using nesting iommu features. Below link has the
related discussion about this decision.

https://lore.kernel.org/kvm/20200515115924.37e69...@w520.home/

This patch also changes the NESTING type container behaviour. Something
that would have succeeded before will now fail: Before this series, if
user asked for a VFIO_IOMMU_TYPE1_NESTING, it would have succeeded even
if the SMMU didn't support stage-2, as the driver would have silently
fallen back on stage-1 mappings (which work exactly the same as stage-2
only since there was no nesting supported). After the series, we do check
for DOMAIN_ATTR_NESTING so if user asks for VFIO_IOMMU_TYPE1_NESTING and
the SMMU doesn't support stage-2, the ioctl fails. But it should be a good
fix and completely harmless. Detail can be found in below link as well.

https://lore.kernel.org/kvm/20200717090900.GC4850@myrica/

Cc: Kevin Tian 
CC: Jacob Pan 
Cc: Alex Williamson 
Cc: Eric Auger 
Cc: Jean-Philippe Brucker 
Cc: Joerg Roedel 
Cc: Lu Baolu 
Signed-off-by: Liu Yi L 
---
v6 -> v7:
*) using vfio_info_add_capability() for adding nesting cap per suggestion
   from Eric.

v5 -> v6:
*) address comments against v5 from Eric Auger.
*) don't report nesting cap to userspace if the nesting_info->format is
   invalid.

v4 -> v5:
*) address comments from Eric Auger.
*) return struct iommu_nesting_info for VFIO_IOMMU_TYPE1_INFO_CAP_NESTING as
   cap is much "cheap", if needs extension in future, just define another cap.
   https://lore.kernel.org/kvm/20200708132947.5b7ee...@x1.home/

v3 -> v4:
*) address comments against v3.

v1 -> v2:
*) added in v2
---
 drivers/vfio/vfio_iommu_type1.c | 92 +++--
 include/uapi/linux/vfio.h   | 19 +
 2 files changed, 99 insertions(+), 12 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index c992973..3c0048b 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -62,18 +62,20 @@ MODULE_PARM_DESC(dma_entry_limit,
 "Maximum number of user DMA mappings per container (65535).");
 
 struct vfio_iommu {
-   struct list_headdomain_list;
-   struct list_headiova_list;
-   struct vfio_domain  *external_domain; /* domain for external user */
-   struct mutexlock;
-   struct rb_root  dma_list;
-   struct blocking_notifier_head notifier;
-   unsigned intdma_avail;
-   uint64_tpgsize_bitmap;
-   boolv2;
-   boolnesting;
-   booldirty_page_tracking;
-   boolpinned_page_dirty_scope;
+   struct list_headdomain_list;
+   struct list_headiova_list;
+   /* domain for external user */
+   struct vfio_domain  *external_domain;
+   struct mutexlock;
+   struct rb_root  dma_list;
+   struct blocking_notifier_head   notifier;
+   unsigned intdma_avail;
+   uint64_tpgsize_bitmap;
+   boolv2;
+   boolnesting;
+   booldirty_page_tracking;
+   boolpinned_page_dirty_scope;
+   struct iommu_nesting_info   *nesting_info;
 };
 
 struct vfio_domain {
@@ -130,6 +132,9 @@ struct vfio_regions {
 #define IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu)\
(!list_empty(>domain_list))
 
+#define CONTAINER_HAS_DOMAIN(iommu)(((iommu)->external_domain) || \
+(!list_empty(&(iommu)->domain_list)))
+
 #define DIRTY_BITMAP_BYTES(n)  (ALIGN(n, BITS_PER_TYPE(u64)) / BITS_PER_BYTE)
 
 /*
@@ 

Re: [trivial PATCH] treewide: Convert switch/case fallthrough; to break;

2020-09-10 Thread Steffen Maier

On 9/9/20 10:06 PM, Joe Perches wrote:

fallthrough to a separate case/default label break; isn't very readable.

Convert pseudo-keyword fallthrough; statements to a simple break; when
the next label is case or default and the only statement in the next
label block is break;

Found using:

$ grep-2.5.4 -rP --include=*.[ch] -n 
"fallthrough;(\s*(case\s+\w+|default)\s*:\s*){1,7}break;" *

Miscellanea:

o Move or coalesce a couple label blocks above a default: block.

Signed-off-by: Joe Perches 
---

Compiled allyesconfig x86-64 only.
A few files for other arches were not compiled.



  drivers/s390/scsi/zfcp_fsf.c  |  2 +-



  82 files changed, 109 insertions(+), 112 deletions(-)



diff --git a/drivers/s390/scsi/zfcp_fsf.c b/drivers/s390/scsi/zfcp_fsf.c
index 140186fe1d1e..2741a07df692 100644
--- a/drivers/s390/scsi/zfcp_fsf.c
+++ b/drivers/s390/scsi/zfcp_fsf.c
@@ -2105,7 +2105,7 @@ static void zfcp_fsf_open_lun_handler(struct zfcp_fsf_req 
*req)
  
  	case FSF_PORT_HANDLE_NOT_VALID:

zfcp_erp_adapter_reopen(adapter, 0, "fsouh_1");
-   fallthrough;
+   break;
case FSF_LUN_ALREADY_OPEN:
break;
case FSF_PORT_BOXED:


Acked-by: Steffen Maier  # for zfcp


--
Mit freundlichen Gruessen / Kind regards
Steffen Maier

Linux on IBM Z Development

https://www.ibm.com/privacy/us/en/
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Matthias Hartmann
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [trivial PATCH] treewide: Convert switch/case fallthrough; to break;

2020-09-10 Thread Ilya Dryomov
On Wed, Sep 9, 2020 at 10:10 PM Joe Perches  wrote:
>
> fallthrough to a separate case/default label break; isn't very readable.
>
> Convert pseudo-keyword fallthrough; statements to a simple break; when
> the next label is case or default and the only statement in the next
> label block is break;
>
> Found using:
>
> $ grep-2.5.4 -rP --include=*.[ch] -n 
> "fallthrough;(\s*(case\s+\w+|default)\s*:\s*){1,7}break;" *
>
> Miscellanea:
>
> o Move or coalesce a couple label blocks above a default: block.
>
> Signed-off-by: Joe Perches 
> ---
>
> Compiled allyesconfig x86-64 only.
> A few files for other arches were not compiled.
>
>  arch/arm/mach-mmp/pm-pxa910.c |  2 +-
>  arch/arm64/kvm/handle_exit.c  |  2 +-
>  arch/mips/kernel/cpu-probe.c  |  2 +-
>  arch/mips/math-emu/cp1emu.c   |  2 +-
>  arch/s390/pci/pci.c   |  2 +-
>  crypto/tcrypt.c   |  4 ++--
>  drivers/ata/sata_mv.c |  2 +-
>  drivers/atm/lanai.c   |  2 +-
>  drivers/gpu/drm/i915/display/intel_sprite.c   |  2 +-
>  drivers/gpu/drm/nouveau/nvkm/engine/disp/hdmi.c   |  2 +-
>  drivers/hid/wacom_wac.c   |  2 +-
>  drivers/i2c/busses/i2c-i801.c |  2 +-
>  drivers/infiniband/ulp/rtrs/rtrs-clt.c| 14 +++---
>  drivers/infiniband/ulp/rtrs/rtrs-srv.c|  6 +++---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   |  2 +-
>  drivers/irqchip/irq-vic.c |  4 ++--
>  drivers/md/dm.c   |  2 +-
>  drivers/media/dvb-frontends/drxd_hard.c   |  2 +-
>  drivers/media/i2c/ov5640.c|  2 +-
>  drivers/media/i2c/ov6650.c|  5 ++---
>  drivers/media/i2c/smiapp/smiapp-core.c|  2 +-
>  drivers/media/i2c/tvp5150.c   |  2 +-
>  drivers/media/pci/ddbridge/ddbridge-core.c|  2 +-
>  drivers/media/usb/cpia2/cpia2_core.c  |  2 +-
>  drivers/mfd/iqs62x.c  |  3 +--
>  drivers/mmc/host/atmel-mci.c  |  2 +-
>  drivers/mtd/nand/raw/nandsim.c|  2 +-
>  drivers/net/ethernet/intel/e1000e/phy.c   |  2 +-
>  drivers/net/ethernet/intel/fm10k/fm10k_pf.c   |  2 +-
>  drivers/net/ethernet/intel/i40e/i40e_adminq.c |  2 +-
>  drivers/net/ethernet/intel/i40e/i40e_txrx.c   |  2 +-
>  drivers/net/ethernet/intel/iavf/iavf_txrx.c   |  2 +-
>  drivers/net/ethernet/intel/igb/e1000_phy.c|  2 +-
>  drivers/net/ethernet/intel/ixgbe/ixgbe_82599.c|  2 +-
>  drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |  2 +-
>  drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c|  2 +-
>  drivers/net/ethernet/intel/ixgbevf/vf.c   |  2 +-
>  drivers/net/ethernet/netronome/nfp/nfpcore/nfp6000_pcie.c |  2 +-
>  drivers/net/ethernet/qlogic/qed/qed_mcp.c |  2 +-
>  drivers/net/ethernet/sfc/falcon/farch.c   |  2 +-
>  drivers/net/ethernet/sfc/farch.c  |  2 +-
>  drivers/net/phy/adin.c|  3 +--
>  drivers/net/usb/pegasus.c |  4 ++--
>  drivers/net/usb/usbnet.c  |  2 +-
>  drivers/net/wireless/ath/ath5k/eeprom.c   |  2 +-
>  drivers/net/wireless/mediatek/mt7601u/dma.c   |  8 
>  drivers/nvme/host/core.c  | 12 ++--
>  drivers/pcmcia/db1xxx_ss.c|  4 ++--
>  drivers/power/supply/abx500_chargalg.c|  2 +-
>  drivers/power/supply/charger-manager.c|  2 +-
>  drivers/rtc/rtc-pcf85063.c|  2 +-
>  drivers/s390/scsi/zfcp_fsf.c  |  2 +-
>  drivers/scsi/aic7xxx/aic79xx_core.c   |  4 ++--
>  drivers/scsi/aic94xx/aic94xx_tmf.c|  2 +-
>  drivers/scsi/lpfc/lpfc_sli.c  |  2 +-
>  drivers/scsi/smartpqi/smartpqi_init.c |  2 +-
>  drivers/scsi/sr.c |  2 +-
>  drivers/tty/serial/sunsu.c|  2 +-
>  drivers/tty/serial/sunzilog.c |  2 +-
>  drivers/tty/vt/vt_ioctl.c |  2 +-
>  drivers/usb/dwc3/core.c   |  2 +-
>  drivers/usb/gadget/legacy/inode.c |  2 +-
>  drivers/usb/gadget/udc/pxa25x_udc.c 

Re: [trivial PATCH] treewide: Convert switch/case fallthrough; to break;

2020-09-10 Thread Matthias Brugger




On 09/09/2020 22:06, Joe Perches wrote:

diff --git a/drivers/net/wireless/mediatek/mt7601u/dma.c 
b/drivers/net/wireless/mediatek/mt7601u/dma.c
index 09f931d4598c..778be26d329f 100644
--- a/drivers/net/wireless/mediatek/mt7601u/dma.c
+++ b/drivers/net/wireless/mediatek/mt7601u/dma.c
@@ -193,11 +193,11 @@ static void mt7601u_complete_rx(struct urb *urb)
case -ESHUTDOWN:
case -ENOENT:
return;
+   case 0:
+   break;
default:
dev_err_ratelimited(dev->dev, "rx urb failed: %d\n",
urb->status);
-   fallthrough;
-   case 0:
break;
}
  
@@ -238,11 +238,11 @@ static void mt7601u_complete_tx(struct urb *urb)

case -ESHUTDOWN:
case -ENOENT:
return;
+   case 0:
+   break;
default:
dev_err_ratelimited(dev->dev, "tx urb failed: %d\n",
urb->status);
-   fallthrough;
-   case 0:
break;
}


Reviewed-by: Matthias Brugger 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v10 30/30] videobuf2: use sgtable-based scatterlist wrappers

2020-09-10 Thread Tomasz Figa
On Thu, Sep 10, 2020 at 11:17 AM Hans Verkuil  wrote:
>
> On 04/09/2020 15:17, Marek Szyprowski wrote:
> > Use recently introduced common wrappers operating directly on the struct
> > sg_table objects and scatterlist page iterators to make the code a bit
> > more compact, robust, easier to follow and copy/paste safe.
> >
> > No functional change, because the code already properly did all the
> > scatterlist related calls.
> >
> > Signed-off-by: Marek Szyprowski 
> > Reviewed-by: Robin Murphy 
>
> Acked-by: Hans Verkuil 
>
> Note that I agree with Marek to keep returning -EIO. If we want to propagate
> low-level errors, then that should be done in a separate patch. But I think 
> EIO
> is fine.

As I mentioned, there are 2 different cases here - UAPI and kAPI. I
agree that we should keep -EIO for UAPI, but kAPI is another story.
But if we're convinced that -EIO is also fine for the latter, I'm fine
with that.

Best regards,
Tomasz

>
> Regards,
>
> Hans
>
> > ---
> >  .../common/videobuf2/videobuf2-dma-contig.c   | 34 ---
> >  .../media/common/videobuf2/videobuf2-dma-sg.c | 32 +++--
> >  .../common/videobuf2/videobuf2-vmalloc.c  | 12 +++
> >  3 files changed, 31 insertions(+), 47 deletions(-)
> >
> > diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c 
> > b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
> > index ec3446cc45b8..1b242d844dde 100644
> > --- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c
> > +++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
> > @@ -58,10 +58,10 @@ static unsigned long vb2_dc_get_contiguous_size(struct 
> > sg_table *sgt)
> >   unsigned int i;
> >   unsigned long size = 0;
> >
> > - for_each_sg(sgt->sgl, s, sgt->nents, i) {
> > + for_each_sgtable_dma_sg(sgt, s, i) {
> >   if (sg_dma_address(s) != expected)
> >   break;
> > - expected = sg_dma_address(s) + sg_dma_len(s);
> > + expected += sg_dma_len(s);
> >   size += sg_dma_len(s);
> >   }
> >   return size;
> > @@ -103,8 +103,7 @@ static void vb2_dc_prepare(void *buf_priv)
> >   if (!sgt)
> >   return;
> >
> > - dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->orig_nents,
> > -buf->dma_dir);
> > + dma_sync_sgtable_for_device(buf->dev, sgt, buf->dma_dir);
> >  }
> >
> >  static void vb2_dc_finish(void *buf_priv)
> > @@ -115,7 +114,7 @@ static void vb2_dc_finish(void *buf_priv)
> >   if (!sgt)
> >   return;
> >
> > - dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->orig_nents, 
> > buf->dma_dir);
> > + dma_sync_sgtable_for_cpu(buf->dev, sgt, buf->dma_dir);
> >  }
> >
> >  /*/
> > @@ -275,8 +274,8 @@ static void vb2_dc_dmabuf_ops_detach(struct dma_buf 
> > *dbuf,
> >* memory locations do not require any explicit cache
> >* maintenance prior or after being used by the device.
> >*/
> > - dma_unmap_sg_attrs(db_attach->dev, sgt->sgl, sgt->orig_nents,
> > -attach->dma_dir, DMA_ATTR_SKIP_CPU_SYNC);
> > + dma_unmap_sgtable(db_attach->dev, sgt, attach->dma_dir,
> > +   DMA_ATTR_SKIP_CPU_SYNC);
> >   sg_free_table(sgt);
> >   kfree(attach);
> >   db_attach->priv = NULL;
> > @@ -301,8 +300,8 @@ static struct sg_table *vb2_dc_dmabuf_ops_map(
> >
> >   /* release any previous cache */
> >   if (attach->dma_dir != DMA_NONE) {
> > - dma_unmap_sg_attrs(db_attach->dev, sgt->sgl, sgt->orig_nents,
> > -attach->dma_dir, DMA_ATTR_SKIP_CPU_SYNC);
> > + dma_unmap_sgtable(db_attach->dev, sgt, attach->dma_dir,
> > +   DMA_ATTR_SKIP_CPU_SYNC);
> >   attach->dma_dir = DMA_NONE;
> >   }
> >
> > @@ -310,9 +309,8 @@ static struct sg_table *vb2_dc_dmabuf_ops_map(
> >* mapping to the client with new direction, no cache sync
> >* required see comment in vb2_dc_dmabuf_ops_detach()
> >*/
> > - sgt->nents = dma_map_sg_attrs(db_attach->dev, sgt->sgl, 
> > sgt->orig_nents,
> > -   dma_dir, DMA_ATTR_SKIP_CPU_SYNC);
> > - if (!sgt->nents) {
> > + if (dma_map_sgtable(db_attach->dev, sgt, dma_dir,
> > + DMA_ATTR_SKIP_CPU_SYNC)) {
> >   pr_err("failed to map scatterlist\n");
> >   mutex_unlock(lock);
> >   return ERR_PTR(-EIO);
> > @@ -455,8 +453,8 @@ static void vb2_dc_put_userptr(void *buf_priv)
> >* No need to sync to CPU, it's already synced to the CPU
> >* since the finish() memop will have been called before this.
> >*/
> > - dma_unmap_sg_attrs(buf->dev, sgt->sgl, sgt->orig_nents,
> > -buf->dma_dir, 

Re: [PATCH v10 29/30] media: pci: fix common ALSA DMA-mapping related codes

2020-09-10 Thread Hans Verkuil
On 04/09/2020 15:17, Marek Szyprowski wrote:
> The Documentation/DMA-API-HOWTO.txt states that dma_map_sg returns the
> numer of the created entries in the DMA address space. However the

numer -> number

> subsequent calls to dma_sync_sg_for_{device,cpu} and dma_unmap_sg must be
> called with the original number of entries passed to dma_map_sg. The
> sg_table->nents in turn holds the result of the dma_map_sg call as stated
> in include/linux/scatterlist.h. Adapt the code to obey those rules.
> 
> While touching this code, update it to use the modern DMA_FROM_DEVICE
> definitions.
> 
> Signed-off-by: Marek Szyprowski 

Acked-by: Hans Verkuil 

Thanks!

Hans

> ---
>  drivers/media/pci/cx23885/cx23885-alsa.c | 4 ++--
>  drivers/media/pci/cx25821/cx25821-alsa.c | 4 ++--
>  drivers/media/pci/cx88/cx88-alsa.c   | 6 +++---
>  drivers/media/pci/saa7134/saa7134-alsa.c | 4 ++--
>  4 files changed, 9 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/media/pci/cx23885/cx23885-alsa.c 
> b/drivers/media/pci/cx23885/cx23885-alsa.c
> index df44ed7393a0..c797bff6eebb 100644
> --- a/drivers/media/pci/cx23885/cx23885-alsa.c
> +++ b/drivers/media/pci/cx23885/cx23885-alsa.c
> @@ -113,7 +113,7 @@ static int cx23885_alsa_dma_map(struct cx23885_audio_dev 
> *dev)
>   struct cx23885_audio_buffer *buf = dev->buf;
>  
>   buf->sglen = dma_map_sg(>pci->dev, buf->sglist,
> - buf->nr_pages, PCI_DMA_FROMDEVICE);
> + buf->nr_pages, DMA_FROM_DEVICE);
>  
>   if (0 == buf->sglen) {
>   pr_warn("%s: cx23885_alsa_map_sg failed\n", __func__);
> @@ -129,7 +129,7 @@ static int cx23885_alsa_dma_unmap(struct 
> cx23885_audio_dev *dev)
>   if (!buf->sglen)
>   return 0;
>  
> - dma_unmap_sg(>pci->dev, buf->sglist, buf->sglen, 
> PCI_DMA_FROMDEVICE);
> + dma_unmap_sg(>pci->dev, buf->sglist, buf->nr_pages, 
> DMA_FROM_DEVICE);
>   buf->sglen = 0;
>   return 0;
>  }
> diff --git a/drivers/media/pci/cx25821/cx25821-alsa.c 
> b/drivers/media/pci/cx25821/cx25821-alsa.c
> index 301616426d8a..8da31c953b02 100644
> --- a/drivers/media/pci/cx25821/cx25821-alsa.c
> +++ b/drivers/media/pci/cx25821/cx25821-alsa.c
> @@ -177,7 +177,7 @@ static int cx25821_alsa_dma_map(struct cx25821_audio_dev 
> *dev)
>   struct cx25821_audio_buffer *buf = dev->buf;
>  
>   buf->sglen = dma_map_sg(>pci->dev, buf->sglist,
> - buf->nr_pages, PCI_DMA_FROMDEVICE);
> + buf->nr_pages, DMA_FROM_DEVICE);
>  
>   if (0 == buf->sglen) {
>   pr_warn("%s: cx25821_alsa_map_sg failed\n", __func__);
> @@ -193,7 +193,7 @@ static int cx25821_alsa_dma_unmap(struct 
> cx25821_audio_dev *dev)
>   if (!buf->sglen)
>   return 0;
>  
> - dma_unmap_sg(>pci->dev, buf->sglist, buf->sglen, 
> PCI_DMA_FROMDEVICE);
> + dma_unmap_sg(>pci->dev, buf->sglist, buf->nr_pages, 
> DMA_FROM_DEVICE);
>   buf->sglen = 0;
>   return 0;
>  }
> diff --git a/drivers/media/pci/cx88/cx88-alsa.c 
> b/drivers/media/pci/cx88/cx88-alsa.c
> index 7d7aceecc985..d38633bc1330 100644
> --- a/drivers/media/pci/cx88/cx88-alsa.c
> +++ b/drivers/media/pci/cx88/cx88-alsa.c
> @@ -316,7 +316,7 @@ static int cx88_alsa_dma_map(struct cx88_audio_dev *dev)
>   struct cx88_audio_buffer *buf = dev->buf;
>  
>   buf->sglen = dma_map_sg(>pci->dev, buf->sglist,
> - buf->nr_pages, PCI_DMA_FROMDEVICE);
> + buf->nr_pages, DMA_FROM_DEVICE);
>  
>   if (buf->sglen == 0) {
>   pr_warn("%s: cx88_alsa_map_sg failed\n", __func__);
> @@ -332,8 +332,8 @@ static int cx88_alsa_dma_unmap(struct cx88_audio_dev *dev)
>   if (!buf->sglen)
>   return 0;
>  
> - dma_unmap_sg(>pci->dev, buf->sglist, buf->sglen,
> -  PCI_DMA_FROMDEVICE);
> + dma_unmap_sg(>pci->dev, buf->sglist, buf->nr_pages,
> +  DMA_FROM_DEVICE);
>   buf->sglen = 0;
>   return 0;
>  }
> diff --git a/drivers/media/pci/saa7134/saa7134-alsa.c 
> b/drivers/media/pci/saa7134/saa7134-alsa.c
> index 544ca57eee75..707ca77221dc 100644
> --- a/drivers/media/pci/saa7134/saa7134-alsa.c
> +++ b/drivers/media/pci/saa7134/saa7134-alsa.c
> @@ -297,7 +297,7 @@ static int saa7134_alsa_dma_map(struct saa7134_dev *dev)
>   struct saa7134_dmasound *dma = >dmasound;
>  
>   dma->sglen = dma_map_sg(>pci->dev, dma->sglist,
> - dma->nr_pages, PCI_DMA_FROMDEVICE);
> + dma->nr_pages, DMA_FROM_DEVICE);
>  
>   if (0 == dma->sglen) {
>   pr_warn("%s: saa7134_alsa_map_sg failed\n", __func__);
> @@ -313,7 +313,7 @@ static int saa7134_alsa_dma_unmap(struct saa7134_dev *dev)
>   if (!dma->sglen)
>   return 0;
>  
> - dma_unmap_sg(>pci->dev, dma->sglist, dma->sglen, 
> PCI_DMA_FROMDEVICE);
> + dma_unmap_sg(>pci->dev, dma->sglist, dma->nr_pages, 
> DMA_FROM_DEVICE);
>   dma->sglen = 0;
>   return 0;
>  }
> 

Re: [PATCH v2 5/9] iommu/ioasid: Introduce ioasid_set private ID

2020-09-10 Thread Auger Eric
Hi Jacob,

On 9/9/20 12:40 AM, Jacob Pan wrote:
> On Tue, 1 Sep 2020 17:38:44 +0200
> Auger Eric  wrote:
> 
>> Hi Jacob,
>> On 8/22/20 6:35 AM, Jacob Pan wrote:
>>> When an IOASID set is used for guest SVA, each VM will acquire its
>>> ioasid_set for IOASID allocations. IOASIDs within the VM must have a
>>> host/physical IOASID backing, mapping between guest and host
>>> IOASIDs can be non-identical. IOASID set private ID (SPID) is
>>> introduced in this patch to be used as guest IOASID. However, the
>>> concept of ioasid_set specific namespace is generic, thus named
>>> SPID.
>>>
>>> As SPID namespace is within the IOASID set, the IOASID core can
>>> provide lookup services at both directions. SPIDs may not be
>>> allocated when its IOASID is allocated, the mapping between SPID
>>> and IOASID is usually established when a guest page table is bound
>>> to a host PASID.
>>>
>>> Signed-off-by: Jacob Pan 
>>> ---
>>>  drivers/iommu/ioasid.c | 54
>>> ++
>>> include/linux/ioasid.h | 12 +++ 2 files changed, 66
>>> insertions(+)
>>>
>>> diff --git a/drivers/iommu/ioasid.c b/drivers/iommu/ioasid.c
>>> index 5f31d63c75b1..c0aef38a4fde 100644
>>> --- a/drivers/iommu/ioasid.c
>>> +++ b/drivers/iommu/ioasid.c
>>> @@ -21,6 +21,7 @@ enum ioasid_state {
>>>   * struct ioasid_data - Meta data about ioasid
>>>   *
>>>   * @id:Unique ID
>>> + * @spid:  Private ID unique within a set
>>>   * @users  Number of active users
>>>   * @state  Track state of the IOASID
>>>   * @setMeta data of the set this IOASID belongs to
>>> @@ -29,6 +30,7 @@ enum ioasid_state {
>>>   */
>>>  struct ioasid_data {
>>> ioasid_t id;
>>> +   ioasid_t spid;
>>> struct ioasid_set *set;
>>> refcount_t users;
>>> enum ioasid_state state;
>>> @@ -326,6 +328,58 @@ int ioasid_attach_data(ioasid_t ioasid, void
>>> *data) EXPORT_SYMBOL_GPL(ioasid_attach_data);
>>>  
>>>  /**
>>> + * ioasid_attach_spid - Attach ioasid_set private ID to an IOASID
>>> + *
>>> + * @ioasid: the ID to attach
>>> + * @spid:   the ioasid_set private ID of @ioasid
>>> + *
>>> + * For IOASID that is already allocated, private ID within the set
>>> can be
>>> + * attached via this API. Future lookup can be done via
>>> ioasid_find.  
>> I would remove "For IOASID that is already allocated, private ID
>> within the set can be attached via this API"
> I guess it is implied. Will remove.
> 
>>> + */
>>> +int ioasid_attach_spid(ioasid_t ioasid, ioasid_t spid)
>>> +{
>>> +   struct ioasid_data *ioasid_data;
>>> +   int ret = 0;
>>> +
>>> +   spin_lock(_allocator_lock);  
>> We keep on saying the SPID is local to an IOASID set but we don't
>> check any IOASID set contains this ioasid. It looks a bit weird to me.
> We store ioasid_set inside ioasid_data when an IOASID is allocated, so
> we don't need to search all the ioasid_sets. Perhaps I missed your
> point?
No I think it became clearer ;-)
> 
>>> +   ioasid_data = xa_load(_allocator->xa, ioasid);
>>> +
>>> +   if (!ioasid_data) {
>>> +   pr_err("No IOASID entry %d to attach SPID %d\n",
>>> +   ioasid, spid);
>>> +   ret = -ENOENT;
>>> +   goto done_unlock;
>>> +   }
>>> +   ioasid_data->spid = spid;  
>> is there any way/need to remove an spid binding?
> For guest SVA, we attach SPID as a guest PASID during bind guest page
> table. Unbind does the opposite, ioasid_attach_spid() with
> spid=INVALID_IOASID clears the binding.
> 
> Perhaps add more symmetric functions? i.e.
> ioasid_detach_spid(ioasid_t ioasid)
> ioasid_attach_spid(struct ioasid_set *set, ioasid_t ioasid)
yep make sense

Thanks

Eric
> 
>>> +
>>> +done_unlock:
>>> +   spin_unlock(_allocator_lock);
>>> +   return ret;
>>> +}
>>> +EXPORT_SYMBOL_GPL(ioasid_attach_spid);
>>> +
>>> +ioasid_t ioasid_find_by_spid(struct ioasid_set *set, ioasid_t spid)
>>> +{
>>> +   struct ioasid_data *entry;
>>> +   unsigned long index;
>>> +
>>> +   if (!xa_load(_sets, set->sid)) {
>>> +   pr_warn("Invalid set\n");
>>> +   return INVALID_IOASID;
>>> +   }
>>> +
>>> +   xa_for_each(>xa, index, entry) {
>>> +   if (spid == entry->spid) {
>>> +   pr_debug("Found ioasid %lu by spid %u\n",
>>> index, spid);
>>> +   refcount_inc(>users);
>>> +   return index;
>>> +   }
>>> +   }
>>> +   return INVALID_IOASID;
>>> +}
>>> +EXPORT_SYMBOL_GPL(ioasid_find_by_spid);
>>> +
>>> +/**
>>>   * ioasid_alloc - Allocate an IOASID
>>>   * @set: the IOASID set
>>>   * @min: the minimum ID (inclusive)
>>> diff --git a/include/linux/ioasid.h b/include/linux/ioasid.h
>>> index 310abe4187a3..d4b3e83672f6 100644
>>> --- a/include/linux/ioasid.h
>>> +++ b/include/linux/ioasid.h
>>> @@ -73,6 +73,8 @@ bool ioasid_is_active(ioasid_t ioasid);
>>>  
>>>  void *ioasid_find(struct ioasid_set *set, ioasid_t ioasid, bool
>>> (*getter)(void *)); int ioasid_attach_data(ioasid_t ioasid, void
>>> *data); +int 

Re: [PATCH v10 30/30] videobuf2: use sgtable-based scatterlist wrappers

2020-09-10 Thread Hans Verkuil
On 04/09/2020 15:17, Marek Szyprowski wrote:
> Use recently introduced common wrappers operating directly on the struct
> sg_table objects and scatterlist page iterators to make the code a bit
> more compact, robust, easier to follow and copy/paste safe.
> 
> No functional change, because the code already properly did all the
> scatterlist related calls.
> 
> Signed-off-by: Marek Szyprowski 
> Reviewed-by: Robin Murphy 

Acked-by: Hans Verkuil 

Note that I agree with Marek to keep returning -EIO. If we want to propagate
low-level errors, then that should be done in a separate patch. But I think EIO
is fine.

Regards,

Hans

> ---
>  .../common/videobuf2/videobuf2-dma-contig.c   | 34 ---
>  .../media/common/videobuf2/videobuf2-dma-sg.c | 32 +++--
>  .../common/videobuf2/videobuf2-vmalloc.c  | 12 +++
>  3 files changed, 31 insertions(+), 47 deletions(-)
> 
> diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c 
> b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
> index ec3446cc45b8..1b242d844dde 100644
> --- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c
> +++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
> @@ -58,10 +58,10 @@ static unsigned long vb2_dc_get_contiguous_size(struct 
> sg_table *sgt)
>   unsigned int i;
>   unsigned long size = 0;
>  
> - for_each_sg(sgt->sgl, s, sgt->nents, i) {
> + for_each_sgtable_dma_sg(sgt, s, i) {
>   if (sg_dma_address(s) != expected)
>   break;
> - expected = sg_dma_address(s) + sg_dma_len(s);
> + expected += sg_dma_len(s);
>   size += sg_dma_len(s);
>   }
>   return size;
> @@ -103,8 +103,7 @@ static void vb2_dc_prepare(void *buf_priv)
>   if (!sgt)
>   return;
>  
> - dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->orig_nents,
> -buf->dma_dir);
> + dma_sync_sgtable_for_device(buf->dev, sgt, buf->dma_dir);
>  }
>  
>  static void vb2_dc_finish(void *buf_priv)
> @@ -115,7 +114,7 @@ static void vb2_dc_finish(void *buf_priv)
>   if (!sgt)
>   return;
>  
> - dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->orig_nents, buf->dma_dir);
> + dma_sync_sgtable_for_cpu(buf->dev, sgt, buf->dma_dir);
>  }
>  
>  /*/
> @@ -275,8 +274,8 @@ static void vb2_dc_dmabuf_ops_detach(struct dma_buf *dbuf,
>* memory locations do not require any explicit cache
>* maintenance prior or after being used by the device.
>*/
> - dma_unmap_sg_attrs(db_attach->dev, sgt->sgl, sgt->orig_nents,
> -attach->dma_dir, DMA_ATTR_SKIP_CPU_SYNC);
> + dma_unmap_sgtable(db_attach->dev, sgt, attach->dma_dir,
> +   DMA_ATTR_SKIP_CPU_SYNC);
>   sg_free_table(sgt);
>   kfree(attach);
>   db_attach->priv = NULL;
> @@ -301,8 +300,8 @@ static struct sg_table *vb2_dc_dmabuf_ops_map(
>  
>   /* release any previous cache */
>   if (attach->dma_dir != DMA_NONE) {
> - dma_unmap_sg_attrs(db_attach->dev, sgt->sgl, sgt->orig_nents,
> -attach->dma_dir, DMA_ATTR_SKIP_CPU_SYNC);
> + dma_unmap_sgtable(db_attach->dev, sgt, attach->dma_dir,
> +   DMA_ATTR_SKIP_CPU_SYNC);
>   attach->dma_dir = DMA_NONE;
>   }
>  
> @@ -310,9 +309,8 @@ static struct sg_table *vb2_dc_dmabuf_ops_map(
>* mapping to the client with new direction, no cache sync
>* required see comment in vb2_dc_dmabuf_ops_detach()
>*/
> - sgt->nents = dma_map_sg_attrs(db_attach->dev, sgt->sgl, sgt->orig_nents,
> -   dma_dir, DMA_ATTR_SKIP_CPU_SYNC);
> - if (!sgt->nents) {
> + if (dma_map_sgtable(db_attach->dev, sgt, dma_dir,
> + DMA_ATTR_SKIP_CPU_SYNC)) {
>   pr_err("failed to map scatterlist\n");
>   mutex_unlock(lock);
>   return ERR_PTR(-EIO);
> @@ -455,8 +453,8 @@ static void vb2_dc_put_userptr(void *buf_priv)
>* No need to sync to CPU, it's already synced to the CPU
>* since the finish() memop will have been called before this.
>*/
> - dma_unmap_sg_attrs(buf->dev, sgt->sgl, sgt->orig_nents,
> -buf->dma_dir, DMA_ATTR_SKIP_CPU_SYNC);
> + dma_unmap_sgtable(buf->dev, sgt, buf->dma_dir,
> +   DMA_ATTR_SKIP_CPU_SYNC);
>   pages = frame_vector_pages(buf->vec);
>   /* sgt should exist only if vector contains pages... */
>   BUG_ON(IS_ERR(pages));
> @@ -553,9 +551,8 @@ static void *vb2_dc_get_userptr(struct device *dev, 
> unsigned long vaddr,
>* No need to sync to the device, this will happen later when the
>* prepare() memop is 

Re: [PATCH 3/3] dma-mapping: introduce DMA range map, supplanting dma_pfn_offset

2020-09-10 Thread Christoph Hellwig
On Thu, Sep 10, 2020 at 09:53:51AM +0200, Greg KH wrote:
> > /*
> >  * Please refer to usb_alloc_dev() to see why we set
> > -* dma_mask and dma_pfn_offset.
> > +* dma_mask and dma_range_map.
> >  */
> > intf->dev.dma_mask = dev->dev.dma_mask;
> > -   intf->dev.dma_pfn_offset = dev->dev.dma_pfn_offset;
> > +   if (dma_direct_copy_range_map(>dev, >dev))
> > +   dev_err(>dev, "failed to copy DMA map\n");
> 
> We tell the user, but then just keep on running?  Is there anything that
> we can do here?
> 
> If not, why not have dma_direct_copy_range_map() print out the error?

At least for USB I'm pretty sure this isn't required at all.  I've been
running with the patch below on my desktop for two days now trying all
the usb toys I have (in addition to grepping for obvious abuses in
the drivers).  remoteproc is a different story, but the DMA handling
seems there is sketchy to start with..

---
>From 8bae3e6833f2ca431dcfcbc8f9cced7d5e972a01 Mon Sep 17 00:00:00 2001
From: Christoph Hellwig 
Date: Wed, 9 Sep 2020 08:28:59 +0200
Subject: usb: don't inherity DMA properties for USB devices

As the comment in usb_alloc_dev correctly states, drivers can't use
the DMA API on usb device, and at least calling dma_set_mask on them
is highly dangerous.  Unlike what the comment states upper level drivers
also can't really use the presence of a dma mask to check for DMA
support, as the dma_mask is set by default for most busses.

Remove the copying over of DMA information, and remove the now unused
dma_direct_copy_range_map export.

Signed-off-by: Christoph Hellwig 
---
 drivers/usb/core/message.c |  7 ---
 drivers/usb/core/usb.c | 13 -
 kernel/dma/direct.c|  1 -
 3 files changed, 21 deletions(-)

diff --git a/drivers/usb/core/message.c b/drivers/usb/core/message.c
index 935ee98e049f65..9e45732dc1d1d1 100644
--- a/drivers/usb/core/message.c
+++ b/drivers/usb/core/message.c
@@ -1954,13 +1954,6 @@ int usb_set_configuration(struct usb_device *dev, int 
configuration)
intf->dev.bus = _bus_type;
intf->dev.type = _if_device_type;
intf->dev.groups = usb_interface_groups;
-   /*
-* Please refer to usb_alloc_dev() to see why we set
-* dma_mask and dma_range_map.
-*/
-   intf->dev.dma_mask = dev->dev.dma_mask;
-   if (dma_direct_copy_range_map(>dev, >dev))
-   dev_err(>dev, "failed to copy DMA map\n");
INIT_WORK(>reset_ws, __usb_queue_reset_device);
intf->minor = -1;
device_initialize(>dev);
diff --git a/drivers/usb/core/usb.c b/drivers/usb/core/usb.c
index 23d451f6894d70..9b4ac4415f1a47 100644
--- a/drivers/usb/core/usb.c
+++ b/drivers/usb/core/usb.c
@@ -599,19 +599,6 @@ struct usb_device *usb_alloc_dev(struct usb_device *parent,
dev->dev.bus = _bus_type;
dev->dev.type = _device_type;
dev->dev.groups = usb_device_groups;
-   /*
-* Fake a dma_mask/offset for the USB device:
-* We cannot really use the dma-mapping API (dma_alloc_* and
-* dma_map_*) for USB devices but instead need to use
-* usb_alloc_coherent and pass data in 'urb's, but some subsystems
-* manually look into the mask/offset pair to determine whether
-* they need bounce buffers.
-* Note: calling dma_set_mask() on a USB device would set the
-* mask for the entire HCD, so don't do that.
-*/
-   dev->dev.dma_mask = bus->sysdev->dma_mask;
-   if (dma_direct_copy_range_map(>dev, bus->sysdev))
-   dev_err(>dev, "failed to copy DMA map\n");
set_dev_node(>dev, dev_to_node(bus->sysdev));
dev->state = USB_STATE_ATTACHED;
dev->lpm_disable_count = 1;
diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
index fc815f7375e282..3af257571a3b42 100644
--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -552,4 +552,3 @@ int dma_direct_copy_range_map(struct device *to, struct 
device *from)
to->dma_range_map = new_map;
return 0;
 }
-EXPORT_SYMBOL_GPL(dma_direct_copy_range_map);
-- 
2.28.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 6/9] iommu/ioasid: Introduce notification APIs

2020-09-10 Thread Auger Eric
Hi Jacob,

On 9/10/20 12:58 AM, Jacob Pan wrote:
> On Tue, 1 Sep 2020 18:49:38 +0200
> Auger Eric  wrote:
> 
>> Hi Jacob,
>>
>> On 8/22/20 6:35 AM, Jacob Pan wrote:
>>> Relations among IOASID users largely follow a publisher-subscriber
>>> pattern. E.g. to support guest SVA on Intel Scalable I/O
>>> Virtualization (SIOV) enabled platforms, VFIO, IOMMU, device
>>> drivers, KVM are all users of IOASIDs. When a state change occurs,
>>> VFIO publishes the change event that needs to be processed by other
>>> users/subscribers.
>>>
>>> This patch introduced two types of notifications: global and per
>>> ioasid_set. The latter is intended for users who only needs to
>>> handle events related to the IOASID of a given set.
>>> For more information, refer to the kernel documentation at
>>> Documentation/ioasid.rst.
>>>
>>> Signed-off-by: Liu Yi L 
>>> Signed-off-by: Wu Hao 
>>> Signed-off-by: Jacob Pan 
>>> ---
>>>  drivers/iommu/ioasid.c | 280
>>> -
>>> include/linux/ioasid.h |  70 + 2 files changed, 348
>>> insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/iommu/ioasid.c b/drivers/iommu/ioasid.c
>>> index c0aef38a4fde..6ddc09a7fe74 100644
>>> --- a/drivers/iommu/ioasid.c
>>> +++ b/drivers/iommu/ioasid.c
>>> @@ -9,8 +9,35 @@
>>>  #include 
>>>  #include 
>>>  #include 
>>> +#include 
>>>  
>>>  static DEFINE_XARRAY_ALLOC(ioasid_sets);
>>> +/*
>>> + * An IOASID could have multiple consumers where each consumeer
>>> may have  
>> can have multiple consumers
> Sounds good, I used past tense to describe a possibility :)
> 
>>> + * hardware contexts associated with IOASIDs.
>>> + * When a status change occurs, such as IOASID is being freed,
>>> notifier chains  
>> s/such as IOASID is being freed/, like on IOASID deallocation,
> Better, will do.
> 
>>> + * are used to keep the consumers in sync.
>>> + * This is a publisher-subscriber pattern where publisher can
>>> change the
>>> + * state of each IOASID, e.g. alloc/free, bind IOASID to a device
>>> and mm.
>>> + * On the other hand, subscribers gets notified for the state
>>> change and
>>> + * keep local states in sync.
>>> + *
>>> + * Currently, the notifier is global. A further optimization could
>>> be per
>>> + * IOASID set notifier chain.
>>> + */
>>> +static ATOMIC_NOTIFIER_HEAD(ioasid_chain);
>>> +
>>> +/* List to hold pending notification block registrations */
>>> +static LIST_HEAD(ioasid_nb_pending_list);
>>> +static DEFINE_SPINLOCK(ioasid_nb_lock);
>>> +struct ioasid_set_nb {
>>> +   struct list_headlist;
>>> +   struct notifier_block   *nb;
>>> +   void*token;
>>> +   struct ioasid_set   *set;
>>> +   boolactive;
>>> +};
>>> +
>>>  enum ioasid_state {
>>> IOASID_STATE_INACTIVE,
>>> IOASID_STATE_ACTIVE,
>>> @@ -394,6 +421,7 @@ EXPORT_SYMBOL_GPL(ioasid_find_by_spid);
>>>  ioasid_t ioasid_alloc(struct ioasid_set *set, ioasid_t min,
>>> ioasid_t max, void *private)
>>>  {
>>> +   struct ioasid_nb_args args;
>>> struct ioasid_data *data;
>>> void *adata;
>>> ioasid_t id = INVALID_IOASID;
>>> @@ -445,8 +473,14 @@ ioasid_t ioasid_alloc(struct ioasid_set *set,
>>> ioasid_t min, ioasid_t max, goto exit_free;
>>> }
>>> set->nr_ioasids++;
>>> -   goto done_unlock;
>>> +   args.id = id;
>>> +   /* Set private ID is not attached during allocation */
>>> +   args.spid = INVALID_IOASID;
>>> +   args.set = set;
>>> +   atomic_notifier_call_chain(>nh, IOASID_ALLOC, );
>>>  
>>> +   spin_unlock(_allocator_lock);
>>> +   return id;  
>> spurious change
> Good catch. should just goto done_unlock.
> 
>>>  exit_free:
>>> kfree(data);
>>>  done_unlock:
>>> @@ -479,6 +513,7 @@ static void ioasid_do_free(struct ioasid_data
>>> *data) 
>>>  static void ioasid_free_locked(struct ioasid_set *set, ioasid_t
>>> ioasid) {
>>> +   struct ioasid_nb_args args;
>>> struct ioasid_data *data;
>>>  
>>> data = xa_load(_allocator->xa, ioasid);
>>> @@ -491,7 +526,16 @@ static void ioasid_free_locked(struct
>>> ioasid_set *set, ioasid_t ioasid) pr_warn("Cannot free IOASID %u
>>> due to set ownership\n", ioasid); return;
>>> }
>>> +  
>> spurious new line
> got it
> 
>>> data->state = IOASID_STATE_FREE_PENDING;
>>> +   /* Notify all users that this IOASID is being freed */
>>> +   args.id = ioasid;
>>> +   args.spid = data->spid;
>>> +   args.pdata = data->private;
>>> +   args.set = data->set;
>>> +   atomic_notifier_call_chain(_chain, IOASID_FREE,
>>> );
>>> +   /* Notify the ioasid_set for per set users */
>>> +   atomic_notifier_call_chain(>nh, IOASID_FREE, );
>>>  
>>> if (!refcount_dec_and_test(>users))
>>> return;  
>> Shouldn't we call the notifier only when ref count == 0?
> Not in the current scheme. The idea is to notify all users the PASID is
> being freed, then each user can drop its reference. When refcount == 0,
> the PASID will be returned to the pool.

OK
> 
>>> @@ -514,6 +558,28 @@ void 

Re: [trivial PATCH] treewide: Convert switch/case fallthrough; to break;

2020-09-10 Thread Wolfram Sang

> diff --git a/drivers/i2c/busses/i2c-i801.c b/drivers/i2c/busses/i2c-i801.c
> index e32ef3f01fe8..b13b1cbcac29 100644
> --- a/drivers/i2c/busses/i2c-i801.c
> +++ b/drivers/i2c/busses/i2c-i801.c
> @@ -1785,7 +1785,7 @@ static int i801_probe(struct pci_dev *dev, const struct 
> pci_device_id *id)
>   fallthrough;
>   case PCI_DEVICE_ID_INTEL_82801CA_3:
>   priv->features |= FEATURE_HOST_NOTIFY;
> - fallthrough;
> + break;
>   case PCI_DEVICE_ID_INTEL_82801BA_2:
>   case PCI_DEVICE_ID_INTEL_82801AB_3:
>   case PCI_DEVICE_ID_INTEL_82801AA_3:

I am not the maintainer (Jean is) but I suggest to drop this hunk. The
code is more complex with multiple 'fallthrough', so this change alone
actually makes the code inconsistent. A rework would need a seperate
patch.



signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [trivial PATCH] treewide: Convert switch/case fallthrough; to break;

2020-09-10 Thread Felipe Balbi
Hi,

Joe Perches  writes:
>  drivers/usb/dwc3/core.c   |  2 +-
>  drivers/usb/gadget/legacy/inode.c |  2 +-
>  drivers/usb/gadget/udc/pxa25x_udc.c   |  4 ++--
>  drivers/usb/phy/phy-fsl-usb.c |  2 +-

for the drivers above:

Acked-by: Felipe Balbi 

-- 
balbi


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] iommu/dma: Fix IOVA reserve dma ranges

2020-09-10 Thread Srinath Mannam via iommu
On Wed, Sep 9, 2020 at 5:35 PM Robin Murphy  wrote:
>
Hi Robin,
Thanks for review
> On 2020-09-09 06:32, Srinath Mannam wrote:
> > Fix IOVA reserve failure for memory regions listed in dma-ranges in the
> > following cases.
> >
> > - start address of memory region is 0x0.
>
> That's fair enough, and in fact generalises to the case of zero-sized
> gaps between regions, which is indeed an oversight.
Yes this is the main reason for the requirement of this fix.
>
> > - end address of a memory region is equal to start address of next memory
> >region.
>
> This part doesn't make much sense, however - if the regions described in
> bridge->dma_ranges overlap, that's a bug in whoever created a malformed
> list to begin with. Possibly it's just poor wording, and you're using
> "memory regions" to refer to any or all of the dma_ranges, the reserved
> IOVA ranges, and what "start" and "end" in this function represent which
> isn't quite either of those.
You are right, this case is very unlikely that nobody lists regions with zero
gap, in such a case they will combine both the regions. Reason for highlighting
this point is, the same fix will handle this case also. Here I used memory
regions to refer entries of dma-ranges(allowed IOVA addresses range) not
reserved IOVA ranges. start and end variables in this function refers to
start and end addresses of reserved IOVA ranges which are derived from
dma ranges resources start and end values.
>
> > Fixes: aadad097cd46f ("iommu/dma: Reserve IOVA for PCIe inaccessible DMA 
> > address")
> > Signed-off-by: Srinath Mannam 
> > ---
> >   drivers/iommu/dma-iommu.c | 15 +++
> >   1 file changed, 11 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> > index 5141d49a046b..0a3f67a4f9ae 100644
> > --- a/drivers/iommu/dma-iommu.c
> > +++ b/drivers/iommu/dma-iommu.c
> > @@ -213,14 +213,21 @@ static int iova_reserve_pci_windows(struct pci_dev 
> > *dev,
> >   resource_list_for_each_entry(window, >dma_ranges) {
> >   end = window->res->start - window->offset;
> >   resv_iova:
> > + if (end < start) {
> > + /* dma_ranges list should be sorted */
> > + dev_err(>dev, "Failed to reserve IOVA\n");
> > + return -EINVAL;
> > + }
> > + /*
> > +  * Skip the cases when start address of first memory region is
> > +  * 0x0 and end address of one memory region and start address
> > +  * of next memory region are equal. Reserve IOVA for rest of
> > +  * addresses fall in between given memory ranges.
> > +  */
> >   if (end > start) {
> >   lo = iova_pfn(iovad, start);
> >   hi = iova_pfn(iovad, end);
> >   reserve_iova(iovad, lo, hi);
> > - } else {
>
> Surely this only needs to be a one-liner?
Yes I agree with you this one line is sufficient.
>
> -   } else {
> +   } else if (end < start) {
>
> (or possibly "end != start"; I can't quite decide which expresses the
> semantic intent better)
I think "end < start" is better choice because it tells list is not sorted
and "!=" contradicts previous condition "end > start".
>
> The rest just looks like unnecessary churn - I don't think it needs
> commenting that a sorted list may simply not have gaps between entries,
> and as above I think the wording of that comment is actively misleading.
I agree with you, these lines were added to explain the issue and fix with
more details.
I will send a new patch with a single line change as you said.
" } else if (end < start) {"

Thanks & Regards,
Srinath.
>
> Robin.
>
> > - /* dma_ranges list should be sorted */
> > - dev_err(>dev, "Failed to reserve IOVA\n");
> > - return -EINVAL;
> >   }
> >
> >   start = window->res->end - window->offset + 1;
> >
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [trivial PATCH] treewide: Convert switch/case fallthrough; to break;

2020-09-10 Thread Nicolas.Ferre--- via iommu
Joe,

Please drop this chunk: it's a successive controller version number 
which are all backward compatible with "fallthrough" on each case so 
removing from this last one makes it inconsistent.

In sort: NACK for atmel-mci.

Best regards,
   Nicolas


On 09/09/2020 at 22:06, Joe Perches wrote:
> diff --git a/drivers/mmc/host/atmel-mci.c b/drivers/mmc/host/atmel-mci.c
> index 444bd3a0a922..8324312e4f42 100644
> --- a/drivers/mmc/host/atmel-mci.c
> +++ b/drivers/mmc/host/atmel-mci.c
> @@ -2435,7 +2435,7 @@ static void atmci_get_cap(struct atmel_mci *host)
>  case 0x100:
>  host->caps.has_bad_data_ordering = 0;
>  host->caps.need_reset_after_xfer = 0;
> -   fallthrough;
> +   break;
>  case 0x0:
>  break;
>  default:


-- 
Nicolas Ferre
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 3/3] dma-mapping: introduce DMA range map, supplanting dma_pfn_offset

2020-09-10 Thread Greg KH
On Thu, Sep 10, 2020 at 07:40:38AM +0200, Christoph Hellwig wrote:
> From: Jim Quinlan 
> 
> The new field 'dma_range_map' in struct device is used to facilitate the
> use of single or multiple offsets between mapping regions of cpu addrs and
> dma addrs.  It subsumes the role of "dev->dma_pfn_offset" which was only
> capable of holding a single uniform offset and had no region bounds
> checking.
> 
> The function of_dma_get_range() has been modified so that it takes a single
> argument -- the device node -- and returns a map, NULL, or an error code.
> The map is an array that holds the information regarding the DMA regions.
> Each range entry contains the address offset, the cpu_start address, the
> dma_start address, and the size of the region.
> 
> of_dma_configure() is the typical manner to set range offsets but there are
> a number of ad hoc assignments to "dev->dma_pfn_offset" in the kernel
> driver code.  These cases now invoke the function
> dma_attach_offset_range(dev, cpu_addr, dma_addr, size).

So if an error happens, we don't do anything?

ice_init(dev->dev);
> diff --git a/drivers/usb/core/message.c b/drivers/usb/core/message.c
> index 6197938dcc2d8f..935ee98e049f65 100644
> --- a/drivers/usb/core/message.c
> +++ b/drivers/usb/core/message.c
> @@ -1956,10 +1956,11 @@ int usb_set_configuration(struct usb_device *dev, int 
> configuration)
>   intf->dev.groups = usb_interface_groups;
>   /*
>* Please refer to usb_alloc_dev() to see why we set
> -  * dma_mask and dma_pfn_offset.
> +  * dma_mask and dma_range_map.
>*/
>   intf->dev.dma_mask = dev->dev.dma_mask;
> - intf->dev.dma_pfn_offset = dev->dev.dma_pfn_offset;
> + if (dma_direct_copy_range_map(>dev, >dev))
> + dev_err(>dev, "failed to copy DMA map\n");

We tell the user, but then just keep on running?  Is there anything that
we can do here?

If not, why not have dma_direct_copy_range_map() print out the error?

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [trivial PATCH] treewide: Convert switch/case fallthrough; to break;

2020-09-10 Thread Mauro Carvalho Chehab
Em Wed, 09 Sep 2020 13:06:39 -0700
Joe Perches  escreveu:

> fallthrough to a separate case/default label break; isn't very readable.
> 
> Convert pseudo-keyword fallthrough; statements to a simple break; when
> the next label is case or default and the only statement in the next
> label block is break;
> 
> Found using:
> 
> $ grep-2.5.4 -rP --include=*.[ch] -n 
> "fallthrough;(\s*(case\s+\w+|default)\s*:\s*){1,7}break;" *
> 
> Miscellanea:
> 
> o Move or coalesce a couple label blocks above a default: block.
> 
> Signed-off-by: Joe Perches 
> ---
> 
> Compiled allyesconfig x86-64 only.
> A few files for other arches were not compiled.
> 


>  drivers/media/dvb-frontends/drxd_hard.c   |  2 +-
>  drivers/media/i2c/ov5640.c|  2 +-
>  drivers/media/i2c/ov6650.c|  5 ++---
>  drivers/media/i2c/smiapp/smiapp-core.c|  2 +-
>  drivers/media/i2c/tvp5150.c   |  2 +-
>  drivers/media/pci/ddbridge/ddbridge-core.c|  2 +-
>  drivers/media/usb/cpia2/cpia2_core.c  |  2 +-

For media drivers:

Reviewed-by: Mauro Carvalho Chehab 


Thanks,
Mauro
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu