Re: [PATCH 02/13] mm/rmap: update to new mmu_notifier semantic

2017-08-30 Thread Jerome Glisse
On Wed, Aug 30, 2017 at 04:25:54PM -0700, Nadav Amit wrote:
> [cc’ing IOMMU people, which for some reason are not cc’d]
> 
> Andrea Arcangeli  wrote:
> 
> > On Wed, Aug 30, 2017 at 11:00:32AM -0700, Nadav Amit wrote:
> >> It is not trivial to flush TLBs (primary or secondary) without holding the
> >> page-table lock, and as we recently encountered this resulted in several
> >> bugs [1]. The main problem is that even if you perform the TLB flush
> >> immediately after the PT-lock is released, you cause a situation in which
> >> other threads may make decisions based on the updated PTE value, without
> >> being aware that a TLB flush is needed.
> >> 
> >> For example, we recently encountered a Linux bug when two threads run
> >> MADV_DONTNEED concurrently on the same address range [2]. One of the 
> >> threads
> >> may update a PTE, setting it as non-present, and then deferring the TLB
> >> flush (for batching). As a result, however, it would cause the second
> >> thread, which also changes the PTEs to assume that the PTE is already
> >> non-present and TLB flush is not necessary. As a result the second core may
> >> still hold stale PTEs in its TLB following MADV_DONTNEED.
> > 
> > The source of those complex races that requires taking into account
> > nested tlb gather to solve it, originates from skipping primary MMU
> > tlb flushes depending on the value of the pagetables (as an
> > optimization).
> > 
> > For mmu_notifier_invalidate_range_end we always ignore the value of
> > the pagetables and mmu_notifier_invalidate_range_end always runs
> > unconditionally invalidating the secondary MMU for the whole range
> > under consideration. There are no optimizations that attempts to skip
> > mmu_notifier_invalidate_range_end depending on the pagetable value and
> > there's no TLB gather for secondary MMUs either. That is to keep it
> > simple of course.
> > 
> > As long as those mmu_notifier_invalidate_range_end stay unconditional,
> > I don't see how those races you linked, can be possibly relevant in
> > evaluating if ->invalidate_range (again only for iommuv2 and
> > intel-svm) has to run inside the PT lock or not.
> 
> Thanks for the clarifications. It now makes much more sense.
> 
> > 
> >> There is a swarm of such problems, and some are not too trivial. Deferring
> >> TLB flushes needs to be done in a very careful manner.
> > 
> > I agree it's not trivial, but I don't think any complexity comes from
> > above.
> > 
> > The only complexity is about, what if the page is copied to some other
> > page and replaced, because the copy is the only case where coherency
> > could be retained by the primary MMU. What if the primary MMU starts
> > working on the new page in between PT lock release and
> > mmu_notifier_invalidate_range_end, while the secondary MMU is stuck on
> > the old page? That is the only problem we deal with here, the copy to
> > other page and replace. Any other case that doesn't involve the copy
> > seems non coherent by definition, and you couldn't measure it.
> > 
> > I can't think of a scenario that requires the explicit
> > mmu_notifier_invalidate_range call before releasing the PT lock, at
> > least for try_to_unmap_one.
> > 
> > Could you think of a scenario where calling ->invalidate_range inside
> > mmu_notifier_invalidate_range_end post PT lock breaks iommuv2 or
> > intel-svm? Those two are the only ones requiring
> > ->invalidate_range calls, all other mmu notifier users are safe
> > without running mmu_notifier_invalidate_range_end under PT lock thanks
> > to mmu_notifier_invalidate_range_start blocking the secondary MMU.
> > 
> > Could you post a tangible scenario that invalidates my theory that
> > those mmu_notifier_invalidate_range calls inside PT lock would be
> > superfluous?
> > 
> > Some of the scenarios under consideration:
> > 
> > 1) migration entry -> migration_entry_wait -> page lock, plus
> >   migrate_pages taking the lock so it can't race with try_to_unmap
> >   from other places
> > 2) swap entry -> lookup_swap_cache -> page lock (page not really replaced)
> > 3) CoW -> do_wp_page -> page lock on old page
> > 4) KSM -> replace_page -> page lock on old page
> > 5) if the pte is zapped as in MADV_DONTNEED, no coherency possible so
> >   it's not measurable that we let the guest run a bit longer on the
> >   old page past PT lock release
> 
> For both CoW and KSM, the correctness is maintained by calling
> ptep_clear_flush_notify(). If you defer the secondary MMU invalidation
> (i.e., replacing ptep_clear_flush_notify() with ptep_clear_flush() ), you
> will cause memory corruption, and page-lock would not be enough.

Just to add up, the IOMMU have its own CPU page table walker and it can
walk the page table at any time (not the page table current to current
CPU, IOMMU have an array that match a PASID with a page table and device
request translation for a given virtual address against a PASID).

So this means the following can happen with ptep_clear_flush() only:

  C

Re: [PATCH 02/13] mm/rmap: update to new mmu_notifier semantic

2017-08-30 Thread Nadav Amit
[cc’ing IOMMU people, which for some reason are not cc’d]

Andrea Arcangeli  wrote:

> On Wed, Aug 30, 2017 at 11:00:32AM -0700, Nadav Amit wrote:
>> It is not trivial to flush TLBs (primary or secondary) without holding the
>> page-table lock, and as we recently encountered this resulted in several
>> bugs [1]. The main problem is that even if you perform the TLB flush
>> immediately after the PT-lock is released, you cause a situation in which
>> other threads may make decisions based on the updated PTE value, without
>> being aware that a TLB flush is needed.
>> 
>> For example, we recently encountered a Linux bug when two threads run
>> MADV_DONTNEED concurrently on the same address range [2]. One of the threads
>> may update a PTE, setting it as non-present, and then deferring the TLB
>> flush (for batching). As a result, however, it would cause the second
>> thread, which also changes the PTEs to assume that the PTE is already
>> non-present and TLB flush is not necessary. As a result the second core may
>> still hold stale PTEs in its TLB following MADV_DONTNEED.
> 
> The source of those complex races that requires taking into account
> nested tlb gather to solve it, originates from skipping primary MMU
> tlb flushes depending on the value of the pagetables (as an
> optimization).
> 
> For mmu_notifier_invalidate_range_end we always ignore the value of
> the pagetables and mmu_notifier_invalidate_range_end always runs
> unconditionally invalidating the secondary MMU for the whole range
> under consideration. There are no optimizations that attempts to skip
> mmu_notifier_invalidate_range_end depending on the pagetable value and
> there's no TLB gather for secondary MMUs either. That is to keep it
> simple of course.
> 
> As long as those mmu_notifier_invalidate_range_end stay unconditional,
> I don't see how those races you linked, can be possibly relevant in
> evaluating if ->invalidate_range (again only for iommuv2 and
> intel-svm) has to run inside the PT lock or not.

Thanks for the clarifications. It now makes much more sense.

> 
>> There is a swarm of such problems, and some are not too trivial. Deferring
>> TLB flushes needs to be done in a very careful manner.
> 
> I agree it's not trivial, but I don't think any complexity comes from
> above.
> 
> The only complexity is about, what if the page is copied to some other
> page and replaced, because the copy is the only case where coherency
> could be retained by the primary MMU. What if the primary MMU starts
> working on the new page in between PT lock release and
> mmu_notifier_invalidate_range_end, while the secondary MMU is stuck on
> the old page? That is the only problem we deal with here, the copy to
> other page and replace. Any other case that doesn't involve the copy
> seems non coherent by definition, and you couldn't measure it.
> 
> I can't think of a scenario that requires the explicit
> mmu_notifier_invalidate_range call before releasing the PT lock, at
> least for try_to_unmap_one.
> 
> Could you think of a scenario where calling ->invalidate_range inside
> mmu_notifier_invalidate_range_end post PT lock breaks iommuv2 or
> intel-svm? Those two are the only ones requiring
> ->invalidate_range calls, all other mmu notifier users are safe
> without running mmu_notifier_invalidate_range_end under PT lock thanks
> to mmu_notifier_invalidate_range_start blocking the secondary MMU.
> 
> Could you post a tangible scenario that invalidates my theory that
> those mmu_notifier_invalidate_range calls inside PT lock would be
> superfluous?
> 
> Some of the scenarios under consideration:
> 
> 1) migration entry -> migration_entry_wait -> page lock, plus
>   migrate_pages taking the lock so it can't race with try_to_unmap
>   from other places
> 2) swap entry -> lookup_swap_cache -> page lock (page not really replaced)
> 3) CoW -> do_wp_page -> page lock on old page
> 4) KSM -> replace_page -> page lock on old page
> 5) if the pte is zapped as in MADV_DONTNEED, no coherency possible so
>   it's not measurable that we let the guest run a bit longer on the
>   old page past PT lock release

For both CoW and KSM, the correctness is maintained by calling
ptep_clear_flush_notify(). If you defer the secondary MMU invalidation
(i.e., replacing ptep_clear_flush_notify() with ptep_clear_flush() ), you
will cause memory corruption, and page-lock would not be enough.

BTW: I see some calls to ptep_clear_flush_notify() which are followed
immediately after by set_pte_at_notify(). I do not understand why it makes
sense, as both notifications end up doing the same thing - invalidating the
secondary MMU. The set_pte_at_notify() in these cases can be changed to
set_pte(). No?

> If you could post a multi CPU trace that shows how iommuv2 or
> intel-svm are broken if ->invalidate_range is run inside
> mmu_notifier_invalidate_range_end post PT lock in try_to_unmap_one it
> would help.
> 
> Of course if we run mmu_notifier_invalidate_range inside PT lock and
> we

Re: [PATCH v3] iommu: Prevent VMD child devices from being remapping targets

2017-08-30 Thread Bjorn Helgaas
On Wed, Aug 30, 2017 at 03:05:59PM -0600, Jon Derrick wrote:
> VMD child devices must use the VMD endpoint's ID as the requester.
> Because of this, there needs to be a way to link the parent VMD
> endpoint's iommu group and associated mappings to the VMD child devices
> such that attaching and detaching child devices modify the endpoint's
> mappings, while preventing early detaching on a singular device removal
> or unbinding.
> 
> The reassignment of individual VMD child devices devices to VMs is
> outside the scope of VMD, but may be implemented in the future. For now
> it is best to prevent any such attempts.
> 
> This patch prevents VMD child devices from returning an IOMMU, which
> prevents it from exposing an iommu_group sysfs directories and allowing
> subsequent binding by userspace-access drivers such as VFIO.
> 
> Signed-off-by: Jon Derrick 

Applied to pci/host-vmd for v4.14, thanks!

> ---
> v2->3, wrapped in x86 ifdef to avoid ia64 compilation errors
> 
>  drivers/iommu/intel-iommu.c | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index 687f18f..2800a6e 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -901,6 +901,13 @@ static struct intel_iommu *device_to_iommu(struct device 
> *dev, u8 *bus, u8 *devf
>   struct pci_dev *pf_pdev;
>  
>   pdev = to_pci_dev(dev);
> +
> +#ifdef CONFIG_X86
> + /* VMD child devices currently cannot be handled individually */
> + if (is_vmd(pdev->bus))
> + return NULL;
> +#endif
> +
>   /* VFs aren't listed in scope tables; we need to look up
>* the PF instead to find the IOMMU. */
>   pf_pdev = pci_physfn(pdev);
> -- 
> 1.8.3.1
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3] iommu: Prevent VMD child devices from being remapping targets

2017-08-30 Thread Jon Derrick
VMD child devices must use the VMD endpoint's ID as the requester.
Because of this, there needs to be a way to link the parent VMD
endpoint's iommu group and associated mappings to the VMD child devices
such that attaching and detaching child devices modify the endpoint's
mappings, while preventing early detaching on a singular device removal
or unbinding.

The reassignment of individual VMD child devices devices to VMs is
outside the scope of VMD, but may be implemented in the future. For now
it is best to prevent any such attempts.

This patch prevents VMD child devices from returning an IOMMU, which
prevents it from exposing an iommu_group sysfs directories and allowing
subsequent binding by userspace-access drivers such as VFIO.

Signed-off-by: Jon Derrick 
---
v2->3, wrapped in x86 ifdef to avoid ia64 compilation errors

 drivers/iommu/intel-iommu.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 687f18f..2800a6e 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -901,6 +901,13 @@ static struct intel_iommu *device_to_iommu(struct device 
*dev, u8 *bus, u8 *devf
struct pci_dev *pf_pdev;
 
pdev = to_pci_dev(dev);
+
+#ifdef CONFIG_X86
+   /* VMD child devices currently cannot be handled individually */
+   if (is_vmd(pdev->bus))
+   return NULL;
+#endif
+
/* VFs aren't listed in scope tables; we need to look up
 * the PF instead to find the IOMMU. */
pf_pdev = pci_physfn(pdev);
-- 
1.8.3.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 0/4] VMD fixups

2017-08-30 Thread Bjorn Helgaas
[+cc Joerg]

On Thu, Aug 17, 2017 at 12:10:10PM -0600, Jon Derrick wrote:
> Mostly just cleanup in this revision, eg, trying to limit scope of vmd code to
> x86
> 
> Previous:
> https://patchwork.kernel.org/patch/9886095/
> https://patchwork.kernel.org/patch/9886097/
> https://patchwork.kernel.org/patch/9886101/
> 
> 
> Jon Derrick (4):
>   MAINTAINERS: Add Jonathan Derrick as VMD maintainer
>   pci/x86: Move VMD quirks to x86 fixups
>   x86/PCI: Use is_vmd rather than relying on the domain number
>   iommu: Prevent VMD child devices from being remapping targets
> 
>  MAINTAINERS |  1 +
>  arch/x86/pci/fixup.c| 18 ++
>  drivers/iommu/intel-iommu.c |  5 +
>  drivers/pci/quirks.c| 17 -
>  4 files changed, 24 insertions(+), 17 deletions(-)

I haven't heard anything from IOMMU folks, so I applied these to
pci/host-vmd for v4.14.  Let me know if there's any objection.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/2] iommu/tegra: Add support for struct iommu_device

2017-08-30 Thread Jon Hunter

On 30/08/17 16:30, Joerg Roedel wrote:
> Hi Jon,
> 
> On Wed, Aug 30, 2017 at 03:22:05PM +0100, Jon Hunter wrote:
>> Yes I can confirm that this fixes the crash. I assume that you will fix
>> the error path for bus_set_iommu() above as I believe now it needs to
>> call iommu_device_sysfs_remove().
> 
> Thanks for testing the patch. I updated the error-path and put the
> commit below into the iommu-tree:
Great! Thanks, Jon

-- 
nvpublic
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] intel-iommu: Don't be too aggressive when clearing one context entry

2017-08-30 Thread Jacob Pan
On Mon, 28 Aug 2017 16:16:29 +0200
Filippo Sironi via iommu  wrote:

> Previously, we were invalidating context cache and IOTLB globally when
> clearing one context entry.  This is a tad too aggressive.
> Invalidate the context cache and IOTLB for the interested device only.
> 
> Signed-off-by: Filippo Sironi 
> Cc: David Woodhouse 
> Cc: David Woodhouse 
> Cc: Joerg Roedel 
> Cc: iommu@lists.linux-foundation.org
> Cc: linux-ker...@vger.kernel.org
> ---
>  drivers/iommu/intel-iommu.c | 25 ++---
>  1 file changed, 22 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index 3e8636f1220e..4bf3e59b0929 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -2351,13 +2351,32 @@ static inline int domain_pfn_mapping(struct
> dmar_domain *domain, unsigned long i 
>  static void domain_context_clear_one(struct intel_iommu *iommu, u8
> bus, u8 devfn) {
> + unsigned long flags;
> + struct context_entry *context;
> + u16 did_old;
> +
>   if (!iommu)
>   return;
>  
> + spin_lock_irqsave(&iommu->lock, flags);
> + context = iommu_context_addr(iommu, bus, devfn, 0);
> + if (!context) {
> + spin_unlock_irqrestore(&iommu->lock, flags);
> + return;
> + }
perhaps check with device_context_mapped()?

> + did_old = context_domain_id(context);
> + spin_unlock_irqrestore(&iommu->lock, flags);
>   clear_context_table(iommu, bus, devfn);
> - iommu->flush.flush_context(iommu, 0, 0, 0,
> -DMA_CCMD_GLOBAL_INVL);
> - iommu->flush.flush_iotlb(iommu, 0, 0, 0,
> DMA_TLB_GLOBAL_FLUSH);
> + iommu->flush.flush_context(iommu,
> +did_old,
> +(((u16)bus) << 8) | devfn,
> +DMA_CCMD_MASK_NOBIT,
> +DMA_CCMD_DEVICE_INVL);
> + iommu->flush.flush_iotlb(iommu,
> +  did_old,
> +  0,
> +  0,
> +  DMA_TLB_DSI_FLUSH);
>  }
>  
>  static inline void unlink_domain_info(struct device_domain_info
> *info)

[Jacob Pan]
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 00/13] mmu_notifier kill invalidate_page callback

2017-08-30 Thread Adam Borowski
On Tue, Aug 29, 2017 at 08:56:15PM -0400, Jerome Glisse wrote:
> I will wait for people to test and for result of my own test before
> reposting if need be, otherwise i will post as separate patch.
>
> > But from a _very_ quick read-through this looks fine. But it obviously
> > needs testing.
> > 
> > People - *especially* the people who saw issues under KVM - can you
> > try out Jérôme's patch-series? I aded some people to the cc, the full
> > series is on lkml. Jérôme - do you have a git branch for people to
> > test that they could easily pull and try out?
> 
> https://cgit.freedesktop.org/~glisse/linux mmu-notifier branch
> git://people.freedesktop.org/~glisse/linux

Tested your branch as of 10f07641, on a long list of guest VMs.
No earth-shattering kaboom.


Meow!
-- 
⢀⣴⠾⠻⢶⣦⠀ 
⣾⠁⢰⠒⠀⣿⡁ Vat kind uf sufficiently advanced technology iz dis!?
⢿⡄⠘⠷⠚⠋⠀ -- Genghis Ht'rok'din
⠈⠳⣄ 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/2] iommu/tegra: Add support for struct iommu_device

2017-08-30 Thread Joerg Roedel
Hi Jon,

On Wed, Aug 30, 2017 at 03:22:05PM +0100, Jon Hunter wrote:
> Yes I can confirm that this fixes the crash. I assume that you will fix
> the error path for bus_set_iommu() above as I believe now it needs to
> call iommu_device_sysfs_remove().

Thanks for testing the patch. I updated the error-path and put the
commit below into the iommu-tree:

>From 96302d89a03524e04d46ec82c6730881bb755923 Mon Sep 17 00:00:00 2001
From: Joerg Roedel 
Date: Wed, 30 Aug 2017 15:06:43 +0200
Subject: [PATCH] arm/tegra: Call bus_set_iommu() after iommu_device_register()

The bus_set_iommu() function will call the add_device()
call-back which needs the iommu to be registered.

Reported-by: Jon Hunter 
Fixes: 0b480e447006 ('iommu/tegra: Add support for struct iommu_device')
Signed-off-by: Joerg Roedel 
---
 drivers/iommu/tegra-smmu.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
index 2802e12e6a54..3b6449e2cbf1 100644
--- a/drivers/iommu/tegra-smmu.c
+++ b/drivers/iommu/tegra-smmu.c
@@ -949,10 +949,6 @@ struct tegra_smmu *tegra_smmu_probe(struct device *dev,
 
tegra_smmu_ahb_enable();
 
-   err = bus_set_iommu(&platform_bus_type, &tegra_smmu_ops);
-   if (err < 0)
-   return ERR_PTR(err);
-
err = iommu_device_sysfs_add(&smmu->iommu, dev, NULL, dev_name(dev));
if (err)
return ERR_PTR(err);
@@ -965,6 +961,13 @@ struct tegra_smmu *tegra_smmu_probe(struct device *dev,
return ERR_PTR(err);
}
 
+   err = bus_set_iommu(&platform_bus_type, &tegra_smmu_ops);
+   if (err < 0) {
+   iommu_device_unregister(&smmu->iommu);
+   iommu_device_sysfs_remove(&smmu->iommu);
+   return ERR_PTR(err);
+   }
+
if (IS_ENABLED(CONFIG_DEBUG_FS))
tegra_smmu_debugfs_init(smmu);
 
-- 
2.13.5

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/2] iommu/tegra: Add support for struct iommu_device

2017-08-30 Thread Jon Hunter
Hi Joerg,

On 30/08/17 13:09, Joerg Roedel wrote:
> Hi Jon,
> 
> On Wed, Aug 30, 2017 at 12:04:38PM +0100, Jon Hunter wrote:
>> With next-20170829 I am seeing several Tegra boards crashing [0][1] on
>> boot in tegra_smmu_probe() and the bisect is point to this commit. Looks
>> like there maybe a sequence problem between calls to bus_set_iommu() and
>> iommu_device_add_sysfs() which results in a NULL pointer dereference.
> 
> Thanks for the report. Can you please test whether the patch below
> fixes the problem?
> 
> Thanks,
> 
>   Joerg
> 
> diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
> index 2802e12e6a54..48ffbe95a663 100644
> --- a/drivers/iommu/tegra-smmu.c
> +++ b/drivers/iommu/tegra-smmu.c
> @@ -949,10 +949,6 @@ struct tegra_smmu *tegra_smmu_probe(struct device *dev,
>  
>   tegra_smmu_ahb_enable();
>  
> - err = bus_set_iommu(&platform_bus_type, &tegra_smmu_ops);
> - if (err < 0)
> - return ERR_PTR(err);
> -
>   err = iommu_device_sysfs_add(&smmu->iommu, dev, NULL, dev_name(dev));
>   if (err)
>   return ERR_PTR(err);
> @@ -965,6 +961,10 @@ struct tegra_smmu *tegra_smmu_probe(struct device *dev,
>   return ERR_PTR(err);
>   }
>  
> + err = bus_set_iommu(&platform_bus_type, &tegra_smmu_ops);
> + if (err < 0)
> + return ERR_PTR(err);
> +

Yes I can confirm that this fixes the crash. I assume that you will fix
the error path for bus_set_iommu() above as I believe now it needs to
call iommu_device_sysfs_remove().

Cheers!
Jon

-- 
nvpublic
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] intel-iommu: Don't be too aggressive when clearing one context entry

2017-08-30 Thread Joerg Roedel
Hi Filippo,

please change the subject to:

iommu/vt-d: Don't be too aggressive when clearing one context entry

to follow the convention used in the iommu-tree. Another comment below.

On Mon, Aug 28, 2017 at 04:16:29PM +0200, Filippo Sironi wrote:
>  static void domain_context_clear_one(struct intel_iommu *iommu, u8 bus, u8 
> devfn)
>  {
> + unsigned long flags;
> + struct context_entry *context;
> + u16 did_old;
> +
>   if (!iommu)
>   return;
>  
> + spin_lock_irqsave(&iommu->lock, flags);
> + context = iommu_context_addr(iommu, bus, devfn, 0);
> + if (!context) {
> + spin_unlock_irqrestore(&iommu->lock, flags);
> + return;
> + }
> + did_old = context_domain_id(context);
> + spin_unlock_irqrestore(&iommu->lock, flags);
>   clear_context_table(iommu, bus, devfn);

This function is the only caller of clear_context_table(), which does
similar things (like fetching the context-entry) as you are adding
above.

So you can either make clear_context_table() return the old domain-id
so that you don't need to do it here, or you get rid of the function
entirely and add the context_clear_entry() and __iommu_flush_cache()
calls into this code-path.

Regards,

Joerg

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 3/3] iommu: s390: constify iommu_ops

2017-08-30 Thread Joerg Roedel
On Mon, Aug 28, 2017 at 05:42:50PM +0530, Arvind Yadav wrote:
> iommu_ops are not supposed to change at runtime.
> Functions 'bus_set_iommu' working with const iommu_ops provided
> by . So mark the non-const structs as const.
> 
> Signed-off-by: Arvind Yadav 
> ---
>  drivers/iommu/s390-iommu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Applied, thanks.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/3] iommu: mtk: constify iommu_ops

2017-08-30 Thread Joerg Roedel
On Mon, Aug 28, 2017 at 05:42:26PM +0530, Arvind Yadav wrote:
> iommu_ops are not supposed to change at runtime.
> Functions 'iommu_device_set_ops' and 'bus_set_iommu' working with
> const iommu_ops provided by . So mark the non-const
> structs as const.
> 
> Signed-off-by: Arvind Yadav 
> ---
>  drivers/iommu/mtk_iommu.c| 4 ++--
>  drivers/iommu/mtk_iommu_v1.c | 4 ++--
>  2 files changed, 4 insertions(+), 4 deletions(-)

This doesn't apply cleanly to the arm/mediatek branch in the iommu-tree.
Please rebase it against that branch and update the subject to:

iommu/mediatek: Constify iommu_ops

to match the convention used in the iommu-tree and send it again.


Thanks,

Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/3] iommu: exynos: constify iommu_ops

2017-08-30 Thread Joerg Roedel
On Mon, Aug 28, 2017 at 05:42:05PM +0530, Arvind Yadav wrote:
> iommu_ops are not supposed to change at runtime.
> Functions 'iommu_device_set_ops' and 'bus_set_iommu' working with
> const iommu_ops provided by . So mark the non-const
> structs as const.
> 
> Signed-off-by: Arvind Yadav 
> ---
>  drivers/iommu/exynos-iommu.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

Applied, thanks.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/ipmmu-vmsa: make ipmmu_gather_ops const

2017-08-30 Thread Joerg Roedel
On Mon, Aug 28, 2017 at 11:47:27PM +0530, Bhumika Goyal wrote:
> Make these const as they are not modified anywhere.
> 
> Signed-off-by: Bhumika Goyal 
> ---
>  drivers/iommu/ipmmu-vmsa.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Applied, thanks.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] iommu/ipmmu-vmsa: Rereserving a free context before setting up a pagetable

2017-08-30 Thread Joerg Roedel
On Wed, Aug 23, 2017 at 05:31:42PM +0300, Oleksandr Tyshchenko wrote:
> From: Oleksandr Tyshchenko 
> 
> Reserving a free context is both quicker and more likely to fail
> (due to limited hardware resources) than setting up a pagetable.
> What is more the pagetable init/cleanup code could require
> the context to be set up.
> 
> Signed-off-by: Oleksandr Tyshchenko 
> CC: Robin Murphy 
> CC: Laurent Pinchart 
> CC: Joerg Roedel 

Applied, thanks.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 0/4] Optimise 64-bit IOVA allocations

2017-08-30 Thread Joerg Roedel
Hey Robin, Nate,

On Fri, Aug 25, 2017 at 02:52:41PM -0400, Nate Watterson wrote:
> Tested-by: Nate Watterson 

If nobody has objections here anymore I would merge these patches when
v4.14-rc1 is released. Given that these changes are a bit more intrusive
and the code is shared between a couple of drivers, I'd like to give it
more testing in linux-next.

Robin, can you rebase these patches to v4.14-rc1 when it is released,
add Nate's Tested-by, and resend them to me please?

Thanks,

Joerg

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/2] iommu/tegra: Add support for struct iommu_device

2017-08-30 Thread Joerg Roedel
Hi Jon,

On Wed, Aug 30, 2017 at 12:04:38PM +0100, Jon Hunter wrote:
> With next-20170829 I am seeing several Tegra boards crashing [0][1] on
> boot in tegra_smmu_probe() and the bisect is point to this commit. Looks
> like there maybe a sequence problem between calls to bus_set_iommu() and
> iommu_device_add_sysfs() which results in a NULL pointer dereference.

Thanks for the report. Can you please test whether the patch below
fixes the problem?

Thanks,

Joerg

diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
index 2802e12e6a54..48ffbe95a663 100644
--- a/drivers/iommu/tegra-smmu.c
+++ b/drivers/iommu/tegra-smmu.c
@@ -949,10 +949,6 @@ struct tegra_smmu *tegra_smmu_probe(struct device *dev,
 
tegra_smmu_ahb_enable();
 
-   err = bus_set_iommu(&platform_bus_type, &tegra_smmu_ops);
-   if (err < 0)
-   return ERR_PTR(err);
-
err = iommu_device_sysfs_add(&smmu->iommu, dev, NULL, dev_name(dev));
if (err)
return ERR_PTR(err);
@@ -965,6 +961,10 @@ struct tegra_smmu *tegra_smmu_probe(struct device *dev,
return ERR_PTR(err);
}
 
+   err = bus_set_iommu(&platform_bus_type, &tegra_smmu_ops);
+   if (err < 0)
+   return ERR_PTR(err);
+
if (IS_ENABLED(CONFIG_DEBUG_FS))
tegra_smmu_debugfs_init(smmu);
 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/2] iommu/tegra: Add support for struct iommu_device

2017-08-30 Thread Jon Hunter

On 09/08/17 23:29, Joerg Roedel wrote:
> From: Joerg Roedel 
> 
> Add a struct iommu_device to each tegra-smmu and register it
> with the iommu-core. Also link devices added to the driver
> to their respective hardware iommus.
> 
> Signed-off-by: Joerg Roedel 
> ---
>  drivers/iommu/tegra-smmu.c | 25 +
>  1 file changed, 25 insertions(+)
> 
> diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
> index faa9c1e..2802e12 100644
> --- a/drivers/iommu/tegra-smmu.c
> +++ b/drivers/iommu/tegra-smmu.c
> @@ -36,6 +36,8 @@ struct tegra_smmu {
>   struct list_head list;
>  
>   struct dentry *debugfs;
> +
> + struct iommu_device iommu;  /* IOMMU Core code handle */
>  };
>  
>  struct tegra_smmu_as {
> @@ -720,6 +722,9 @@ static int tegra_smmu_add_device(struct device *dev)
>* first match.
>*/
>   dev->archdata.iommu = smmu;
> +
> + iommu_device_link(&smmu->iommu, dev);
> +
>   break;
>   }
>  
> @@ -737,6 +742,11 @@ static int tegra_smmu_add_device(struct device *dev)
>  
>  static void tegra_smmu_remove_device(struct device *dev)
>  {
> + struct tegra_smmu *smmu = dev->archdata.iommu;
> +
> + if (smmu)
> + iommu_device_unlink(&smmu->iommu, dev);
> +
>   dev->archdata.iommu = NULL;
>   iommu_group_remove_device(dev);
>  }
> @@ -943,6 +953,18 @@ struct tegra_smmu *tegra_smmu_probe(struct device *dev,
>   if (err < 0)
>   return ERR_PTR(err);
>  
> + err = iommu_device_sysfs_add(&smmu->iommu, dev, NULL, dev_name(dev));
> + if (err)
> + return ERR_PTR(err);
> +
> + iommu_device_set_ops(&smmu->iommu, &tegra_smmu_ops);
> +
> + err = iommu_device_register(&smmu->iommu);
> + if (err) {
> + iommu_device_sysfs_remove(&smmu->iommu);
> + return ERR_PTR(err);
> + }
> +
>   if (IS_ENABLED(CONFIG_DEBUG_FS))
>   tegra_smmu_debugfs_init(smmu);
>  
> @@ -951,6 +973,9 @@ struct tegra_smmu *tegra_smmu_probe(struct device *dev,
>  
>  void tegra_smmu_remove(struct tegra_smmu *smmu)
>  {
> + iommu_device_unregister(&smmu->iommu);
> + iommu_device_sysfs_remove(&smmu->iommu);
> +
>   if (IS_ENABLED(CONFIG_DEBUG_FS))
>   tegra_smmu_debugfs_exit(smmu);
>  }
> 

With next-20170829 I am seeing several Tegra boards crashing [0][1] on
boot in tegra_smmu_probe() and the bisect is point to this commit. Looks
like there maybe a sequence problem between calls to bus_set_iommu() and
iommu_device_add_sysfs() which results in a NULL pointer dereference.

You can see the full crash log here [1].

Cheers
Jon

[0] https://nvtb.github.io//linux-next/
[1]
https://nvtb.github.io//linux-next/test_next-20170829/20170829024534/boot/tegra124-jetson-tk1/tegra124-jetson-tk1/tegra_defconfig_log.txt

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 00/13] mmu_notifier kill invalidate_page callback

2017-08-30 Thread Mike Galbraith
On Tue, 2017-08-29 at 20:56 -0400, Jerome Glisse wrote:
> On Tue, Aug 29, 2017 at 05:11:24PM -0700, Linus Torvalds wrote:
> 
> > People - *especially* the people who saw issues under KVM - can you
> > try out Jérôme's patch-series? I aded some people to the cc, the full
> > series is on lkml. Jérôme - do you have a git branch for people to
> > test that they could easily pull and try out?
> 
> https://cgit.freedesktop.org/~glisse/linux mmu-notifier branch
> git://people.freedesktop.org/~glisse/linux

Looks good here.

I reproduced fairly quickly with RT host and 1 RT guest by just having
the guest do a parallel kbuild over NFS (the guest had to be restored
afterward, was corrupted).  I'm currently flogging 2 guests as well as
the host, whimper free.  I'll let the lot broil for while longer, but
at this point, smoke/flame appearance seems comfortingly unlikely.

-Mike


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu