Re: [PATCH v2 12/12] x86/traps: Fix up invalid PASID

2020-06-15 Thread Raj, Ashok
On Mon, Jun 15, 2020 at 06:03:57PM +0200, Peter Zijlstra wrote:
> 
> I don't get why you need a rdmsr here, or why not having one would
> require a TIF flag. Is that because this MSR is XSAVE/XRSTOR managed?
> 
> > > > +*/
> > > > +   rdmsrl(MSR_IA32_PASID, pasid_msr);
> > > > +   if (pasid_msr & MSR_IA32_PASID_VALID)
> > > > +   return false;
> > > > +
> > > > +   /* Fix up the MSR if the MSR doesn't have a valid PASID. */
> > > > +   wrmsrl(MSR_IA32_PASID, pasid | MSR_IA32_PASID_VALID);
> 
> How much more expensive is the wrmsr over the rdmsr? Can't we just
> unconditionally write the current PASID and call it a day?

The reason to check the rdmsr() is because we are using a hueristic taking
GP faults. If we already setup the MSR, but we get it a second time it
means the reason is something other than PASID_MSR not being set.

Ideally we should use the TIF_ to track this would be cheaper, but we were
told those bits aren't easy to give out. 

Cheers,
Ashok
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 12/12] x86/traps: Fix up invalid PASID

2020-06-15 Thread Fenghua Yu
Hi, Peter,

On Mon, Jun 15, 2020 at 08:31:16PM +0200, Peter Zijlstra wrote:
> On Mon, Jun 15, 2020 at 11:12:59AM -0700, Fenghua Yu wrote:
> > > I don't get why you need a rdmsr here, or why not having one would
> > > require a TIF flag. Is that because this MSR is XSAVE/XRSTOR managed?
> > 
> > My concern is TIF flags are precious (only 3 slots available). Defining
> > a new TIF flag may be not worth it while rdmsr() can check if PASID
> > is valid in the MSR. And performance here might not be a big issue
> > in #GP.
> > 
> > But if you think using TIF flag is better, I can define a new TIF flag
> > and maintain it per thread (init 0 when clone()/fork(), set 1 in fixup()).
> > Then we can avoid using rdmsr() to check valid PASID in the MSR.
> 
> WHY ?!?! What do you need a TIF flag for?

We need "a way" to check if the per thread MSR has a valid PASID. If yes,
no need to fix up the MSR (wrmsr()), and let other handler to handle the #GP.
Otherwise, apply the heuristics and fix up the MSR and exit the #GP.

The way to check the valid PASID in the MSR is rdmsr() in this series.
A TIF flag will be much faster than rdmsr() and seems a sutiable way
to check valid PASID status per thread. That's why it could replace
rdmsr() to check PASID in the MSR.

Or do you suggest to add a random new flag in struct thread_info instead
of a TIF flag?

Thanks.

-Fenghua

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 12/12] x86/traps: Fix up invalid PASID

2020-06-15 Thread Fenghua Yu
Hi, Peter,

On Mon, Jun 15, 2020 at 09:09:28PM +0200, Peter Zijlstra wrote:
> On Mon, Jun 15, 2020 at 11:55:29AM -0700, Fenghua Yu wrote:
> 
> > Or do you suggest to add a random new flag in struct thread_info instead
> > of a TIF flag?
> 
> Why thread_info? What's wrong with something simple like the below. It
> takes a bit from the 'strictly current' flags word.
> 
> 
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index b62e6aaf28f0..fca830b97055 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -801,6 +801,9 @@ struct task_struct {
>   /* Stalled due to lack of memory */
>   unsignedin_memstall:1;
>  #endif
> +#ifdef CONFIG_PCI_PASID
> + unsignedhas_valid_pasid:1;
> +#endif
>  
>   unsigned long   atomic_flags; /* Flags requiring atomic 
> access. */
>  
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 142b23645d82..10b3891be99e 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -955,6 +955,10 @@ static struct task_struct *dup_task_struct(struct 
> task_struct *orig, int node)
>   tsk->use_memdelay = 0;
>  #endif
>  
> +#ifdef CONFIG_PCI_PASID
> + tsk->has_valid_pasid = 0;
> +#endif
> +
>  #ifdef CONFIG_MEMCG
>   tsk->active_memcg = NULL;
>  #endif

The PASID MSR is x86 specific although PASID is PCIe concept and per-mm.
Checking if the MSR has valid PASID (bit31=1) is an x86 specifc work.
The flag should be cleared in cloned()/forked() and is only set and
read in fixup() in x86 #GP for heuristic. It's not used anywhere outside
of x86.

That's why we think the flag should be in x86 struct thread_info instead
of in generice struct task_struct.

Please advice.

Thanks.

-Fenghua
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 26/29] docs: fix references for DMA*.txt files

2020-06-15 Thread Paul E. McKenney
On Mon, Jun 15, 2020 at 08:47:05AM +0200, Mauro Carvalho Chehab wrote:
> As we moved those files to core-api, fix references to point
> to their newer locations.
> 
> Signed-off-by: Mauro Carvalho Chehab 
> ---
>  Documentation/PCI/pci.rst  |  6 +++---
>  Documentation/block/biodoc.rst |  2 +-
>  Documentation/core-api/bus-virt-phys-mapping.rst   |  2 +-
>  Documentation/core-api/dma-api.rst |  6 +++---
>  Documentation/core-api/dma-isa-lpc.rst |  2 +-
>  Documentation/driver-api/usb/dma.rst   |  6 +++---
>  Documentation/memory-barriers.txt  |  6 +++---

I grabbed this Documentation/memory-barriers.txt, but can drop it if
Jon takes the whole batch, in which case:

Acked-by: Paul E. McKenney  # from LKMM perspective

>  .../translations/ko_KR/memory-barriers.txt |  6 +++---
>  arch/ia64/hp/common/sba_iommu.c| 12 ++--
>  arch/parisc/kernel/pci-dma.c   |  2 +-
>  arch/x86/include/asm/dma-mapping.h |  4 ++--
>  arch/x86/kernel/amd_gart_64.c  |  2 +-
>  drivers/parisc/sba_iommu.c | 14 +++---
>  include/linux/dma-mapping.h|  2 +-
>  include/media/videobuf-dma-sg.h|  2 +-
>  kernel/dma/debug.c |  2 +-
>  16 files changed, 38 insertions(+), 38 deletions(-)
> 
> diff --git a/Documentation/PCI/pci.rst b/Documentation/PCI/pci.rst
> index 8c016d8c9862..d10d3fe604c5 100644
> --- a/Documentation/PCI/pci.rst
> +++ b/Documentation/PCI/pci.rst
> @@ -265,7 +265,7 @@ Set the DMA mask size
>  -
>  .. note::
> If anything below doesn't make sense, please refer to
> -   Documentation/DMA-API.txt. This section is just a reminder that
> +   :doc:`/core-api/dma-api`. This section is just a reminder that
> drivers need to indicate DMA capabilities of the device and is not
> an authoritative source for DMA interfaces.
>  
> @@ -291,7 +291,7 @@ Many 64-bit "PCI" devices (before PCI-X) and some PCI-X 
> devices are
>  Setup shared control data
>  -
>  Once the DMA masks are set, the driver can allocate "consistent" (a.k.a. 
> shared)
> -memory.  See Documentation/DMA-API.txt for a full description of
> +memory.  See :doc:`/core-api/dma-api` for a full description of
>  the DMA APIs. This section is just a reminder that it needs to be done
>  before enabling DMA on the device.
>  
> @@ -421,7 +421,7 @@ owners if there is one.
>  
>  Then clean up "consistent" buffers which contain the control data.
>  
> -See Documentation/DMA-API.txt for details on unmapping interfaces.
> +See :doc:`/core-api/dma-api` for details on unmapping interfaces.
>  
>  
>  Unregister from other subsystems
> diff --git a/Documentation/block/biodoc.rst b/Documentation/block/biodoc.rst
> index b964796ec9c7..ba7f45d0271c 100644
> --- a/Documentation/block/biodoc.rst
> +++ b/Documentation/block/biodoc.rst
> @@ -196,7 +196,7 @@ a virtual address mapping (unlike the earlier scheme of 
> virtual address
>  do not have a corresponding kernel virtual address space mapping) and
>  low-memory pages.
>  
> -Note: Please refer to Documentation/DMA-API-HOWTO.txt for a discussion
> +Note: Please refer to :doc:`/core-api/dma-api-howto` for a discussion
>  on PCI high mem DMA aspects and mapping of scatter gather lists, and support
>  for 64 bit PCI.
>  
> diff --git a/Documentation/core-api/bus-virt-phys-mapping.rst 
> b/Documentation/core-api/bus-virt-phys-mapping.rst
> index 4bb07c2f3e7d..c7bc99cd2e21 100644
> --- a/Documentation/core-api/bus-virt-phys-mapping.rst
> +++ b/Documentation/core-api/bus-virt-phys-mapping.rst
> @@ -8,7 +8,7 @@ How to access I/O mapped memory from within device drivers
>  
>   The virt_to_bus() and bus_to_virt() functions have been
>   superseded by the functionality provided by the PCI DMA interface
> - (see Documentation/DMA-API-HOWTO.txt).  They continue
> + (see :doc:`/core-api/dma-api-howto`).  They continue
>   to be documented below for historical purposes, but new code
>   must not use them. --davidm 00/12/12
>  
> diff --git a/Documentation/core-api/dma-api.rst 
> b/Documentation/core-api/dma-api.rst
> index 2d8d2fed7317..63b4a2f20867 100644
> --- a/Documentation/core-api/dma-api.rst
> +++ b/Documentation/core-api/dma-api.rst
> @@ -5,7 +5,7 @@ Dynamic DMA mapping using the generic device
>  :Author: James E.J. Bottomley 
>  
>  This document describes the DMA API.  For a more gentle introduction
> -of the API (and actual examples), see Documentation/DMA-API-HOWTO.txt.
> +of the API (and actual examples), see :doc:`/core-api/dma-api-howto`.
>  
>  This API is split into two pieces.  Part I describes the basic API.
>  Part II describes extensions for supporting non-consistent memory
> @@ -471,7 +471,7 @@ without the _attrs suffixes, except that they pass an 
> 

Re: [PATCH v2 12/12] x86/traps: Fix up invalid PASID

2020-06-15 Thread Andy Lutomirski


> On Jun 15, 2020, at 1:56 PM, Luck, Tony  wrote:
> 
> 
>> 
>> Are we planning to keep PASID live once a task has used it once or are we 
>> going to swap it lazily?  If the latter, a percpu variable might be better.
> 
> Current plan is "touch it once and the task owns it until exit(2)"
> 
> Maybe someday in the future when we have data on how applications
> actually use accelerators we could look at something more complex
> if usage patterns look like it would be beneficial.
> 
> 

So what’s the RDMSR for?  Surely you
have some state somewhere that says “this task has a PASID.”  Can’t you just 
make sure that stays in sync with the MSR?  Then, on #GP, if the task already 
has a PASID, you know the MSR is set.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 12/12] x86/traps: Fix up invalid PASID

2020-06-15 Thread Fenghua Yu
On Mon, Jun 15, 2020 at 06:03:57PM +0200, Peter Zijlstra wrote:
> On Mon, Jun 15, 2020 at 08:48:54AM -0700, Fenghua Yu wrote:
> > Hi, Peter,
> > On Mon, Jun 15, 2020 at 09:56:49AM +0200, Peter Zijlstra wrote:
> > > On Fri, Jun 12, 2020 at 05:41:33PM -0700, Fenghua Yu wrote:
> > > > +/*
> > > > + * Apply some heuristics to see if the #GP fault was caused by a thread
> > > > + * that hasn't had the IA32_PASID MSR initialized.  If it looks like 
> > > > that
> > > > + * is the problem, try initializing the IA32_PASID MSR. If the 
> > > > heuristic
> > > > + * guesses incorrectly, take one more #GP fault.
> > > 
> > > How is that going to help? Aren't we then going to run this same
> > > heuristic again and again and again?
> > 
> > The heuristic always initializes the MSR with the per mm PASID IIF the
> > mm has a valid PASID but the MSR doesn't have one. This heuristic usually
> > happens only once on the first #GP in a thread.
> 
> But it doesn't guarantee the PASID is the right one. Suppose both the mm
> has a PASID and the MSR has a VALID one, but the MSR isn't the mm one.
> Then we NO-OP. So if the exception was due to us having the wrong PASID,
> we stuck.

The MSR for each thread was cleared during fork() and clone(). The PASID
was cleared during mm_init(). The per-mm PASID was assigned when fist
bind_mm() is called and remains the same value until process exit().

The MSR is only fixed up when the first ENQCMD is executed in a thread:
bit 31 in the MSR is 0 and the PASID in the mm is non-zero.

The MSR remains the same PASID value once it's fixed up until the thread
exits.

So the work flow ensures the PASID goes from mm's PASID to the MSR.

The PASID could be unbund from the mm. In this case, iommu will generate
#PF and kernel oops instead of #GP.

> 
> > If the next #GP still comes in, the heuristic finds out the MSR already
> > has a valid PASID and thus will not fixup the MSR any more. The fixup()
> > returns "false" and lets others to handle the #GP.
> > 
> > So the heuristic will be executed once (at most) and won't be executed
> > again and again.
> 
> So I get that you want to set the PASID on-demand, but I find this all
> really weird code to make that happen.

We could keep PASID same in all threads sychronously by propogating the MSRs
when the PASID is bound to the mm via IPIs or taskworks to all
threads in the process. But the code is complex and error-prone and
overhead could be high:
1. The user can call driver to do binding and unbinding multiple times.
   The IPIs or taskworks will be sent multiple times to make sure only
   the last IPIs or taskworks take action.
2. Even if a thread never executes ENQCMD and thus never uses the MSR,
   the MSR still needs to be updated whenever bind_mm() and needs to be
   context switched each time. This could cause high overhead.

Setting the PASID on-demand is simpler and cleaner and was recommended
by Thomas.

> 
> > > > +bool __fixup_pasid_exception(void)
> > > > +{
> > > > +   u64 pasid_msr;
> > > > +   unsigned int pasid;
> > > > +
> > > > +   /*
> > > > +* This function is called only when this #GP was triggered 
> > > > from user
> > > > +* space. So the mm cannot be NULL.
> > > > +*/
> > > > +   pasid = current->mm->pasid;
> > > > +   /* If the mm doesn't have a valid PASID, then can't help. */
> > > > +   if (invalid_pasid(pasid))
> > > > +   return false;
> > > > +
> > > > +   /*
> > > > +* Since IRQ is disabled now, the current task still owns the 
> > > > FPU on
> > > 
> > > That's just weird and confusing. What you want to say is that you rely
> > > on the exception disabling the interrupt.
> > 
> > I checked SDM again. You are right. #GP can be interrupted by machine check
> > or other interrupts. So I cannot assume the current task still owns the FPU.
> > Instead of directly rdmsr() and wrmsr(), I will add helpers that can access
> > either the MSR on the processor or the PASID state in the memory.
> 
> That's not in fact what I meant, but yes, you can take exceptions while
> !IF just fine.
> 
> > > > +* this CPU and the PASID MSR can be directly accessed.
> > > > +*
> > > > +* If the MSR has a valid PASID, the #GP must be for some other 
> > > > reason.
> > > > +*
> > > > +* If rdmsr() is really a performance issue, a TIF_ flag may be
> > > > +* added to check if the thread has a valid PASID instead of 
> > > > rdmsr().
> > > 
> > > I don't understand any of this. Nobody except us writes to this MSR, we
> > > should bloody well know what's in it. What gives?
> > 
> > Patch 4 describes how to manage the MSR and patch 7 describes the format
> > of the MSR (20-bit PASID value and bit 31 is valid bit).
> > 
> > Are they sufficient to help? Or do you mean something else?
> 
> I don't get why you need a rdmsr here, or why not having one would
> require a TIF flag. Is that because this MSR is XSAVE/XRSTOR managed?

My 

Re: [PATCH v2 12/12] x86/traps: Fix up invalid PASID

2020-06-15 Thread Peter Zijlstra
On Mon, Jun 15, 2020 at 11:19:21AM -0700, Raj, Ashok wrote:
> On Mon, Jun 15, 2020 at 06:03:57PM +0200, Peter Zijlstra wrote:
> > 
> > I don't get why you need a rdmsr here, or why not having one would
> > require a TIF flag. Is that because this MSR is XSAVE/XRSTOR managed?
> > 
> > > > > +  */
> > > > > + rdmsrl(MSR_IA32_PASID, pasid_msr);
> > > > > + if (pasid_msr & MSR_IA32_PASID_VALID)
> > > > > + return false;
> > > > > +
> > > > > + /* Fix up the MSR if the MSR doesn't have a valid PASID. */
> > > > > + wrmsrl(MSR_IA32_PASID, pasid | MSR_IA32_PASID_VALID);
> > 
> > How much more expensive is the wrmsr over the rdmsr? Can't we just
> > unconditionally write the current PASID and call it a day?
> 
> The reason to check the rdmsr() is because we are using a hueristic taking
> GP faults. If we already setup the MSR, but we get it a second time it
> means the reason is something other than PASID_MSR not being set.
> 
> Ideally we should use the TIF_ to track this would be cheaper, but we were
> told those bits aren't easy to give out. 

Why do you need a TIF flag? Why not any other random flag in current?
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 12/12] x86/traps: Fix up invalid PASID

2020-06-15 Thread Peter Zijlstra
On Mon, Jun 15, 2020 at 11:12:59AM -0700, Fenghua Yu wrote:
> > I don't get why you need a rdmsr here, or why not having one would
> > require a TIF flag. Is that because this MSR is XSAVE/XRSTOR managed?
> 
> My concern is TIF flags are precious (only 3 slots available). Defining
> a new TIF flag may be not worth it while rdmsr() can check if PASID
> is valid in the MSR. And performance here might not be a big issue
> in #GP.
> 
> But if you think using TIF flag is better, I can define a new TIF flag
> and maintain it per thread (init 0 when clone()/fork(), set 1 in fixup()).
> Then we can avoid using rdmsr() to check valid PASID in the MSR.

WHY ?!?! What do you need a TIF flag for?
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/amd: Fix event counter availability check

2020-06-15 Thread Alexander Monakov
On Mon, 1 Jun 2020, Suravee Suthikulpanit wrote:

> Alexander
> 
> On 6/1/20 4:01 PM, Alexander Monakov wrote:
> > On Mon, 1 Jun 2020, Suravee Suthikulpanit wrote:
> > 
> > > > Moving init_iommu_perf_ctr just after iommu_flush_all_caches resolves
> > > > the issue. This is the earliest point in amd_iommu_init_pci where the
> > > > call succeeds on my laptop.
> > > 
> > > According to your description, it should just need to be anywhere after
> > > the
> > > pci_enable_device() is called for the IOMMU device, isn't it? So, on your
> > > system, what if we just move the init_iommu_perf_ctr() here:
> > 
> > No, this doesn't work, as I already said in the paragraph you are responding
> > to. See my last sentence in the quoted part.
> > 
> > So the implication is init_device_table_dma together with subsequent cache
> > flush is also setting up something that is necessary for counters to be
> > writable.
> > 
> > Alexander
> > 
> 
> Instead of blindly moving the code around to a spot that would just work,
> I am trying to understand what might be required here. In this case,
> the init_device_table_dma()should not be needed. I suspect it's the IOMMU
> invalidate all command that's also needed here.
> 
> I'm also checking with the HW and BIOS team. Meanwhile, could you please give
> the following change a try:

Hello. Can you give any update please?

Alexander

> diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
> index 5b81fd16f5fa..b07458cc1b0b 100644
> --- a/drivers/iommu/amd_iommu_init.c
> +++ b/drivers/iommu/amd_iommu_init.c
> @@ -1875,6 +1875,8 @@ static int __init amd_iommu_init_pci(void)
> ret = iommu_init_pci(iommu);
> if (ret)
> break;
> +   iommu_flush_all_caches(iommu);
> +   init_iommu_perf_ctr(iommu);
> }
> 
> /*
> --
> 2.17.1
> 
> Thanks,
> Suravee
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 12/12] x86/traps: Fix up invalid PASID

2020-06-15 Thread Peter Zijlstra
On Mon, Jun 15, 2020 at 01:17:35PM -0700, Fenghua Yu wrote:
> Hi, Peter,
> 
> On Mon, Jun 15, 2020 at 09:09:28PM +0200, Peter Zijlstra wrote:
> > On Mon, Jun 15, 2020 at 11:55:29AM -0700, Fenghua Yu wrote:
> > 
> > > Or do you suggest to add a random new flag in struct thread_info instead
> > > of a TIF flag?
> > 
> > Why thread_info? What's wrong with something simple like the below. It
> > takes a bit from the 'strictly current' flags word.
> > 
> > 
> > diff --git a/include/linux/sched.h b/include/linux/sched.h
> > index b62e6aaf28f0..fca830b97055 100644
> > --- a/include/linux/sched.h
> > +++ b/include/linux/sched.h
> > @@ -801,6 +801,9 @@ struct task_struct {
> > /* Stalled due to lack of memory */
> > unsignedin_memstall:1;
> >  #endif
> > +#ifdef CONFIG_PCI_PASID
> > +   unsignedhas_valid_pasid:1;
> > +#endif
> >  
> > unsigned long   atomic_flags; /* Flags requiring atomic 
> > access. */
> >  
> > diff --git a/kernel/fork.c b/kernel/fork.c
> > index 142b23645d82..10b3891be99e 100644
> > --- a/kernel/fork.c
> > +++ b/kernel/fork.c
> > @@ -955,6 +955,10 @@ static struct task_struct *dup_task_struct(struct 
> > task_struct *orig, int node)
> > tsk->use_memdelay = 0;
> >  #endif
> >  
> > +#ifdef CONFIG_PCI_PASID
> > +   tsk->has_valid_pasid = 0;
> > +#endif
> > +
> >  #ifdef CONFIG_MEMCG
> > tsk->active_memcg = NULL;
> >  #endif
> 
> The PASID MSR is x86 specific although PASID is PCIe concept and per-mm.
> Checking if the MSR has valid PASID (bit31=1) is an x86 specifc work.
> The flag should be cleared in cloned()/forked() and is only set and
> read in fixup() in x86 #GP for heuristic. It's not used anywhere outside
> of x86.
> 
> That's why we think the flag should be in x86 struct thread_info instead
> of in generice struct task_struct.

I don't think anybody really cares, it's just one bit, we have plenty
left.

On x86_64 there's a u32 sized alignment hole in thread_info, also we
don't use the upper 32bit of thread_info::flags, however using those
would still mean you have to use atomic ops, which you really don't
need.


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 12/12] x86/traps: Fix up invalid PASID

2020-06-15 Thread Andy Lutomirski

> On Jun 15, 2020, at 1:17 PM, Fenghua Yu  wrote:
> 
> Hi, Peter,
> 
>> On Mon, Jun 15, 2020 at 09:09:28PM +0200, Peter Zijlstra wrote:
>>> On Mon, Jun 15, 2020 at 11:55:29AM -0700, Fenghua Yu wrote:
>>> 
>>> Or do you suggest to add a random new flag in struct thread_info instead
>>> of a TIF flag?
>> 
>> Why thread_info? What's wrong with something simple like the below. It
>> takes a bit from the 'strictly current' flags word.
>> 
>> 
>> diff --git a/include/linux/sched.h b/include/linux/sched.h
>> index b62e6aaf28f0..fca830b97055 100644
>> --- a/include/linux/sched.h
>> +++ b/include/linux/sched.h
>> @@ -801,6 +801,9 @@ struct task_struct {
>>/* Stalled due to lack of memory */
>>unsignedin_memstall:1;
>> #endif
>> +#ifdef CONFIG_PCI_PASID
>> +unsignedhas_valid_pasid:1;
>> +#endif
>> 
>>unsigned longatomic_flags; /* Flags requiring atomic access. 
>> */
>> 
>> diff --git a/kernel/fork.c b/kernel/fork.c
>> index 142b23645d82..10b3891be99e 100644
>> --- a/kernel/fork.c
>> +++ b/kernel/fork.c
>> @@ -955,6 +955,10 @@ static struct task_struct *dup_task_struct(struct 
>> task_struct *orig, int node)
>>tsk->use_memdelay = 0;
>> #endif
>> 
>> +#ifdef CONFIG_PCI_PASID
>> +tsk->has_valid_pasid = 0;
>> +#endif
>> +
>> #ifdef CONFIG_MEMCG
>>tsk->active_memcg = NULL;
>> #endif
> 
> The PASID MSR is x86 specific although PASID is PCIe concept and per-mm.
> Checking if the MSR has valid PASID (bit31=1) is an x86 specifc work.
> The flag should be cleared in cloned()/forked() and is only set and
> read in fixup() in x86 #GP for heuristic. It's not used anywhere outside
> of x86.
> 
> That's why we think the flag should be in x86 struct thread_info instead
> of in generice struct task_struct.
> 

Are we planning to keep PASID live once a task has used it once or are we going 
to swap it lazily?  If the latter, a percpu variable might be better.

> Please advice.
> 
> Thanks.
> 
> -Fenghua
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 12/12] x86/traps: Fix up invalid PASID

2020-06-15 Thread Peter Zijlstra
On Mon, Jun 15, 2020 at 11:55:29AM -0700, Fenghua Yu wrote:

> Or do you suggest to add a random new flag in struct thread_info instead
> of a TIF flag?

Why thread_info? What's wrong with something simple like the below. It
takes a bit from the 'strictly current' flags word.


diff --git a/include/linux/sched.h b/include/linux/sched.h
index b62e6aaf28f0..fca830b97055 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -801,6 +801,9 @@ struct task_struct {
/* Stalled due to lack of memory */
unsignedin_memstall:1;
 #endif
+#ifdef CONFIG_PCI_PASID
+   unsignedhas_valid_pasid:1;
+#endif
 
unsigned long   atomic_flags; /* Flags requiring atomic 
access. */
 
diff --git a/kernel/fork.c b/kernel/fork.c
index 142b23645d82..10b3891be99e 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -955,6 +955,10 @@ static struct task_struct *dup_task_struct(struct 
task_struct *orig, int node)
tsk->use_memdelay = 0;
 #endif
 
+#ifdef CONFIG_PCI_PASID
+   tsk->has_valid_pasid = 0;
+#endif
+
 #ifdef CONFIG_MEMCG
tsk->active_memcg = NULL;
 #endif
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v2 12/12] x86/traps: Fix up invalid PASID

2020-06-15 Thread Luck, Tony
> Are we planning to keep PASID live once a task has used it once or are we 
> going to swap it lazily?  If the latter, a percpu variable might be better.

Current plan is "touch it once and the task owns it until exit(2)"

Maybe someday in the future when we have data on how applications
actually use accelerators we could look at something more complex
if usage patterns look like it would be beneficial.

-Tony
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v2 12/12] x86/traps: Fix up invalid PASID

2020-06-15 Thread Luck, Tony
> So what’s the RDMSR for?  Surely you
> have some state somewhere that says “this task has a PASID.”
> Can’t you just make sure that stays in sync with the MSR?  Then, on #GP, if 
> the task already has a PASID, you know the MSR is set.

We have state that says the process ("mm") has been allocated a PASID. But not 
for each task.

E.g. a process may clone a bunch of tasks, then one of them opens a device that 
needs
a PASID.   We did internally have patches to go hunt down all those other tasks 
and
force a PASID onto each. But the code was big and ugly. Also maybe the wrong 
thing
to do if those threads didn't ever need to access the device.

PeterZ suggested that we can add a bit to the task structure for this purpose.

Fenghua is hesitant about adding an x86 only bit there.

-Tony
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 3/4] pci: acs: Enable PCI_ACS_TB for untrusted/external-facing devices

2020-06-15 Thread Rajat Jain via iommu
When enabling ACS, currently the bit "translation blocking" was
not getting changed at all. Set it to disable translation blocking
too for all external facing or untrusted devices. This is OK
because ATS is only allowed on internal devces.

Signed-off-by: Rajat Jain 
---
 drivers/pci/pci.c|  4 
 drivers/pci/quirks.c | 11 +++
 2 files changed, 15 insertions(+)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index d2ff987585855..79853b52658a2 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3330,6 +3330,10 @@ static void pci_std_enable_acs(struct pci_dev *dev)
/* Upstream Forwarding */
ctrl |= (cap & PCI_ACS_UF);
 
+   if (dev->external_facing || dev->untrusted)
+   /* Translation Blocking */
+   ctrl |= (cap & PCI_ACS_TB);
+
pci_write_config_word(dev, pos + PCI_ACS_CTRL, ctrl);
 }
 
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index b341628e47527..6294adeac4049 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -4934,6 +4934,13 @@ static void pci_quirk_enable_intel_rp_mpc_acs(struct 
pci_dev *dev)
}
 }
 
+/*
+ * Currently this quirk does the equivalent of
+ * PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_UF | PCI_ACS_SV
+ *
+ * Currently missing, it also needs to do equivalent of PCI_ACS_TB,
+ * if dev->external_facing || dev->untrusted
+ */
 static int pci_quirk_enable_intel_pch_acs(struct pci_dev *dev)
 {
if (!pci_quirk_intel_pch_acs_match(dev))
@@ -4973,6 +4980,10 @@ static int pci_quirk_enable_intel_spt_pch_acs(struct 
pci_dev *dev)
ctrl |= (cap & PCI_ACS_CR);
ctrl |= (cap & PCI_ACS_UF);
 
+   if (dev->external_facing || dev->untrusted)
+   /* Translation Blocking */
+   ctrl |= (cap & PCI_ACS_TB);
+
pci_write_config_dword(dev, pos + INTEL_SPT_ACS_CTRL, ctrl);
 
pci_info(dev, "Intel SPT PCH root port ACS workaround enabled\n");
-- 
2.27.0.290.gba653c62da-goog

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/4] pci: Keep the ACS capability offset in device

2020-06-15 Thread Rajat Jain via iommu
Currently this is being looked up at a number of places. Read
and store it once at bootup so that it can be used by all later.

Signed-off-by: Rajat Jain 
---
 drivers/pci/p2pdma.c |  2 +-
 drivers/pci/pci.c| 21 +
 drivers/pci/pci.h|  2 +-
 drivers/pci/probe.c  |  2 +-
 drivers/pci/quirks.c |  8 
 include/linux/pci.h  |  1 +
 6 files changed, 25 insertions(+), 11 deletions(-)

diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c
index e8e444eeb1cd2..f29a48f8fa594 100644
--- a/drivers/pci/p2pdma.c
+++ b/drivers/pci/p2pdma.c
@@ -253,7 +253,7 @@ static int pci_bridge_has_acs_redir(struct pci_dev *pdev)
int pos;
u16 ctrl;
 
-   pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_ACS);
+   pos = pdev->acs_cap;
if (!pos)
return 0;
 
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index ce096272f52b1..d2ff987585855 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -51,6 +51,7 @@ EXPORT_SYMBOL(pci_pci_problems);
 
 unsigned int pci_pm_d3_delay;
 
+static void pci_enable_acs(struct pci_dev *dev);
 static void pci_pme_list_scan(struct work_struct *work);
 
 static LIST_HEAD(pci_pme_list);
@@ -3284,7 +3285,7 @@ static void pci_disable_acs_redir(struct pci_dev *dev)
if (!pci_dev_specific_disable_acs_redir(dev))
return;
 
-   pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_ACS);
+   pos = dev->acs_cap;
if (!pos) {
pci_warn(dev, "cannot disable ACS redirect for this hardware as 
it does not have ACS capabilities\n");
return;
@@ -3310,7 +3311,7 @@ static void pci_std_enable_acs(struct pci_dev *dev)
u16 cap;
u16 ctrl;
 
-   pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_ACS);
+   pos = dev->acs_cap;
if (!pos)
return;
 
@@ -3336,7 +3337,7 @@ static void pci_std_enable_acs(struct pci_dev *dev)
  * pci_enable_acs - enable ACS if hardware support it
  * @dev: the PCI device
  */
-void pci_enable_acs(struct pci_dev *dev)
+static void pci_enable_acs(struct pci_dev *dev)
 {
if (!pci_acs_enable)
goto disable_acs_redir;
@@ -3362,7 +3363,7 @@ static bool pci_acs_flags_enabled(struct pci_dev *pdev, 
u16 acs_flags)
int pos;
u16 cap, ctrl;
 
-   pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_ACS);
+   pos = pdev->acs_cap;
if (!pos)
return false;
 
@@ -3487,6 +3488,18 @@ bool pci_acs_path_enabled(struct pci_dev *start,
return true;
 }
 
+/**
+ * pci_acs_init - Initialize if hardware supports it
+ * @dev: the PCI device
+ */
+void pci_acs_init(struct pci_dev *dev)
+{
+   dev->acs_cap = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_ACS);
+
+   if (dev->acs_cap)
+   pci_enable_acs(dev);
+}
+
 /**
  * pci_rebar_find_pos - find position of resize ctrl reg for BAR
  * @pdev: PCI device
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 6d3f758671064..12fb79fbe29d3 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -532,7 +532,7 @@ static inline resource_size_t pci_resource_alignment(struct 
pci_dev *dev,
return resource_alignment(res);
 }
 
-void pci_enable_acs(struct pci_dev *dev);
+void pci_acs_init(struct pci_dev *dev);
 #ifdef CONFIG_PCI_QUIRKS
 int pci_dev_specific_acs_enabled(struct pci_dev *dev, u16 acs_flags);
 int pci_dev_specific_enable_acs(struct pci_dev *dev);
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 2f66988cea257..6d87066a5ecc5 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2390,7 +2390,7 @@ static void pci_init_capabilities(struct pci_dev *dev)
pci_ats_init(dev);  /* Address Translation Services */
pci_pri_init(dev);  /* Page Request Interface */
pci_pasid_init(dev);/* Process Address Space ID */
-   pci_enable_acs(dev);/* Enable ACS P2P upstream forwarding */
+   pci_acs_init(dev);  /* Access Control Services */
pci_ptm_init(dev);  /* Precision Time Measurement */
pci_aer_init(dev);  /* Advanced Error Reporting */
pci_dpc_init(dev);  /* Downstream Port Containment */
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 812bfc32ecb82..b341628e47527 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -4653,7 +4653,7 @@ static int pci_quirk_intel_spt_pch_acs(struct pci_dev 
*dev, u16 acs_flags)
if (!pci_quirk_intel_spt_pch_acs_match(dev))
return -ENOTTY;
 
-   pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_ACS);
+   pos = dev->acs_cap;
if (!pos)
return -ENOTTY;
 
@@ -4961,7 +4961,7 @@ static int pci_quirk_enable_intel_spt_pch_acs(struct 
pci_dev *dev)
if (!pci_quirk_intel_spt_pch_acs_match(dev))
return -ENOTTY;
 
-   pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_ACS);
+   pos = 

Re: [PATCH 4/4] pci: export untrusted attribute in sysfs

2020-06-15 Thread Greg Kroah-Hartman
On Mon, Jun 15, 2020 at 06:17:42PM -0700, Rajat Jain wrote:
> This is needed to allow the userspace to determine when an untrusted
> device has been added, and thus allowing it to bind the driver manually
> to it, if it so wishes. This is being done as part of the approach
> discussed at https://lkml.org/lkml/2020/6/9/1331
> 
> Signed-off-by: Rajat Jain 
> ---
>  drivers/pci/pci-sysfs.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index 6d78df981d41a..574e9c613ba26 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -50,6 +50,7 @@ pci_config_attr(subsystem_device, "0x%04x\n");
>  pci_config_attr(revision, "0x%02x\n");
>  pci_config_attr(class, "0x%06x\n");
>  pci_config_attr(irq, "%u\n");
> +pci_config_attr(untrusted, "%u\n");
>  
>  static ssize_t broken_parity_status_show(struct device *dev,
>struct device_attribute *attr,
> @@ -608,6 +609,7 @@ static struct attribute *pci_dev_attrs[] = {
>  #endif
>   _attr_driver_override.attr,
>   _attr_ari_enabled.attr,
> + _attr_untrusted.attr,
>   NULL,
>  };

You also need a Documentation/ABI/ update for this new file.

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [patch for-5.8 2/4] dma-direct: re-encrypt memory if dma_direct_alloc_pages() fails

2020-06-15 Thread Christoph Hellwig
On Thu, Jun 11, 2020 at 12:20:29PM -0700, David Rientjes wrote:
> If arch_dma_set_uncached() fails after memory has been decrypted, it needs
> to be re-encrypted before freeing.
> 
> Fixes: fa7e2247c572 ("dma-direct: make uncached_kernel_address more
> general")
> Cc: sta...@vger.kernel.org # 5.7
> Signed-off-by: David Rientjes 

Note that this can't really happen in practice as
CONFIG_ARCH_HAS_DMA_SET_UNCACHED and memory encryption are mutally
exclusive in pracrie.  Still looks ok and useful otherwise.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 00/29] Documentation fixes

2020-06-15 Thread Mauro Carvalho Chehab
Hi Jon,

That's a bunch of files I have to be applied on the top of v5.8-rc1 fixing
documentation warnings. I already removed some duplicated stuff.

Regards,
Mauro

Mauro Carvalho Chehab (29):
  mm: vmalloc.c: remove a kernel-doc annotation from a removed parameter
  net: dev: add a missing kernel-doc annotation
  net: netdevice.h: add a description for napi_defer_hard_irqs
  scripts/kernel-doc: parse __ETHTOOL_DECLARE_LINK_MODE_MASK
  net: pylink.h: add kernel-doc descriptions for new fields at
phylink_config
  scripts/kernel-doc: handle function pointer prototypes
  fs: fs.h: fix a kernel-doc parameter description
  gpio: driver.h: fix kernel-doc markup
  kcsan: fix a kernel-doc warning
  rcu: fix some kernel-doc warnings
  fs: docs: f2fs.rst: fix a broken table
  dt: update a reference for reneases pcar file renamed to yaml
  dt: fix broken links due to txt->yaml renames
  dt: Fix broken references to renamed docs
  dt: fix reference to olpc,xo1.75-ec.txt
  selftests/vm/keys: fix a broken reference at protection_keys.c
  docs: hugetlbpage.rst: fix some warnings
  docs: powerpc: fix some issues at vas-api.rst
  docs: driver-model: remove a duplicated markup at driver.rst
  docs: watch_queue.rst: supress some Sphinx warnings and move to
core-api
  docs: device-mapper: add dm-ebs.rst to an index file
  docs: it_IT: add two missing references
  docs: ABI: fix a typo when pointing to w1-generic.rst
  docs: fs: locking.rst: fix a broken table
  docs: add bus-virt-phys-mapping.txt to core-api
  docs: fix references for DMA*.txt files
  docs: dt: minor adjustments at writing-schema.rst
  docs: fs: proc.rst: fix a warning due to a merge conflict
  docs: fs: proc.rst: convert a new chapter to ReST

 .../ABI/testing/sysfs-driver-w1_therm |   2 +-
 Documentation/PCI/pci.rst |   6 +-
 .../admin-guide/device-mapper/index.rst   |   1 +
 Documentation/admin-guide/mm/hugetlbpage.rst  |  25 ++-
 Documentation/block/biodoc.rst|   2 +-
 .../bus-virt-phys-mapping.rst}|   2 +-
 Documentation/core-api/dma-api.rst|   6 +-
 Documentation/core-api/dma-isa-lpc.rst|   2 +-
 Documentation/core-api/index.rst  |   2 +
 Documentation/{ => core-api}/watch_queue.rst  |  34 ++--
 .../bindings/arm/freescale/fsl,scu.txt|   2 +-
 .../bindings/display/bridge/sii902x.txt   |   2 +-
 .../bindings/display/imx/fsl-imx-drm.txt  |   4 +-
 .../devicetree/bindings/display/imx/ldb.txt   |   4 +-
 .../display/rockchip/rockchip-drm.yaml|   2 +-
 .../bindings/misc/olpc,xo1.75-ec.txt  |   2 +-
 .../bindings/net/mediatek-bluetooth.txt   |   2 +-
 .../bindings/pinctrl/renesas,pfc-pinctrl.txt  |   2 +-
 .../bindings/sound/audio-graph-card.txt   |   2 +-
 .../bindings/sound/st,sti-asoc-card.txt   |   2 +-
 .../bindings/spi/qcom,spi-geni-qcom.txt   |   2 +-
 Documentation/devicetree/writing-schema.rst   |   9 +-
 .../driver-api/driver-model/driver.rst|   2 -
 Documentation/driver-api/usb/dma.rst  |   6 +-
 Documentation/filesystems/f2fs.rst| 150 --
 Documentation/filesystems/locking.rst |   6 +-
 Documentation/filesystems/proc.rst|  46 +++---
 Documentation/memory-barriers.txt |   6 +-
 Documentation/mips/ingenic-tcu.rst|   2 +-
 Documentation/powerpc/vas-api.rst |  23 ++-
 Documentation/security/keys/core.rst  |   2 +-
 .../it_IT/process/management-style.rst|   2 +
 .../it_IT/process/submitting-patches.rst  |   2 +
 .../translations/ko_KR/memory-barriers.txt|   6 +-
 MAINTAINERS   |   8 +-
 arch/ia64/hp/common/sba_iommu.c   |  12 +-
 arch/parisc/kernel/pci-dma.c  |   2 +-
 arch/x86/include/asm/dma-mapping.h|   4 +-
 arch/x86/kernel/amd_gart_64.c |   2 +-
 drivers/parisc/sba_iommu.c|  14 +-
 include/linux/dma-mapping.h   |   2 +-
 include/linux/fs.h|   2 +-
 include/linux/gpio/driver.h   |   2 +-
 include/linux/kcsan-checks.h  |  10 +-
 include/linux/netdevice.h |   2 +
 include/linux/phylink.h   |   4 +
 include/linux/rculist.h   |   2 +-
 include/linux/watch_queue.h   |   2 +-
 include/media/videobuf-dma-sg.h   |   2 +-
 init/Kconfig  |   2 +-
 kernel/dma/debug.c|   2 +-
 kernel/watch_queue.c  |   2 +-
 mm/vmalloc.c  |   1 -
 net/core/dev.c|   1 +
 scripts/kernel-doc|   7 +
 tools/testing/selftests/vm/protection_keys.c  |   2 +-
 56 files changed, 282 insertions(+), 175 deletions(-)
 rename Documentation/{bus-virt-phys-mapping.txt => 

[PATCH] dma-direct: enable mmap for !CONFIG_MMU

2020-06-15 Thread Christoph Hellwig
nommu configfs can trivially map the coherent allocations to user space,
as no actual page table setup is required and the kernel and the user
space programs share the same address space.

Fixes: 62fcee9a3bd7 ("dma-mapping: remove CONFIG_ARCH_NO_COHERENT_DMA_MMAP")
Signed-off-by: Christoph Hellwig 
Reported-by: dillon min 
Reviewed-by: Vladimir Murzin 
Tested-by: dillon min 
---
 kernel/dma/Kconfig  |  1 +
 kernel/dma/direct.c | 14 --
 2 files changed, 1 insertion(+), 14 deletions(-)

diff --git a/kernel/dma/Kconfig b/kernel/dma/Kconfig
index d006668c0027d2..e0dae570a51530 100644
--- a/kernel/dma/Kconfig
+++ b/kernel/dma/Kconfig
@@ -71,6 +71,7 @@ config SWIOTLB
 # in the pagetables
 #
 config DMA_NONCOHERENT_MMAP
+   default y if !MMU
bool
 
 config DMA_REMAP
diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
index 0a4881e59aa7d6..9ec6a5c3fc578c 100644
--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -459,7 +459,6 @@ int dma_direct_get_sgtable(struct device *dev, struct 
sg_table *sgt,
return ret;
 }
 
-#ifdef CONFIG_MMU
 bool dma_direct_can_mmap(struct device *dev)
 {
return dev_is_dma_coherent(dev) ||
@@ -485,19 +484,6 @@ int dma_direct_mmap(struct device *dev, struct 
vm_area_struct *vma,
return remap_pfn_range(vma, vma->vm_start, pfn + vma->vm_pgoff,
user_count << PAGE_SHIFT, vma->vm_page_prot);
 }
-#else /* CONFIG_MMU */
-bool dma_direct_can_mmap(struct device *dev)
-{
-   return false;
-}
-
-int dma_direct_mmap(struct device *dev, struct vm_area_struct *vma,
-   void *cpu_addr, dma_addr_t dma_addr, size_t size,
-   unsigned long attrs)
-{
-   return -ENXIO;
-}
-#endif /* CONFIG_MMU */
 
 int dma_direct_supported(struct device *dev, u64 mask)
 {
-- 
2.26.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [Issue]platform/x86: iommu: System can't shutdown because iommu driver keeps checking the status of DMA_GSTS_TES

2020-06-15 Thread Lu Baolu

Hi Koba Ko,

On 2020/6/15 11:19, Koba Ko wrote:

hi All,
I have a machine and there's only intel gpu.
the secureboot and vt-d is enabled in BIOS.
On the Ubuntu desktop, I do s2idle first and restart the machine.
The machine can't restart successfully, so I need to press the 
power button to shutdown.

I tried  each of the following and the issue can't be triggered.
1. disable secure boot in BIOS.
2. intel_iommu=off.
3. intel_iomm=igfx_off.
4. nomodeset
5. i915.modeset=0.

After I investigate further, find inte_iommu keeps checking the status 
of DMA_GSTS_TES.
During the procedure of restart, the driver would disable iommu 
translation and

check status of DMA_GSTS_TES until status of DMA_GSTS_TES is 0.

If you need more information, I can provide it.


Do you mind telling what platform is it?

Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: arm64 iommu groups issue

2020-06-15 Thread John Garry

On 12/06/2020 15:30, Lorenzo Pieralisi wrote:

On Mon, Feb 17, 2020 at 12:08:48PM +, John Garry wrote:


Right, and even worse is that it relies on the port driver even
existing at all.

All this iommu group assignment should be taken outside device
driver probe paths.

However we could still consider device links for sync'ing the SMMU
and each device probing.


Yes, we should get that for DT now thanks to the of_devlink stuff, but
cooking up some equivalent for IORT might be worthwhile.


It doesn't solve this problem, but at least we could remove the iommu_ops
check in iort_iommu_xlate().

We would need to carve out a path from pci_device_add() or even device_add()
to solve all cases.




Another thought that crosses my mind is that when pci_device_group()
walks up to the point of ACS isolation and doesn't find an existing
group, it can still infer that everything it walked past *should* be put
in the same group it's then eventually going to return. Unfortunately I
can't see an obvious way for it to act on that knowledge, though, since
recursive iommu_probe_device() is unlikely to end well.




[...]


And this looks to be the reason for which current
iommu_bus_init()->bus_for_each_device(..., add_iommu_group) fails
also.


Of course, just adding a 'correct' add_device replay without the
of_xlate process doesn't help at all. No wonder this looked suspiciously
simpler than where the first idea left off...

(on reflection, the core of this idea seems to be recycling the existing
iommu_bus_init walk rather than building up a separate "waiting list",
while forgetting that that wasn't the difficult part of the original
idea anyway)


We could still use a bus walk to add the group per iommu, but we would need
an additional check to ensure the device is associated with the IOMMU.




On this current code mentioned, the principle of this seems wrong to
me - we call bus_for_each_device(..., add_iommu_group) for the first
SMMU in the system which probes, but we attempt to add_iommu_group()
for all devices on the bus, even though the SMMU for that device may
yet to have probed.


Yes, iommu_bus_init() is one of the places still holding a
deeply-ingrained assumption that the ops go live for all IOMMU instances
at once, which is what warranted the further replay in
of_iommu_configure() originally. Moving that out of
of_platform_device_create() to support probe deferral is where the
trouble really started.


I'm not too familiar with the history here, but could this be reverted now
with the introduction of of_devlink stuff?


Hi John,


Hi Lorenzo,



have we managed to reach a consensus on this thread on how to solve
the issue ? 


No, not really. So Robin and I tried a couple of quick things 
previously, but they came did not come to much, as above.



Asking because this thread seems stalled - I am keen on
getting it fixed.


I haven't spent more time on this. But from what I was hearing last 
time, this issue was ticketed internally for arm, so I was waiting for 
that to be picked up to re-engage.


Thanks,
John
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [patch for-5.8 4/4] dma-direct: add missing set_memory_decrypted() for coherent mapping

2020-06-15 Thread Christoph Hellwig
On Thu, Jun 11, 2020 at 12:20:32PM -0700, David Rientjes wrote:
> When a coherent mapping is created in dma_direct_alloc_pages(), it needs
> to be decrypted if the device requires unencrypted DMA before returning.
> 
> Fixes: 3acac065508f ("dma-mapping: merge the generic remapping helpers
> into dma-direct")
> Cc: sta...@vger.kernel.org # 5.5+
> Signed-off-by: David Rientjes 
> ---
>  kernel/dma/direct.c | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
> --- a/kernel/dma/direct.c
> +++ b/kernel/dma/direct.c
> @@ -195,6 +195,12 @@ void *dma_direct_alloc_pages(struct device *dev, size_t 
> size,
>   __builtin_return_address(0));
>   if (!ret)
>   goto out_free_pages;
> + if (force_dma_unencrypted(dev)) {
> + err = set_memory_decrypted((unsigned long)ret,
> +1 << get_order(size));
> + if (err)
> + goto out_free_pages;
> + }

Note that ret is a vmalloc address here.  Does set_memory_decrypted
work for that case?  Again this should be mostly theoretical, so I'm
not too worried for now.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 26/29] docs: fix references for DMA*.txt files

2020-06-15 Thread Mauro Carvalho Chehab
As we moved those files to core-api, fix references to point
to their newer locations.

Signed-off-by: Mauro Carvalho Chehab 
---
 Documentation/PCI/pci.rst  |  6 +++---
 Documentation/block/biodoc.rst |  2 +-
 Documentation/core-api/bus-virt-phys-mapping.rst   |  2 +-
 Documentation/core-api/dma-api.rst |  6 +++---
 Documentation/core-api/dma-isa-lpc.rst |  2 +-
 Documentation/driver-api/usb/dma.rst   |  6 +++---
 Documentation/memory-barriers.txt  |  6 +++---
 .../translations/ko_KR/memory-barriers.txt |  6 +++---
 arch/ia64/hp/common/sba_iommu.c| 12 ++--
 arch/parisc/kernel/pci-dma.c   |  2 +-
 arch/x86/include/asm/dma-mapping.h |  4 ++--
 arch/x86/kernel/amd_gart_64.c  |  2 +-
 drivers/parisc/sba_iommu.c | 14 +++---
 include/linux/dma-mapping.h|  2 +-
 include/media/videobuf-dma-sg.h|  2 +-
 kernel/dma/debug.c |  2 +-
 16 files changed, 38 insertions(+), 38 deletions(-)

diff --git a/Documentation/PCI/pci.rst b/Documentation/PCI/pci.rst
index 8c016d8c9862..d10d3fe604c5 100644
--- a/Documentation/PCI/pci.rst
+++ b/Documentation/PCI/pci.rst
@@ -265,7 +265,7 @@ Set the DMA mask size
 -
 .. note::
If anything below doesn't make sense, please refer to
-   Documentation/DMA-API.txt. This section is just a reminder that
+   :doc:`/core-api/dma-api`. This section is just a reminder that
drivers need to indicate DMA capabilities of the device and is not
an authoritative source for DMA interfaces.
 
@@ -291,7 +291,7 @@ Many 64-bit "PCI" devices (before PCI-X) and some PCI-X 
devices are
 Setup shared control data
 -
 Once the DMA masks are set, the driver can allocate "consistent" (a.k.a. 
shared)
-memory.  See Documentation/DMA-API.txt for a full description of
+memory.  See :doc:`/core-api/dma-api` for a full description of
 the DMA APIs. This section is just a reminder that it needs to be done
 before enabling DMA on the device.
 
@@ -421,7 +421,7 @@ owners if there is one.
 
 Then clean up "consistent" buffers which contain the control data.
 
-See Documentation/DMA-API.txt for details on unmapping interfaces.
+See :doc:`/core-api/dma-api` for details on unmapping interfaces.
 
 
 Unregister from other subsystems
diff --git a/Documentation/block/biodoc.rst b/Documentation/block/biodoc.rst
index b964796ec9c7..ba7f45d0271c 100644
--- a/Documentation/block/biodoc.rst
+++ b/Documentation/block/biodoc.rst
@@ -196,7 +196,7 @@ a virtual address mapping (unlike the earlier scheme of 
virtual address
 do not have a corresponding kernel virtual address space mapping) and
 low-memory pages.
 
-Note: Please refer to Documentation/DMA-API-HOWTO.txt for a discussion
+Note: Please refer to :doc:`/core-api/dma-api-howto` for a discussion
 on PCI high mem DMA aspects and mapping of scatter gather lists, and support
 for 64 bit PCI.
 
diff --git a/Documentation/core-api/bus-virt-phys-mapping.rst 
b/Documentation/core-api/bus-virt-phys-mapping.rst
index 4bb07c2f3e7d..c7bc99cd2e21 100644
--- a/Documentation/core-api/bus-virt-phys-mapping.rst
+++ b/Documentation/core-api/bus-virt-phys-mapping.rst
@@ -8,7 +8,7 @@ How to access I/O mapped memory from within device drivers
 
The virt_to_bus() and bus_to_virt() functions have been
superseded by the functionality provided by the PCI DMA interface
-   (see Documentation/DMA-API-HOWTO.txt).  They continue
+   (see :doc:`/core-api/dma-api-howto`).  They continue
to be documented below for historical purposes, but new code
must not use them. --davidm 00/12/12
 
diff --git a/Documentation/core-api/dma-api.rst 
b/Documentation/core-api/dma-api.rst
index 2d8d2fed7317..63b4a2f20867 100644
--- a/Documentation/core-api/dma-api.rst
+++ b/Documentation/core-api/dma-api.rst
@@ -5,7 +5,7 @@ Dynamic DMA mapping using the generic device
 :Author: James E.J. Bottomley 
 
 This document describes the DMA API.  For a more gentle introduction
-of the API (and actual examples), see Documentation/DMA-API-HOWTO.txt.
+of the API (and actual examples), see :doc:`/core-api/dma-api-howto`.
 
 This API is split into two pieces.  Part I describes the basic API.
 Part II describes extensions for supporting non-consistent memory
@@ -471,7 +471,7 @@ without the _attrs suffixes, except that they pass an 
optional
 dma_attrs.
 
 The interpretation of DMA attributes is architecture-specific, and
-each attribute should be documented in Documentation/DMA-attributes.txt.
+each attribute should be documented in :doc:`/core-api/dma-attributes`.
 
 If dma_attrs are 0, the semantics of each of these functions
 is identical to those of the corresponding function
@@ -484,7 +484,7 @@ for DMA::
 
#include 
/* DMA_ATTR_FOO 

Re: [patch for-5.8 1/4] dma-direct: always align allocation size in dma_direct_alloc_pages()

2020-06-15 Thread Christoph Hellwig
On Thu, Jun 11, 2020 at 12:20:28PM -0700, David Rientjes wrote:
> dma_alloc_contiguous() does size >> PAGE_SHIFT and set_memory_decrypted()
> works at page granularity.  It's necessary to page align the allocation
> size in dma_direct_alloc_pages() for consistent behavior.
> 
> This also fixes an issue when arch_dma_prep_coherent() is called on an
> unaligned allocation size for dma_alloc_need_uncached() when
> CONFIG_DMA_DIRECT_REMAP is disabled but CONFIG_ARCH_HAS_DMA_SET_UNCACHED
> is enabled.
> 
> Cc: sta...@vger.kernel.org
> Signed-off-by: David Rientjes 
> ---
>  kernel/dma/direct.c | 17 ++---
>  1 file changed, 10 insertions(+), 7 deletions(-)
> 
> diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
> --- a/kernel/dma/direct.c
> +++ b/kernel/dma/direct.c
> @@ -112,11 +112,12 @@ static inline bool dma_should_free_from_pool(struct 
> device *dev,
>  struct page *__dma_direct_alloc_pages(struct device *dev, size_t size,
>   gfp_t gfp, unsigned long attrs)
>  {
> - size_t alloc_size = PAGE_ALIGN(size);
>   int node = dev_to_node(dev);
>   struct page *page = NULL;
>   u64 phys_limit;
>  
> + VM_BUG_ON(!PAGE_ALIGNED(size));

This really should be a WARN_ON_ONCE, but I've fixed this up before
applying.  I've also added a prep patch to mark __dma_direct_alloc_pages
static as part of auditing for other callers.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v2 02/15] iommu: Report domain nesting info

2020-06-15 Thread Liu, Yi L
Hi Kevin,

> From: Tian, Kevin 
> Sent: Monday, June 15, 2020 9:23 AM
> 
> > From: Liu, Yi L 
> > Sent: Friday, June 12, 2020 5:05 PM
> >
> > Hi Alex,
> >
> > > From: Alex Williamson 
> > > Sent: Friday, June 12, 2020 3:30 AM
> > >
> > > On Thu, 11 Jun 2020 05:15:21 -0700
> > > Liu Yi L  wrote:
> > >
> > > > IOMMUs that support nesting translation needs report the
> > > > capability info to userspace, e.g. the format of first level/stage 
> > > > paging
> structures.
> > > >
> > > > Cc: Kevin Tian 
> > > > CC: Jacob Pan 
> > > > Cc: Alex Williamson 
> > > > Cc: Eric Auger 
> > > > Cc: Jean-Philippe Brucker 
> > > > Cc: Joerg Roedel 
> > > > Cc: Lu Baolu 
> > > > Signed-off-by: Liu Yi L 
> > > > Signed-off-by: Jacob Pan 
> > > > ---
> > > > @Jean, Eric: as nesting was introduced for ARM, but looks like no
> > > > actual user of it. right? So I'm wondering if we can reuse
> > > > DOMAIN_ATTR_NESTING to retrieve nesting info? how about your
> > opinions?
> > > >
> > > >  include/linux/iommu.h  |  1 +
> > > >  include/uapi/linux/iommu.h | 34
> > ++
> > > >  2 files changed, 35 insertions(+)
> > > >
> > > > diff --git a/include/linux/iommu.h b/include/linux/iommu.h index
> > > > 78a26ae..f6e4b49 100644
> > > > --- a/include/linux/iommu.h
> > > > +++ b/include/linux/iommu.h
> > > > @@ -126,6 +126,7 @@ enum iommu_attr {
> > > > DOMAIN_ATTR_FSL_PAMUV1,
> > > > DOMAIN_ATTR_NESTING,/* two stages of translation */
> > > > DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE,
> > > > +   DOMAIN_ATTR_NESTING_INFO,
> > > > DOMAIN_ATTR_MAX,
> > > >  };
> > > >
> > > > diff --git a/include/uapi/linux/iommu.h
> > > > b/include/uapi/linux/iommu.h index 303f148..02eac73 100644
> > > > --- a/include/uapi/linux/iommu.h
> > > > +++ b/include/uapi/linux/iommu.h
> > > > @@ -332,4 +332,38 @@ struct iommu_gpasid_bind_data {
> > > > };
> > > >  };
> > > >
> > > > +struct iommu_nesting_info {
> > > > +   __u32   size;
> > > > +   __u32   format;
> > > > +   __u32   features;
> > > > +#define IOMMU_NESTING_FEAT_SYSWIDE_PASID   (1 << 0)
> > > > +#define IOMMU_NESTING_FEAT_BIND_PGTBL  (1 << 1)
> > > > +#define IOMMU_NESTING_FEAT_CACHE_INVLD (1 << 2)
> > > > +   __u32   flags;
> > > > +   __u8data[];
> > > > +};
> > > > +
> > > > +/*
> > > > + * @flags: VT-d specific flags. Currently reserved for future
> > > > + * extension.
> > > > + * @addr_width:The output addr width of first level/stage
> translation
> > > > + * @pasid_bits:Maximum supported PASID bits, 0 represents no
> > PASID
> > > > + * support.
> > > > + * @cap_reg:   Describe basic capabilities as defined in VT-d
> > capability
> > > > + * register.
> > > > + * @cap_mask:  Mark valid capability bits in @cap_reg.
> > > > + * @ecap_reg:  Describe the extended capabilities as defined in VT-d
> > > > + * extended capability register.
> > > > + * @ecap_mask: Mark the valid capability bits in @ecap_reg.
> > >
> > > Please explain this a little further, why do we need to tell
> > > userspace about cap/ecap register bits that aren't valid through this 
> > > interface?
> > > Thanks,
> >
> > we only want to tell userspace about the bits marked in the cap/ecap_mask.
> > cap/ecap_mask is kind of white-list of the cap/ecap register.
> > userspace should only care about the bits in the white-list, for other
> > bits, it should ignore.
> >
> > Regards,
> > Yi Liu
> 
> For invalid bits if kernel just clears them then do we still need additional 
> mask bits
> to explicitly mark them out? I guess this might be the point that Alex 
> asked...

For invalid bits, kernel will clear them. But I think the mask bits is
still necessary. The mask bits tells user space the bits related to
nesting. Without it, user space may have no idea about it.

Maybe talk about QEMU usage of the cap/ecap bits would help. QEMU vIOMMU
decides cap/ecap bits according to QEMU cmdline. But not all of them are
compatible with hardware support. Especially, vIOMMU built on nesting.
So needs to sync the cap/ecap bits with host side. Based on the mask
bits, QEMU can compare the cap/ecap bits configured by QEMU cmdline with
the cap/ecap bits reported by this interface. This comparation is limited
to the nesting related bits in cap/ecap, the other bits are not included
and can use the configuration by QEMU cmdline.

The link below show the current Intel vIOMMU usage on the cap/ecap bits.
For each assigned device, vIOMMU will compare the nesting related bits in
cap/ecap and mask out the bits which hardware doesn't support. After the
machine is intilized, the vIOMMU cap/ecap bits are determined. If user
hot-plug devices to VM, vIOMMU will fail it if the hardware cap/ecap bits
behind hot-plug device are not compatible with determined vIOMMU cap/ecap
bits.

https://www.spinics.net/lists/kvm/msg218294.html

Regards,
Yi Liu

> >
> > > Alex
> > >
> > >
> > > > + 

[PATCH 2/4] pci: set "untrusted" flag for truly external devices only

2020-06-15 Thread Rajat Jain via iommu
The "ExternalFacing" devices (root ports) are still internal devices
that sit on the internal system fabric and thus trusted. Currently they
were being marked untrusted - likely as an unintended border case.

This patch uses the platform flag to identify the external facing devices
and then use it to mark any downstream devices as "untrusted". The
external-facing devices themselves are left as "trusted". This was
discussed here: https://lkml.org/lkml/2020/6/10/1049

Signed-off-by: Rajat Jain 
---
 drivers/iommu/intel/iommu.c |  2 +-
 drivers/pci/of.c|  2 +-
 drivers/pci/pci-acpi.c  | 13 +++--
 drivers/pci/probe.c |  2 +-
 include/linux/pci.h |  8 
 5 files changed, 18 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 9129663a7406b..1256ca89fb519 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -4735,7 +4735,7 @@ static inline bool has_untrusted_dev(void)
struct pci_dev *pdev = NULL;
 
for_each_pci_dev(pdev)
-   if (pdev->untrusted)
+   if (pdev->untrusted || pdev->external_facing)
return true;
 
return false;
diff --git a/drivers/pci/of.c b/drivers/pci/of.c
index 27839cd2459f6..22727fc9558df 100644
--- a/drivers/pci/of.c
+++ b/drivers/pci/of.c
@@ -42,7 +42,7 @@ void pci_set_bus_of_node(struct pci_bus *bus)
} else {
node = of_node_get(bus->self->dev.of_node);
if (node && of_property_read_bool(node, "external-facing"))
-   bus->self->untrusted = true;
+   bus->self->external_facing = true;
}
 
bus->dev.of_node = node;
diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
index 7224b1e5f2a83..492c07805caf8 100644
--- a/drivers/pci/pci-acpi.c
+++ b/drivers/pci/pci-acpi.c
@@ -1213,22 +1213,23 @@ static void pci_acpi_optimize_delay(struct pci_dev 
*pdev,
ACPI_FREE(obj);
 }
 
-static void pci_acpi_set_untrusted(struct pci_dev *dev)
+static void pci_acpi_set_external_facing(struct pci_dev *dev)
 {
u8 val;
 
-   if (pci_pcie_type(dev) != PCI_EXP_TYPE_ROOT_PORT)
+   if (pci_pcie_type(dev) != PCI_EXP_TYPE_ROOT_PORT &&
+   pci_pcie_type(dev) != PCI_EXP_TYPE_DOWNSTREAM)
return;
if (device_property_read_u8(>dev, "ExternalFacingPort", ))
return;
 
/*
-* These root ports expose PCIe (including DMA) outside of the
-* system so make sure we treat them and everything behind as
+* These root/down ports expose PCIe (including DMA) outside of the
+* system so make sure we treat everything behind them as
 * untrusted.
 */
if (val)
-   dev->untrusted = 1;
+   dev->external_facing = 1;
 }
 
 static void pci_acpi_setup(struct device *dev)
@@ -1240,7 +1241,7 @@ static void pci_acpi_setup(struct device *dev)
return;
 
pci_acpi_optimize_delay(pci_dev, adev->handle);
-   pci_acpi_set_untrusted(pci_dev);
+   pci_acpi_set_external_facing(pci_dev);
pci_acpi_add_edr_notifier(pci_dev);
 
pci_acpi_add_pm_notifier(adev, pci_dev);
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 6d87066a5ecc5..8c40c00413e74 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1552,7 +1552,7 @@ static void set_pcie_untrusted(struct pci_dev *dev)
 * untrusted as well.
 */
parent = pci_upstream_bridge(dev);
-   if (parent && parent->untrusted)
+   if (parent && (parent->untrusted || parent->external_facing))
dev->untrusted = true;
 }
 
diff --git a/include/linux/pci.h b/include/linux/pci.h
index a26be5332bba6..fe1bc603fda40 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -432,6 +432,14 @@ struct pci_dev {
 * mappings to make sure they cannot access arbitrary memory.
 */
unsigned intuntrusted:1;
+   /*
+* Devices are marked as external-facing using info from platform
+* (ACPI / devicetree). An external-facing device is still an internal
+* trusted device, but it faces external untrusted devices. Thus any
+* devices enumerated downstream an external-facing device is marked
+* as untrusted.
+*/
+   unsigned intexternal_facing:1;
unsigned intbroken_intx_masking:1;  /* INTx masking can't be used */
unsigned intio_window_1k:1; /* Intel bridge 1K I/O windows 
*/
unsigned intirq_managed:1;
-- 
2.27.0.290.gba653c62da-goog

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 4/4] pci: export untrusted attribute in sysfs

2020-06-15 Thread Rajat Jain via iommu
This is needed to allow the userspace to determine when an untrusted
device has been added, and thus allowing it to bind the driver manually
to it, if it so wishes. This is being done as part of the approach
discussed at https://lkml.org/lkml/2020/6/9/1331

Signed-off-by: Rajat Jain 
---
 drivers/pci/pci-sysfs.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 6d78df981d41a..574e9c613ba26 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -50,6 +50,7 @@ pci_config_attr(subsystem_device, "0x%04x\n");
 pci_config_attr(revision, "0x%02x\n");
 pci_config_attr(class, "0x%06x\n");
 pci_config_attr(irq, "%u\n");
+pci_config_attr(untrusted, "%u\n");
 
 static ssize_t broken_parity_status_show(struct device *dev,
 struct device_attribute *attr,
@@ -608,6 +609,7 @@ static struct attribute *pci_dev_attrs[] = {
 #endif
_attr_driver_override.attr,
_attr_ari_enabled.attr,
+   _attr_untrusted.attr,
NULL,
 };
 
-- 
2.27.0.290.gba653c62da-goog

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v2 02/15] iommu: Report domain nesting info

2020-06-15 Thread Tian, Kevin
> From: Liu, Yi L 
> Sent: Monday, June 15, 2020 2:05 PM
> 
> Hi Kevin,
> 
> > From: Tian, Kevin 
> > Sent: Monday, June 15, 2020 9:23 AM
> >
> > > From: Liu, Yi L 
> > > Sent: Friday, June 12, 2020 5:05 PM
> > >
> > > Hi Alex,
> > >
> > > > From: Alex Williamson 
> > > > Sent: Friday, June 12, 2020 3:30 AM
> > > >
> > > > On Thu, 11 Jun 2020 05:15:21 -0700
> > > > Liu Yi L  wrote:
> > > >
> > > > > IOMMUs that support nesting translation needs report the
> > > > > capability info to userspace, e.g. the format of first level/stage 
> > > > > paging
> > structures.
> > > > >
> > > > > Cc: Kevin Tian 
> > > > > CC: Jacob Pan 
> > > > > Cc: Alex Williamson 
> > > > > Cc: Eric Auger 
> > > > > Cc: Jean-Philippe Brucker 
> > > > > Cc: Joerg Roedel 
> > > > > Cc: Lu Baolu 
> > > > > Signed-off-by: Liu Yi L 
> > > > > Signed-off-by: Jacob Pan 
> > > > > ---
> > > > > @Jean, Eric: as nesting was introduced for ARM, but looks like no
> > > > > actual user of it. right? So I'm wondering if we can reuse
> > > > > DOMAIN_ATTR_NESTING to retrieve nesting info? how about your
> > > opinions?
> > > > >
> > > > >  include/linux/iommu.h  |  1 +
> > > > >  include/uapi/linux/iommu.h | 34
> > > ++
> > > > >  2 files changed, 35 insertions(+)
> > > > >
> > > > > diff --git a/include/linux/iommu.h b/include/linux/iommu.h index
> > > > > 78a26ae..f6e4b49 100644
> > > > > --- a/include/linux/iommu.h
> > > > > +++ b/include/linux/iommu.h
> > > > > @@ -126,6 +126,7 @@ enum iommu_attr {
> > > > >   DOMAIN_ATTR_FSL_PAMUV1,
> > > > >   DOMAIN_ATTR_NESTING,/* two stages of translation */
> > > > >   DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE,
> > > > > + DOMAIN_ATTR_NESTING_INFO,
> > > > >   DOMAIN_ATTR_MAX,
> > > > >  };
> > > > >
> > > > > diff --git a/include/uapi/linux/iommu.h
> > > > > b/include/uapi/linux/iommu.h index 303f148..02eac73 100644
> > > > > --- a/include/uapi/linux/iommu.h
> > > > > +++ b/include/uapi/linux/iommu.h
> > > > > @@ -332,4 +332,38 @@ struct iommu_gpasid_bind_data {
> > > > >   };
> > > > >  };
> > > > >
> > > > > +struct iommu_nesting_info {
> > > > > + __u32   size;
> > > > > + __u32   format;
> > > > > + __u32   features;
> > > > > +#define IOMMU_NESTING_FEAT_SYSWIDE_PASID (1 << 0)
> > > > > +#define IOMMU_NESTING_FEAT_BIND_PGTBL(1 << 1)
> > > > > +#define IOMMU_NESTING_FEAT_CACHE_INVLD   (1 <<
> 2)
> > > > > + __u32   flags;
> > > > > + __u8data[];
> > > > > +};
> > > > > +
> > > > > +/*
> > > > > + * @flags:   VT-d specific flags. Currently reserved for future
> > > > > + *   extension.
> > > > > + * @addr_width:  The output addr width of first level/stage
> > translation
> > > > > + * @pasid_bits:  Maximum supported PASID bits, 0 represents
> no
> > > PASID
> > > > > + *   support.
> > > > > + * @cap_reg: Describe basic capabilities as defined in VT-d
> > > capability
> > > > > + *   register.
> > > > > + * @cap_mask:Mark valid capability bits in @cap_reg.
> > > > > + * @ecap_reg:Describe the extended capabilities as defined 
> > > > > in VT-d
> > > > > + *   extended capability register.
> > > > > + * @ecap_mask:   Mark the valid capability bits in @ecap_reg.
> > > >
> > > > Please explain this a little further, why do we need to tell
> > > > userspace about cap/ecap register bits that aren't valid through this
> interface?
> > > > Thanks,
> > >
> > > we only want to tell userspace about the bits marked in the
> cap/ecap_mask.
> > > cap/ecap_mask is kind of white-list of the cap/ecap register.
> > > userspace should only care about the bits in the white-list, for other
> > > bits, it should ignore.
> > >
> > > Regards,
> > > Yi Liu
> >
> > For invalid bits if kernel just clears them then do we still need additional
> mask bits
> > to explicitly mark them out? I guess this might be the point that Alex 
> > asked...
> 
> For invalid bits, kernel will clear them. But I think the mask bits is
> still necessary. The mask bits tells user space the bits related to
> nesting. Without it, user space may have no idea about it.

userspace should know which bit is related to nesting and then should
check that bit explicitly...

> 
> Maybe talk about QEMU usage of the cap/ecap bits would help. QEMU
> vIOMMU
> decides cap/ecap bits according to QEMU cmdline. But not all of them are
> compatible with hardware support. Especially, vIOMMU built on nesting.
> So needs to sync the cap/ecap bits with host side. Based on the mask
> bits, QEMU can compare the cap/ecap bits configured by QEMU cmdline with
> the cap/ecap bits reported by this interface. This comparation is limited
> to the nesting related bits in cap/ecap, the other bits are not included
> and can use the configuration by QEMU cmdline.

I didn't get this explanation. Based on patch [15/15], nesting capabilities
are defined as:
+/* Nesting Support Capability Alignment */
+#define 

Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU

2020-06-15 Thread Bjorn Helgaas
On Sat, Jun 13, 2020 at 10:30:56PM +0800, Zhangfei Gao wrote:
> On 2020/6/11 下午9:44, Bjorn Helgaas wrote:
> > +++ b/drivers/iommu/iommu.c
> > > > > > > > > > @@ -2418,6 +2418,10 @@ int iommu_fwspec_init(struct device 
> > > > > > > > > > *dev, struct
> > > > > > > > > > fwnode_handle *iommu_fwnode,
> > > > > > > > > > fwspec->iommu_fwnode = iommu_fwnode;
> > > > > > > > > > fwspec->ops = ops;
> > > > > > > > > > dev_iommu_fwspec_set(dev, fwspec);
> > > > > > > > > > +
> > > > > > > > > > +   if (dev_is_pci(dev))
> > > > > > > > > > +   pci_fixup_device(pci_fixup_final, 
> > > > > > > > > > to_pci_dev(dev));
> > > > > > > > > > +
> > > > > > > > > > 
> > > > > > > > > > Then pci_fixup_final will be called twice, the first in 
> > > > > > > > > > pci_bus_add_device.
> > > > > > > > > > Here in iommu_fwspec_init is the second time, specifically 
> > > > > > > > > > for iommu_fwspec.
> > > > > > > > > > Will send this when 5.8-rc1 is open.
> > > > > > > > > Wait, this whole fixup approach seems wrong to me.  No matter 
> > > > > > > > > how you
> > > > > > > > > do the fixup, it's still a fixup, which means it requires 
> > > > > > > > > ongoing
> > > > > > > > > maintenance.  Surely we don't want to have to add the 
> > > > > > > > > Vendor/Device ID
> > > > > > > > > for every new AMBA device that comes along, do we?
> > > > > > > > > 
> > > > > > > > Here the fake pci device has standard PCI cfg space, but 
> > > > > > > > physical
> > > > > > > > implementation is base on AMBA
> > > > > > > > They can provide pasid feature.
> > > > > > > > However,
> > > > > > > > 1, does not support tlp since they are not real pci devices.
> > > > > > > > 2. does not support pri, instead support stall (provided by 
> > > > > > > > smmu)
> > > > > > > > And stall is not a pci feature, so it is not described in 
> > > > > > > > struct pci_dev,
> > > > > > > > but in struct iommu_fwspec.
> > > > > > > > So we use this fixup to tell pci system that the devices can 
> > > > > > > > support stall,
> > > > > > > > and hereby support pasid.
> > > > > > > This did not answer my question.  Are you proposing that we 
> > > > > > > update a
> > > > > > > quirk every time a new AMBA device is released?  I don't think 
> > > > > > > that
> > > > > > > would be a good model.
> > > > > > Yes, you are right, but we do not have any better idea yet.
> > > > > > Currently we have three fake pci devices, which support stall and 
> > > > > > pasid.
> > > > > > We have to let pci system know the device can support pasid, 
> > > > > > because of
> > > > > > stall feature, though not support pri.
> > > > > > Do you have any other ideas?
> > > > > It sounds like the best way would be to allocate a PCI capability for 
> > > > > it, so
> > > > > detection can be done through config space, at least in future 
> > > > > devices,
> > > > > or possibly after a firmware update if the config space in your system
> > > > > is controlled by firmware somewhere.  Once there is a proper mechanism
> > > > > to do this, using fixups to detect the early devices that don't use 
> > > > > that
> > > > > should be uncontroversial. I have no idea what the process or timeline
> > > > > is to add new capabilities into the PCIe specification, or if this one
> > > > > would be acceptable to the PCI SIG at all.
> > > > That sounds like a possibility.  The spec already defines a
> > > > Vendor-Specific Extended Capability (PCIe r5.0, sec 7.9.5) that might
> > > > be a candidate.
> > > Will investigate this, thanks Bjorn
> > FWIW, there's also a Vendor-Specific Capability that can appear in the
> > first 256 bytes of config space (the Vendor-Specific Extended
> > Capability must appear in the "Extended Configuration Space" from
> > 0x100-0xfff).
> Unfortunately our silicon does not have either Vendor-Specific Capability or
> Vendor-Specific Extended Capability.
> 
> Studied commit 8531e283bee66050734fb0e89d53e85fd5ce24a4
> Looks this method requires adding member (like can_stall) to struct pci_dev,
> looks difficult.

The problem is that we don't want to add device IDs every time a new
chip comes out.  Adding one or two device IDs for silicon that's
already released is not a problem as long as you have a strategy for
*future* devices so they don't require a quirk.

> > > > > If detection cannot be done through PCI config space, the next best
> > > > > alternative is to pass auxiliary data through firmware. On DT based
> > > > > machines, you can list non-hotpluggable PCIe devices and add custom
> > > > > properties that could be read during device enumeration. I assume
> > > > > ACPI has something similar, but I have not done that.
> > > Yes, thanks Arnd
> > > > ACPI has _DSM (ACPI v6.3, sec 9.1.1), which might be a candidate.  I
> > > > like this better than a PCI capability because the property you need
> > > > to expose is not a PCI property.
> > > _DSM may not workable, since it is working in runtime.
> > > We need 

RE: [PATCH v2 00/15] vfio: expose virtual Shared Virtual Addressing to VMs

2020-06-15 Thread Tian, Kevin
> From: Stefan Hajnoczi 
> Sent: Monday, June 15, 2020 6:02 PM
> 
> On Thu, Jun 11, 2020 at 05:15:19AM -0700, Liu Yi L wrote:
> > Shared Virtual Addressing (SVA), a.k.a, Shared Virtual Memory (SVM) on
> > Intel platforms allows address space sharing between device DMA and
> > applications. SVA can reduce programming complexity and enhance
> security.
> >
> > This VFIO series is intended to expose SVA usage to VMs. i.e. Sharing
> > guest application address space with passthru devices. This is called
> > vSVA in this series. The whole vSVA enabling requires QEMU/VFIO/IOMMU
> > changes. For IOMMU and QEMU changes, they are in separate series (listed
> > in the "Related series").
> >
> > The high-level architecture for SVA virtualization is as below, the key
> > design of vSVA support is to utilize the dual-stage IOMMU translation (
> > also known as IOMMU nesting translation) capability in host IOMMU.
> >
> >
> > .-.  .---.
> > |   vIOMMU|  | Guest process CR3, FL only|
> > | |  '---'
> > ./
> > | PASID Entry |--- PASID cache flush -
> > '-'   |
> > | |   V
> > | |CR3 in GPA
> > '-'
> > Guest
> > --| Shadow |--|
> >   vv  v
> > Host
> > .-.  .--.
> > |   pIOMMU|  | Bind FL for GVA-GPA  |
> > | |  '--'
> > ./  |
> > | PASID Entry | V (Nested xlate)
> > '\.--.
> > | |   |SL for GPA-HPA, default domain|
> > | |   '--'
> > '-'
> > Where:
> >  - FL = First level/stage one page tables
> >  - SL = Second level/stage two page tables
> 
> Hi,
> Looks like an interesting feature!
> 
> To check I understand this feature: can applications now pass virtual
> addresses to devices instead of translating to IOVAs?
> 
> If yes, can guest applications restrict the vSVA address space so the
> device only has access to certain regions?
> 
> On one hand replacing IOVA translation with virtual addresses simplifies
> the application programming model, but does it give up isolation if the
> device can now access all application memory?
> 

with SVA each application is allocated with a unique PASID to tag its
virtual address space. The device that claims SVA support must guarantee 
that one application can only program the device to access its own virtual
address space (i.e. all DMAs triggered by this application are tagged with
the application's PASID, and are translated by IOMMU's PASID-granular
page table). So, isolation is not sacrificed in SVA.

Thanks
Kevin
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 04/12] docs: x86: Add documentation for SVA (Shared Virtual Addressing)

2020-06-15 Thread Fenghua Yu
Hi, Baolu,

On Sat, Jun 13, 2020 at 08:17:40PM +0800, Lu Baolu wrote:
> Hi Fenghua,
> 
> On 2020/6/13 8:41, Fenghua Yu wrote:
> >+implement implement fairness or ensure forward progress can be made.
> 
> Repeated "implement".

Will fix this.

> >+For example, the Intel Data Streaming Accelerator (DSA) uses
> >+intel_svm_bind_mm(), which will do the following.
> 
> The Intel SVM APIs have been deprecated. Drivers should use
> iommu_sva_bind_device() instead. Please also update other places in
> this document.

Will fix this.

Thanks.

-Fenghua

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 12/12] x86/traps: Fix up invalid PASID

2020-06-15 Thread Peter Zijlstra
On Fri, Jun 12, 2020 at 05:41:33PM -0700, Fenghua Yu wrote:
> @@ -447,6 +458,18 @@ dotraplinkage void do_general_protection(struct pt_regs 
> *regs, long error_code)
>   int ret;
>  
>   RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake RCU");
> +
> + /*
> +  * Perform the check for a user mode PASID exception before enable
> +  * interrupts. Doing this here ensures that the PASID MSR can be simply
> +  * accessed because the contents are known to be still associated
> +  * with the current process.
> +  */
> + if (user_mode(regs) && fixup_pasid_exception()) {
> + cond_local_irq_enable(regs);
> + return;

OK, so we're done with the exception, lets enable interrupts?

> + }
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 00/12] x86: tag application address space for devices

2020-06-15 Thread Peter Zijlstra
On Fri, Jun 12, 2020 at 05:41:21PM -0700, Fenghua Yu wrote:

> This series only provides simple and basic support for ENQCMD and the MSR:
> 1. Clean up type definitions (patch 1-3). These patches can be in a
>separate series.
>- Define "pasid" as "unsigned int" consistently (patch 1 and 2).
>- Define "flags" as "unsigned int"
> 2. Explain different various technical terms used in the series (patch 4).
> 3. Enumerate support for ENQCMD in the processor (patch 5).
> 4. Handle FPU PASID state and the MSR during context switch (patches 6-7).
> 5. Define "pasid" in mm_struct (patch 8).
> 5. Clear PASID state for new mm and forked and cloned thread (patch 9-10).
> 6. Allocate and free PASID for a process (patch 11).
> 7. Fix up the PASID MSR in #GP handler when one thread in a process
>executes ENQCMD for the first time (patches 12).

If this is per mm, should not switch_mm() update the MSR ? I'm not
seeing that, nor do I see it explained why not.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 12/12] x86/traps: Fix up invalid PASID

2020-06-15 Thread Peter Zijlstra
On Fri, Jun 12, 2020 at 05:41:33PM -0700, Fenghua Yu wrote:
> +/*
> + * Apply some heuristics to see if the #GP fault was caused by a thread
> + * that hasn't had the IA32_PASID MSR initialized.  If it looks like that
> + * is the problem, try initializing the IA32_PASID MSR. If the heuristic
> + * guesses incorrectly, take one more #GP fault.

How is that going to help? Aren't we then going to run this same
heuristic again and again and again?

> + */
> +bool __fixup_pasid_exception(void)
> +{
> + u64 pasid_msr;
> + unsigned int pasid;
> +
> + /*
> +  * This function is called only when this #GP was triggered from user
> +  * space. So the mm cannot be NULL.
> +  */
> + pasid = current->mm->pasid;
> + /* If the mm doesn't have a valid PASID, then can't help. */
> + if (invalid_pasid(pasid))
> + return false;
> +
> + /*
> +  * Since IRQ is disabled now, the current task still owns the FPU on

That's just weird and confusing. What you want to say is that you rely
on the exception disabling the interrupt.

> +  * this CPU and the PASID MSR can be directly accessed.
> +  *
> +  * If the MSR has a valid PASID, the #GP must be for some other reason.
> +  *
> +  * If rdmsr() is really a performance issue, a TIF_ flag may be
> +  * added to check if the thread has a valid PASID instead of rdmsr().

I don't understand any of this. Nobody except us writes to this MSR, we
should bloody well know what's in it. What gives?

> +  */
> + rdmsrl(MSR_IA32_PASID, pasid_msr);
> + if (pasid_msr & MSR_IA32_PASID_VALID)
> + return false;
> +
> + /* Fix up the MSR if the MSR doesn't have a valid PASID. */
> + wrmsrl(MSR_IA32_PASID, pasid | MSR_IA32_PASID_VALID);
> +
> + return true;
> +}
> -- 
> 2.19.1
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v2 00/15] vfio: expose virtual Shared Virtual Addressing to VMs

2020-06-15 Thread Liu, Yi L
> From: Stefan Hajnoczi 
> Sent: Monday, June 15, 2020 6:02 PM
> 
> On Thu, Jun 11, 2020 at 05:15:19AM -0700, Liu Yi L wrote:
> > Shared Virtual Addressing (SVA), a.k.a, Shared Virtual Memory (SVM) on
> > Intel platforms allows address space sharing between device DMA and
> > applications. SVA can reduce programming complexity and enhance security.
> >
> > This VFIO series is intended to expose SVA usage to VMs. i.e. Sharing
> > guest application address space with passthru devices. This is called
> > vSVA in this series. The whole vSVA enabling requires QEMU/VFIO/IOMMU
> > changes. For IOMMU and QEMU changes, they are in separate series (listed
> > in the "Related series").
> >
> > The high-level architecture for SVA virtualization is as below, the key
> > design of vSVA support is to utilize the dual-stage IOMMU translation (
> > also known as IOMMU nesting translation) capability in host IOMMU.
> >
> >
> > .-.  .---.
> > |   vIOMMU|  | Guest process CR3, FL only|
> > | |  '---'
> > ./
> > | PASID Entry |--- PASID cache flush -
> > '-'   |
> > | |   V
> > | |CR3 in GPA
> > '-'
> > Guest
> > --| Shadow |--|
> >   vv  v
> > Host
> > .-.  .--.
> > |   pIOMMU|  | Bind FL for GVA-GPA  |
> > | |  '--'
> > ./  |
> > | PASID Entry | V (Nested xlate)
> > '\.--.
> > | |   |SL for GPA-HPA, default domain|
> > | |   '--'
> > '-'
> > Where:
> >  - FL = First level/stage one page tables
> >  - SL = Second level/stage two page tables
> 
> Hi,
> Looks like an interesting feature!

thanks for the interest. Stefan :-)

> To check I understand this feature: can applications now pass virtual
> addresses to devices instead of translating to IOVAs?

yes, application could pass virtual addresses to device directly. As
long as the virtual address is mapped in cpu page table, then IOMMU
would get it translated to physical address.

> If yes, can guest applications restrict the vSVA address space so the
> device only has access to certain regions?

do you mean restrict the access of certain virtual address regions of
guest application ? or certain guest memory? :-)

> On one hand replacing IOVA translation with virtual addresses simplifies
> the application programming model, but does it give up isolation if the
> device can now access all application memory?

yeah, you are right, SVA simplifies application programming model. And
today, we do allow access all application memory by SVA. this is also
another benefit of SVA. e.g. say an accelerator gets a copy of data from
a buffer written by cpu. If there is some other data which is directed
by a pointer (a virtual address) within the data got from memory, accelerator
could do another DMA to fetch it without cpu's involvement.

Regards,
Yi Liu

> Thanks,
> Stefan
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[Issue]platform/x86: iommu: System can't shutdown because iommu driver keeps checking the status of DMA_GSTS_TES

2020-06-15 Thread Koba Ko
hi All,
I have a machine and there's only intel gpu.
the secureboot and vt-d is enabled in BIOS.
On the Ubuntu desktop, I do s2idle first and restart the machine.
The machine can't restart successfully, so I need to press the power button
to shutdown.
I tried  each of the following and the issue can't be triggered.
1. disable secure boot in BIOS.
2. intel_iommu=off.
3. intel_iomm=igfx_off.
4. nomodeset
5. i915.modeset=0.

After I investigate further, find inte_iommu keeps checking the status of
DMA_GSTS_TES.
During the procedure of restart, the driver would disable iommu translation
and
check status of DMA_GSTS_TES until status of DMA_GSTS_TES is 0.

If you need more information, I can provide it.

Thanks
*Koba Ko*
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 14/15] vfio: Document dual stage control

2020-06-15 Thread Stefan Hajnoczi
On Thu, Jun 11, 2020 at 05:15:33AM -0700, Liu Yi L wrote:
> From: Eric Auger 
> 
> The VFIO API was enhanced to support nested stage control: a bunch of
> new iotcls and usage guideline.
> 
> Let's document the process to follow to set up nested mode.
> 
> Cc: Kevin Tian 
> CC: Jacob Pan 
> Cc: Alex Williamson 
> Cc: Eric Auger 
> Cc: Jean-Philippe Brucker 
> Cc: Joerg Roedel 
> Cc: Lu Baolu 
> Signed-off-by: Eric Auger 
> Signed-off-by: Liu Yi L 
> ---
> v1 -> v2:
> *) new in v2, compared with Eric's original version, pasid table bind
>and fault reporting is removed as this series doesn't cover them.
>Original version from Eric.
>https://lkml.org/lkml/2020/3/20/700
> 
>  Documentation/driver-api/vfio.rst | 64 
> +++
>  1 file changed, 64 insertions(+)
> 
> diff --git a/Documentation/driver-api/vfio.rst 
> b/Documentation/driver-api/vfio.rst
> index f1a4d3c..06224bd 100644
> --- a/Documentation/driver-api/vfio.rst
> +++ b/Documentation/driver-api/vfio.rst
> @@ -239,6 +239,70 @@ group and can access them as follows::
>   /* Gratuitous device reset and go... */
>   ioctl(device, VFIO_DEVICE_RESET);
>  
> +IOMMU Dual Stage Control
> +
> +
> +Some IOMMUs support 2 stages/levels of translation. Stage corresponds to
> +the ARM terminology while level corresponds to Intel's VTD terminology.
> +In the following text we use either without distinction.
> +
> +This is useful when the guest is exposed with a virtual IOMMU and some
> +devices are assigned to the guest through VFIO. Then the guest OS can use
> +stage 1 (GIOVA -> GPA or GVA->GPA), while the hypervisor uses stage 2 for
> +VM isolation (GPA -> HPA).
> +
> +Under dual stage translation, the guest gets ownership of the stage 1 page
> +tables and also owns stage 1 configuration structures. The hypervisor owns
> +the root configuration structure (for security reason), including stage 2
> +configuration. This works as long configuration structures and page table

s/as long configuration/as long as configuration/

> +format are compatible between the virtual IOMMU and the physical IOMMU.

s/format/formats/

> +
> +Assuming the HW supports it, this nested mode is selected by choosing the
> +VFIO_TYPE1_NESTING_IOMMU type through:
> +
> +ioctl(container, VFIO_SET_IOMMU, VFIO_TYPE1_NESTING_IOMMU);
> +
> +This forces the hypervisor to use the stage 2, leaving stage 1 available
> +for guest usage. The guest stage 1 format depends on IOMMU vendor, and
> +it is the same with the nesting configuration method. User space should
> +check the format and configuration method after setting nesting type by
> +using:
> +
> +ioctl(container->fd, VFIO_IOMMU_GET_INFO, _info);
> +
> +Details can be found in Documentation/userspace-api/iommu.rst. For Intel
> +VT-d, each stage 1 page table is bound to host by:
> +
> +nesting_op->flags = VFIO_IOMMU_NESTING_OP_BIND_PGTBL;
> +memcpy(_op->data, _data, sizeof(bind_data));
> +ioctl(container->fd, VFIO_IOMMU_NESTING_OP, nesting_op);
> +
> +As mentioned above, guest OS may use stage 1 for GIOVA->GPA or GVA->GPA.
> +GVA->GPA page tables are available when PASID (Process Address Space ID)
> +is exposed to guest. e.g. guest with PASID-capable devices assigned. For
> +such page table binding, the bind_data should include PASID info, which
> +is allocated by guest itself or by host. This depends on hardware vendor
> +e.g. Intel VT-d requires to allocate PASID from host. This requirement is
> +available by VFIO_IOMMU_GET_INFO. User space could allocate PASID from
> +host by:
> +
> +req.flags = VFIO_IOMMU_ALLOC_PASID;
> +ioctl(container, VFIO_IOMMU_PASID_REQUEST, );

It is not clear how the userspace application determines whether PASIDs
must be allocated from the host via VFIO_IOMMU_PASID_REQUEST or if the
guest itself can allocate PASIDs. The text mentions VFIO_IOMMU_GET_INFO
but what exactly should the userspace application check?


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[Issue]platform/x86: iommu: System can't shutdown because iommu driver keeps checking the status of DMA_GSTS_TES

2020-06-15 Thread Koba Ko
hi All,
I have a machine and there's only intel gpu.
the secureboot and vt-d is enabled in BIOS.
On the Ubuntu desktop, I do s2idle first and restart the machine.
The machine can't restart successfully, so I need to press the power
button to shutdown.
I tried  each of the following and the issue can't be triggered.
1. disable secure boot in BIOS.
2. intel_iommu=off.
3. intel_iomm=igfx_off.
4. nomodeset
5. i915.modeset=0.

After I investigate further, find inte_iommu keeps checking the status
of DMA_GSTS_TES.
During the procedure of restart, the driver would disable iommu translation and
check the status of DMA_GSTS_TES until status of DMA_GSTS_TES is 0.

If you need more information, I can provide it.

Thanks
Koba Ko
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 00/15] vfio: expose virtual Shared Virtual Addressing to VMs

2020-06-15 Thread Stefan Hajnoczi
On Thu, Jun 11, 2020 at 05:15:19AM -0700, Liu Yi L wrote:
> Shared Virtual Addressing (SVA), a.k.a, Shared Virtual Memory (SVM) on
> Intel platforms allows address space sharing between device DMA and
> applications. SVA can reduce programming complexity and enhance security.
> 
> This VFIO series is intended to expose SVA usage to VMs. i.e. Sharing
> guest application address space with passthru devices. This is called
> vSVA in this series. The whole vSVA enabling requires QEMU/VFIO/IOMMU
> changes. For IOMMU and QEMU changes, they are in separate series (listed
> in the "Related series").
> 
> The high-level architecture for SVA virtualization is as below, the key
> design of vSVA support is to utilize the dual-stage IOMMU translation (
> also known as IOMMU nesting translation) capability in host IOMMU.
> 
> 
> .-.  .---.
> |   vIOMMU|  | Guest process CR3, FL only|
> | |  '---'
> ./
> | PASID Entry |--- PASID cache flush -
> '-'   |
> | |   V
> | |CR3 in GPA
> '-'
> Guest
> --| Shadow |--|
>   vv  v
> Host
> .-.  .--.
> |   pIOMMU|  | Bind FL for GVA-GPA  |
> | |  '--'
> ./  |
> | PASID Entry | V (Nested xlate)
> '\.--.
> | |   |SL for GPA-HPA, default domain|
> | |   '--'
> '-'
> Where:
>  - FL = First level/stage one page tables
>  - SL = Second level/stage two page tables

Hi,
Looks like an interesting feature!

To check I understand this feature: can applications now pass virtual
addresses to devices instead of translating to IOVAs?

If yes, can guest applications restrict the vSVA address space so the
device only has access to certain regions?

On one hand replacing IOVA translation with virtual addresses simplifies
the application programming model, but does it give up isolation if the
device can now access all application memory?

Thanks,
Stefan


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH] uacce: remove uacce_vma_fault

2020-06-15 Thread Zhangfei Gao
Fix NULL pointer error if removing uacce's parent module during app's
running. SIGBUS is already reported by do_page_fault, so uacce_vma_fault
is not needed. If providing vma_fault, vmf->page has to be filled as well,
required by __do_fault.

Reported-by: Jean-Philippe Brucker 
Signed-off-by: Zhangfei Gao 
---
 drivers/misc/uacce/uacce.c | 9 -
 1 file changed, 9 deletions(-)

diff --git a/drivers/misc/uacce/uacce.c b/drivers/misc/uacce/uacce.c
index 107028e..aa91f69 100644
--- a/drivers/misc/uacce/uacce.c
+++ b/drivers/misc/uacce/uacce.c
@@ -179,14 +179,6 @@ static int uacce_fops_release(struct inode *inode, struct 
file *filep)
return 0;
 }
 
-static vm_fault_t uacce_vma_fault(struct vm_fault *vmf)
-{
-   if (vmf->flags & (FAULT_FLAG_MKWRITE | FAULT_FLAG_WRITE))
-   return VM_FAULT_SIGBUS;
-
-   return 0;
-}
-
 static void uacce_vma_close(struct vm_area_struct *vma)
 {
struct uacce_queue *q = vma->vm_private_data;
@@ -199,7 +191,6 @@ static void uacce_vma_close(struct vm_area_struct *vma)
 }
 
 static const struct vm_operations_struct uacce_vm_ops = {
-   .fault = uacce_vma_fault,
.close = uacce_vma_close,
 };
 
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 12/12] x86/traps: Fix up invalid PASID

2020-06-15 Thread Peter Zijlstra
On Mon, Jun 15, 2020 at 08:48:54AM -0700, Fenghua Yu wrote:
> Hi, Peter,
> On Mon, Jun 15, 2020 at 09:56:49AM +0200, Peter Zijlstra wrote:
> > On Fri, Jun 12, 2020 at 05:41:33PM -0700, Fenghua Yu wrote:
> > > +/*
> > > + * Apply some heuristics to see if the #GP fault was caused by a thread
> > > + * that hasn't had the IA32_PASID MSR initialized.  If it looks like that
> > > + * is the problem, try initializing the IA32_PASID MSR. If the heuristic
> > > + * guesses incorrectly, take one more #GP fault.
> > 
> > How is that going to help? Aren't we then going to run this same
> > heuristic again and again and again?
> 
> The heuristic always initializes the MSR with the per mm PASID IIF the
> mm has a valid PASID but the MSR doesn't have one. This heuristic usually
> happens only once on the first #GP in a thread.

But it doesn't guarantee the PASID is the right one. Suppose both the mm
has a PASID and the MSR has a VALID one, but the MSR isn't the mm one.
Then we NO-OP. So if the exception was due to us having the wrong PASID,
we stuck.

> If the next #GP still comes in, the heuristic finds out the MSR already
> has a valid PASID and thus will not fixup the MSR any more. The fixup()
> returns "false" and lets others to handle the #GP.
> 
> So the heuristic will be executed once (at most) and won't be executed
> again and again.

So I get that you want to set the PASID on-demand, but I find this all
really weird code to make that happen.

> > > +bool __fixup_pasid_exception(void)
> > > +{
> > > + u64 pasid_msr;
> > > + unsigned int pasid;
> > > +
> > > + /*
> > > +  * This function is called only when this #GP was triggered from user
> > > +  * space. So the mm cannot be NULL.
> > > +  */
> > > + pasid = current->mm->pasid;
> > > + /* If the mm doesn't have a valid PASID, then can't help. */
> > > + if (invalid_pasid(pasid))
> > > + return false;
> > > +
> > > + /*
> > > +  * Since IRQ is disabled now, the current task still owns the FPU on
> > 
> > That's just weird and confusing. What you want to say is that you rely
> > on the exception disabling the interrupt.
> 
> I checked SDM again. You are right. #GP can be interrupted by machine check
> or other interrupts. So I cannot assume the current task still owns the FPU.
> Instead of directly rdmsr() and wrmsr(), I will add helpers that can access
> either the MSR on the processor or the PASID state in the memory.

That's not in fact what I meant, but yes, you can take exceptions while
!IF just fine.

> > > +  * this CPU and the PASID MSR can be directly accessed.
> > > +  *
> > > +  * If the MSR has a valid PASID, the #GP must be for some other reason.
> > > +  *
> > > +  * If rdmsr() is really a performance issue, a TIF_ flag may be
> > > +  * added to check if the thread has a valid PASID instead of rdmsr().
> > 
> > I don't understand any of this. Nobody except us writes to this MSR, we
> > should bloody well know what's in it. What gives?
> 
> Patch 4 describes how to manage the MSR and patch 7 describes the format
> of the MSR (20-bit PASID value and bit 31 is valid bit).
> 
> Are they sufficient to help? Or do you mean something else?

I don't get why you need a rdmsr here, or why not having one would
require a TIF flag. Is that because this MSR is XSAVE/XRSTOR managed?

> > > +  */
> > > + rdmsrl(MSR_IA32_PASID, pasid_msr);
> > > + if (pasid_msr & MSR_IA32_PASID_VALID)
> > > + return false;
> > > +
> > > + /* Fix up the MSR if the MSR doesn't have a valid PASID. */
> > > + wrmsrl(MSR_IA32_PASID, pasid | MSR_IA32_PASID_VALID);

How much more expensive is the wrmsr over the rdmsr? Can't we just
unconditionally write the current PASID and call it a day?
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] iommu/tegra-smmu: Add missing locks around mapping operations

2020-06-15 Thread Dmitry Osipenko
25.05.2020 22:54, Dmitry Osipenko пишет:
> The mapping operations of the Tegra SMMU driver are subjected to a race
> condition issues because SMMU Address Space isn't allocated and freed
> atomically, while it should be. This patch makes the mapping operations
> atomic, it fixes an accidentally released Host1x Address Space problem
> which happens while running multiple graphics tests in parallel on
> Tegra30, i.e. by having multiple threads racing with each other in the
> Host1x's submission and completion code paths, performing IOVA mappings
> and unmappings in parallel.
> 
> Cc: 
> Signed-off-by: Dmitry Osipenko 
> ---
> 
> Changelog:
> 
> v2: - Now using mutex instead of spinlock.
> 
> - The _locked postfix is replaced with the underscores prefix.
> 

Hello Thierry and Joerg!

Guys, are you okay with the v2 variant? Will be great if we could fix
the issue ASAP since it's quite unpleasant. Thanks in advance!
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 2/2] iommu/amd: Move Kconfig and Makefile bits down into amd directory

2020-06-15 Thread Suravee Suthikulpanit

Reviewed-by: Suravee Suthikulpanit 

Thanks,
Suravee

On 6/13/20 6:11 AM, Jerry Snitselaar wrote:

Move AMD Kconfig and Makefile bits down into the amd directory
with the rest of the AMD specific files.

Cc: Joerg Roedel 
Cc: Suravee Suthikulpanit 
Signed-off-by: Jerry Snitselaar 
---
  drivers/iommu/Kconfig  | 45 +-
  drivers/iommu/Makefile |  5 +
  drivers/iommu/amd/Kconfig  | 44 +
  drivers/iommu/amd/Makefile |  4 
  4 files changed, 50 insertions(+), 48 deletions(-)
  create mode 100644 drivers/iommu/amd/Kconfig
  create mode 100644 drivers/iommu/amd/Makefile

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index b12d4ec124f6..78a8be0053b3 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -132,50 +132,7 @@ config IOMMU_PGTABLES_L2
def_bool y
depends on MSM_IOMMU && MMU && SMP && CPU_DCACHE_DISABLE=n
  
-# AMD IOMMU support

-config AMD_IOMMU
-   bool "AMD IOMMU support"
-   select SWIOTLB
-   select PCI_MSI
-   select PCI_ATS
-   select PCI_PRI
-   select PCI_PASID
-   select IOMMU_API
-   select IOMMU_IOVA
-   select IOMMU_DMA
-   depends on X86_64 && PCI && ACPI
-   ---help---
- With this option you can enable support for AMD IOMMU hardware in
- your system. An IOMMU is a hardware component which provides
- remapping of DMA memory accesses from devices. With an AMD IOMMU you
- can isolate the DMA memory of different devices and protect the
- system from misbehaving device drivers or hardware.
-
- You can find out if your system has an AMD IOMMU if you look into
- your BIOS for an option to enable it or if you have an IVRS ACPI
- table.
-
-config AMD_IOMMU_V2
-   tristate "AMD IOMMU Version 2 driver"
-   depends on AMD_IOMMU
-   select MMU_NOTIFIER
-   ---help---
- This option enables support for the AMD IOMMUv2 features of the IOMMU
- hardware. Select this option if you want to use devices that support
- the PCI PRI and PASID interface.
-
-config AMD_IOMMU_DEBUGFS
-   bool "Enable AMD IOMMU internals in DebugFS"
-   depends on AMD_IOMMU && IOMMU_DEBUGFS
-   ---help---
- !!!WARNING!!!  !!!WARNING!!!  !!!WARNING!!!  !!!WARNING!!!
-
- DO NOT ENABLE THIS OPTION UNLESS YOU REALLY, -REALLY- KNOW WHAT YOU 
ARE DOING!!!
- Exposes AMD IOMMU device internals in DebugFS.
-
- This option is -NOT- intended for production environments, and should
- not generally be enabled.
-
+source "drivers/iommu/amd/Kconfig"
  source "drivers/iommu/intel/Kconfig"
  
  config IRQ_REMAP

diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 71dd2f382e78..f356bc12b1c7 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -1,5 +1,5 @@
  # SPDX-License-Identifier: GPL-2.0
-obj-y += intel/
+obj-y += amd/ intel/
  obj-$(CONFIG_IOMMU_API) += iommu.o
  obj-$(CONFIG_IOMMU_API) += iommu-traces.o
  obj-$(CONFIG_IOMMU_API) += iommu-sysfs.o
@@ -12,9 +12,6 @@ obj-$(CONFIG_IOASID) += ioasid.o
  obj-$(CONFIG_IOMMU_IOVA) += iova.o
  obj-$(CONFIG_OF_IOMMU)+= of_iommu.o
  obj-$(CONFIG_MSM_IOMMU) += msm_iommu.o
-obj-$(CONFIG_AMD_IOMMU) += amd/iommu.o amd/init.o amd/quirks.o
-obj-$(CONFIG_AMD_IOMMU_DEBUGFS) += amd/debugfs.o
-obj-$(CONFIG_AMD_IOMMU_V2) += amd/iommu_v2.o
  obj-$(CONFIG_ARM_SMMU) += arm_smmu.o
  arm_smmu-objs += arm-smmu.o arm-smmu-impl.o arm-smmu-qcom.o
  obj-$(CONFIG_ARM_SMMU_V3) += arm-smmu-v3.o
diff --git a/drivers/iommu/amd/Kconfig b/drivers/iommu/amd/Kconfig
new file mode 100644
index ..1f061d91e0b8
--- /dev/null
+++ b/drivers/iommu/amd/Kconfig
@@ -0,0 +1,44 @@
+# SPDX-License-Identifier: GPL-2.0-only
+# AMD IOMMU support
+config AMD_IOMMU
+   bool "AMD IOMMU support"
+   select SWIOTLB
+   select PCI_MSI
+   select PCI_ATS
+   select PCI_PRI
+   select PCI_PASID
+   select IOMMU_API
+   select IOMMU_IOVA
+   select IOMMU_DMA
+   depends on X86_64 && PCI && ACPI
+   help
+ With this option you can enable support for AMD IOMMU hardware in
+ your system. An IOMMU is a hardware component which provides
+ remapping of DMA memory accesses from devices. With an AMD IOMMU you
+ can isolate the DMA memory of different devices and protect the
+ system from misbehaving device drivers or hardware.
+
+ You can find out if your system has an AMD IOMMU if you look into
+ your BIOS for an option to enable it or if you have an IVRS ACPI
+ table.
+
+config AMD_IOMMU_V2
+   tristate "AMD IOMMU Version 2 driver"
+   depends on AMD_IOMMU
+   select MMU_NOTIFIER
+   help
+ This option enables support for the AMD IOMMUv2 features of the IOMMU
+ hardware. Select this option if you want to use devices that support
+ the PCI PRI and 

Re: [PATCH v2 00/12] x86: tag application address space for devices

2020-06-15 Thread Fenghua Yu
Hi, Peter,
On Mon, Jun 15, 2020 at 09:52:02AM +0200, Peter Zijlstra wrote:
> On Fri, Jun 12, 2020 at 05:41:21PM -0700, Fenghua Yu wrote:
> 
> > This series only provides simple and basic support for ENQCMD and the MSR:
> > 1. Clean up type definitions (patch 1-3). These patches can be in a
> >separate series.
> >- Define "pasid" as "unsigned int" consistently (patch 1 and 2).
> >- Define "flags" as "unsigned int"
> > 2. Explain different various technical terms used in the series (patch 4).
> > 3. Enumerate support for ENQCMD in the processor (patch 5).
> > 4. Handle FPU PASID state and the MSR during context switch (patches 6-7).
> > 5. Define "pasid" in mm_struct (patch 8).
> > 5. Clear PASID state for new mm and forked and cloned thread (patch 9-10).
> > 6. Allocate and free PASID for a process (patch 11).
> > 7. Fix up the PASID MSR in #GP handler when one thread in a process
> >executes ENQCMD for the first time (patches 12).
> 
> If this is per mm, should not switch_mm() update the MSR ? I'm not
> seeing that, nor do I see it explained why not.

PASID value is per mm and all threads in a process have the same PASID
value in the MSR. However, the MSR is per thread and is context switched
by XSAVES/XRSTROS in patches 6-7.

Thanks.

-Fenghua
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 12/12] x86/traps: Fix up invalid PASID

2020-06-15 Thread Fenghua Yu
Hi, Peter,
On Mon, Jun 15, 2020 at 09:56:49AM +0200, Peter Zijlstra wrote:
> On Fri, Jun 12, 2020 at 05:41:33PM -0700, Fenghua Yu wrote:
> > +/*
> > + * Apply some heuristics to see if the #GP fault was caused by a thread
> > + * that hasn't had the IA32_PASID MSR initialized.  If it looks like that
> > + * is the problem, try initializing the IA32_PASID MSR. If the heuristic
> > + * guesses incorrectly, take one more #GP fault.
> 
> How is that going to help? Aren't we then going to run this same
> heuristic again and again and again?

The heuristic always initializes the MSR with the per mm PASID IIF the
mm has a valid PASID but the MSR doesn't have one. This heuristic usually
happens only once on the first #GP in a thread.

If the next #GP still comes in, the heuristic finds out the MSR already
has a valid PASID and thus will not fixup the MSR any more. The fixup()
returns "false" and lets others to handle the #GP.

So the heuristic will be executed once (at most) and won't be executed
again and again.

> 
> > + */
> > +bool __fixup_pasid_exception(void)
> > +{
> > +   u64 pasid_msr;
> > +   unsigned int pasid;
> > +
> > +   /*
> > +* This function is called only when this #GP was triggered from user
> > +* space. So the mm cannot be NULL.
> > +*/
> > +   pasid = current->mm->pasid;
> > +   /* If the mm doesn't have a valid PASID, then can't help. */
> > +   if (invalid_pasid(pasid))
> > +   return false;
> > +
> > +   /*
> > +* Since IRQ is disabled now, the current task still owns the FPU on
> 
> That's just weird and confusing. What you want to say is that you rely
> on the exception disabling the interrupt.

I checked SDM again. You are right. #GP can be interrupted by machine check
or other interrupts. So I cannot assume the current task still owns the FPU.
Instead of directly rdmsr() and wrmsr(), I will add helpers that can access
either the MSR on the processor or the PASID state in the memory.

> 
> > +* this CPU and the PASID MSR can be directly accessed.
> > +*
> > +* If the MSR has a valid PASID, the #GP must be for some other reason.
> > +*
> > +* If rdmsr() is really a performance issue, a TIF_ flag may be
> > +* added to check if the thread has a valid PASID instead of rdmsr().
> 
> I don't understand any of this. Nobody except us writes to this MSR, we
> should bloody well know what's in it. What gives?

Patch 4 describes how to manage the MSR and patch 7 describes the format
of the MSR (20-bit PASID value and bit 31 is valid bit).

Are they sufficient to help? Or do you mean something else?

> > +*/
> > +   rdmsrl(MSR_IA32_PASID, pasid_msr);
> > +   if (pasid_msr & MSR_IA32_PASID_VALID)
> > +   return false;
> > +
> > +   /* Fix up the MSR if the MSR doesn't have a valid PASID. */
> > +   wrmsrl(MSR_IA32_PASID, pasid | MSR_IA32_PASID_VALID);
> > +
> > +   return true;
> > +}
> > -- 
> > 2.19.1
> > 

Thanks.

-Fenghua
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v2 12/12] x86/traps: Fix up invalid PASID

2020-06-15 Thread Luck, Tony
>> The heuristic always initializes the MSR with the per mm PASID IIF the
>> mm has a valid PASID but the MSR doesn't have one. This heuristic usually
>> happens only once on the first #GP in a thread.
>
> But it doesn't guarantee the PASID is the right one. Suppose both the mm
> has a PASID and the MSR has a VALID one, but the MSR isn't the mm one.
> Then we NO-OP. So if the exception was due to us having the wrong PASID,
> we stuck.

ENQCMD only checks the 'valid' bit of the IA32_PASID MSR to decide whether
to #GP or not.  H/W has no concept of the "right" pasid value.

If IA32_PASID is valid with the wrong value ... then the system is about to
see some major corruption because the operations in the accelerator are
not going to translate to the physical addresses for pages owned by the process
that issued the ENQCMD.

-Tony
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu