Re: [v11, 5/8] soc: fsl: add GUTS driver for QorIQ platforms

2016-09-12 Thread Scott Wood
On Mon, 2016-09-12 at 06:39 +, Y.B. Lu wrote:
> Hi Scott,
> 
> Thanks for your review :)
> See my comment inline.
> 
> > 
> > -Original Message-
> > From: Scott Wood [mailto:o...@buserror.net]
> > Sent: Friday, September 09, 2016 11:47 AM
> > To: Y.B. Lu; linux-...@vger.kernel.org; ulf.hans...@linaro.org; Arnd
> > Bergmann
> > Cc: linuxppc-...@lists.ozlabs.org; devicet...@vger.kernel.org; linux-arm-
> > ker...@lists.infradead.org; linux-ker...@vger.kernel.org; linux-
> > c...@vger.kernel.org; linux-...@vger.kernel.org; iommu@lists.linux-
> > foundation.org; net...@vger.kernel.org; Mark Rutland; Rob Herring;
> > Russell King; Jochen Friedrich; Joerg Roedel; Claudiu Manoil; Bhupesh
> > Sharma; Qiang Zhao; Kumar Gala; Santosh Shilimkar; Leo Li; X.B. Xie
> > Subject: Re: [v11, 5/8] soc: fsl: add GUTS driver for QorIQ platforms
> > 
> > On Tue, 2016-09-06 at 16:28 +0800, Yangbo Lu wrote:
> > > 
> > > The global utilities block controls power management, I/O device
> > > enabling, power-onreset(POR) configuration monitoring, alternate
> > > function selection for multiplexed signals,and clock control.
> > > 
> > > This patch adds a driver to manage and access global utilities block.
> > > Initially only reading SVR and registering soc device are supported.
> > > Other guts accesses, such as reading RCW, should eventually be moved
> > > into this driver as well.
> > > 
> > > Signed-off-by: Yangbo Lu 
> > > Signed-off-by: Scott Wood 
> > Don't put my signoff on patches that I didn't put it on
> > myself.  Definitely don't put mine *after* yours on patches that were
> > last modified by you.
> > 
> > If you want to mention that the soc_id encoding was my suggestion, then
> > do so explicitly.
> > 
> [Lu Yangbo-B47093] I found your 'signoff' on this patch at below link.
> http://patchwork.ozlabs.org/patch/649211/
> 
> So, let me just change the order in next version ?
> Signed-off-by: Scott Wood 
> Signed-off-by: Yangbo Lu 

No.  This isn't my patch so my signoff shouldn't be on it.

> [Lu Yangbo-B47093] It's a good idea to move die into .family I think.
> In my opinion, it's better to keep svr and name in soc_id just like your
> suggestion above.
> > 
> > {
> > .soc_id = "svr:0x85490010,name:T1023E,",
> > .family = "QorIQ T1024",
> > }
> The user probably don’t like to learn the svr value. What they want is just
> to match the soc they use.
> It's convenient to use name+rev for them to match a soc.

What the user should want 99% of the time is to match the die (plus revision),
not the soc.

> Regarding shrinking the table, I think it's hard to use svr+mask. Because I
> find many platforms use different masks.
> We couldn’t know the mask according svr value.

The mask would be part of the table:

{
{
.die = "T1024",
.svr = 0x8540,
.mask = 0xfff0,
},
{
.die = "T1040",
.svr = 0x8520,
.mask = 0xfff0,
},
{
.die = "LS1088A",
.svr = 0x8703,
.mask = 0x,
},
...
}

There's a small risk that we get the mask wrong and a different die is created
that matches an existing table, but it doesn't seem too likely, and can easily
be fixed with a kernel update if it happens.

BTW, aren't ls2080a and ls2085a the same die?  And is there no non-E version
of LS2080A/LS2040A?

> > > + do {
> > > + if (!matches->soc_id)
> > > + return NULL;
> > > + if (glob_match(svr_match, matches->soc_id))
> > > + break;
> > > + } while (matches++);
> > Are you expecting "matches++" to ever evaluate as false?
> [Lu Yangbo-B47093] Yes, this is used to match the soc we use in qoriq_soc
> array until getting true. 
> We need to get the name and die information defined in array.

I'm not asking whether the glob_match will ever return true.  I'm saying that
"matches++" will never become NULL.

> > > + /* Register soc device */
> > > + soc_dev_attr = kzalloc(sizeof(*soc_dev_attr), GFP_KERNEL);
> > > + if (!soc_dev_attr) {
> > > + ret = -ENOMEM;
> > > + goto out_unmap;
> > > + }
> > Couldn't this be statically allocated?
> [Lu Yangbo-B47093] Do you mean we define this struct statically ?
> 
> static struct soc_device_attribute soc_dev_attr;

Yes.

> > > +
> > > + soc_dev = soc_device_register(soc_dev_attr);
> > > + if (IS_ERR(soc_dev)) {
> > > + ret = -ENODEV;
> > Why are you changing the error code?
> [Lu Yangbo-B47093] What error code should we use ? :)

ret = PTR_ERR(soc_dev);

+   }
> > > + return 0;
> > > +out:
> > > + kfree(soc_dev_attr->machine);
> > > + kfree(soc_dev_attr->family);
> > > + kfree(soc_dev_attr->soc_id);
> > > + kfree(soc_dev_attr->revision);
> > > + kfree(soc_dev_attr);
> > > +out_unmap:
> > > + iounmap(guts->regs);
> > > 

Re: [RFC PATCH v2 20/20] x86: Add support to make use of Secure Memory Encryption

2016-09-12 Thread Borislav Petkov
On Mon, Aug 22, 2016 at 05:39:08PM -0500, Tom Lendacky wrote:
> This patch adds the support to check if SME has been enabled and if the
> mem_encrypt=on command line option is set. If both of these conditions
> are true, then the encryption mask is set and the kernel is encrypted
> "in place."
> 
> Signed-off-by: Tom Lendacky 
> ---
>  Documentation/kernel-parameters.txt |3 
>  arch/x86/kernel/asm-offsets.c   |2 
>  arch/x86/kernel/mem_encrypt.S   |  302 
> +++
>  arch/x86/mm/mem_encrypt.c   |2 
>  4 files changed, 309 insertions(+)
> 
> diff --git a/Documentation/kernel-parameters.txt 
> b/Documentation/kernel-parameters.txt
> index 46c030a..a1986c8 100644
> --- a/Documentation/kernel-parameters.txt
> +++ b/Documentation/kernel-parameters.txt
> @@ -2268,6 +2268,9 @@ bytes respectively. Such letter suffixes can also be 
> entirely omitted.
>   memory contents and reserves bad memory
>   regions that are detected.
>  
> + mem_encrypt=on  [X86_64] Enable memory encryption on processors
> + that support this feature.
> +
>   meye.*= [HW] Set MotionEye Camera parameters
>   See Documentation/video4linux/meye.txt.
>  
> diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
> index 2bd5c6f..e485ada 100644
> --- a/arch/x86/kernel/asm-offsets.c
> +++ b/arch/x86/kernel/asm-offsets.c
> @@ -85,6 +85,8 @@ void common(void) {
>   OFFSET(BP_init_size, boot_params, hdr.init_size);
>   OFFSET(BP_pref_address, boot_params, hdr.pref_address);
>   OFFSET(BP_code32_start, boot_params, hdr.code32_start);
> + OFFSET(BP_cmd_line_ptr, boot_params, hdr.cmd_line_ptr);
> + OFFSET(BP_ext_cmd_line_ptr, boot_params, ext_cmd_line_ptr);
>  
>   BLANK();
>   DEFINE(PTREGS_SIZE, sizeof(struct pt_regs));
> diff --git a/arch/x86/kernel/mem_encrypt.S b/arch/x86/kernel/mem_encrypt.S
> index f2e0536..bf9f6a9 100644
> --- a/arch/x86/kernel/mem_encrypt.S
> +++ b/arch/x86/kernel/mem_encrypt.S
> @@ -12,13 +12,230 @@
>  
>  #include 
>  
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
>   .text
>   .code64
>  ENTRY(sme_enable)
> +#ifdef CONFIG_AMD_MEM_ENCRYPT
> + /* Check for AMD processor */
> + xorl%eax, %eax
> + cpuid
> + cmpl$0x68747541, %ebx   # AuthenticAMD
> + jne .Lmem_encrypt_exit
> + cmpl$0x69746e65, %edx
> + jne .Lmem_encrypt_exit
> + cmpl$0x444d4163, %ecx
> + jne .Lmem_encrypt_exit
> +
> + /* Check for memory encryption leaf */
> + movl$0x8000, %eax
> + cpuid
> + cmpl$0x801f, %eax
> + jb  .Lmem_encrypt_exit
> +
> + /*
> +  * Check for memory encryption feature:
> +  *   CPUID Fn8000_001F[EAX] - Bit 0
> +  * Secure Memory Encryption support
> +  *   CPUID Fn8000_001F[EBX] - Bits 5:0
> +  * Pagetable bit position used to indicate encryption
> +  *   CPUID Fn8000_001F[EBX] - Bits 11:6
> +  * Reduction in physical address space (in bits) when enabled
> +  */
> + movl$0x801f, %eax
> + cpuid
> + bt  $0, %eax
> + jnc .Lmem_encrypt_exit
> +
> + /* Check if BIOS/UEFI has allowed memory encryption */
> + movl$MSR_K8_SYSCFG, %ecx
> + rdmsr
> + bt  $MSR_K8_SYSCFG_MEM_ENCRYPT_BIT, %eax
> + jnc .Lmem_encrypt_exit

Like other people suggested, it would be great if this were in C. Should be
actually readable :)

> +
> + /* Check for the mem_encrypt=on command line option */
> + push%rsi/* Save RSI (real_mode_data) */
> + push%rbx/* Save CPUID information */
> + movlBP_ext_cmd_line_ptr(%rsi), %ecx
> + shlq$32, %rcx
> + movlBP_cmd_line_ptr(%rsi), %edi
> + addq%rcx, %rdi
> + leaqmem_encrypt_enable_option(%rip), %rsi
> + callcmdline_find_option_bool
> + pop %rbx/* Restore CPUID information */
> + pop %rsi/* Restore RSI (real_mode_data) */
> + testl   %eax, %eax
> + jz  .Lno_mem_encrypt

This too.

> +
> + /* Set memory encryption mask */
> + movl%ebx, %ecx
> + andl$0x3f, %ecx
> + bts %ecx, sme_me_mask(%rip)
> +
> +.Lno_mem_encrypt:
> + /*
> +  * BIOS/UEFI has allowed memory encryption so we need to set
> +  * the amount of physical address space reduction even if
> +  * the user decides not to use memory encryption.
> +  */
> + movl%ebx, %ecx
> + shrl$6, %ecx
> + andl$0x3f, %ecx
> + movb%cl, sme_me_loss(%rip)
> +
> +.Lmem_encrypt_exit:
> +#endif   /* CONFIG_AMD_MEM_ENCRYPT */
> +
>   ret
>  ENDPROC(sme_enable)
>  
>  ENTRY(sme_encrypt_kernel)

This should be doable too but I guess you'll have to try it to see.

...

> diff 

Re: [RFC PATCH v2 19/20] x86: Access the setup data through debugfs un-encrypted

2016-09-12 Thread Borislav Petkov
On Mon, Aug 22, 2016 at 05:38:59PM -0500, Tom Lendacky wrote:
> Since the setup data is in memory in the clear, it must be accessed as
> un-encrypted.  Always use ioremap (similar to sysfs setup data support)
> to map the data.
> 
> Signed-off-by: Tom Lendacky 
> ---
>  arch/x86/kernel/kdebugfs.c |   30 +++---
>  1 file changed, 11 insertions(+), 19 deletions(-)
> 
> diff --git a/arch/x86/kernel/kdebugfs.c b/arch/x86/kernel/kdebugfs.c
> index bdb83e4..a58a82e 100644
> --- a/arch/x86/kernel/kdebugfs.c
> +++ b/arch/x86/kernel/kdebugfs.c
> @@ -48,17 +48,13 @@ static ssize_t setup_data_read(struct file *file, char 
> __user *user_buf,
>  
>   pa = node->paddr + sizeof(struct setup_data) + pos;
>   pg = pfn_to_page((pa + count - 1) >> PAGE_SHIFT);
> - if (PageHighMem(pg)) {

Why is it ok to get rid of the PageHighMem() check?

Btw, we did talk earlier in the thread about making __va() clear the SME
mask and then you won't really need to change stuff here. Or?

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v2 11/20] mm: Access BOOT related data in the clear

2016-09-12 Thread Andy Lutomirski
On Aug 22, 2016 6:53 PM, "Tom Lendacky"  wrote:
>
> BOOT data (such as EFI related data) is not encyrpted when the system is
> booted and needs to be accessed as non-encrypted.  Add support to the
> early_memremap API to identify the type of data being accessed so that
> the proper encryption attribute can be applied.  Currently, two types
> of data are defined, KERNEL_DATA and BOOT_DATA.

What happens when you memremap boot services data outside of early
boot?  Matt just added code that does this.

IMO this API is not so great.  It scatters a specialized consideration
all over the place.  Could early_memremap not look up the PA to figure
out what to do?

--Andy

[leaving the rest here for Matt's benefit]

>  unsigned long size,
> +   enum memremap_owner owner,
> +   pgprot_t prot)
> +{
> +   return prot;
> +}
> +
>  void __init early_ioremap_reset(void)
>  {
> early_ioremap_shutdown();
> @@ -213,16 +221,23 @@ early_ioremap(resource_size_t phys_addr, unsigned long 
> size)
>
>  /* Remap memory */
>  void __init *
> -early_memremap(resource_size_t phys_addr, unsigned long size)
> +early_memremap(resource_size_t phys_addr, unsigned long size,
> +  enum memremap_owner owner)
>  {
> -   return (__force void *)__early_ioremap(phys_addr, size,
> -  FIXMAP_PAGE_NORMAL);
> +   pgprot_t prot = early_memremap_pgprot_adjust(phys_addr, size, owner,
> +FIXMAP_PAGE_NORMAL);
> +
> +   return (__force void *)__early_ioremap(phys_addr, size, prot);
>  }
>  #ifdef FIXMAP_PAGE_RO
>  void __init *
> -early_memremap_ro(resource_size_t phys_addr, unsigned long size)
> +early_memremap_ro(resource_size_t phys_addr, unsigned long size,
> + enum memremap_owner owner)
>  {
> -   return (__force void *)__early_ioremap(phys_addr, size, 
> FIXMAP_PAGE_RO);
> +   pgprot_t prot = early_memremap_pgprot_adjust(phys_addr, size, owner,
> +FIXMAP_PAGE_RO);
> +
> +   return (__force void *)__early_ioremap(phys_addr, size, prot);
>  }
>  #endif
>
> @@ -236,7 +251,8 @@ early_memremap_prot(resource_size_t phys_addr, unsigned 
> long size,
>
>  #define MAX_MAP_CHUNK  (NR_FIX_BTMAPS << PAGE_SHIFT)
>
> -void __init copy_from_early_mem(void *dest, phys_addr_t src, unsigned long 
> size)
> +void __init copy_from_early_mem(void *dest, phys_addr_t src, unsigned long 
> size,
> +   enum memremap_owner owner)
>  {
> unsigned long slop, clen;
> char *p;
> @@ -246,7 +262,7 @@ void __init copy_from_early_mem(void *dest, phys_addr_t 
> src, unsigned long size)
> clen = size;
> if (clen > MAX_MAP_CHUNK - slop)
> clen = MAX_MAP_CHUNK - slop;
> -   p = early_memremap(src & PAGE_MASK, clen + slop);
> +   p = early_memremap(src & PAGE_MASK, clen + slop, owner);
> memcpy(dest, p + slop, clen);
> early_memunmap(p, clen + slop);
> dest += clen;
> @@ -265,12 +281,14 @@ early_ioremap(resource_size_t phys_addr, unsigned long 
> size)
>
>  /* Remap memory */
>  void __init *
> -early_memremap(resource_size_t phys_addr, unsigned long size)
> +early_memremap(resource_size_t phys_addr, unsigned long size,
> +  enum memremap_owner owner)
>  {
> return (void *)phys_addr;
>  }
>  void __init *
> -early_memremap_ro(resource_size_t phys_addr, unsigned long size)
> +early_memremap_ro(resource_size_t phys_addr, unsigned long size,
> + enum memremap_owner owner)
>  {
> return (void *)phys_addr;
>  }
>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v2 16/20] x86: Check for memory encryption on the APs

2016-09-12 Thread Borislav Petkov
On Mon, Aug 22, 2016 at 05:38:29PM -0500, Tom Lendacky wrote:
> Add support to check if memory encryption is active in the kernel and that
> it has been enabled on the AP. If memory encryption is active in the kernel

A small nit: let's write out "AP" the first time at least: "... on the
Application Processors (AP)." for more clarity.

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v2 12/20] x86: Add support for changing memory encryption attribute

2016-09-12 Thread Borislav Petkov
On Mon, Sep 12, 2016 at 10:41:29AM -0500, Tom Lendacky wrote:
> Looking at __change_page_attr_set_clr() isn't it possible for some of
> the pages to be changed before an error is encountered since it is
> looping?  If so, we may still need to flush. The CPA_FLUSHTLB flag
> should take care of a failing case where no attributes have actually
> been changed.

Ah, yes, ok, then yours is correct.

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v2 11/20] mm: Access BOOT related data in the clear

2016-09-12 Thread Borislav Petkov
On Mon, Sep 12, 2016 at 10:14:59AM -0500, Tom Lendacky wrote:
> I did run checkpatch against everything, but was always under the
> assumption that I shouldn't change existing warnings/errors like this.
> If it's considered ok since I'm touching that line of code then I'll
> take care of those situations.

Yeah, normally we fix sensible checkpatch warnings/errors when we touch
the code so please do correct them :-)

Thanks.

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v2 10/20] x86: Insure that memory areas are encrypted when possible

2016-09-12 Thread Borislav Petkov
On Mon, Sep 12, 2016 at 10:05:36AM -0500, Tom Lendacky wrote:
> I can look into that.  The reason I put this here is this is all the
> early page fault support that is very specific to this file. I modified
> an existing static function to take advantage of the mapping support.

Yeah, but all this code is SME-specific and doesn't belong there.
AFAICT, it uses global/public symbols so there shouldn't be a problem to
have it in mem_encrypt.c.

> Hmmm, maybe... With the change to the early_memremap() the initrd is now
> identified as BOOT_DATA in relocate_initrd() and so it will be mapped
> and copied as non-encyrpted data. But since it was encrypted before the
> call to relocate_initrd() it will copy encrypted bytes which will later
> be accessed encrypted. That isn't clear though, so I'll rework
> reserve_initrd() to perform the sme_early_mem_enc() once at the end
> whether the initrd is re-located or not.

Makes sense.

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 04/22] iommu: Introduce iommu_fwspec

2016-09-12 Thread Robin Murphy
Introduce a common structure to hold the per-device firmware data that
most IOMMU drivers need to keep track of. This enables us to configure
much of that data from common firmware code, and consolidate a lot of
the equivalent implementations, device look-up tables, etc. which are
currently strewn across IOMMU drivers.

This will also be enable us to address the outstanding "multiple IOMMUs
on the platform bus" problem by tweaking IOMMU API calls to prefer
dev->fwspec->ops before falling back to dev->bus->iommu_ops, and thus
gracefully handle those troublesome systems which we currently cannot.

As the first user, hook up the OF IOMMU configuration mechanism. The
driver-defined nature of DT cells means that we still need the drivers
to translate and add the IDs themselves, but future users such as the
much less free-form ACPI IORT will be much simpler and self-contained.

CC: Greg Kroah-Hartman 
Suggested-by: Will Deacon 
Signed-off-by: Robin Murphy 

---

- Drop the 'gradual introduction' fiddling and go straight to struct
  device and common code, as it prevents all the silly build issues
  and ultimately makes life simpler for everyone
---
 drivers/iommu/iommu.c| 56 
 drivers/iommu/of_iommu.c |  8 +--
 include/linux/device.h   |  3 +++
 include/linux/iommu.h| 38 
 4 files changed, 103 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index b06d93594436..816e320d3ad8 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -31,6 +31,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 static struct kset *iommu_group_kset;
@@ -1613,3 +1614,58 @@ out:
 
return ret;
 }
+
+int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode,
+ const struct iommu_ops *ops)
+{
+   struct iommu_fwspec *fwspec = dev->iommu_fwspec;
+
+   if (fwspec)
+   return ops == fwspec->ops ? 0 : -EINVAL;
+
+   fwspec = kzalloc(sizeof(*fwspec), GFP_KERNEL);
+   if (!fwspec)
+   return -ENOMEM;
+
+   of_node_get(to_of_node(iommu_fwnode));
+   fwspec->iommu_fwnode = iommu_fwnode;
+   fwspec->ops = ops;
+   dev->iommu_fwspec = fwspec;
+   return 0;
+}
+EXPORT_SYMBOL_GPL(iommu_fwspec_init);
+
+void iommu_fwspec_free(struct device *dev)
+{
+   struct iommu_fwspec *fwspec = dev->iommu_fwspec;
+
+   if (fwspec) {
+   fwnode_handle_put(fwspec->iommu_fwnode);
+   kfree(fwspec);
+   dev->iommu_fwspec = NULL;
+   }
+}
+EXPORT_SYMBOL_GPL(iommu_fwspec_free);
+
+int iommu_fwspec_add_ids(struct device *dev, u32 *ids, int num_ids)
+{
+   struct iommu_fwspec *fwspec = dev->iommu_fwspec;
+   size_t size;
+   int i;
+
+   if (!fwspec)
+   return -EINVAL;
+
+   size = offsetof(struct iommu_fwspec, ids[fwspec->num_ids + num_ids]);
+   fwspec = krealloc(dev->iommu_fwspec, size, GFP_KERNEL);
+   if (!fwspec)
+   return -ENOMEM;
+
+   for (i = 0; i < num_ids; i++)
+   fwspec->ids[fwspec->num_ids + i] = ids[i];
+
+   fwspec->num_ids += num_ids;
+   dev->iommu_fwspec = fwspec;
+   return 0;
+}
+EXPORT_SYMBOL_GPL(iommu_fwspec_add_ids);
diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
index 19e1e8f2f871..5b82862f571f 100644
--- a/drivers/iommu/of_iommu.c
+++ b/drivers/iommu/of_iommu.c
@@ -167,7 +167,9 @@ static const struct iommu_ops
return NULL;
 
ops = of_iommu_get_ops(iommu_spec.np);
-   if (!ops || !ops->of_xlate || ops->of_xlate(>dev, _spec))
+   if (!ops || !ops->of_xlate ||
+   iommu_fwspec_init(>dev, _spec.np->fwnode, ops) ||
+   ops->of_xlate(>dev, _spec))
ops = NULL;
 
of_node_put(iommu_spec.np);
@@ -196,7 +198,9 @@ const struct iommu_ops *of_iommu_configure(struct device 
*dev,
np = iommu_spec.np;
ops = of_iommu_get_ops(np);
 
-   if (!ops || !ops->of_xlate || ops->of_xlate(dev, _spec))
+   if (!ops || !ops->of_xlate ||
+   iommu_fwspec_init(dev, >fwnode, ops) ||
+   ops->of_xlate(dev, _spec))
goto err_put_node;
 
of_node_put(np);
diff --git a/include/linux/device.h b/include/linux/device.h
index 38f02814d53a..bc41e87a969b 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -41,6 +41,7 @@ struct device_node;
 struct fwnode_handle;
 struct iommu_ops;
 struct iommu_group;
+struct iommu_fwspec;
 
 struct bus_attribute {
struct attributeattr;
@@ -765,6 +766,7 @@ struct device_dma_parameters {
  * gone away. This should be set by the allocator of the
  * device (i.e. the bus driver that discovered the device).
  * @iommu_group: IOMMU group 

[PATCH v7 19/22] iommu/arm-smmu: Wire up generic configuration support

2016-09-12 Thread Robin Murphy
With everything else now in place, fill in an of_xlate callback and the
appropriate registration to plumb into the generic configuration
machinery, and watch everything just work.

Signed-off-by: Robin Murphy 

---

- Convert to updated fwspec mechanism
---
 drivers/iommu/arm-smmu.c | 168 ++-
 1 file changed, 108 insertions(+), 60 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index c5ee5a51c7ec..f143dbc549b9 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -43,6 +43,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -418,6 +419,8 @@ struct arm_smmu_option_prop {
 
 static atomic_t cavium_smmu_context_count = ATOMIC_INIT(0);
 
+static bool using_legacy_binding, using_generic_binding;
+
 static struct arm_smmu_option_prop arm_smmu_options[] = {
{ ARM_SMMU_OPT_SECURE_CFG_ACCESS, "calxeda,smmu-secure-config-access" },
{ 0, NULL},
@@ -817,12 +820,6 @@ static int arm_smmu_init_domain_context(struct 
iommu_domain *domain,
if (smmu_domain->smmu)
goto out_unlock;
 
-   /* We're bypassing these SIDs, so don't allocate an actual context */
-   if (domain->type == IOMMU_DOMAIN_DMA) {
-   smmu_domain->smmu = smmu;
-   goto out_unlock;
-   }
-
/*
 * Mapping the requested stage onto what we support is surprisingly
 * complicated, mainly because the spec allows S1+S2 SMMUs without
@@ -981,7 +978,7 @@ static void arm_smmu_destroy_domain_context(struct 
iommu_domain *domain)
void __iomem *cb_base;
int irq;
 
-   if (!smmu || domain->type == IOMMU_DOMAIN_DMA)
+   if (!smmu)
return;
 
/*
@@ -1015,8 +1012,8 @@ static struct iommu_domain 
*arm_smmu_domain_alloc(unsigned type)
if (!smmu_domain)
return NULL;
 
-   if (type == IOMMU_DOMAIN_DMA &&
-   iommu_get_dma_cookie(_domain->domain)) {
+   if (type == IOMMU_DOMAIN_DMA && (using_legacy_binding ||
+   iommu_get_dma_cookie(_domain->domain))) {
kfree(smmu_domain);
return NULL;
}
@@ -1133,19 +1130,22 @@ static int arm_smmu_master_alloc_smes(struct device 
*dev)
mutex_lock(>stream_map_mutex);
/* Figure out a viable stream map entry allocation */
for_each_cfg_sme(fwspec, i, idx) {
+   u16 sid = fwspec->ids[i];
+   u16 mask = fwspec->ids[i] >> SMR_MASK_SHIFT;
+
if (idx != INVALID_SMENDX) {
ret = -EEXIST;
goto out_err;
}
 
-   ret = arm_smmu_find_sme(smmu, fwspec->ids[i], 0);
+   ret = arm_smmu_find_sme(smmu, sid, mask);
if (ret < 0)
goto out_err;
 
idx = ret;
if (smrs && smmu->s2crs[idx].count == 0) {
-   smrs[idx].id = fwspec->ids[i];
-   smrs[idx].mask = 0; /* We don't currently share SMRs */
+   smrs[idx].id = sid;
+   smrs[idx].mask = mask;
smrs[idx].valid = true;
}
smmu->s2crs[idx].count++;
@@ -1203,15 +1203,6 @@ static int arm_smmu_domain_add_master(struct 
arm_smmu_domain *smmu_domain,
u8 cbndx = smmu_domain->cfg.cbndx;
int i, idx;
 
-   /*
-* FIXME: This won't be needed once we have IOMMU-backed DMA ops
-* for all devices behind the SMMU. Note that we need to take
-* care configuring SMRs for devices both a platform_device and
-* and a PCI device (i.e. a PCI host controller)
-*/
-   if (smmu_domain->domain.type == IOMMU_DOMAIN_DMA)
-   type = S2CR_TYPE_BYPASS;
-
for_each_cfg_sme(fwspec, i, idx) {
if (type == s2cr[idx].type && cbndx == s2cr[idx].cbndx)
continue;
@@ -1373,25 +1364,50 @@ static bool arm_smmu_capable(enum iommu_cap cap)
}
 }
 
+static int arm_smmu_match_node(struct device *dev, void *data)
+{
+   return dev->of_node == data;
+}
+
+static struct arm_smmu_device *arm_smmu_get_by_node(struct device_node *np)
+{
+   struct device *dev = driver_find_device(_smmu_driver.driver, NULL,
+   np, arm_smmu_match_node);
+   put_device(dev);
+   return dev ? dev_get_drvdata(dev) : NULL;
+}
+
 static int arm_smmu_add_device(struct device *dev)
 {
struct arm_smmu_device *smmu;
struct arm_smmu_master_cfg *cfg;
-   struct iommu_fwspec *fwspec;
+   struct iommu_fwspec *fwspec = dev->iommu_fwspec;
int i, ret;
 
-   ret = arm_smmu_register_legacy_master(dev, );
-   fwspec = dev->iommu_fwspec;
-   if (ret)
-   goto out_free;
+   if (using_legacy_binding) {
+   ret = 

[PATCH v7 16/22] iommu/arm-smmu: Intelligent SMR allocation

2016-09-12 Thread Robin Murphy
Stream Match Registers are one of the more awkward parts of the SMMUv2
architecture; there are typically never enough to assign one to each
stream ID in the system, and configuring them such that a single ID
matches multiple entries is catastrophically bad - at best, every
transaction raises a global fault; at worst, they go *somewhere*.

To address the former issue, we can mask ID bits such that a single
register may be used to match multiple IDs belonging to the same device
or group, but doing so also heightens the risk of the latter problem
(which can be nasty to debug).

Tackle both problems at once by replacing the simple bitmap allocator
with something much cleverer. Now that we have convenient in-memory
representations of the stream mapping table, it becomes straightforward
to properly validate new SMR entries against the current state, opening
the door to arbitrary masking and SMR sharing.

Another feature which falls out of this is that with IDs shared by
separate devices being automatically accounted for, simply associating a
group pointer with the S2CR offers appropriate group allocation almost
for free, so hook that up in the process.

Signed-off-by: Robin Murphy 

---

- Simplify SMR matching logic, and document it properly
- Explicify INVALID_SMENDX check
---
 drivers/iommu/arm-smmu.c | 201 ---
 1 file changed, 121 insertions(+), 80 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 7a0edef34a22..0e7dbf59335d 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -302,6 +302,8 @@ enum arm_smmu_implementation {
 };
 
 struct arm_smmu_s2cr {
+   struct iommu_group  *group;
+   int count;
enum arm_smmu_s2cr_type type;
enum arm_smmu_s2cr_privcfg  privcfg;
u8  cbndx;
@@ -363,6 +365,7 @@ struct arm_smmu_device {
u16 smr_mask_mask;
struct arm_smmu_smr *smrs;
struct arm_smmu_s2cr*s2crs;
+   struct mutexstream_map_mutex;
 
unsigned long   va_size;
unsigned long   ipa_size;
@@ -1041,23 +1044,6 @@ static void arm_smmu_domain_free(struct iommu_domain 
*domain)
kfree(smmu_domain);
 }
 
-static int arm_smmu_alloc_smr(struct arm_smmu_device *smmu)
-{
-   int i;
-
-   for (i = 0; i < smmu->num_mapping_groups; i++)
-   if (!cmpxchg(>smrs[i].valid, false, true))
-   return i;
-
-   return INVALID_SMENDX;
-}
-
-static void arm_smmu_free_smr(struct arm_smmu_device *smmu, int idx)
-{
-   writel_relaxed(~SMR_VALID, ARM_SMMU_GR0(smmu) + ARM_SMMU_GR0_SMR(idx));
-   WRITE_ONCE(smmu->smrs[idx].valid, false);
-}
-
 static void arm_smmu_write_smr(struct arm_smmu_device *smmu, int idx)
 {
struct arm_smmu_smr *smr = smmu->smrs + idx;
@@ -1085,49 +1071,115 @@ static void arm_smmu_write_sme(struct arm_smmu_device 
*smmu, int idx)
arm_smmu_write_smr(smmu, idx);
 }
 
-static int arm_smmu_master_alloc_smes(struct arm_smmu_device *smmu,
- struct arm_smmu_master_cfg *cfg)
+static int arm_smmu_find_sme(struct arm_smmu_device *smmu, u16 id, u16 mask)
 {
struct arm_smmu_smr *smrs = smmu->smrs;
-   int i, idx;
+   int i, free_idx = -ENOSPC;
 
-   /* Allocate the SMRs on the SMMU */
-   for_each_cfg_sme(cfg, i, idx) {
-   if (idx != INVALID_SMENDX)
-   return -EEXIST;
+   /* Stream indexing is blissfully easy */
+   if (!smrs)
+   return id;
 
-   /* ...except on stream indexing hardware, of course */
-   if (!smrs) {
-   cfg->smendx[i] = cfg->streamids[i];
+   /* Validating SMRs is... less so */
+   for (i = 0; i < smmu->num_mapping_groups; ++i) {
+   if (!smrs[i].valid) {
+   /*
+* Note the first free entry we come across, which
+* we'll claim in the end if nothing else matches.
+*/
+   if (free_idx < 0)
+   free_idx = i;
continue;
}
-
-   idx = arm_smmu_alloc_smr(smmu);
-   if (idx < 0) {
-   dev_err(smmu->dev, "failed to allocate free SMR\n");
-   goto err_free_smrs;
-   }
-   cfg->smendx[i] = idx;
-
-   smrs[idx].id = cfg->streamids[i];
-   smrs[idx].mask = 0; /* We don't currently share SMRs */
+   /*
+* If the new entry is _entirely_ matched by an existing entry,
+* then reuse that, with the guarantee that there also cannot
+* be any subsequent conflicting 

[PATCH v7 15/22] iommu/arm-smmu: Add a stream map entry iterator

2016-09-12 Thread Robin Murphy
We iterate over the SMEs associated with a master config quite a lot in
various places, and are about to do so even more. Let's wrap the idiom
in a handy iterator macro before the repetition gets out of hand.

Tested-by: Lorenzo Pieralisi 
Signed-off-by: Robin Murphy 
---
 drivers/iommu/arm-smmu.c | 26 --
 1 file changed, 12 insertions(+), 14 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 3962add0c33b..7a0edef34a22 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -324,6 +324,8 @@ struct arm_smmu_master_cfg {
s16 smendx[MAX_MASTER_STREAMIDS];
 };
 #define INVALID_SMENDX -1
+#define for_each_cfg_sme(cfg, i, idx) \
+   for (i = 0; idx = cfg->smendx[i], i < cfg->num_streamids; ++i)
 
 struct arm_smmu_device {
struct device   *dev;
@@ -1090,8 +1092,8 @@ static int arm_smmu_master_alloc_smes(struct 
arm_smmu_device *smmu,
int i, idx;
 
/* Allocate the SMRs on the SMMU */
-   for (i = 0; i < cfg->num_streamids; ++i) {
-   if (cfg->smendx[i] != INVALID_SMENDX)
+   for_each_cfg_sme(cfg, i, idx) {
+   if (idx != INVALID_SMENDX)
return -EEXIST;
 
/* ...except on stream indexing hardware, of course */
@@ -1115,8 +1117,8 @@ static int arm_smmu_master_alloc_smes(struct 
arm_smmu_device *smmu,
return 0;
 
/* It worked! Now, poke the actual hardware */
-   for (i = 0; i < cfg->num_streamids; ++i)
-   arm_smmu_write_smr(smmu, cfg->smendx[i]);
+   for_each_cfg_sme(cfg, i, idx)
+   arm_smmu_write_smr(smmu, idx);
 
return 0;
 
@@ -1131,15 +1133,13 @@ err_free_smrs:
 static void arm_smmu_master_free_smes(struct arm_smmu_master_cfg *cfg)
 {
struct arm_smmu_device *smmu = cfg->smmu;
-   int i;
+   int i, idx;
 
/*
 * We *must* clear the S2CR first, because freeing the SMR means
 * that it can be re-allocated immediately.
 */
-   for (i = 0; i < cfg->num_streamids; ++i) {
-   int idx = cfg->smendx[i];
-
+   for_each_cfg_sme(cfg, i, idx) {
/* An IOMMU group is torn down by the first device to be 
removed */
if (idx == INVALID_SMENDX)
return;
@@ -1151,9 +1151,9 @@ static void arm_smmu_master_free_smes(struct 
arm_smmu_master_cfg *cfg)
__iowmb();
 
/* Invalidate the SMRs before freeing back to the allocator */
-   for (i = 0; i < cfg->num_streamids; ++i) {
+   for_each_cfg_sme(cfg, i, idx) {
if (smmu->smrs)
-   arm_smmu_free_smr(smmu, cfg->smendx[i]);
+   arm_smmu_free_smr(smmu, idx);
 
cfg->smendx[i] = INVALID_SMENDX;
}
@@ -1162,7 +1162,7 @@ static void arm_smmu_master_free_smes(struct 
arm_smmu_master_cfg *cfg)
 static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain,
  struct arm_smmu_master_cfg *cfg)
 {
-   int i, ret = 0;
+   int i, idx, ret = 0;
struct arm_smmu_device *smmu = smmu_domain->smmu;
struct arm_smmu_s2cr *s2cr = smmu->s2crs;
enum arm_smmu_s2cr_type type = S2CR_TYPE_TRANS;
@@ -1182,9 +1182,7 @@ static int arm_smmu_domain_add_master(struct 
arm_smmu_domain *smmu_domain,
if (smmu_domain->domain.type == IOMMU_DOMAIN_DMA)
type = S2CR_TYPE_BYPASS;
 
-   for (i = 0; i < cfg->num_streamids; ++i) {
-   int idx = cfg->smendx[i];
-
+   for_each_cfg_sme(cfg, i, idx) {
/* Devices in an IOMMU group may already be configured */
if (type == s2cr[idx].type && cbndx == s2cr[idx].cbndx)
break;
-- 
2.8.1.dirty

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 22/22] iommu/dma: Avoid PCI host bridge windows

2016-09-12 Thread Robin Murphy
With our DMA ops enabled for PCI devices, we should avoid allocating
IOVAs which a host bridge might misinterpret as peer-to-peer DMA and
lead to faults, corruption or other badness. To be safe, punch out holes
for all of the relevant host bridge's windows when initialising a DMA
domain for a PCI device.

CC: Marek Szyprowski 
CC: Inki Dae 
Reported-by: Lorenzo Pieralisi 
Signed-off-by: Robin Murphy 

---

- Squash in the previous drm/exynos fixup
- If need be, this one can probably wait
---
 arch/arm64/mm/dma-mapping.c   |  2 +-
 drivers/gpu/drm/exynos/exynos_drm_iommu.h |  2 +-
 drivers/iommu/dma-iommu.c | 25 -
 include/linux/dma-iommu.h |  3 ++-
 4 files changed, 28 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index c4284c432ae8..610d8e53011e 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -827,7 +827,7 @@ static bool do_iommu_attach(struct device *dev, const 
struct iommu_ops *ops,
 * then the IOMMU core will have already configured a group for this
 * device, and allocated the default domain for that group.
 */
-   if (!domain || iommu_dma_init_domain(domain, dma_base, size)) {
+   if (!domain || iommu_dma_init_domain(domain, dma_base, size, dev)) {
pr_warn("Failed to set up IOMMU for device %s; retaining 
platform DMA ops\n",
dev_name(dev));
return false;
diff --git a/drivers/gpu/drm/exynos/exynos_drm_iommu.h 
b/drivers/gpu/drm/exynos/exynos_drm_iommu.h
index c8de4913fdbe..87f6b5672e11 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_iommu.h
+++ b/drivers/gpu/drm/exynos/exynos_drm_iommu.h
@@ -66,7 +66,7 @@ static inline int __exynos_iommu_create_mapping(struct 
exynos_drm_private *priv,
if (ret)
goto free_domain;
 
-   ret = iommu_dma_init_domain(domain, start, size);
+   ret = iommu_dma_init_domain(domain, start, size, NULL);
if (ret)
goto put_cookie;
 
diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 4329d18080cf..c5ab8667e6f2 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -103,18 +104,38 @@ void iommu_put_dma_cookie(struct iommu_domain *domain)
 }
 EXPORT_SYMBOL(iommu_put_dma_cookie);
 
+static void iova_reserve_pci_windows(struct pci_dev *dev,
+   struct iova_domain *iovad)
+{
+   struct pci_host_bridge *bridge = pci_find_host_bridge(dev->bus);
+   struct resource_entry *window;
+   unsigned long lo, hi;
+
+   resource_list_for_each_entry(window, >windows) {
+   if (resource_type(window->res) != IORESOURCE_MEM &&
+   resource_type(window->res) != IORESOURCE_IO)
+   continue;
+
+   lo = iova_pfn(iovad, window->res->start - window->offset);
+   hi = iova_pfn(iovad, window->res->end - window->offset);
+   reserve_iova(iovad, lo, hi);
+   }
+}
+
 /**
  * iommu_dma_init_domain - Initialise a DMA mapping domain
  * @domain: IOMMU domain previously prepared by iommu_get_dma_cookie()
  * @base: IOVA at which the mappable address space starts
  * @size: Size of IOVA space
+ * @dev: Device the domain is being initialised for
  *
  * @base and @size should be exact multiples of IOMMU page granularity to
  * avoid rounding surprises. If necessary, we reserve the page at address 0
  * to ensure it is an invalid IOVA. It is safe to reinitialise a domain, but
  * any change which could make prior IOVAs invalid will fail.
  */
-int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base, u64 
size)
+int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base,
+   u64 size, struct device *dev)
 {
struct iova_domain *iovad = cookie_iovad(domain);
unsigned long order, base_pfn, end_pfn;
@@ -152,6 +173,8 @@ int iommu_dma_init_domain(struct iommu_domain *domain, 
dma_addr_t base, u64 size
iovad->dma_32bit_pfn = end_pfn;
} else {
init_iova_domain(iovad, 1UL << order, base_pfn, end_pfn);
+   if (dev && dev_is_pci(dev))
+   iova_reserve_pci_windows(to_pci_dev(dev), iovad);
}
return 0;
 }
diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
index 5ee806e41b5c..32c589062bd9 100644
--- a/include/linux/dma-iommu.h
+++ b/include/linux/dma-iommu.h
@@ -30,7 +30,8 @@ int iommu_get_dma_cookie(struct iommu_domain *domain);
 void iommu_put_dma_cookie(struct iommu_domain *domain);
 
 /* Setup call for arch DMA mapping code */
-int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base, u64 
size);
+int iommu_dma_init_domain(struct 

[PATCH v7 20/22] iommu/arm-smmu: Set domain geometry

2016-09-12 Thread Robin Murphy
For non-aperture-based IOMMUs, the domain geometry seems to have become
the de-facto way of indicating the input address space size. That is
quite a useful thing from the users' perspective, so let's do the same.

Reviewed-by: Eric Auger 
Tested-by: Lorenzo Pieralisi 
Signed-off-by: Robin Murphy 
---
 drivers/iommu/arm-smmu-v3.c | 2 ++
 drivers/iommu/arm-smmu.c| 2 ++
 2 files changed, 4 insertions(+)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 0c45c1e02e04..15c01c3cd540 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -1551,6 +1551,8 @@ static int arm_smmu_domain_finalise(struct iommu_domain 
*domain)
return -ENOMEM;
 
domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
+   domain->geometry.aperture_end = (1UL << ias) - 1;
+   domain->geometry.force_aperture = true;
smmu_domain->pgtbl_ops = pgtbl_ops;
 
ret = finalise_stage_fn(smmu_domain, _cfg);
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index f143dbc549b9..fa892d25004d 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -940,6 +940,8 @@ static int arm_smmu_init_domain_context(struct iommu_domain 
*domain,
 
/* Update the domain's page sizes to reflect the page table format */
domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
+   domain->geometry.aperture_end = (1UL << ias) - 1;
+   domain->geometry.force_aperture = true;
 
/* Initialise the context bank with our page table cfg */
arm_smmu_init_context_bank(smmu_domain, _cfg);
-- 
2.8.1.dirty

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 17/22] iommu/arm-smmu: Convert to iommu_fwspec

2016-09-12 Thread Robin Murphy
In the final step of preparation for full generic configuration support,
swap our fixed-size master_cfg for the generic iommu_fwspec. For the
legacy DT bindings, the driver simply gets to act as its own 'firmware'.
Farewell, arbitrary MAX_MASTER_STREAMIDS!

Signed-off-by: Robin Murphy 

---

- Convert to updated fwspec mechanism
---
 drivers/iommu/arm-smmu.c | 140 ++-
 1 file changed, 78 insertions(+), 62 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 0e7dbf59335d..c5ee5a51c7ec 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -42,6 +42,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -51,9 +52,6 @@
 
 #include "io-pgtable.h"
 
-/* Maximum number of stream IDs assigned to a single device */
-#define MAX_MASTER_STREAMIDS   128
-
 /* Maximum number of context banks per SMMU */
 #define ARM_SMMU_MAX_CBS   128
 
@@ -321,13 +319,13 @@ struct arm_smmu_smr {
 
 struct arm_smmu_master_cfg {
struct arm_smmu_device  *smmu;
-   int num_streamids;
-   u16 streamids[MAX_MASTER_STREAMIDS];
-   s16 smendx[MAX_MASTER_STREAMIDS];
+   s16 smendx[];
 };
 #define INVALID_SMENDX -1
-#define for_each_cfg_sme(cfg, i, idx) \
-   for (i = 0; idx = cfg->smendx[i], i < cfg->num_streamids; ++i)
+#define __fwspec_cfg(fw) ((struct arm_smmu_master_cfg *)fw->iommu_priv)
+#define fwspec_smmu(fw)  (__fwspec_cfg(fw)->smmu)
+#define for_each_cfg_sme(fw, i, idx) \
+   for (i = 0; idx = __fwspec_cfg(fw)->smendx[i], i < fw->num_ids; ++i)
 
 struct arm_smmu_device {
struct device   *dev;
@@ -480,14 +478,16 @@ static int __find_legacy_master_phandle(struct device 
*dev, void *data)
 }
 
 static struct platform_driver arm_smmu_driver;
+static struct iommu_ops arm_smmu_ops;
 
-static int arm_smmu_register_legacy_master(struct device *dev)
+static int arm_smmu_register_legacy_master(struct device *dev,
+  struct arm_smmu_device **smmu)
 {
-   struct arm_smmu_device *smmu;
-   struct arm_smmu_master_cfg *cfg;
+   struct device *smmu_dev;
struct device_node *np;
struct of_phandle_iterator it;
void *data = 
+   u32 *sids;
__be32 pci_sid;
int err;
 
@@ -500,20 +500,13 @@ static int arm_smmu_register_legacy_master(struct device 
*dev)
it.node = np;
err = driver_for_each_device(_smmu_driver.driver, NULL, ,
 __find_legacy_master_phandle);
+   smmu_dev = data;
of_node_put(np);
if (err == 0)
return -ENODEV;
if (err < 0)
return err;
 
-   smmu = dev_get_drvdata(data);
-
-   if (it.cur_count > MAX_MASTER_STREAMIDS) {
-   dev_err(smmu->dev,
-   "reached maximum number (%d) of stream IDs for master 
device %s\n",
-   MAX_MASTER_STREAMIDS, dev_name(dev));
-   return -ENOSPC;
-   }
if (dev_is_pci(dev)) {
/* "mmu-masters" assumes Stream ID == Requester ID */
pci_for_each_dma_alias(to_pci_dev(dev), __arm_smmu_get_pci_sid,
@@ -522,17 +515,20 @@ static int arm_smmu_register_legacy_master(struct device 
*dev)
it.cur_count = 1;
}
 
-   cfg = kzalloc(sizeof(*cfg), GFP_KERNEL);
-   if (!cfg)
+   err = iommu_fwspec_init(dev, _dev->of_node->fwnode,
+   _smmu_ops);
+   if (err)
+   return err;
+
+   sids = kcalloc(it.cur_count, sizeof(*sids), GFP_KERNEL);
+   if (!sids)
return -ENOMEM;
 
-   cfg->smmu = smmu;
-   dev->archdata.iommu = cfg;
-
-   while (it.cur_count--)
-   cfg->streamids[cfg->num_streamids++] = be32_to_cpup(it.cur++);
-
-   return 0;
+   *smmu = dev_get_drvdata(smmu_dev);
+   of_phandle_iterator_args(, sids, it.cur_count);
+   err = iommu_fwspec_add_ids(dev, sids, it.cur_count);
+   kfree(sids);
+   return err;
 }
 
 static int __arm_smmu_alloc_bitmap(unsigned long *map, int start, int end)
@@ -1127,7 +1123,8 @@ static bool arm_smmu_free_sme(struct arm_smmu_device 
*smmu, int idx)
 
 static int arm_smmu_master_alloc_smes(struct device *dev)
 {
-   struct arm_smmu_master_cfg *cfg = dev->archdata.iommu;
+   struct iommu_fwspec *fwspec = dev->iommu_fwspec;
+   struct arm_smmu_master_cfg *cfg = fwspec->iommu_priv;
struct arm_smmu_device *smmu = cfg->smmu;
struct arm_smmu_smr *smrs = smmu->smrs;
struct iommu_group *group;
@@ -1135,19 +1132,19 @@ static int arm_smmu_master_alloc_smes(struct device 
*dev)
 
mutex_lock(>stream_map_mutex);
/* Figure out a viable stream map entry 

[PATCH v7 18/22] Docs: dt: document ARM SMMU generic binding usage

2016-09-12 Thread Robin Murphy
Document how the generic "iommus" binding should be used to describe ARM
SMMU stream IDs instead of the old "mmu-masters" binding.

Acked-by: Rob Herring 
Signed-off-by: Robin Murphy 

---

- Remove missed reference to old "mmu-masters" example
---
 .../devicetree/bindings/iommu/arm,smmu.txt | 63 --
 1 file changed, 48 insertions(+), 15 deletions(-)

diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt 
b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
index 19fe6f2c83f6..e862d1485205 100644
--- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt
+++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
@@ -35,12 +35,16 @@ conditions.
   interrupt per context bank. In the case of a single,
   combined interrupt, it must be listed multiple times.
 
-- mmu-masters   : A list of phandles to device nodes representing bus
-  masters for which the SMMU can provide a translation
-  and their corresponding StreamIDs (see example below).
-  Each device node linked from this list must have a
-  "#stream-id-cells" property, indicating the number of
-  StreamIDs associated with it.
+- #iommu-cells  : See Documentation/devicetree/bindings/iommu/iommu.txt
+  for details. With a value of 1, each "iommus" entry
+  represents a distinct stream ID emitted by that device
+  into the relevant SMMU.
+
+  SMMUs with stream matching support and complex masters
+  may use a value of 2, where the second cell represents
+  an SMR mask to combine with the ID in the first cell.
+  Care must be taken to ensure the set of matched IDs
+  does not result in conflicts.
 
 ** System MMU optional properties:
 
@@ -56,9 +60,20 @@ conditions.
   aliases of secure registers have to be used during
   SMMU configuration.
 
-Example:
+** Deprecated properties:
 
-smmu {
+- mmu-masters (deprecated in favour of the generic "iommus" binding) :
+  A list of phandles to device nodes representing bus
+  masters for which the SMMU can provide a translation
+  and their corresponding Stream IDs. Each device node
+  linked from this list must have a "#stream-id-cells"
+  property, indicating the number of Stream ID
+  arguments associated with its phandle.
+
+** Examples:
+
+/* SMMU with stream matching or stream indexing */
+smmu1: iommu {
 compatible = "arm,smmu-v1";
 reg = <0xba5e 0x1>;
 #global-interrupts = <2>;
@@ -68,11 +83,29 @@ Example:
  <0 35 4>,
  <0 36 4>,
  <0 37 4>;
-
-/*
- * Two DMA controllers, the first with two StreamIDs (0xd01d
- * and 0xd01e) and the second with only one (0xd11c).
- */
-mmu-masters = < 0xd01d 0xd01e>,
-  < 0xd11c>;
+#iommu-cells = <1>;
+};
+
+/* device with two stream IDs, 0 and 7 */
+master1 {
+iommus = < 0>,
+ < 7>;
+};
+
+
+/* SMMU with stream matching */
+smmu2: iommu {
+...
+#iommu-cells = <2>;
+};
+
+/* device with stream IDs 0 and 7 */
+master2 {
+iommus = < 0 0>,
+ < 7 0>;
+};
+
+/* device with stream IDs 1, 17, 33 and 49 */
+master3 {
+iommus = < 1 0x30>;
 };
-- 
2.8.1.dirty

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 21/22] iommu/dma: Add support for mapping MSIs

2016-09-12 Thread Robin Murphy
When an MSI doorbell is located downstream of an IOMMU, attaching
devices to a DMA ops domain and switching on translation leads to a rude
shock when their attempt to write to the physical address returned by
the irqchip driver faults (or worse, writes into some already-mapped
buffer) and no interrupt is forthcoming.

Address this by adding a hook for relevant irqchip drivers to call from
their compose_msi_msg() callback, to swizzle the physical address with
an appropriatly-mapped IOVA for any device attached to one of our DMA
ops domains.

Acked-by: Thomas Gleixner 
Acked-by: Marc Zyngier 
Signed-off-by: Robin Murphy 
---
 drivers/iommu/dma-iommu.c| 136 ++-
 drivers/irqchip/irq-gic-v2m.c|   3 +
 drivers/irqchip/irq-gic-v3-its.c |   3 +
 include/linux/dma-iommu.h|   9 +++
 4 files changed, 136 insertions(+), 15 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 00c8a08d56e7..4329d18080cf 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -25,10 +25,28 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
 
+struct iommu_dma_msi_page {
+   struct list_headlist;
+   dma_addr_t  iova;
+   phys_addr_t phys;
+};
+
+struct iommu_dma_cookie {
+   struct iova_domain  iovad;
+   struct list_headmsi_page_list;
+   spinlock_t  msi_lock;
+};
+
+static inline struct iova_domain *cookie_iovad(struct iommu_domain *domain)
+{
+   return &((struct iommu_dma_cookie *)domain->iova_cookie)->iovad;
+}
+
 int iommu_dma_init(void)
 {
return iova_cache_get();
@@ -43,15 +61,19 @@ int iommu_dma_init(void)
  */
 int iommu_get_dma_cookie(struct iommu_domain *domain)
 {
-   struct iova_domain *iovad;
+   struct iommu_dma_cookie *cookie;
 
if (domain->iova_cookie)
return -EEXIST;
 
-   iovad = kzalloc(sizeof(*iovad), GFP_KERNEL);
-   domain->iova_cookie = iovad;
+   cookie = kzalloc(sizeof(*cookie), GFP_KERNEL);
+   if (!cookie)
+   return -ENOMEM;
 
-   return iovad ? 0 : -ENOMEM;
+   spin_lock_init(>msi_lock);
+   INIT_LIST_HEAD(>msi_page_list);
+   domain->iova_cookie = cookie;
+   return 0;
 }
 EXPORT_SYMBOL(iommu_get_dma_cookie);
 
@@ -63,14 +85,20 @@ EXPORT_SYMBOL(iommu_get_dma_cookie);
  */
 void iommu_put_dma_cookie(struct iommu_domain *domain)
 {
-   struct iova_domain *iovad = domain->iova_cookie;
+   struct iommu_dma_cookie *cookie = domain->iova_cookie;
+   struct iommu_dma_msi_page *msi, *tmp;
 
-   if (!iovad)
+   if (!cookie)
return;
 
-   if (iovad->granule)
-   put_iova_domain(iovad);
-   kfree(iovad);
+   if (cookie->iovad.granule)
+   put_iova_domain(>iovad);
+
+   list_for_each_entry_safe(msi, tmp, >msi_page_list, list) {
+   list_del(>list);
+   kfree(msi);
+   }
+   kfree(cookie);
domain->iova_cookie = NULL;
 }
 EXPORT_SYMBOL(iommu_put_dma_cookie);
@@ -88,7 +116,7 @@ EXPORT_SYMBOL(iommu_put_dma_cookie);
  */
 int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base, u64 
size)
 {
-   struct iova_domain *iovad = domain->iova_cookie;
+   struct iova_domain *iovad = cookie_iovad(domain);
unsigned long order, base_pfn, end_pfn;
 
if (!iovad)
@@ -155,7 +183,7 @@ int dma_direction_to_prot(enum dma_data_direction dir, bool 
coherent)
 static struct iova *__alloc_iova(struct iommu_domain *domain, size_t size,
dma_addr_t dma_limit)
 {
-   struct iova_domain *iovad = domain->iova_cookie;
+   struct iova_domain *iovad = cookie_iovad(domain);
unsigned long shift = iova_shift(iovad);
unsigned long length = iova_align(iovad, size) >> shift;
 
@@ -171,7 +199,7 @@ static struct iova *__alloc_iova(struct iommu_domain 
*domain, size_t size,
 /* The IOVA allocator knows what we mapped, so just unmap whatever that was */
 static void __iommu_dma_unmap(struct iommu_domain *domain, dma_addr_t dma_addr)
 {
-   struct iova_domain *iovad = domain->iova_cookie;
+   struct iova_domain *iovad = cookie_iovad(domain);
unsigned long shift = iova_shift(iovad);
unsigned long pfn = dma_addr >> shift;
struct iova *iova = find_iova(iovad, pfn);
@@ -294,7 +322,7 @@ struct page **iommu_dma_alloc(struct device *dev, size_t 
size, gfp_t gfp,
void (*flush_page)(struct device *, const void *, phys_addr_t))
 {
struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
-   struct iova_domain *iovad = domain->iova_cookie;
+   struct iova_domain *iovad = cookie_iovad(domain);
struct iova *iova;
struct page **pages;
struct sg_table sgt;
@@ -386,7 +414,7 @@ dma_addr_t iommu_dma_map_page(struct device *dev, 

[PATCH v7 14/22] iommu/arm-smmu: Streamline SMMU data lookups

2016-09-12 Thread Robin Murphy
Simplify things somewhat by stashing our arm_smmu_device instance in
drvdata, so that it's readily available to our driver model callbacks.
Then we can excise the private list entirely, since the driver core
already has a perfectly good list of SMMU devices we can use in the one
instance we actually need to. Finally, make a further modest code saving
with the relatively new of_device_get_match_data() helper.

Tested-by: Lorenzo Pieralisi 
Signed-off-by: Robin Murphy 
---
 drivers/iommu/arm-smmu.c | 44 +++-
 1 file changed, 11 insertions(+), 33 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 0a628f2c9297..3962add0c33b 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -41,6 +41,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -370,8 +371,6 @@ struct arm_smmu_device {
u32 num_context_irqs;
unsigned int*irqs;
 
-   struct list_headlist;
-
u32 cavium_id_base; /* Specific to Cavium */
 };
 
@@ -409,9 +408,6 @@ struct arm_smmu_domain {
struct iommu_domain domain;
 };
 
-static DEFINE_SPINLOCK(arm_smmu_devices_lock);
-static LIST_HEAD(arm_smmu_devices);
-
 struct arm_smmu_option_prop {
u32 opt;
const char *prop;
@@ -478,6 +474,8 @@ static int __find_legacy_master_phandle(struct device *dev, 
void *data)
return err;
 }
 
+static struct platform_driver arm_smmu_driver;
+
 static int arm_smmu_register_legacy_master(struct device *dev)
 {
struct arm_smmu_device *smmu;
@@ -495,19 +493,16 @@ static int arm_smmu_register_legacy_master(struct device 
*dev)
}
 
it.node = np;
-   spin_lock(_smmu_devices_lock);
-   list_for_each_entry(smmu, _smmu_devices, list) {
-   err = __find_legacy_master_phandle(smmu->dev, );
-   if (err)
-   break;
-   }
-   spin_unlock(_smmu_devices_lock);
+   err = driver_for_each_device(_smmu_driver.driver, NULL, ,
+__find_legacy_master_phandle);
of_node_put(np);
if (err == 0)
return -ENODEV;
if (err < 0)
return err;
 
+   smmu = dev_get_drvdata(data);
+
if (it.cur_count > MAX_MASTER_STREAMIDS) {
dev_err(smmu->dev,
"reached maximum number (%d) of stream IDs for master 
device %s\n",
@@ -1816,7 +1811,6 @@ MODULE_DEVICE_TABLE(of, arm_smmu_of_match);
 
 static int arm_smmu_device_dt_probe(struct platform_device *pdev)
 {
-   const struct of_device_id *of_id;
const struct arm_smmu_match_data *data;
struct resource *res;
struct arm_smmu_device *smmu;
@@ -1830,8 +1824,7 @@ static int arm_smmu_device_dt_probe(struct 
platform_device *pdev)
}
smmu->dev = dev;
 
-   of_id = of_match_node(arm_smmu_of_match, dev->of_node);
-   data = of_id->data;
+   data = of_device_get_match_data(dev);
smmu->version = data->version;
smmu->model = data->model;
 
@@ -1904,35 +1897,20 @@ static int arm_smmu_device_dt_probe(struct 
platform_device *pdev)
}
}
 
-   INIT_LIST_HEAD(>list);
-   spin_lock(_smmu_devices_lock);
-   list_add(>list, _smmu_devices);
-   spin_unlock(_smmu_devices_lock);
-
+   platform_set_drvdata(pdev, smmu);
arm_smmu_device_reset(smmu);
return 0;
 }
 
 static int arm_smmu_device_remove(struct platform_device *pdev)
 {
-   struct device *dev = >dev;
-   struct arm_smmu_device *curr, *smmu = NULL;
-
-   spin_lock(_smmu_devices_lock);
-   list_for_each_entry(curr, _smmu_devices, list) {
-   if (curr->dev == dev) {
-   smmu = curr;
-   list_del(>list);
-   break;
-   }
-   }
-   spin_unlock(_smmu_devices_lock);
+   struct arm_smmu_device *smmu = platform_get_drvdata(pdev);
 
if (!smmu)
return -ENODEV;
 
if (!bitmap_empty(smmu->context_map, ARM_SMMU_MAX_CBS))
-   dev_err(dev, "removing device with active domains!\n");
+   dev_err(>dev, "removing device with active domains!\n");
 
/* Turn the thing off */
writel(sCR0_CLIENTPD, ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sCR0);
-- 
2.8.1.dirty

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 07/22] iommu/arm-smmu: Implement of_xlate() for SMMUv3

2016-09-12 Thread Robin Murphy
Now that we can properly describe the mapping between PCI RIDs and
stream IDs via "iommu-map", and have it fed it to the driver
automatically via of_xlate(), rework the SMMUv3 driver to benefit from
that, and get rid of the current misuse of the "iommus" binding.

Since having of_xlate wired up means that masters will now be given the
appropriate DMA ops, we also need to make sure that default domains work
properly. This necessitates dispensing with the "whole group at a time"
notion for attaching to a domain, as devices which share a group get
attached to the group's default domain one by one as they are initially
probed.

Signed-off-by: Robin Murphy 

---

- Convert to updated fwspec mechanism
---
 drivers/iommu/arm-smmu-v3.c | 304 +++-
 1 file changed, 131 insertions(+), 173 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index d7fef5f99bfc..15ba80db6465 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -30,6 +30,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -610,12 +611,9 @@ struct arm_smmu_device {
struct arm_smmu_strtab_cfg  strtab_cfg;
 };
 
-/* SMMU private data for an IOMMU group */
-struct arm_smmu_group {
+/* SMMU private data for each master */
+struct arm_smmu_master_data {
struct arm_smmu_device  *smmu;
-   struct arm_smmu_domain  *domain;
-   int num_sids;
-   u32 *sids;
struct arm_smmu_strtab_ent  ste;
 };
 
@@ -1555,20 +1553,6 @@ static int arm_smmu_domain_finalise(struct iommu_domain 
*domain)
return ret;
 }
 
-static struct arm_smmu_group *arm_smmu_group_get(struct device *dev)
-{
-   struct iommu_group *group;
-   struct arm_smmu_group *smmu_group;
-
-   group = iommu_group_get(dev);
-   if (!group)
-   return NULL;
-
-   smmu_group = iommu_group_get_iommudata(group);
-   iommu_group_put(group);
-   return smmu_group;
-}
-
 static __le64 *arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid)
 {
__le64 *step;
@@ -1591,27 +1575,17 @@ static __le64 *arm_smmu_get_step_for_sid(struct 
arm_smmu_device *smmu, u32 sid)
return step;
 }
 
-static int arm_smmu_install_ste_for_group(struct arm_smmu_group *smmu_group)
+static int arm_smmu_install_ste_for_dev(struct iommu_fwspec *fwspec)
 {
int i;
-   struct arm_smmu_domain *smmu_domain = smmu_group->domain;
-   struct arm_smmu_strtab_ent *ste = _group->ste;
-   struct arm_smmu_device *smmu = smmu_group->smmu;
+   struct arm_smmu_master_data *master = fwspec->iommu_priv;
+   struct arm_smmu_device *smmu = master->smmu;
 
-   if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
-   ste->s1_cfg = _domain->s1_cfg;
-   ste->s2_cfg = NULL;
-   arm_smmu_write_ctx_desc(smmu, ste->s1_cfg);
-   } else {
-   ste->s1_cfg = NULL;
-   ste->s2_cfg = _domain->s2_cfg;
-   }
-
-   for (i = 0; i < smmu_group->num_sids; ++i) {
-   u32 sid = smmu_group->sids[i];
+   for (i = 0; i < fwspec->num_ids; ++i) {
+   u32 sid = fwspec->ids[i];
__le64 *step = arm_smmu_get_step_for_sid(smmu, sid);
 
-   arm_smmu_write_strtab_ent(smmu, sid, step, ste);
+   arm_smmu_write_strtab_ent(smmu, sid, step, >ste);
}
 
return 0;
@@ -1619,13 +1593,11 @@ static int arm_smmu_install_ste_for_group(struct 
arm_smmu_group *smmu_group)
 
 static void arm_smmu_detach_dev(struct device *dev)
 {
-   struct arm_smmu_group *smmu_group = arm_smmu_group_get(dev);
+   struct arm_smmu_master_data *master = dev->iommu_fwspec->iommu_priv;
 
-   smmu_group->ste.bypass = true;
-   if (arm_smmu_install_ste_for_group(smmu_group) < 0)
+   master->ste.bypass = true;
+   if (arm_smmu_install_ste_for_dev(dev->iommu_fwspec) < 0)
dev_warn(dev, "failed to install bypass STE\n");
-
-   smmu_group->domain = NULL;
 }
 
 static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
@@ -1633,16 +1605,20 @@ static int arm_smmu_attach_dev(struct iommu_domain 
*domain, struct device *dev)
int ret = 0;
struct arm_smmu_device *smmu;
struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
-   struct arm_smmu_group *smmu_group = arm_smmu_group_get(dev);
+   struct arm_smmu_master_data *master;
+   struct arm_smmu_strtab_ent *ste;
 
-   if (!smmu_group)
+   if (!dev->iommu_fwspec)
return -ENOENT;
 
+   master = dev->iommu_fwspec->iommu_priv;
+   smmu = master->smmu;
+   ste = >ste;
+
/* Already attached to a different domain? */
-   if (smmu_group->domain && smmu_group->domain != smmu_domain)
+   if (!ste->bypass)

[PATCH v7 11/22] iommu/arm-smmu: Consolidate stream map entry state

2016-09-12 Thread Robin Murphy
In order to consider SMR masking, we really want to be able to validate
ID/mask pairs against existing SMR contents to prevent stream match
conflicts, which at best would cause transactions to fault unexpectedly,
and at worst lead to silent unpredictable behaviour. With our SMMU
instance data holding only an allocator bitmap, and the SMR values
themselves scattered across master configs hanging off devices which we
may have no way of finding, there's essentially no way short of digging
everything back out of the hardware. Similarly, the thought of power
management ops to support suspend/resume faces the exact same problem.

By massaging the software state into a closer shape to the underlying
hardware, everything comes together quite nicely; the allocator and the
high-level view of the data become a single centralised state which we
can easily keep track of, and to which any updates can be validated in
full before being synchronised to the hardware itself.

Signed-off-by: Robin Murphy 

---

- Make INVALID_SMENDX checks explicit
---
 drivers/iommu/arm-smmu.c | 147 +++
 1 file changed, 86 insertions(+), 61 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index f86d7887f69a..dfe13780ba54 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -28,6 +28,7 @@
 
 #define pr_fmt(fmt) "arm-smmu: " fmt
 
+#include 
 #include 
 #include 
 #include 
@@ -55,9 +56,6 @@
 /* Maximum number of context banks per SMMU */
 #define ARM_SMMU_MAX_CBS   128
 
-/* Maximum number of mapping groups per SMMU */
-#define ARM_SMMU_MAX_SMRS  128
-
 /* SMMU global address space */
 #define ARM_SMMU_GR0(smmu) ((smmu)->base)
 #define ARM_SMMU_GR1(smmu) ((smmu)->base + (1 << (smmu)->pgshift))
@@ -295,16 +293,17 @@ enum arm_smmu_implementation {
 };
 
 struct arm_smmu_smr {
-   u8  idx;
u16 mask;
u16 id;
+   boolvalid;
 };
 
 struct arm_smmu_master_cfg {
int num_streamids;
u16 streamids[MAX_MASTER_STREAMIDS];
-   struct arm_smmu_smr *smrs;
+   s16 smendx[MAX_MASTER_STREAMIDS];
 };
+#define INVALID_SMENDX -1
 
 struct arm_smmu_master {
struct device_node  *of_node;
@@ -346,7 +345,7 @@ struct arm_smmu_device {
u32 num_mapping_groups;
u16 streamid_mask;
u16 smr_mask_mask;
-   DECLARE_BITMAP(smr_map, ARM_SMMU_MAX_SMRS);
+   struct arm_smmu_smr *smrs;
 
unsigned long   va_size;
unsigned long   ipa_size;
@@ -550,6 +549,7 @@ static int register_smmu_master(struct arm_smmu_device 
*smmu,
return -ERANGE;
}
master->cfg.streamids[i] = streamid;
+   master->cfg.smendx[i] = INVALID_SMENDX;
}
return insert_smmu_master(smmu, master);
 }
@@ -1080,79 +1080,91 @@ static void arm_smmu_domain_free(struct iommu_domain 
*domain)
kfree(smmu_domain);
 }
 
-static int arm_smmu_master_configure_smrs(struct arm_smmu_device *smmu,
- struct arm_smmu_master_cfg *cfg)
+static int arm_smmu_alloc_smr(struct arm_smmu_device *smmu)
 {
int i;
-   struct arm_smmu_smr *smrs;
-   void __iomem *gr0_base = ARM_SMMU_GR0(smmu);
 
-   if (!(smmu->features & ARM_SMMU_FEAT_STREAM_MATCH))
-   return 0;
+   for (i = 0; i < smmu->num_mapping_groups; i++)
+   if (!cmpxchg(>smrs[i].valid, false, true))
+   return i;
 
-   if (cfg->smrs)
-   return -EEXIST;
+   return INVALID_SMENDX;
+}
 
-   smrs = kmalloc_array(cfg->num_streamids, sizeof(*smrs), GFP_KERNEL);
-   if (!smrs) {
-   dev_err(smmu->dev, "failed to allocate %d SMRs\n",
-   cfg->num_streamids);
-   return -ENOMEM;
-   }
+static void arm_smmu_free_smr(struct arm_smmu_device *smmu, int idx)
+{
+   writel_relaxed(~SMR_VALID, ARM_SMMU_GR0(smmu) + ARM_SMMU_GR0_SMR(idx));
+   WRITE_ONCE(smmu->smrs[idx].valid, false);
+}
+
+static void arm_smmu_write_smr(struct arm_smmu_device *smmu, int idx)
+{
+   struct arm_smmu_smr *smr = smmu->smrs + idx;
+   u32 reg = (smr->id & smmu->streamid_mask) << SMR_ID_SHIFT |
+ (smr->mask & smmu->smr_mask_mask) << SMR_MASK_SHIFT;
+
+   if (smr->valid)
+   reg |= SMR_VALID;
+   writel_relaxed(reg, ARM_SMMU_GR0(smmu) + ARM_SMMU_GR0_SMR(idx));
+}
+
+static int arm_smmu_master_alloc_smes(struct arm_smmu_device *smmu,
+ struct 

[PATCH v7 12/22] iommu/arm-smmu: Keep track of S2CR state

2016-09-12 Thread Robin Murphy
Making S2CRs first-class citizens within the driver with a high-level
representation of their state offers a neat solution to a few problems:

Firstly, the information about which context a device's stream IDs are
associated with is already present by necessity in the S2CR. With that
state easily accessible we can refer directly to it and obviate the need
to track an IOMMU domain in each device's archdata (its earlier purpose
of enforcing correct attachment of multi-device groups now being handled
by the IOMMU core itself).

Secondly, the core API now deprecates explicit domain detach and expects
domain attach to move devices smoothly from one domain to another; for
SMMUv2, this notion maps directly to simply rewriting the S2CRs assigned
to the device. By giving the driver a suitable abstraction of those
S2CRs to work with, we can massively reduce the overhead of the current
heavy-handed "detach, free resources, reallocate resources, attach"
approach.

Thirdly, making the software state hardware-shaped and attached to the
SMMU instance once again makes suspend/resume of this register group
that much simpler to implement in future.

Signed-off-by: Robin Murphy 

---

- Make INVALID_SMENDX checks explicit
---
 drivers/iommu/arm-smmu.c | 159 +++
 1 file changed, 93 insertions(+), 66 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index dfe13780ba54..69b6cab65421 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -170,12 +170,20 @@
 #define S2CR_CBNDX_MASK0xff
 #define S2CR_TYPE_SHIFT16
 #define S2CR_TYPE_MASK 0x3
-#define S2CR_TYPE_TRANS(0 << S2CR_TYPE_SHIFT)
-#define S2CR_TYPE_BYPASS   (1 << S2CR_TYPE_SHIFT)
-#define S2CR_TYPE_FAULT(2 << S2CR_TYPE_SHIFT)
+enum arm_smmu_s2cr_type {
+   S2CR_TYPE_TRANS,
+   S2CR_TYPE_BYPASS,
+   S2CR_TYPE_FAULT,
+};
 
 #define S2CR_PRIVCFG_SHIFT 24
-#define S2CR_PRIVCFG_UNPRIV(2 << S2CR_PRIVCFG_SHIFT)
+#define S2CR_PRIVCFG_MASK  0x3
+enum arm_smmu_s2cr_privcfg {
+   S2CR_PRIVCFG_DEFAULT,
+   S2CR_PRIVCFG_DIPAN,
+   S2CR_PRIVCFG_UNPRIV,
+   S2CR_PRIVCFG_PRIV,
+};
 
 /* Context bank attribute registers */
 #define ARM_SMMU_GR1_CBAR(n)   (0x0 + ((n) << 2))
@@ -292,6 +300,16 @@ enum arm_smmu_implementation {
CAVIUM_SMMUV2,
 };
 
+struct arm_smmu_s2cr {
+   enum arm_smmu_s2cr_type type;
+   enum arm_smmu_s2cr_privcfg  privcfg;
+   u8  cbndx;
+};
+
+#define s2cr_init_val (struct arm_smmu_s2cr){  \
+   .type = disable_bypass ? S2CR_TYPE_FAULT : S2CR_TYPE_BYPASS,\
+}
+
 struct arm_smmu_smr {
u16 mask;
u16 id;
@@ -346,6 +364,7 @@ struct arm_smmu_device {
u16 streamid_mask;
u16 smr_mask_mask;
struct arm_smmu_smr *smrs;
+   struct arm_smmu_s2cr*s2crs;
 
unsigned long   va_size;
unsigned long   ipa_size;
@@ -1108,6 +1127,23 @@ static void arm_smmu_write_smr(struct arm_smmu_device 
*smmu, int idx)
writel_relaxed(reg, ARM_SMMU_GR0(smmu) + ARM_SMMU_GR0_SMR(idx));
 }
 
+static void arm_smmu_write_s2cr(struct arm_smmu_device *smmu, int idx)
+{
+   struct arm_smmu_s2cr *s2cr = smmu->s2crs + idx;
+   u32 reg = (s2cr->type & S2CR_TYPE_MASK) << S2CR_TYPE_SHIFT |
+ (s2cr->cbndx & S2CR_CBNDX_MASK) << S2CR_CBNDX_SHIFT |
+ (s2cr->privcfg & S2CR_PRIVCFG_MASK) << S2CR_PRIVCFG_SHIFT;
+
+   writel_relaxed(reg, ARM_SMMU_GR0(smmu) + ARM_SMMU_GR0_S2CR(idx));
+}
+
+static void arm_smmu_write_sme(struct arm_smmu_device *smmu, int idx)
+{
+   arm_smmu_write_s2cr(smmu, idx);
+   if (smmu->smrs)
+   arm_smmu_write_smr(smmu, idx);
+}
+
 static int arm_smmu_master_alloc_smes(struct arm_smmu_device *smmu,
  struct arm_smmu_master_cfg *cfg)
 {
@@ -1158,6 +1194,23 @@ static void arm_smmu_master_free_smes(struct 
arm_smmu_device *smmu,
 {
int i;
 
+   /*
+* We *must* clear the S2CR first, because freeing the SMR means
+* that it can be re-allocated immediately.
+*/
+   for (i = 0; i < cfg->num_streamids; ++i) {
+   int idx = cfg->smendx[i];
+
+   /* An IOMMU group is torn down by the first device to be 
removed */
+   if (idx == INVALID_SMENDX)
+   return;
+
+   smmu->s2crs[idx] = s2cr_init_val;
+   arm_smmu_write_s2cr(smmu, idx);
+   }
+   /* Sync S2CR updates before touching anything else */
+   __iowmb();
+
/* Invalidate the SMRs before freeing back 

[PATCH v7 13/22] iommu/arm-smmu: Refactor mmu-masters handling

2016-09-12 Thread Robin Murphy
To be able to support the generic bindings and handle of_xlate() calls,
we need to be able to associate SMMUs and stream IDs directly with
devices *before* allocating IOMMU groups. Furthermore, to support real
default domains with multi-device groups we also have to handle domain
attach on a per-device basis, as the "whole group at a time" assumption
fails to properly handle subsequent devices added to a group after the
first has already triggered default domain creation and attachment.

To that end, use the now-vacant dev->archdata.iommu field for easy
config and SMMU instance lookup, and unify config management by chopping
down the platform-device-specific tree and probing the "mmu-masters"
property on-demand instead. This may add a bit of one-off overhead to
initially adding a new device, but we're about to deprecate that binding
in favour of the inherently-more-efficient generic ones anyway.

For the sake of simplicity, this patch does temporarily regress the case
of aliasing PCI devices by losing the duplicate stream ID detection that
the previous per-group config had. Stay tuned, because we'll be back to
fix that in a better and more general way momentarily...

Tested-by: Lorenzo Pieralisi 
Signed-off-by: Robin Murphy 
---
 drivers/iommu/arm-smmu.c | 382 +--
 1 file changed, 107 insertions(+), 275 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 69b6cab65421..0a628f2c9297 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -317,18 +317,13 @@ struct arm_smmu_smr {
 };
 
 struct arm_smmu_master_cfg {
+   struct arm_smmu_device  *smmu;
int num_streamids;
u16 streamids[MAX_MASTER_STREAMIDS];
s16 smendx[MAX_MASTER_STREAMIDS];
 };
 #define INVALID_SMENDX -1
 
-struct arm_smmu_master {
-   struct device_node  *of_node;
-   struct rb_node  node;
-   struct arm_smmu_master_cfg  cfg;
-};
-
 struct arm_smmu_device {
struct device   *dev;
 
@@ -376,7 +371,6 @@ struct arm_smmu_device {
unsigned int*irqs;
 
struct list_headlist;
-   struct rb_root  masters;
 
u32 cavium_id_base; /* Specific to Cavium */
 };
@@ -415,12 +409,6 @@ struct arm_smmu_domain {
struct iommu_domain domain;
 };
 
-struct arm_smmu_phandle_args {
-   struct device_node *np;
-   int args_count;
-   uint32_t args[MAX_MASTER_STREAMIDS];
-};
-
 static DEFINE_SPINLOCK(arm_smmu_devices_lock);
 static LIST_HEAD(arm_smmu_devices);
 
@@ -462,132 +450,89 @@ static struct device_node *dev_get_dev_node(struct 
device *dev)
 
while (!pci_is_root_bus(bus))
bus = bus->parent;
-   return bus->bridge->parent->of_node;
+   return of_node_get(bus->bridge->parent->of_node);
}
 
-   return dev->of_node;
+   return of_node_get(dev->of_node);
 }
 
-static struct arm_smmu_master *find_smmu_master(struct arm_smmu_device *smmu,
-   struct device_node *dev_node)
+static int __arm_smmu_get_pci_sid(struct pci_dev *pdev, u16 alias, void *data)
 {
-   struct rb_node *node = smmu->masters.rb_node;
-
-   while (node) {
-   struct arm_smmu_master *master;
-
-   master = container_of(node, struct arm_smmu_master, node);
-
-   if (dev_node < master->of_node)
-   node = node->rb_left;
-   else if (dev_node > master->of_node)
-   node = node->rb_right;
-   else
-   return master;
-   }
-
-   return NULL;
+   *((__be32 *)data) = cpu_to_be32(alias);
+   return 0; /* Continue walking */
 }
 
-static struct arm_smmu_master_cfg *
-find_smmu_master_cfg(struct device *dev)
+static int __find_legacy_master_phandle(struct device *dev, void *data)
 {
-   struct arm_smmu_master_cfg *cfg = NULL;
-   struct iommu_group *group = iommu_group_get(dev);
+   struct of_phandle_iterator *it = *(void **)data;
+   struct device_node *np = it->node;
+   int err;
 
-   if (group) {
-   cfg = iommu_group_get_iommudata(group);
-   iommu_group_put(group);
-   }
-
-   return cfg;
-}
-
-static int insert_smmu_master(struct arm_smmu_device *smmu,
- struct arm_smmu_master *master)
-{
-   struct rb_node **new, *parent;
-
-   new = >masters.rb_node;
-   parent = NULL;
-   while (*new) {
-   struct arm_smmu_master *this
-   = container_of(*new, struct arm_smmu_master, node);
-
-   parent = *new;
-   if 

[PATCH v7 10/22] iommu/arm-smmu: Handle stream IDs more dynamically

2016-09-12 Thread Robin Murphy
Rather than assuming fixed worst-case values for stream IDs and SMR
masks, keep track of whatever implemented bits the hardware actually
reports. This also obviates the slightly questionable validation of SMR
fields in isolation - rather than aborting the whole SMMU probe for a
hardware configuration which is still architecturally valid, we can
simply refuse masters later if they try to claim an unrepresentable ID
or mask (which almost certainly implies a DT error anyway).

Acked-by: Will Deacon 
Tested-by: Lorenzo Pieralisi 
Signed-off-by: Robin Murphy 
---
 drivers/iommu/arm-smmu.c | 43 ++-
 1 file changed, 22 insertions(+), 21 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 4b1c87e947fd..f86d7887f69a 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -165,9 +165,7 @@
 #define ARM_SMMU_GR0_SMR(n)(0x800 + ((n) << 2))
 #define SMR_VALID  (1 << 31)
 #define SMR_MASK_SHIFT 16
-#define SMR_MASK_MASK  0x7fff
 #define SMR_ID_SHIFT   0
-#define SMR_ID_MASK0x7fff
 
 #define ARM_SMMU_GR0_S2CR(n)   (0xc00 + ((n) << 2))
 #define S2CR_CBNDX_SHIFT   0
@@ -346,6 +344,8 @@ struct arm_smmu_device {
atomic_tirptndx;
 
u32 num_mapping_groups;
+   u16 streamid_mask;
+   u16 smr_mask_mask;
DECLARE_BITMAP(smr_map, ARM_SMMU_MAX_SMRS);
 
unsigned long   va_size;
@@ -1715,39 +1715,40 @@ static int arm_smmu_device_cfg_probe(struct 
arm_smmu_device *smmu)
dev_notice(smmu->dev,
   "\t(IDR0.CTTW overridden by dma-coherent 
property)\n");
 
+   /* Max. number of entries we have for stream matching/indexing */
+   size = 1 << ((id >> ID0_NUMSIDB_SHIFT) & ID0_NUMSIDB_MASK);
+   smmu->streamid_mask = size - 1;
if (id & ID0_SMS) {
-   u32 smr, sid, mask;
+   u32 smr;
 
smmu->features |= ARM_SMMU_FEAT_STREAM_MATCH;
-   smmu->num_mapping_groups = (id >> ID0_NUMSMRG_SHIFT) &
-  ID0_NUMSMRG_MASK;
-   if (smmu->num_mapping_groups == 0) {
+   size = (id >> ID0_NUMSMRG_SHIFT) & ID0_NUMSMRG_MASK;
+   if (size == 0) {
dev_err(smmu->dev,
"stream-matching supported, but no SMRs 
present!\n");
return -ENODEV;
}
 
-   smr = SMR_MASK_MASK << SMR_MASK_SHIFT;
-   smr |= (SMR_ID_MASK << SMR_ID_SHIFT);
+   /*
+* SMR.ID bits may not be preserved if the corresponding MASK
+* bits are set, so check each one separately. We can reject
+* masters later if they try to claim IDs outside these masks.
+*/
+   smr = smmu->streamid_mask << SMR_ID_SHIFT;
writel_relaxed(smr, gr0_base + ARM_SMMU_GR0_SMR(0));
smr = readl_relaxed(gr0_base + ARM_SMMU_GR0_SMR(0));
+   smmu->streamid_mask = smr >> SMR_ID_SHIFT;
 
-   mask = (smr >> SMR_MASK_SHIFT) & SMR_MASK_MASK;
-   sid = (smr >> SMR_ID_SHIFT) & SMR_ID_MASK;
-   if ((mask & sid) != sid) {
-   dev_err(smmu->dev,
-   "SMR mask bits (0x%x) insufficient for ID field 
(0x%x)\n",
-   mask, sid);
-   return -ENODEV;
-   }
+   smr = smmu->streamid_mask << SMR_MASK_SHIFT;
+   writel_relaxed(smr, gr0_base + ARM_SMMU_GR0_SMR(0));
+   smr = readl_relaxed(gr0_base + ARM_SMMU_GR0_SMR(0));
+   smmu->smr_mask_mask = smr >> SMR_MASK_SHIFT;
 
dev_notice(smmu->dev,
-  "\tstream matching with %u register groups, mask 
0x%x",
-  smmu->num_mapping_groups, mask);
-   } else {
-   smmu->num_mapping_groups = (id >> ID0_NUMSIDB_SHIFT) &
-  ID0_NUMSIDB_MASK;
+  "\tstream matching with %lu register groups, mask 
0x%x",
+  size, smmu->smr_mask_mask);
}
+   smmu->num_mapping_groups = size;
 
if (smmu->version < ARM_SMMU_V2 || !(id & ID0_PTFS_NO_AARCH32)) {
smmu->features |= ARM_SMMU_FEAT_FMT_AARCH32_L;
-- 
2.8.1.dirty

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 00/22] Generic DT bindings for PCI IOMMUs and ARM SMMU

2016-09-12 Thread Robin Murphy
Hi all,

To any more confusing fixups and crazily numbered extra patches, here's
a quick v7 with everything rebased into the right order. The significant
change this time is to implement iommu_fwspec properly from the start,
which ends up being far simpler and more robust than faffing about
introducing it somewhere 'less intrusive' to move toward core code later.

New branch in the logical place:

git://linux-arm.org/linux-rm iommu/generic-v7

Robin.

Mark Rutland (1):
  Docs: dt: add PCI IOMMU map bindings

Robin Murphy (21):
  of/irq: Break out msi-map lookup (again)
  iommu/of: Handle iommu-map property for PCI
  iommu: Introduce iommu_fwspec
  Docs: dt: document ARM SMMUv3 generic binding usage
  iommu/arm-smmu: Fall back to global bypass
  iommu/arm-smmu: Implement of_xlate() for SMMUv3
  iommu/arm-smmu: Support non-PCI devices with SMMUv3
  iommu/arm-smmu: Set PRIVCFG in stage 1 STEs
  iommu/arm-smmu: Handle stream IDs more dynamically
  iommu/arm-smmu: Consolidate stream map entry state
  iommu/arm-smmu: Keep track of S2CR state
  iommu/arm-smmu: Refactor mmu-masters handling
  iommu/arm-smmu: Streamline SMMU data lookups
  iommu/arm-smmu: Add a stream map entry iterator
  iommu/arm-smmu: Intelligent SMR allocation
  iommu/arm-smmu: Convert to iommu_fwspec
  Docs: dt: document ARM SMMU generic binding usage
  iommu/arm-smmu: Wire up generic configuration support
  iommu/arm-smmu: Set domain geometry
  iommu/dma: Add support for mapping MSIs
  iommu/dma: Avoid PCI host bridge windows

 .../devicetree/bindings/iommu/arm,smmu-v3.txt  |   8 +-
 .../devicetree/bindings/iommu/arm,smmu.txt |  63 +-
 .../devicetree/bindings/pci/pci-iommu.txt  | 171 
 arch/arm64/mm/dma-mapping.c|   2 +-
 drivers/gpu/drm/exynos/exynos_drm_iommu.h  |   2 +-
 drivers/iommu/Kconfig  |   2 +-
 drivers/iommu/arm-smmu-v3.c| 386 +
 drivers/iommu/arm-smmu.c   | 962 ++---
 drivers/iommu/dma-iommu.c  | 161 +++-
 drivers/iommu/iommu.c  |  56 ++
 drivers/iommu/of_iommu.c   |  52 +-
 drivers/irqchip/irq-gic-v2m.c  |   3 +
 drivers/irqchip/irq-gic-v3-its.c   |   3 +
 drivers/of/irq.c   |  78 +-
 drivers/of/of_pci.c| 102 +++
 include/linux/device.h |   3 +
 include/linux/dma-iommu.h  |  12 +-
 include/linux/iommu.h  |  38 +
 include/linux/of_pci.h |  10 +
 19 files changed, 1323 insertions(+), 791 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/pci/pci-iommu.txt

-- 
2.8.1.dirty

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 08/22] iommu/arm-smmu: Support non-PCI devices with SMMUv3

2016-09-12 Thread Robin Murphy
With the device <-> stream ID relationship suitably abstracted and
of_xlate() hooked up, the PCI dependency now looks, and is, entirely
arbitrary. Any bus using the of_dma_configure() mechanism will work,
so extend support to the platform and AMBA buses which do just that.

Acked-by: Will Deacon 
Tested-by: Lorenzo Pieralisi 
Signed-off-by: Robin Murphy 
---
 drivers/iommu/Kconfig   |  2 +-
 drivers/iommu/arm-smmu-v3.c | 37 +++--
 2 files changed, 32 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index d432ca828472..8ee54d71c7eb 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -309,7 +309,7 @@ config ARM_SMMU
 
 config ARM_SMMU_V3
bool "ARM Ltd. System MMU Version 3 (SMMUv3) Support"
-   depends on ARM64 && PCI
+   depends on ARM64
select IOMMU_API
select IOMMU_IO_PGTABLE_LPAE
select GENERIC_MSI_IRQ_DOMAIN
diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 15ba80db6465..52860bcf80f2 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -35,6 +35,8 @@
 #include 
 #include 
 
+#include 
+
 #include "io-pgtable.h"
 
 /* MMIO registers */
@@ -1805,6 +1807,23 @@ static void arm_smmu_remove_device(struct device *dev)
iommu_fwspec_free(dev);
 }
 
+static struct iommu_group *arm_smmu_device_group(struct device *dev)
+{
+   struct iommu_group *group;
+
+   /*
+* We don't support devices sharing stream IDs other than PCI RID
+* aliases, since the necessary ID-to-device lookup becomes rather
+* impractical given a potential sparse 32-bit stream ID space.
+*/
+   if (dev_is_pci(dev))
+   group = pci_device_group(dev);
+   else
+   group = generic_device_group(dev);
+
+   return group;
+}
+
 static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
enum iommu_attr attr, void *data)
 {
@@ -1851,10 +1870,6 @@ out_unlock:
 
 static int arm_smmu_of_xlate(struct device *dev, struct of_phandle_args *args)
 {
-   /* We only support PCI, for now */
-   if (!dev_is_pci(dev))
-   return -ENODEV;
-
return iommu_fwspec_add_ids(dev, args->args, 1);
 }
 
@@ -1869,7 +1884,7 @@ static struct iommu_ops arm_smmu_ops = {
.iova_to_phys   = arm_smmu_iova_to_phys,
.add_device = arm_smmu_add_device,
.remove_device  = arm_smmu_remove_device,
-   .device_group   = pci_device_group,
+   .device_group   = arm_smmu_device_group,
.domain_get_attr= arm_smmu_domain_get_attr,
.domain_set_attr= arm_smmu_domain_set_attr,
.of_xlate   = arm_smmu_of_xlate,
@@ -2613,8 +2628,18 @@ static int arm_smmu_device_dt_probe(struct 
platform_device *pdev)
 
/* And we're up. Go go go! */
of_iommu_set_ops(dev->of_node, _smmu_ops);
+#ifdef CONFIG_PCI
pci_request_acs();
-   return bus_set_iommu(_bus_type, _smmu_ops);
+   ret = bus_set_iommu(_bus_type, _smmu_ops);
+   if (ret)
+   return ret;
+#endif
+#ifdef CONFIG_ARM_AMBA
+   ret = bus_set_iommu(_bustype, _smmu_ops);
+   if (ret)
+   return ret;
+#endif
+   return bus_set_iommu(_bus_type, _smmu_ops);
 }
 
 static int arm_smmu_device_remove(struct platform_device *pdev)
-- 
2.8.1.dirty

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 05/22] Docs: dt: document ARM SMMUv3 generic binding usage

2016-09-12 Thread Robin Murphy
We're about to ratify our use of the generic binding, so document it.

CC: Rob Herring 
CC: Mark Rutland 
Signed-off-by: Robin Murphy 

---

- Reference PCI "iommu-map" binding instead, as that's our main concern
- Fix "IDs" typo
---
 Documentation/devicetree/bindings/iommu/arm,smmu-v3.txt | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu-v3.txt 
b/Documentation/devicetree/bindings/iommu/arm,smmu-v3.txt
index 7b94c88cf2ee..be57550e14e4 100644
--- a/Documentation/devicetree/bindings/iommu/arm,smmu-v3.txt
+++ b/Documentation/devicetree/bindings/iommu/arm,smmu-v3.txt
@@ -27,6 +27,12 @@ the PCIe specification.
   * "cmdq-sync" - CMD_SYNC complete
   * "gerror"- Global Error activated
 
+- #iommu-cells  : See the generic IOMMU binding described in
+devicetree/bindings/pci/pci-iommu.txt
+  for details. For SMMUv3, must be 1, with each cell
+  describing a single stream ID. All possible stream
+  IDs which a device may emit must be described.
+
 ** SMMUv3 optional properties:
 
 - dma-coherent  : Present if DMA operations made by the SMMU (page
@@ -54,6 +60,6 @@ the PCIe specification.
  ;
 interrupt-names = "eventq", "priq", "cmdq-sync", "gerror";
 dma-coherent;
-#iommu-cells = <0>;
+#iommu-cells = <1>;
 msi-parent = < 0xff>;
 };
-- 
2.8.1.dirty

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 06/22] iommu/arm-smmu: Fall back to global bypass

2016-09-12 Thread Robin Murphy
Unlike SMMUv2, SMMUv3 has no easy way to bypass unknown stream IDs,
other than allocating and filling in the entire stream table with bypass
entries, which for some configurations would waste *gigabytes* of RAM.
Otherwise, all transactions on unknown stream IDs will simply be aborted
with a C_BAD_STREAMID event.

Rather than render the system unusable in the case of an invalid DT,
avoid enabling the SMMU altogether such that everything bypasses
(though letting the explicit disable_bypass option take precedence).

Signed-off-by: Robin Murphy 

---

- Implement proper GBPA update procedure per the spec
---
 drivers/iommu/arm-smmu-v3.c | 48 +
 1 file changed, 44 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 5db6931c715c..d7fef5f99bfc 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -123,6 +123,10 @@
 #define CR2_RECINVSID  (1 << 1)
 #define CR2_E2H(1 << 0)
 
+#define ARM_SMMU_GBPA  0x44
+#define GBPA_ABORT (1 << 20)
+#define GBPA_UPDATE(1 << 31)
+
 #define ARM_SMMU_IRQ_CTRL  0x50
 #define IRQ_CTRL_EVTQ_IRQEN(1 << 2)
 #define IRQ_CTRL_PRIQ_IRQEN(1 << 1)
@@ -2124,6 +2128,24 @@ static int arm_smmu_write_reg_sync(struct 
arm_smmu_device *smmu, u32 val,
  1, ARM_SMMU_POLL_TIMEOUT_US);
 }
 
+/* GBPA is "special" */
+static int arm_smmu_update_gbpa(struct arm_smmu_device *smmu, u32 set, u32 clr)
+{
+   int ret;
+   u32 reg, __iomem *gbpa = smmu->base + ARM_SMMU_GBPA;
+
+   ret = readl_relaxed_poll_timeout(gbpa, reg, !(reg & GBPA_UPDATE),
+1, ARM_SMMU_POLL_TIMEOUT_US);
+   if (ret)
+   return ret;
+
+   reg &= ~clr;
+   reg |= set;
+   writel_relaxed(reg | GBPA_UPDATE, gbpa);
+   return readl_relaxed_poll_timeout(gbpa, reg, !(reg & GBPA_UPDATE),
+ 1, ARM_SMMU_POLL_TIMEOUT_US);
+}
+
 static void arm_smmu_free_msis(void *data)
 {
struct device *dev = data;
@@ -2269,7 +2291,7 @@ static int arm_smmu_device_disable(struct arm_smmu_device 
*smmu)
return ret;
 }
 
-static int arm_smmu_device_reset(struct arm_smmu_device *smmu)
+static int arm_smmu_device_reset(struct arm_smmu_device *smmu, bool bypass)
 {
int ret;
u32 reg, enables;
@@ -2370,8 +2392,17 @@ static int arm_smmu_device_reset(struct arm_smmu_device 
*smmu)
return ret;
}
 
-   /* Enable the SMMU interface */
-   enables |= CR0_SMMUEN;
+
+   /* Enable the SMMU interface, or ensure bypass */
+   if (!bypass || disable_bypass) {
+   enables |= CR0_SMMUEN;
+   } else {
+   ret = arm_smmu_update_gbpa(smmu, 0, GBPA_ABORT);
+   if (ret) {
+   dev_err(smmu->dev, "GBPA not responding to update\n");
+   return ret;
+   }
+   }
ret = arm_smmu_write_reg_sync(smmu, enables, ARM_SMMU_CR0,
  ARM_SMMU_CR0ACK);
if (ret) {
@@ -2570,6 +2601,15 @@ static int arm_smmu_device_dt_probe(struct 
platform_device *pdev)
struct resource *res;
struct arm_smmu_device *smmu;
struct device *dev = >dev;
+   bool bypass = true;
+   u32 cells;
+
+   if (of_property_read_u32(dev->of_node, "#iommu-cells", ))
+   dev_err(dev, "missing #iommu-cells property\n");
+   else if (cells != 1)
+   dev_err(dev, "invalid #iommu-cells value (%d)\n", cells);
+   else
+   bypass = false;
 
smmu = devm_kzalloc(dev, sizeof(*smmu), GFP_KERNEL);
if (!smmu) {
@@ -2622,7 +2662,7 @@ static int arm_smmu_device_dt_probe(struct 
platform_device *pdev)
platform_set_drvdata(pdev, smmu);
 
/* Reset the device */
-   return arm_smmu_device_reset(smmu);
+   return arm_smmu_device_reset(smmu, bypass);
 }
 
 static int arm_smmu_device_remove(struct platform_device *pdev)
-- 
2.8.1.dirty

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 03/22] iommu/of: Handle iommu-map property for PCI

2016-09-12 Thread Robin Murphy
Now that we have a way to pick up the RID translation and target IOMMU,
hook up of_iommu_configure() to bring PCI devices into the of_xlate
mechanism and allow them IOMMU-backed DMA ops without the need for
driver-specific handling.

Reviewed-by: Will Deacon 
Signed-off-by: Robin Murphy 

---

- Split PCI handling into a separate function
---
 drivers/iommu/of_iommu.c | 46 +-
 1 file changed, 41 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
index 57f23eaaa2f9..19e1e8f2f871 100644
--- a/drivers/iommu/of_iommu.c
+++ b/drivers/iommu/of_iommu.c
@@ -22,6 +22,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 static const struct of_device_id __iommu_of_table_sentinel
@@ -134,6 +135,45 @@ const struct iommu_ops *of_iommu_get_ops(struct 
device_node *np)
return ops;
 }
 
+static int __get_pci_rid(struct pci_dev *pdev, u16 alias, void *data)
+{
+   struct of_phandle_args *iommu_spec = data;
+
+   iommu_spec->args[0] = alias;
+   return iommu_spec->np == pdev->bus->dev.of_node;
+}
+
+static const struct iommu_ops
+*of_pci_iommu_configure(struct pci_dev *pdev, struct device_node *bridge_np)
+{
+   const struct iommu_ops *ops;
+   struct of_phandle_args iommu_spec;
+
+   /*
+* Start by tracing the RID alias down the PCI topology as
+* far as the host bridge whose OF node we have...
+* (we're not even attempting to handle multi-alias devices yet)
+*/
+   iommu_spec.args_count = 1;
+   iommu_spec.np = bridge_np;
+   pci_for_each_dma_alias(pdev, __get_pci_rid, _spec);
+   /*
+* ...then find out what that becomes once it escapes the PCI
+* bus into the system beyond, and which IOMMU it ends up at.
+*/
+   iommu_spec.np = NULL;
+   if (of_pci_map_rid(bridge_np, iommu_spec.args[0], "iommu-map",
+  "iommu-map-mask", _spec.np, iommu_spec.args))
+   return NULL;
+
+   ops = of_iommu_get_ops(iommu_spec.np);
+   if (!ops || !ops->of_xlate || ops->of_xlate(>dev, _spec))
+   ops = NULL;
+
+   of_node_put(iommu_spec.np);
+   return ops;
+}
+
 const struct iommu_ops *of_iommu_configure(struct device *dev,
   struct device_node *master_np)
 {
@@ -142,12 +182,8 @@ const struct iommu_ops *of_iommu_configure(struct device 
*dev,
const struct iommu_ops *ops = NULL;
int idx = 0;
 
-   /*
-* We can't do much for PCI devices without knowing how
-* device IDs are wired up from the PCI bus to the IOMMU.
-*/
if (dev_is_pci(dev))
-   return NULL;
+   return of_pci_iommu_configure(to_pci_dev(dev), master_np);
 
/*
 * We don't currently walk up the tree looking for a parent IOMMU.
-- 
2.8.1.dirty

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 01/22] Docs: dt: add PCI IOMMU map bindings

2016-09-12 Thread Robin Murphy
From: Mark Rutland 

The existing IOMMU bindings are able to specify the relationship between
masters and IOMMUs, but they are insufficient for describing the general
case of hotpluggable busses such as PCI where the set of masters is not
known until runtime, and the relationship between masters and IOMMUs is
a property of the integration of the system.

This patch adds a generic binding for mapping PCI devices to IOMMUs,
using a new iommu-map property (specific to PCI*) which may be used to
map devices (identified by their Requester ID) to sideband data for the
IOMMU which they master through.

Acked-by: Rob Herring 
Acked-by: Will Deacon 
Signed-off-by: Mark Rutland 
---
 .../devicetree/bindings/pci/pci-iommu.txt  | 171 +
 1 file changed, 171 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/pci/pci-iommu.txt

diff --git a/Documentation/devicetree/bindings/pci/pci-iommu.txt 
b/Documentation/devicetree/bindings/pci/pci-iommu.txt
new file mode 100644
index ..56c829621b9a
--- /dev/null
+++ b/Documentation/devicetree/bindings/pci/pci-iommu.txt
@@ -0,0 +1,171 @@
+This document describes the generic device tree binding for describing the
+relationship between PCI(e) devices and IOMMU(s).
+
+Each PCI(e) device under a root complex is uniquely identified by its Requester
+ID (AKA RID). A Requester ID is a triplet of a Bus number, Device number, and
+Function number.
+
+For the purpose of this document, when treated as a numeric value, a RID is
+formatted such that:
+
+* Bits [15:8] are the Bus number.
+* Bits [7:3] are the Device number.
+* Bits [2:0] are the Function number.
+* Any other bits required for padding must be zero.
+
+IOMMUs may distinguish PCI devices through sideband data derived from the
+Requester ID. While a given PCI device can only master through one IOMMU, a
+root complex may split masters across a set of IOMMUs (e.g. with one IOMMU per
+bus).
+
+The generic 'iommus' property is insufficient to describe this relationship,
+and a mechanism is required to map from a PCI device to its IOMMU and sideband
+data.
+
+For generic IOMMU bindings, see
+Documentation/devicetree/bindings/iommu/iommu.txt.
+
+
+PCI root complex
+
+
+Optional properties
+---
+
+- iommu-map: Maps a Requester ID to an IOMMU and associated iommu-specifier
+  data.
+
+  The property is an arbitrary number of tuples of
+  (rid-base,iommu,iommu-base,length).
+
+  Any RID r in the interval [rid-base, rid-base + length) is associated with
+  the listed IOMMU, with the iommu-specifier (r - rid-base + iommu-base).
+
+- iommu-map-mask: A mask to be applied to each Requester ID prior to being
+  mapped to an iommu-specifier per the iommu-map property.
+
+
+Example (1)
+===
+
+/ {
+   #address-cells = <1>;
+   #size-cells = <1>;
+
+   iommu: iommu@a {
+   reg = <0xa 0x1>;
+   compatible = "vendor,some-iommu";
+   #iommu-cells = <1>;
+   };
+
+   pci: pci@f {
+   reg = <0xf 0x1>;
+   compatible = "vendor,pcie-root-complex";
+   device_type = "pci";
+
+   /*
+* The sideband data provided to the IOMMU is the RID,
+* identity-mapped.
+*/
+   iommu-map = <0x0  0x0 0x1>;
+   };
+};
+
+
+Example (2)
+===
+
+/ {
+   #address-cells = <1>;
+   #size-cells = <1>;
+
+   iommu: iommu@a {
+   reg = <0xa 0x1>;
+   compatible = "vendor,some-iommu";
+   #iommu-cells = <1>;
+   };
+
+   pci: pci@f {
+   reg = <0xf 0x1>;
+   compatible = "vendor,pcie-root-complex";
+   device_type = "pci";
+
+   /*
+* The sideband data provided to the IOMMU is the RID with the
+* function bits masked out.
+*/
+   iommu-map = <0x0  0x0 0x1>;
+   iommu-map-mask = <0xfff8>;
+   };
+};
+
+
+Example (3)
+===
+
+/ {
+   #address-cells = <1>;
+   #size-cells = <1>;
+
+   iommu: iommu@a {
+   reg = <0xa 0x1>;
+   compatible = "vendor,some-iommu";
+   #iommu-cells = <1>;
+   };
+
+   pci: pci@f {
+   reg = <0xf 0x1>;
+   compatible = "vendor,pcie-root-complex";
+   device_type = "pci";
+
+   /*
+* The sideband data provided to the IOMMU is the RID,
+* but the high bits of the bus number are flipped.
+*/
+   iommu-map = <0x  0x8000 0x8000>,
+   <0x8000  0x 0x8000>;
+   };
+};
+
+
+Example (4)
+===
+
+/ {
+   #address-cells = <1>;
+   #size-cells = <1>;
+
+   iommu_a: iommu@a {
+   reg = <0xa 0x1>;
+ 

[PATCH v7 02/22] of/irq: Break out msi-map lookup (again)

2016-09-12 Thread Robin Murphy
The PCI msi-map code is already doing double-duty translating IDs and
retrieving MSI parents, which unsurprisingly is the same functionality
we need for the identically-formatted PCI iommu-map property. Drag the
core parsing routine up yet another layer into the general OF-PCI code,
and further generalise it for either kind of lookup in either flavour
of map property.

Acked-by: Rob Herring 
Acked-by: Marc Zyngier 
Tested-by: Lorenzo Pieralisi 
Signed-off-by: Robin Murphy 
---
 drivers/of/irq.c   |  78 ++---
 drivers/of/of_pci.c| 102 +
 include/linux/of_pci.h |  10 +
 3 files changed, 116 insertions(+), 74 deletions(-)

diff --git a/drivers/of/irq.c b/drivers/of/irq.c
index a2e68f740eda..393fea85eb4e 100644
--- a/drivers/of/irq.c
+++ b/drivers/of/irq.c
@@ -26,6 +26,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -592,87 +593,16 @@ static u32 __of_msi_map_rid(struct device *dev, struct 
device_node **np,
u32 rid_in)
 {
struct device *parent_dev;
-   struct device_node *msi_controller_node;
-   struct device_node *msi_np = *np;
-   u32 map_mask, masked_rid, rid_base, msi_base, rid_len, phandle;
-   int msi_map_len;
-   bool matched;
u32 rid_out = rid_in;
-   const __be32 *msi_map = NULL;
 
/*
 * Walk up the device parent links looking for one with a
 * "msi-map" property.
 */
-   for (parent_dev = dev; parent_dev; parent_dev = parent_dev->parent) {
-   if (!parent_dev->of_node)
-   continue;
-
-   msi_map = of_get_property(parent_dev->of_node,
- "msi-map", _map_len);
-   if (!msi_map)
-   continue;
-
-   if (msi_map_len % (4 * sizeof(__be32))) {
-   dev_err(parent_dev, "Error: Bad msi-map length: %d\n",
-   msi_map_len);
-   return rid_out;
-   }
-   /* We have a good parent_dev and msi_map, let's use them. */
-   break;
-   }
-   if (!msi_map)
-   return rid_out;
-
-   /* The default is to select all bits. */
-   map_mask = 0x;
-
-   /*
-* Can be overridden by "msi-map-mask" property.  If
-* of_property_read_u32() fails, the default is used.
-*/
-   of_property_read_u32(parent_dev->of_node, "msi-map-mask", _mask);
-
-   masked_rid = map_mask & rid_in;
-   matched = false;
-   while (!matched && msi_map_len >= 4 * sizeof(__be32)) {
-   rid_base = be32_to_cpup(msi_map + 0);
-   phandle = be32_to_cpup(msi_map + 1);
-   msi_base = be32_to_cpup(msi_map + 2);
-   rid_len = be32_to_cpup(msi_map + 3);
-
-   if (rid_base & ~map_mask) {
-   dev_err(parent_dev,
-   "Invalid msi-map translation - msi-map-mask 
(0x%x) ignores rid-base (0x%x)\n",
-   map_mask, rid_base);
-   return rid_out;
-   }
-
-   msi_controller_node = of_find_node_by_phandle(phandle);
-
-   matched = (masked_rid >= rid_base &&
-  masked_rid < rid_base + rid_len);
-   if (msi_np)
-   matched &= msi_np == msi_controller_node;
-
-   if (matched && !msi_np) {
-   *np = msi_np = msi_controller_node;
+   for (parent_dev = dev; parent_dev; parent_dev = parent_dev->parent)
+   if (!of_pci_map_rid(parent_dev->of_node, rid_in, "msi-map",
+   "msi-map-mask", np, _out))
break;
-   }
-
-   of_node_put(msi_controller_node);
-   msi_map_len -= 4 * sizeof(__be32);
-   msi_map += 4;
-   }
-   if (!matched)
-   return rid_out;
-
-   rid_out = masked_rid - rid_base + msi_base;
-   dev_dbg(dev,
-   "msi-map at: %s, using mask %08x, rid-base: %08x, msi-base: 
%08x, length: %08x, rid: %08x -> %08x\n",
-   dev_name(parent_dev), map_mask, rid_base, msi_base,
-   rid_len, rid_in, rid_out);
-
return rid_out;
 }
 
diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c
index 589b30c68e14..b58be12ab277 100644
--- a/drivers/of/of_pci.c
+++ b/drivers/of/of_pci.c
@@ -308,3 +308,105 @@ struct msi_controller 
*of_pci_find_msi_chip_by_node(struct device_node *of_node)
 EXPORT_SYMBOL_GPL(of_pci_find_msi_chip_by_node);
 
 #endif /* CONFIG_PCI_MSI */
+
+/**
+ * of_pci_map_rid - Translate a requester ID through a downstream mapping.
+ * @np: root complex device node.
+ * @rid: PCI requester ID to map.
+ * 

Re: [RFC PATCH v2 13/20] x86: Decrypt trampoline area if memory encryption is active

2016-09-12 Thread Tom Lendacky
On 09/09/2016 12:34 PM, Borislav Petkov wrote:
> On Mon, Aug 22, 2016 at 05:37:57PM -0500, Tom Lendacky wrote:
>> When Secure Memory Encryption is enabled, the trampoline area must not
>> be encrypted. A cpu running in real mode will not be able to decrypt
> 
> s/cpu/CPU/... always :-)

Ok.

> 
>> memory that has been encrypted because it will not be able to use addresses
>> with the memory encryption mask.
>>
>> Signed-off-by: Tom Lendacky 
>> ---
>>  arch/x86/realmode/init.c |9 +
>>  1 file changed, 9 insertions(+)
>>
>> diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c
>> index 5db706f1..f74925f 100644
>> --- a/arch/x86/realmode/init.c
>> +++ b/arch/x86/realmode/init.c
>> @@ -6,6 +6,7 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>>  
>>  struct real_mode_header *real_mode_header;
>>  u32 *trampoline_cr4_features;
>> @@ -130,6 +131,14 @@ static void __init set_real_mode_permissions(void)
>>  unsigned long text_start =
>>  (unsigned long) __va(real_mode_header->text_start);
>>  
>> +/*
>> + * If memory encryption is active, the trampoline area will need to
>> + * be in non-encrypted memory in order to bring up other processors
> 
> Let's stick with either "unencrypted" - I'd prefer that one - or
> "non-encrypted" nomenclature so that there's no distraction. I see both
> versions in the patchset.

Yup, I'll audit the code and make everything consistent.

Thanks,
Tom

> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v2 12/20] x86: Add support for changing memory encryption attribute

2016-09-12 Thread Tom Lendacky
On 09/09/2016 12:23 PM, Borislav Petkov wrote:
> On Mon, Aug 22, 2016 at 05:37:49PM -0500, Tom Lendacky wrote:
>> This patch adds support to be change the memory encryption attribute for
>> one or more memory pages.
>>
>> Signed-off-by: Tom Lendacky 
>> ---
>>  arch/x86/include/asm/cacheflush.h  |3 +
>>  arch/x86/include/asm/mem_encrypt.h |   13 ++
>>  arch/x86/mm/mem_encrypt.c  |   43 +
>>  arch/x86/mm/pageattr.c |   75 
>> 
>>  4 files changed, 134 insertions(+)
> 
> ...
> 
>> diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
>> index 72c292d..0ba9382 100644
>> --- a/arch/x86/mm/pageattr.c
>> +++ b/arch/x86/mm/pageattr.c
>> @@ -1728,6 +1728,81 @@ int set_memory_4k(unsigned long addr, int numpages)
>>  __pgprot(0), 1, 0, NULL);
>>  }
>>  
>> +static int __set_memory_enc_dec(struct cpa_data *cpa)
>> +{
>> +unsigned long addr;
>> +int numpages;
>> +int ret;
>> +
>> +if (*cpa->vaddr & ~PAGE_MASK) {
>> +*cpa->vaddr &= PAGE_MASK;
>> +
>> +/* People should not be passing in unaligned addresses */
>> +WARN_ON_ONCE(1);
> 
> Let's make this more user-friendly:
> 
>   if (WARN_ONCE(*cpa->vaddr & ~PAGE_MASK, "Misaligned address: 0x%lx\n", 
> *cpa->vaddr))
>   *cpa->vaddr &= PAGE_MASK;

Will do.

> 
>> +}
>> +
>> +addr = *cpa->vaddr;
>> +numpages = cpa->numpages;
>> +
>> +/* Must avoid aliasing mappings in the highmem code */
>> +kmap_flush_unused();
>> +vm_unmap_aliases();
>> +
>> +ret = __change_page_attr_set_clr(cpa, 1);
>> +
>> +/* Check whether we really changed something */
>> +if (!(cpa->flags & CPA_FLUSHTLB))
>> +goto out;
>> +
>> +/*
>> + * On success we use CLFLUSH, when the CPU supports it to
>> + * avoid the WBINVD.
>> + */
>> +if (!ret && static_cpu_has(X86_FEATURE_CLFLUSH))
>> +cpa_flush_range(addr, numpages, 1);
>> +else
>> +cpa_flush_all(1);
> 
> So if we fail (ret != 0) we do WBINVD unconditionally even if we don't
> have to?

Looking at __change_page_attr_set_clr() isn't it possible for some of
the pages to be changed before an error is encountered since it is
looping?  If so, we may still need to flush. The CPA_FLUSHTLB flag
should take care of a failing case where no attributes have actually
been changed.

Thanks,
Tom

> 
> Don't you want this instead:
> 
> ret = __change_page_attr_set_clr(cpa, 1);
> if (ret)
> goto out;
> 
> /* Check whether we really changed something */
> if (!(cpa->flags & CPA_FLUSHTLB))
> goto out;
> 
> /*
>  * On success we use CLFLUSH, when the CPU supports it to
>  * avoid the WBINVD.
>  */
> if (static_cpu_has(X86_FEATURE_CLFLUSH))
> cpa_flush_range(addr, numpages, 1);
> else
> cpa_flush_all(1);
> 
> out:
> return ret;
> }
> 
> ?
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v2 11/20] mm: Access BOOT related data in the clear

2016-09-12 Thread Tom Lendacky
On 09/09/2016 11:38 AM, Borislav Petkov wrote:
> On Mon, Aug 22, 2016 at 05:37:38PM -0500, Tom Lendacky wrote:
>> BOOT data (such as EFI related data) is not encyrpted when the system is
>> booted and needs to be accessed as non-encrypted.  Add support to the
>> early_memremap API to identify the type of data being accessed so that
>> the proper encryption attribute can be applied.  Currently, two types
>> of data are defined, KERNEL_DATA and BOOT_DATA.
>>
>> Signed-off-by: Tom Lendacky 
>> ---
> 
> ...
> 
>> diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
>> index 031db21..e3bdc5a 100644
>> --- a/arch/x86/mm/ioremap.c
>> +++ b/arch/x86/mm/ioremap.c
>> @@ -419,6 +419,25 @@ void unxlate_dev_mem_ptr(phys_addr_t phys, void *addr)
>>  iounmap((void __iomem *)((unsigned long)addr & PAGE_MASK));
>>  }
>>  
>> +/*
>> + * Architecure override of __weak function to adjust the protection 
>> attributes
>> + * used when remapping memory.
>> + */
>> +pgprot_t __init early_memremap_pgprot_adjust(resource_size_t phys_addr,
>> + unsigned long size,
>> + enum memremap_owner owner,
>> + pgprot_t prot)
>> +{
>> +/*
>> + * If memory encryption is enabled and BOOT_DATA is being mapped
>> + * then remove the encryption bit.
>> + */
>> +if (_PAGE_ENC && (owner == BOOT_DATA))
>> +prot = __pgprot(pgprot_val(prot) & ~_PAGE_ENC);
>> +
>> +return prot;
>> +}
>> +
> 
> Hmm, so AFAICT, only arch/x86/xen needs KERNEL_DATA and everything else
> is BOOT_DATA.
> 
> So instead of touching so many files and changing early_memremap(),
> why can't you remove _PAGE_ENC by default on x86 and define a specific
> early_memremap() for arch/x86/xen/ which you call there?
> 
> That would make this patch soo much smaller and the change simpler.

Yes it would.  I'll take a look into that.

> 
> ...
> 
>> diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
>> index 5a2631a..f9286c6 100644
>> --- a/drivers/firmware/efi/efi.c
>> +++ b/drivers/firmware/efi/efi.c
>> @@ -386,7 +386,7 @@ int __init efi_mem_desc_lookup(u64 phys_addr, 
>> efi_memory_desc_t *out_md)
>>   * So just always get our own virtual map on the CPU.
>>   *
>>   */
>> -md = early_memremap(p, sizeof (*md));
>> +md = early_memremap(p, sizeof (*md), BOOT_DATA);
> 
> WARNING: space prohibited between function name and open parenthesis '('
> #432: FILE: drivers/firmware/efi/efi.c:389:
> +   md = early_memremap(p, sizeof (*md), BOOT_DATA);
> 
> Please integrate checkpatch.pl into your workflow so that you can catch
> small style nits like this. And don't take its output too seriously... :-)

I did run checkpatch against everything, but was always under the
assumption that I shouldn't change existing warnings/errors like this.
If it's considered ok since I'm touching that line of code then I'll
take care of those situations.

Thanks,
Tom

> 
>>  if (!md) {
>>  pr_err_once("early_memremap(%pa, %zu) failed.\n",
>>  , sizeof (*md));
>> @@ -501,7 +501,8 @@ int __init efi_config_parse_tables(void *config_tables, 
>> int count, int sz,
>>  if (efi.properties_table != EFI_INVALID_TABLE_ADDR) {
>>  efi_properties_table_t *tbl;
>>  
>> -tbl = early_memremap(efi.properties_table, sizeof(*tbl));
>> +tbl = early_memremap(efi.properties_table, sizeof(*tbl),
>> + BOOT_DATA);
>>  if (tbl == NULL) {
>>  pr_err("Could not map Properties table!\n");
>>  return -ENOMEM;
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v2 10/20] x86: Insure that memory areas are encrypted when possible

2016-09-12 Thread Tom Lendacky
On 09/09/2016 10:53 AM, Borislav Petkov wrote:
> On Mon, Aug 22, 2016 at 05:37:23PM -0500, Tom Lendacky wrote:
>> Encrypt memory areas in place when possible (e.g. zero page, etc.) so
>> that special handling isn't needed afterwards.
>>
>> Signed-off-by: Tom Lendacky 
>> ---
>>  arch/x86/kernel/head64.c |   93 
>> --
>>  arch/x86/kernel/setup.c  |8 
>>  2 files changed, 96 insertions(+), 5 deletions(-)
> 
> ...
> 
>> +int __init early_make_pgtable(unsigned long address)
>> +{
>> +unsigned long physaddr = address - __PAGE_OFFSET;
>> +pmdval_t pmd;
>> +
>> +pmd = (physaddr & PMD_MASK) + early_pmd_flags;
>> +
>> +return __early_make_pgtable(address, pmd);
>> +}
>> +
>> +static void __init create_unencrypted_mapping(void *address, unsigned long 
>> size)
>> +{
>> +unsigned long physaddr = (unsigned long)address - __PAGE_OFFSET;
>> +pmdval_t pmd_flags, pmd;
>> +
>> +if (!sme_me_mask)
>> +return;
>> +
>> +/* Clear the encryption mask from the early_pmd_flags */
>> +pmd_flags = early_pmd_flags & ~sme_me_mask;
>> +
>> +do {
>> +pmd = (physaddr & PMD_MASK) + pmd_flags;
>> +__early_make_pgtable((unsigned long)address, pmd);
>> +
>> +address += PMD_SIZE;
>> +physaddr += PMD_SIZE;
>> +size = (size < PMD_SIZE) ? 0 : size - PMD_SIZE;
>> +} while (size);
>> +}
>> +
>> +static void __init __clear_mapping(unsigned long address)
> 
> Should be called something with "pmd" in the name as it clears a PMD,
> i.e. __clear_pmd_mapping or so.

Ok.

> 
>> +{
>> +unsigned long physaddr = address - __PAGE_OFFSET;
>> +pgdval_t pgd, *pgd_p;
>> +pudval_t pud, *pud_p;
>> +pmdval_t *pmd_p;
>> +
>> +/* Invalid address or early pgt is done ?  */
>> +if (physaddr >= MAXMEM ||
>> +read_cr3() != __sme_pa_nodebug(early_level4_pgt))
>> +return;
>> +
>> +pgd_p = _level4_pgt[pgd_index(address)].pgd;
>> +pgd = *pgd_p;
>> +
>> +if (!pgd)
>> +return;
>> +
>> +/*
>> + * The use of __START_KERNEL_map rather than __PAGE_OFFSET here matches
>> + * __early_make_pgtable where the entry was created.
>> + */
>> +pud_p = (pudval_t *)((pgd & PTE_PFN_MASK) + __START_KERNEL_map - 
>> phys_base);
>> +pud_p += pud_index(address);
>> +pud = *pud_p;
>> +
>> +if (!pud)
>> +return;
>> +
>> +pmd_p = (pmdval_t *)((pud & PTE_PFN_MASK) + __START_KERNEL_map - 
>> phys_base);
>> +pmd_p[pmd_index(address)] = 0;
>> +}
>> +
>> +static void __init clear_mapping(void *address, unsigned long size)
>> +{
>> +if (!sme_me_mask)
>> +return;
>> +
>> +do {
>> +__clear_mapping((unsigned long)address);
>> +
>> +address += PMD_SIZE;
>> +size = (size < PMD_SIZE) ? 0 : size - PMD_SIZE;
>> +} while (size);
>> +}
>> +
>> +static void __init sme_memcpy(void *dst, void *src, unsigned long size)
>> +{
>> +create_unencrypted_mapping(src, size);
>> +memcpy(dst, src, size);
>> +clear_mapping(src, size);
>> +}
>> +
> 
> In any case, this whole functionality is SME-specific and should be
> somewhere in an SME-specific file. arch/x86/mm/mem_encrypt.c or so...

I can look into that.  The reason I put this here is this is all the
early page fault support that is very specific to this file. I modified
an existing static function to take advantage of the mapping support.

> 
>>  /* Don't add a printk in there. printk relies on the PDA which is not 
>> initialized 
>> yet. */
>>  static void __init clear_bss(void)
>> @@ -122,12 +205,12 @@ static void __init copy_bootdata(char *real_mode_data)
>>  char * command_line;
>>  unsigned long cmd_line_ptr;
>>  
>> -memcpy(_params, real_mode_data, sizeof boot_params);
>> +sme_memcpy(_params, real_mode_data, sizeof boot_params);
> 
> checkpatch.pl:
> 
> WARNING: sizeof boot_params should be sizeof(boot_params)
> #155: FILE: arch/x86/kernel/head64.c:208:
> +   sme_memcpy(_params, real_mode_data, sizeof boot_params);

I can fix that.

> 
>>  sanitize_boot_params(_params);
>>  cmd_line_ptr = get_cmd_line_ptr();
>>  if (cmd_line_ptr) {
>>  command_line = __va(cmd_line_ptr);
>> -memcpy(boot_command_line, command_line, COMMAND_LINE_SIZE);
>> +sme_memcpy(boot_command_line, command_line, COMMAND_LINE_SIZE);
>>  }
>>  }
>>  
>> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
>> index 1489da8..1fdaa11 100644
>> --- a/arch/x86/kernel/setup.c
>> +++ b/arch/x86/kernel/setup.c
>> @@ -114,6 +114,7 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>>  
>>  /*
>>   * max_low_pfn_mapped: highest direct mapped pfn under 4GB
>> @@ -376,6 +377,13 @@ static void __init reserve_initrd(void)
>>  !ramdisk_image || !ramdisk_size)
>>  return; /* No initrd provided by bootloader */
>>  
>> 

Re: [RFC PATCH v2 18/20] x86/kvm: Enable Secure Memory Encryption of nested page tables

2016-09-12 Thread Borislav Petkov
On Mon, Aug 22, 2016 at 05:38:49PM -0500, Tom Lendacky wrote:
> Update the KVM support to include the memory encryption mask when creating
> and using nested page tables.
> 
> Signed-off-by: Tom Lendacky 
> ---
>  arch/x86/include/asm/kvm_host.h |3 ++-
>  arch/x86/kvm/mmu.c  |8 ++--
>  arch/x86/kvm/vmx.c  |3 ++-
>  arch/x86/kvm/x86.c  |3 ++-
>  4 files changed, 12 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 33ae3a4..c51c1cb 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1039,7 +1039,8 @@ void kvm_mmu_setup(struct kvm_vcpu *vcpu);
>  void kvm_mmu_init_vm(struct kvm *kvm);
>  void kvm_mmu_uninit_vm(struct kvm *kvm);
>  void kvm_mmu_set_mask_ptes(u64 user_mask, u64 accessed_mask,
> - u64 dirty_mask, u64 nx_mask, u64 x_mask, u64 p_mask);
> + u64 dirty_mask, u64 nx_mask, u64 x_mask, u64 p_mask,
> + u64 me_mask);

Why do you need a separate mask?

arch/x86/kvm/mmu.c::set_spte() ORs in shadow_present_mask
unconditionally. So you can simply do:


kvm_mmu_set_mask_ptes(PT_USER_MASK, PT_ACCESSED_MASK,
  PT_DIRTY_MASK, PT64_NX_MASK, 0,
  PT_PRESENT_MASK | sme_me_mask);

and have this change much simpler.

>  void kvm_mmu_reset_context(struct kvm_vcpu *vcpu);
>  void kvm_mmu_slot_remove_write_access(struct kvm *kvm,
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index 3d4cc8cc..a7040f4 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -122,7 +122,7 @@ module_param(dbg, bool, 0644);
>   * PT32_LEVEL_BITS))) - 1))
>  
>  #define PT64_PERM_MASK (PT_PRESENT_MASK | PT_WRITABLE_MASK | 
> shadow_user_mask \
> - | shadow_x_mask | shadow_nx_mask)
> + | shadow_x_mask | shadow_nx_mask | shadow_me_mask)

This would be sme_me_mask, of course, like with the baremetal masks.

Or am I missing something?

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v2 07/20] x86: Provide general kernel support for memory encryption

2016-09-12 Thread Tom Lendacky
On 09/08/2016 08:55 AM, Borislav Petkov wrote:
> On Thu, Sep 08, 2016 at 08:26:27AM -0500, Tom Lendacky wrote:
>> When does this value get initialized?  Since _PAGE_ENC is #defined to
>> sme_me_mask, which is not set until the boot process begins, I'm afraid
>> we'd end up using the initial value of sme_me_mask, which is zero.  Do
>> I have that right?
> 
> Hmm, but then that would hold true for all the other defines where you
> OR-in _PAGE_ENC, no?

As long as the #define is not a global variable like this one it's ok.

> 
> In any case, the preprocessed source looks like this:
> 
> pmdval_t early_pmd_flags = (((pteval_t)(1)) << 0) | (((pteval_t)(1)) << 
> 1) | (((pteval_t)(1)) << 6) | (((pteval_t)(1)) << 5) | (((pteval_t)(1)) << 
> 8)) | (((pteval_t)(1)) << 63)) | (((pteval_t)(1)) << 7)) | sme_me_mask) & 
> ~pteval_t)(1)) << 8) | (((pteval_t)(1)) << 63));
> 
> but the problem is later, when building:
> 
> arch/x86/kernel/head64.c:39:28: error: initializer element is not constant
>  pmdval_t early_pmd_flags = (__PAGE_KERNEL_LARGE | _PAGE_ENC) & 
> ~(_PAGE_GLOBAL | _PAGE_NX);
> ^
> scripts/Makefile.build:153: recipe for target 'arch/x86/kernel/head64.s' 
> failed
> 
> so I guess not. :-\
> 
> Ok, then at least please put the early_pmd_flags init after
> sme_early_init() along with a small comment explaning what happens.

Let me verify that we won't possibly take any kind of page fault during
sme_early_init() and cause a page to be not be marked encrypted. Or... I
can add a comment indicating I need to set this as early as possible to
cover any page faults that might occur.

Thanks,
Tom

> 
> Thanks.
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v2 16/20] x86: Check for memory encryption on the APs

2016-09-12 Thread Borislav Petkov
On Mon, Aug 22, 2016 at 05:38:29PM -0500, Tom Lendacky wrote:
> Add support to check if memory encryption is active in the kernel and that
> it has been enabled on the AP. If memory encryption is active in the kernel
> but has not been enabled on the AP then do not allow the AP to continue
> start up.
> 
> Signed-off-by: Tom Lendacky 
> ---
>  arch/x86/include/asm/msr-index.h |2 ++
>  arch/x86/include/asm/realmode.h  |   12 
>  arch/x86/realmode/init.c |4 
>  arch/x86/realmode/rm/trampoline_64.S |   19 +++
>  4 files changed, 37 insertions(+)

...

> diff --git a/arch/x86/realmode/rm/trampoline_64.S 
> b/arch/x86/realmode/rm/trampoline_64.S
> index dac7b20..94e29f4 100644
> --- a/arch/x86/realmode/rm/trampoline_64.S
> +++ b/arch/x86/realmode/rm/trampoline_64.S
> @@ -30,6 +30,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include "realmode.h"
>  
>   .text
> @@ -92,6 +93,23 @@ ENTRY(startup_32)
>   movl%edx, %fs
>   movl%edx, %gs
>  
> + /* Check for memory encryption support */
> + bt  $TH_FLAGS_SME_ENABLE_BIT, pa_tr_flags
> + jnc .Ldone
> + movl$MSR_K8_SYSCFG, %ecx
> + rdmsr
> + bt  $MSR_K8_SYSCFG_MEM_ENCRYPT_BIT, %eax
> + jc  .Ldone
> +
> + /*
> +  * Memory encryption is enabled but the MSR has not been set on this
> +  * CPU so we can't continue

Hmm, let me try to parse this correctly: BSP has SME enabled but the
BIOS might not've set this on the AP? Really? Is that even possible?

Because if SME is enabled, that means that MSR_K8_SYSCFG[23] on the BSP
is set, right?

Also, I want to rule out here simple BIOS idiocy: if the only problem
with the bit not being set in the AP is because some BIOS monkey forgot
to do so, then we should try to set it ourselves and not die for no real
reason.

Or is there another issue?

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v2 15/20] iommu/amd: AMD IOMMU support for memory encryption

2016-09-12 Thread Borislav Petkov
On Mon, Aug 22, 2016 at 05:38:20PM -0500, Tom Lendacky wrote:
> Add support to the AMD IOMMU driver to set the memory encryption mask if
> memory encryption is enabled.
> 
> Signed-off-by: Tom Lendacky 
> ---
>  arch/x86/include/asm/mem_encrypt.h |2 ++
>  arch/x86/mm/mem_encrypt.c  |5 +
>  drivers/iommu/amd_iommu.c  |   10 ++
>  3 files changed, 17 insertions(+)
> 
> diff --git a/arch/x86/include/asm/mem_encrypt.h 
> b/arch/x86/include/asm/mem_encrypt.h
> index 384fdfb..e395729 100644
> --- a/arch/x86/include/asm/mem_encrypt.h
> +++ b/arch/x86/include/asm/mem_encrypt.h
> @@ -36,6 +36,8 @@ void __init sme_early_init(void);
>  /* Architecture __weak replacement functions */
>  void __init mem_encrypt_init(void);
>  
> +unsigned long amd_iommu_get_me_mask(void);
> +
>  unsigned long swiotlb_get_me_mask(void);
>  void swiotlb_set_mem_dec(void *vaddr, unsigned long size);
>  
> diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
> index 6b2e8bf..2f28d87 100644
> --- a/arch/x86/mm/mem_encrypt.c
> +++ b/arch/x86/mm/mem_encrypt.c
> @@ -185,6 +185,11 @@ void __init mem_encrypt_init(void)
>   swiotlb_clear_encryption();
>  }
>  
> +unsigned long amd_iommu_get_me_mask(void)
> +{
> + return sme_me_mask;
> +}
> +
>  unsigned long swiotlb_get_me_mask(void)
>  {
>   return sme_me_mask;
> diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
> index 96de97a..63995e3 100644
> --- a/drivers/iommu/amd_iommu.c
> +++ b/drivers/iommu/amd_iommu.c
> @@ -166,6 +166,15 @@ struct dma_ops_domain {
>  static struct iova_domain reserved_iova_ranges;
>  static struct lock_class_key reserved_rbtree_key;
>  
> +/*
> + * Support for memory encryption. If memory encryption is supported, then an
> + * override to this function will be provided.
> + */
> +unsigned long __weak amd_iommu_get_me_mask(void)
> +{
> + return 0;
> +}

So instead of adding a function each time which returns sme_me_mask
for each user it has, why don't you add a single function which
returns sme_me_mask in mem_encrypt.c and add an inline in the header
mem_encrypt.h which returns 0 for the !CONFIG_AMD_MEM_ENCRYPT case.

This all is still funny because we access sme_me_mask directly for the
different KERNEL_* masks but then you're adding an accessor function.

So what you should do instead, IMHO, is either hide sme_me_mask
altogether and use the accessor functions only (not sure if that would
work in all cases) or expose sme_me_mask unconditionally and have it be
0 if CONFIG_AMD_MEM_ENCRYPT is not enabled so that it just works.

Or is there a third, more graceful variant?

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v2 14/20] x86: DMA support for memory encryption

2016-09-12 Thread Borislav Petkov
On Mon, Aug 22, 2016 at 05:38:07PM -0500, Tom Lendacky wrote:
> Since DMA addresses will effectively look like 48-bit addresses when the
> memory encryption mask is set, SWIOTLB is needed if the DMA mask of the
> device performing the DMA does not support 48-bits. SWIOTLB will be
> initialized to create un-encrypted bounce buffers for use by these devices.
> 
> Signed-off-by: Tom Lendacky 
> ---
>  arch/x86/include/asm/dma-mapping.h |5 ++-
>  arch/x86/include/asm/mem_encrypt.h |6 +++
>  arch/x86/kernel/pci-dma.c  |   11 --
>  arch/x86/kernel/pci-nommu.c|2 +
>  arch/x86/kernel/pci-swiotlb.c  |8 +++--
>  arch/x86/mm/mem_encrypt.c  |   22 
>  include/linux/swiotlb.h|1 +
>  init/main.c|   13 +++
>  lib/swiotlb.c  |   64 
> 
>  9 files changed, 115 insertions(+), 17 deletions(-)

...

> @@ -172,3 +174,23 @@ void __init sme_early_init(void)
>   for (i = 0; i < ARRAY_SIZE(protection_map); i++)
>   protection_map[i] = __pgprot(pgprot_val(protection_map[i]) | 
> sme_me_mask);
>  }
> +
> +/* Architecture __weak replacement functions */
> +void __init mem_encrypt_init(void)
> +{
> + if (!sme_me_mask)
> + return;
> +
> + /* Make SWIOTLB use an unencrypted DMA area */
> + swiotlb_clear_encryption();
> +}
> +
> +unsigned long swiotlb_get_me_mask(void)

This could just as well be named to something more generic:

swiotlb_get_clear_dma_mask() or so which basically means the mask of
bits which get cleared before returning DMA addresses...

> +{
> + return sme_me_mask;
> +}
> +
> +void swiotlb_set_mem_dec(void *vaddr, unsigned long size)
> +{
> + sme_set_mem_dec(vaddr, size);
> +}
> diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
> index 5f81f8a..5c909fc 100644
> --- a/include/linux/swiotlb.h
> +++ b/include/linux/swiotlb.h
> @@ -29,6 +29,7 @@ int swiotlb_init_with_tbl(char *tlb, unsigned long nslabs, 
> int verbose);
>  extern unsigned long swiotlb_nr_tbl(void);
>  unsigned long swiotlb_size_or_default(void);
>  extern int swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs);
> +extern void __init swiotlb_clear_encryption(void);
>  
>  /*
>   * Enumeration for sync targets
> diff --git a/init/main.c b/init/main.c
> index a8a58e2..82c7cd9 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -458,6 +458,10 @@ void __init __weak thread_stack_cache_init(void)
>  }
>  #endif
>  
> +void __init __weak mem_encrypt_init(void)
> +{
> +}
> +
>  /*
>   * Set up kernel memory allocators
>   */
> @@ -598,6 +602,15 @@ asmlinkage __visible void __init start_kernel(void)
>*/
>   locking_selftest();
>  
> + /*
> +  * This needs to be called before any devices perform DMA
> +  * operations that might use the swiotlb bounce buffers.
> +  * This call will mark the bounce buffers as un-encrypted so
> +  * that the usage of them will not cause "plain-text" data

...that their usage will not cause ...

> +  * to be decrypted when accessed.
> +  */
> + mem_encrypt_init();
> +
>  #ifdef CONFIG_BLK_DEV_INITRD
>   if (initrd_start && !initrd_below_start_ok &&
>   page_to_pfn(virt_to_page((void *)initrd_start)) < min_low_pfn) {
> diff --git a/lib/swiotlb.c b/lib/swiotlb.c
> index 22e13a0..15d5741 100644
> --- a/lib/swiotlb.c
> +++ b/lib/swiotlb.c
> @@ -131,6 +131,26 @@ unsigned long swiotlb_size_or_default(void)
>   return size ? size : (IO_TLB_DEFAULT_SIZE);
>  }
>  
> +/*
> + * Support for memory encryption. If memory encryption is supported, then an
> + * override to these functions will be provided.
> + */

No need for that comment.

> +unsigned long __weak swiotlb_get_me_mask(void)
> +{
> + return 0;
> +}
> +
> +void __weak swiotlb_set_mem_dec(void *vaddr, unsigned long size)
> +{
> +}
> +
> +/* For swiotlb, clear memory encryption mask from dma addresses */
> +static dma_addr_t swiotlb_phys_to_dma(struct device *hwdev,
> +   phys_addr_t address)
> +{
> + return phys_to_dma(hwdev, address) & ~swiotlb_get_me_mask();
> +}
> +
>  /* Note that this doesn't work with highmem page */
>  static dma_addr_t swiotlb_virt_to_bus(struct device *hwdev,
> volatile void *address)
> @@ -159,6 +179,30 @@ void swiotlb_print_info(void)
>  bytes >> 20, vstart, vend - 1);
>  }
>  
> +/*
> + * If memory encryption is active, the DMA address for an encrypted page may
> + * be beyond the range of the device. If bounce buffers are required be sure
> + * that they are not on an encrypted page. This should be called before the
> + * iotlb area is used.
> + */
> +void __init swiotlb_clear_encryption(void)
> +{
> + void *vaddr;
> + unsigned long bytes;
> +
> + if (no_iotlb_memory || !io_tlb_start || late_alloc)
> + return;
> +
> + vaddr = 

Re: [PATCH 21/20] drm/exynos: Fix iommu_dma_init_domain prototype change

2016-09-12 Thread Will Deacon
On Fri, Sep 09, 2016 at 07:17:46PM +0100, Robin Murphy wrote:
> When adding an extra argument to a function, one really should try a bit
> harder to catch *all* the callers...
> 
> CC: Marek Szyprowski 
> CC: Inki Dae 
> CC: David Airlie 
> CC: dri-de...@lists.freedesktop.org
> Signed-off-by: Robin Murphy 
> 
> ---
> 
> Ideally, this should be squashed into "iommu/dma: Avoid PCI host bridge
> windows" to avoid potential bisection breakage. Sorry!

I'll squash this in when I push out the final series..

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 4.5/20] Docs: dt: document ARM SMMUv3 generic binding usage

2016-09-12 Thread Will Deacon
On Fri, Sep 09, 2016 at 07:17:48PM +0100, Robin Murphy wrote:
> Now that we've ratified SMMUv3's use of the generic binding, document it.
> 
> CC: Rob Herring 
> CC: Mark Rutland 
> Signed-off-by: Robin Murphy 
> ---
>  Documentation/devicetree/bindings/iommu/arm,smmu-v3.txt | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu-v3.txt 
> b/Documentation/devicetree/bindings/iommu/arm,smmu-v3.txt
> index 7b94c88cf2ee..69a694f70bea 100644
> --- a/Documentation/devicetree/bindings/iommu/arm,smmu-v3.txt
> +++ b/Documentation/devicetree/bindings/iommu/arm,smmu-v3.txt
> @@ -27,6 +27,12 @@ the PCIe specification.
>* "cmdq-sync" - CMD_SYNC complete
>* "gerror"- Global Error activated
>  
> +- #iommu-cells  : See the generic IOMMU binding described in
> +devicetree/bindings/iommu/iommu.txt

That file has a weird "Notes:" section describing the application of
"iommus" to a PCI host bridge. We should probably rip that out, because
it seems to be going directly against the approach we've ended up taking.

In fact, replacing that with a cross reference to

  devicetree/bindings/pci/pci-iommu.txt

would make much more sense to me.

> +  for details. For SMMUv3, must be 1, with each cell
> +  describing a single stream ID. All possible stream
> +  ID which a device may emit must be described.

IDs

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 22/20] iommu/arm-smmu: Fall back to global bypass

2016-09-12 Thread Will Deacon
On Fri, Sep 09, 2016 at 07:17:47PM +0100, Robin Murphy wrote:
> Unlike SMMUv2, SMMUv3 has no easy way to bypass unknown stream IDs,
> other than allocating and filling in the entire stream table with bypass
> entries, which for some configurations would waste *gigabytes* of RAM.
> Otherwise, all transactions on unknown stream IDs will simply be aborted
> with a C_BAD_STREAMID event.
> 
> Rather than render the system unusable in the case of an invalid DT,
> avoid enabling the SMMU altogether such that everything bypasses
> (though letting the explicit disable_bypass option take precedence).
> 
> Signed-off-by: Robin Murphy 
> ---
>  drivers/iommu/arm-smmu-v3.c | 28 +++-
>  1 file changed, 23 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> index be293b5aa896..859b80c83946 100644
> --- a/drivers/iommu/arm-smmu-v3.c
> +++ b/drivers/iommu/arm-smmu-v3.c
> @@ -126,6 +126,9 @@
>  #define CR2_RECINVSID(1 << 1)
>  #define CR2_E2H  (1 << 0)
>  
> +#define ARM_SMMU_GBPA0x44
> +#define GBPA_ABORT   (1 << 20)
> +
>  #define ARM_SMMU_IRQ_CTRL0x50
>  #define IRQ_CTRL_EVTQ_IRQEN  (1 << 2)
>  #define IRQ_CTRL_PRIQ_IRQEN  (1 << 1)
> @@ -2242,7 +2245,7 @@ static int arm_smmu_device_disable(struct 
> arm_smmu_device *smmu)
>   return ret;
>  }
>  
> -static int arm_smmu_device_reset(struct arm_smmu_device *smmu)
> +static int arm_smmu_device_reset(struct arm_smmu_device *smmu, bool bypass)
>  {
>   int ret;
>   u32 reg, enables;
> @@ -2343,8 +2346,14 @@ static int arm_smmu_device_reset(struct 
> arm_smmu_device *smmu)
>   return ret;
>   }
>  
> - /* Enable the SMMU interface */
> - enables |= CR0_SMMUEN;
> +
> + /* Enable the SMMU interface, or ensure bypass */
> + if (!bypass || disable_bypass) {
> + enables |= CR0_SMMUEN;
> + } else {
> + reg = readl_relaxed(smmu->base + ARM_SMMU_GBPA);
> + writel_relaxed(reg & ~GBPA_ABORT, smmu->base + ARM_SMMU_GBPA);
> + }

I think this invokes the CONSTRAINED UNPREDICTABLE monster, because the
GBPA register has some a special update procedure involving the 'update'
bit (bit 31).

You might be able to reuse arm_smmu_write_reg_sync to poll for completion
with offset 0. I'm happy to assume that the update bit is initially clear.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [v11, 5/8] soc: fsl: add GUTS driver for QorIQ platforms

2016-09-12 Thread Y.B. Lu
Hi Scott,

Thanks for your review :)
See my comment inline.

> -Original Message-
> From: Scott Wood [mailto:o...@buserror.net]
> Sent: Friday, September 09, 2016 11:47 AM
> To: Y.B. Lu; linux-...@vger.kernel.org; ulf.hans...@linaro.org; Arnd
> Bergmann
> Cc: linuxppc-...@lists.ozlabs.org; devicet...@vger.kernel.org; linux-arm-
> ker...@lists.infradead.org; linux-ker...@vger.kernel.org; linux-
> c...@vger.kernel.org; linux-...@vger.kernel.org; iommu@lists.linux-
> foundation.org; net...@vger.kernel.org; Mark Rutland; Rob Herring;
> Russell King; Jochen Friedrich; Joerg Roedel; Claudiu Manoil; Bhupesh
> Sharma; Qiang Zhao; Kumar Gala; Santosh Shilimkar; Leo Li; X.B. Xie
> Subject: Re: [v11, 5/8] soc: fsl: add GUTS driver for QorIQ platforms
> 
> On Tue, 2016-09-06 at 16:28 +0800, Yangbo Lu wrote:
> > The global utilities block controls power management, I/O device
> > enabling, power-onreset(POR) configuration monitoring, alternate
> > function selection for multiplexed signals,and clock control.
> >
> > This patch adds a driver to manage and access global utilities block.
> > Initially only reading SVR and registering soc device are supported.
> > Other guts accesses, such as reading RCW, should eventually be moved
> > into this driver as well.
> >
> > Signed-off-by: Yangbo Lu 
> > Signed-off-by: Scott Wood 
> 
> Don't put my signoff on patches that I didn't put it on
> myself.  Definitely don't put mine *after* yours on patches that were
> last modified by you.
> 
> If you want to mention that the soc_id encoding was my suggestion, then
> do so explicitly.
> 

[Lu Yangbo-B47093] I found your 'signoff' on this patch at below link.
http://patchwork.ozlabs.org/patch/649211/

So, let me just change the order in next version ?
Signed-off-by: Scott Wood 
Signed-off-by: Yangbo Lu 

> > +/* SoC attribute definition for QorIQ platform */ static const struct
> > +soc_device_attribute qoriq_soc[] = { #ifdef CONFIG_PPC
> > +   /*
> > +    * Power Architecture-based SoCs T Series
> > +    */
> > +
> > +   /* SoC: T1024/T1014/T1023/T1013 Rev: 1.0 */
> > +   { .soc_id   = "svr:0x85400010,name:T1024,die:T1024",
> > +     .revision = "1.0",
> > +   },
> > +   { .soc_id   = "svr:0x85480010,name:T1024E,die:T1024",
> > +     .revision = "1.0",
> > +   },
> 
> Revision could be computed from the low 8 bits of SVR (just as you do for
> unknown SVRs).
>
 
[Lu Yangbo-B47093] Yes, you're right. Will remove it here.

> We could move the die name into .family:
> 
>   {
>   .soc_id = "svr:0x85490010,name:T1023E,",
>   .family = "QorIQ T1024",
>   }
> 
> I see you dropped svre (and the trailing comma), though I guess the vast
> majority of potential users will be looking at .family.  In which case do
> we even need name?  If we just make the soc_id be "svr:0x" then
> we could shrink the table to an svr+mask that identifies each die.  I'd
> still want to keep the "svr:" even if we're giving up on the general
> tagging system, to make it clear what the number refers to, and to
> provide some defense against users who match only against soc_id rather
> than soc_id+family.  Or we could go further and format soc_id as "QorIQ
> SVR 0x" so that soc_id-only matches are fully acceptable rather
> than just less dangerous.

[Lu Yangbo-B47093] It's a good idea to move die into .family I think.
In my opinion, it's better to keep svr and name in soc_id just like your 
suggestion above.
>   {
>   .soc_id = "svr:0x85490010,name:T1023E,",
>   .family = "QorIQ T1024",
>   }
The user probably don’t like to learn the svr value. What they want is just to 
match the soc they use.
It's convenient to use name+rev for them to match a soc.

Regarding shrinking the table, I think it's hard to use svr+mask. Because I 
find many platforms use different masks.
We couldn’t know the mask according svr value.

> 
> > +static const struct soc_device_attribute *fsl_soc_device_match(
> > +   unsigned int svr, const struct soc_device_attribute *matches) {
> > +   char svr_match[50];
> > +   int n;
> > +
> > +   n = sprintf(svr_match, "*%08x*", svr);
> 
> n = sprintf(svr_match, "svr:0x%08x,*", svr);
> 
> (according to the current encoding)
> 

[Lu Yangbo-B47093] Ok. Will do that.

> > +
> > +   do {
> > +   if (!matches->soc_id)
> > +   return NULL;
> > +   if (glob_match(svr_match, matches->soc_id))
> > +   break;
> > +   } while (matches++);
> 
> Are you expecting "matches++" to ever evaluate as false?

[Lu Yangbo-B47093] Yes, this is used to match the soc we use in qoriq_soc array 
until getting true. 
We need to get the name and die information defined in array.

> 
> > +   /* Register soc device */
> > +   soc_dev_attr = kzalloc(sizeof(*soc_dev_attr), GFP_KERNEL);
> > +   if (!soc_dev_attr) {
> > +   ret = -ENOMEM;
> > +