Re: [RFC PATCH v2 19/20] x86: Access the setup data through debugfs un-encrypted

2016-09-15 Thread Tom Lendacky
On 09/14/2016 09:51 AM, Borislav Petkov wrote:
> On Wed, Sep 14, 2016 at 09:29:41AM -0500, Tom Lendacky wrote:
>> This is still required because just using the __va() would still cause
>> the mapping created to have the encryption bit set. The ioremap call
>> will result in the mapping not having the encryption bit set.
> 
> I meant this: https://lkml.kernel.org/r/20160902181447.ga25...@nazgul.tnic
> 
> Wouldn't simply clearing the SME mask work?
> 
> #define __va(x)   ((void *)(((unsigned 
> long)(x)+PAGE_OFFSET) & ~sme_me_mask))
> 
> Or are you saying, one needs the whole noodling through ioremap_cache()
> because the data is already encrypted and accessing it with sme_me_mask
> cleared would simply give you the encrypted garbage?

The problem is that this physical address does not contain the
encryption bit, and even if it did, it wouldn't matter.  The __va()
define creates a virtual address that will be mapped as encrypted given
the current approach (which is how I found this).  It's only ioremap()
that would create a mapping without the encryption attribute and since
this is unencrypted data it needs to be access accordingly.

Thanks,
Tom

> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v2 15/20] iommu/amd: AMD IOMMU support for memory encryption

2016-09-15 Thread Tom Lendacky
On 09/14/2016 09:41 AM, Borislav Petkov wrote:
> On Wed, Sep 14, 2016 at 08:45:44AM -0500, Tom Lendacky wrote:
>> Currently, mem_encrypt.h only lives in the arch/x86 directory so it
>> wouldn't be able to be included here without breaking other archs.
> 
> I'm wondering if it would be simpler to move only sme_me_mask to an
> arch-agnostic header just so that we save us all the code duplication.
> 
> Hmmm.

If I do that, then I could put an #ifdef in the header to include the
asm/mem_encrypt.h if the memory encryption is configured, else set the
value to zero.  I'll look into this.  One immediate question becomes do
we keep the name very specific vs. making it more generic, sme_me_mask
vs me_mask, etc.

Thanks,
Tom

> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v2 11/20] mm: Access BOOT related data in the clear

2016-09-15 Thread Tom Lendacky
On 09/15/2016 04:57 AM, Matt Fleming wrote:
> On Wed, 14 Sep, at 09:20:44AM, Tom Lendacky wrote:
>> On 09/12/2016 11:55 AM, Andy Lutomirski wrote:
>>> On Aug 22, 2016 6:53 PM, "Tom Lendacky"  wrote:

 BOOT data (such as EFI related data) is not encyrpted when the system is
 booted and needs to be accessed as non-encrypted.  Add support to the
 early_memremap API to identify the type of data being accessed so that
 the proper encryption attribute can be applied.  Currently, two types
 of data are defined, KERNEL_DATA and BOOT_DATA.
>>>
>>> What happens when you memremap boot services data outside of early
>>> boot?  Matt just added code that does this.
>>>
>>> IMO this API is not so great.  It scatters a specialized consideration
>>> all over the place.  Could early_memremap not look up the PA to figure
>>> out what to do?
>>
>> Yes, I could see if the PA falls outside of the kernel usable area and,
>> if so, remove the memory encryption attribute from the mapping (for both
>> early_memremap and memremap).
>>
>> Let me look into that, I would prefer something along that line over
>> this change.
> 
> So, the last time we talked about using the address to figure out
> whether to encrypt/decrypt you said,
> 
>  "I looked into this and this would be a large change also to parse
>   tables and build lists."
> 
> Has something changed that makes this approach easier?

The original idea of parsing the tables and building a list was
a large change.  This approach would be simpler by just checking if
the PA is outside the kernel usable area, and if so, removing the
encryption bit.

> 
> And again, you need to be careful with the EFI kexec code paths, since
> you've got a mixture of boot and kernel data being passed. In
> particular the EFI memory map is allocated by the firmware on first
> boot (BOOT_DATA) but by the kernel on kexec (KERNEL_DATA).
> 
> That's one of the reasons I suggested requiring the caller to decide
> on BOOT_DATA vs KERNEL_DATA - when you start looking at kexec the
> distinction isn't easily made.

Yeah, for kexec I think I'll need to make sure that everything looks
like it came from the BIOS/UEFI/bootloader.  If all of the kexec
pieces are allocated with un-encrypted memory, then the boot path
should remain the same.  That's the piece I need to investigate
further.

Thanks,
Tom

> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 00/22] Generic DT bindings for PCI IOMMUs and ARM SMMU

2016-09-15 Thread Auger Eric
Hi Robin,
On 15/09/2016 12:15, Robin Murphy wrote:
> On 15/09/16 10:29, Auger Eric wrote:
>> Hi Robin,
>>
>> On 14/09/2016 14:53, Robin Murphy wrote:
>>> On 14/09/16 13:32, Auger Eric wrote:
 Hi,
 On 14/09/2016 12:35, Robin Murphy wrote:
> On 14/09/16 09:41, Auger Eric wrote:
>> Hi,
>>
>> On 12/09/2016 18:13, Robin Murphy wrote:
>>> Hi all,
>>>
>>> To any more confusing fixups and crazily numbered extra patches, here's
>>> a quick v7 with everything rebased into the right order. The significant
>>> change this time is to implement iommu_fwspec properly from the start,
>>> which ends up being far simpler and more robust than faffing about
>>> introducing it somewhere 'less intrusive' to move toward core code 
>>> later.
>>>
>>> New branch in the logical place:
>>>
>>> git://linux-arm.org/linux-rm iommu/generic-v7
>>
>> For information, as discussed privately with Robin I experience some
>> regressions with the former and now deprecated dt description.
>>
>> on my AMD Overdrive board and my old dt description I now only see a
>> single group:
>>
>> /sys/kernel/iommu_groups/
>> /sys/kernel/iommu_groups/0
>> /sys/kernel/iommu_groups/0/devices
>> /sys/kernel/iommu_groups/0/devices/e070.xgmac
>>
>> whereas I formerly see
>>
>> /sys/kernel/iommu_groups/
>> /sys/kernel/iommu_groups/3
>> /sys/kernel/iommu_groups/3/devices
>> /sys/kernel/iommu_groups/3/devices/:00:00.0
>> /sys/kernel/iommu_groups/1
>> /sys/kernel/iommu_groups/1/devices
>> /sys/kernel/iommu_groups/1/devices/e070.xgmac
>> /sys/kernel/iommu_groups/4
>> /sys/kernel/iommu_groups/4/devices
>> /sys/kernel/iommu_groups/4/devices/:00:02.2
>> /sys/kernel/iommu_groups/4/devices/:01:00.1
>> /sys/kernel/iommu_groups/4/devices/:00:02.0
>> /sys/kernel/iommu_groups/4/devices/:01:00.0
>> /sys/kernel/iommu_groups/2
>> /sys/kernel/iommu_groups/2/devices
>> /sys/kernel/iommu_groups/2/devices/e090.xgmac
>> /sys/kernel/iommu_groups/0
>> /sys/kernel/iommu_groups/0/devices
>> /sys/kernel/iommu_groups/0/devices/f000.pcie
>>
>> This is the group topology without ACS override. Applying the non
>> upstreamed "pci: Enable overrides for missing ACS capabilities" I used
>> to see separate groups for each PCIe components. Now I don't see any
>> difference with and without ACS override.
>
> OK, having reproduced on my Juno, the problem looks to be that
> of_for_each_phandle() leaves err set to -ENOENT after successfully
> walking a phandle list, which makes __find_legacy_master_phandle()
> always bail out after the first SMMU.
>
> Can you confirm that the following diff fixes things for you?

 Well it improves but there are still differences in the group topology.
 The PFs now are in group 0.

 root@trusty:~# lspci -nk
 00:00.0 0600: 1022:1a00
 Subsystem: 1022:1a00
 00:02.0 0600: 1022:1a01
 00:02.2 0604: 1022:1a02
 Kernel driver in use: pcieport
 01:00.0 0200: 8086:1521 (rev 01)
 Subsystem: 8086:0002
 Kernel driver in use: igb
 01:00.1 0200: 8086:1521 (rev 01)
 Subsystem: 8086:0002
 Kernel driver in use: igb


 with your series + fix:
 /sys/kernel/iommu_groups/
 /sys/kernel/iommu_groups/3
 /sys/kernel/iommu_groups/3/devices
 /sys/kernel/iommu_groups/3/devices/:00:00.0
 /sys/kernel/iommu_groups/1
 /sys/kernel/iommu_groups/1/devices
 /sys/kernel/iommu_groups/1/devices/e070.xgmac
 /sys/kernel/iommu_groups/4
 /sys/kernel/iommu_groups/4/devices
 /sys/kernel/iommu_groups/4/devices/:00:02.2
 /sys/kernel/iommu_groups/4/devices/:00:02.0
 /sys/kernel/iommu_groups/2
 /sys/kernel/iommu_groups/2/devices
 /sys/kernel/iommu_groups/2/devices/e090.xgmac
 /sys/kernel/iommu_groups/0
 /sys/kernel/iommu_groups/0/devices
 /sys/kernel/iommu_groups/0/devices/:01:00.1
 /sys/kernel/iommu_groups/0/devices/f000.pcie
 /sys/kernel/iommu_groups/0/devices/:01:00.0

 Before (4.8-rc5):

 /sys/kernel/iommu_groups/
 /sys/kernel/iommu_groups/3
 /sys/kernel/iommu_groups/3/devices
 /sys/kernel/iommu_groups/3/devices/:00:00.0
 /sys/kernel/iommu_groups/1
 /sys/kernel/iommu_groups/1/devices
 /sys/kernel/iommu_groups/1/devices/e070.xgmac
 /sys/kernel/iommu_groups/4
 /sys/kernel/iommu_groups/4/devices
 /sys/kernel/iommu_groups/4/devices/:00:02.2
 /sys/kernel/iommu_groups/4/devices/:01:00.1
 /sys/kernel/iommu_groups/4/devices/:00:02.0
 /sys/kernel/iommu_groups/4/devices/:01:00.0
 /sys/kernel/iommu_groups/2
 /sys/kernel/iommu_groups/2/devices
 /sys/kernel/iommu_groups/2/devices/e090.xgmac
 

Re: [PATCHv9 0/6] dmaengine: rcar-dmac: add iommu support for slave transfers

2016-09-15 Thread Vinod Koul
On Wed, Aug 10, 2016 at 11:07:10PM +0530, Vinod Koul wrote:
> On Wed, Aug 10, 2016 at 01:22:13PM +0200, Niklas Söderlund wrote:
> > Hi,
> > 
> > This series tries to solve the problem with DMA with device registers
> > (MMIO registers) that are behind an IOMMU for the rcar-dmac driver. A
> > recent patch '9575632 (dmaengine: make slave address physical)'
> > clarifies that DMA slave address provided by clients is the physical
> > address. This puts the task of mapping the DMA slave address from a
> > phys_addr_t to a dma_addr_t on the DMA engine.
> > 
> > Without an IOMMU this is easy since the phys_addr_t and dma_addr_t are
> > the same and no special care is needed. However if you have a IOMMU you
> > need to map the DMA slave phys_addr_t to a dma_addr_t using something
> > like this.
> > 
> > This series is based on top of v4.8-rc1. And I'm hoping to be able to 
> > collect a
> > Ack from Russell King on patch 4/6 that adds the ARM specific part and then 
> > be
> > able to take the whole series through the dmaengine tree. If this is not the
> > best route I'm more then happy to do it another way.
> > 
> > It's tested on a Koelsch with CONFIG_IPMMU_VMSA and by enabling the
> > ipmmu_ds node in r8a7791.dtsi. I verified operation by interacting with
> > /dev/mmcblk1, i2c and the serial console which are devices behind the
> > iommu.
> 
> As I said in last one, the dmaengine parts look fine to me. But to go thru
> dmaengine tree I would need ACK on non dmaengine patches.

I havent heard back from this one and I am inclined to merge this one now.
If anyone has any objects, please speak up now...

Also ACKs welcome...

-- 
~Vinod
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v5 0/8] Fix kdump faults on system with amd iommu

2016-09-15 Thread Baoquan He
This is v5 post. In fact in v3 the solution is correct. Just unluckily 
I got a AMD machine with bnx2 NIC which can't reset itself during driver
init. It made me very unconfident with my understanding about the fix.
Now with below fix the AMD machine with bnx2 NIC can also work well
to dump and there's no IO_PAGE_FAULT seen any more. Now network maintainer
has picked it up.

bnx2: Reset device during driver initialization
https://www.mail-archive.com/netdev@vger.kernel.org/msg127336.html

The principle of the fix is similar to intel iommu. Just defer the assignment
of device to domain to device driver init. But there's difference than
intel iommu. AMD iommu create protection domain and assign device to
domain in iommu driver init stage. So in this patchset I just allow the
assignment of device to domain in software level, but defer updating the
domain info, especially the pte_root to dev table entry to device driver
init stage.


Baoquan He (8):
  iommu/amd: Detect pre enabled translation
  iommu/amd: add early_enable_iommu() wrapper function
  iommu/amd: Define bit fields for DTE particularly
  iommu/amd: Add function copy_dev_tables
  iommu/amd: copy old trans table from old kernel
  iommu/amd: Do not re-enable dev table entries in kdump
  iommu/amd: Don't update domain info to dte entry at iommu init stage
  iommu/amd: Update domain into to dte entry during device driver init

 drivers/iommu/amd_iommu.c   |  49 +--
 drivers/iommu/amd_iommu_init.c  | 135 
 drivers/iommu/amd_iommu_proto.h |   1 +
 drivers/iommu/amd_iommu_types.h |  23 +--
 4 files changed, 187 insertions(+), 21 deletions(-)

-- 
2.5.5

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v5 8/8] iommu/amd: Update domain into to dte entry during device driver init

2016-09-15 Thread Baoquan He
All devices are supposed to reset themselves at device driver initialization
stage. At this time if in kdump kernel those on-flight DMA will be stopped
because of device reset. It's best time to update the protection domain info,
especially pte_root, to dte entry which the device relates to.

Signed-off-by: Baoquan He 
---
 drivers/iommu/amd_iommu.c | 21 +
 1 file changed, 21 insertions(+)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 6c37300..00b64ee 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2310,6 +2310,10 @@ static dma_addr_t __map_single(struct device *dev,
unsigned int pages;
int prot = 0;
int i;
+   struct iommu_dev_data *dev_data = get_dev_data(dev);
+   struct protection_domain *domain = get_domain(dev);
+   u16 alias = amd_iommu_alias_table[dev_data->devid];
+   struct amd_iommu *iommu = amd_iommu_rlookup_table[dev_data->devid];
 
pages = iommu_num_pages(paddr, size, PAGE_SIZE);
paddr &= PAGE_MASK;
@@ -2319,6 +2323,13 @@ static dma_addr_t __map_single(struct device *dev,
goto out;
 
prot = dir2prot(direction);
+   if (translation_pre_enabled(iommu) && !dev_data->domain_updated) {
+   dev_data->domain_updated = true;
+   set_dte_entry(dev_data->devid, domain, dev_data->ats.enabled);
+   if (alias != dev_data->devid)
+   set_dte_entry(alias, domain, dev_data->ats.enabled);
+   device_flush_dte(dev_data);
+   }
 
start = address;
for (i = 0; i < pages; ++i) {
@@ -2470,6 +2481,9 @@ static int map_sg(struct device *dev, struct scatterlist 
*sglist,
struct scatterlist *s;
unsigned long address;
u64 dma_mask;
+   struct iommu_dev_data *dev_data = get_dev_data(dev);
+   u16 alias = amd_iommu_alias_table[dev_data->devid];
+   struct amd_iommu *iommu = amd_iommu_rlookup_table[dev_data->devid];
 
domain = get_domain(dev);
if (IS_ERR(domain))
@@ -2485,6 +2499,13 @@ static int map_sg(struct device *dev, struct scatterlist 
*sglist,
goto out_err;
 
prot = dir2prot(direction);
+   if (translation_pre_enabled(iommu) && !dev_data->domain_updated) {
+   dev_data->domain_updated = true;
+   set_dte_entry(dev_data->devid, domain, dev_data->ats.enabled);
+   if (alias != dev_data->devid)
+   set_dte_entry(alias, domain, dev_data->ats.enabled);
+   device_flush_dte(dev_data);
+   }
 
/* Map all sg entries */
for_each_sg(sglist, s, nelems, i) {
-- 
2.5.5

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v5 7/8] iommu/amd: Don't update domain info to dte entry at iommu init stage

2016-09-15 Thread Baoquan He
AMD iommu creates protection domain and assign each device to it during
iommu driver initialization stage. This happened just after system pci
bus scanning stage, and much earlier than device driver init stage. So
at this time if in kdump kernel the domain info, especially pte_root,
can't be updated to dte entry. We should wait until device driver init
stage.

Signed-off-by: Baoquan He 
---
 drivers/iommu/amd_iommu.c | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index fcb69ff..6c37300 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -137,6 +137,7 @@ struct iommu_dev_data {
bool pri_tlp; /* PASID TLB required for
 PPR completions */
u32 errata;   /* Bitmap for errata to apply */
+   bool domain_updated;
 };
 
 /*
@@ -1708,6 +1709,15 @@ static void set_dte_entry(u16 devid, struct 
protection_domain *domain, bool ats)
 {
u64 pte_root = 0;
u64 flags = 0;
+   struct iommu_dev_data *dev_data;
+   struct amd_iommu *iommu = amd_iommu_rlookup_table[devid];
+
+   dev_data = find_dev_data(devid);
+if (!dev_data)
+return;
+
+   if (translation_pre_enabled(iommu) && !dev_data->domain_updated)
+   return;
 
if (domain->mode != PAGE_MODE_NONE)
pte_root = virt_to_phys(domain->pt_root);
@@ -1756,6 +1766,14 @@ static void set_dte_entry(u16 devid, struct 
protection_domain *domain, bool ats)
 
 static void clear_dte_entry(u16 devid)
 {
+   struct iommu_dev_data *dev_data;
+   struct amd_iommu *iommu = amd_iommu_rlookup_table[devid];
+
+   dev_data = find_dev_data(devid);
+if (!dev_data)
+return;
+   if (translation_pre_enabled(iommu) && !dev_data->domain_updated)
+   return;
/* remove entry from the device table seen by the hardware */
amd_iommu_dev_table[devid].data[0]  = DTE_FLAG_V | DTE_FLAG_TV;
amd_iommu_dev_table[devid].data[1] &= DTE_FLAG_MASK;
-- 
2.5.5

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v5 4/8] iommu/amd: Add function copy_dev_tables

2016-09-15 Thread Baoquan He
Add function copy_dev_tables to copy the old DEV table entry of the panicked
kernel to the new allocated DEV table. Since all iommu share the same DTE table
the copy only need be done once as long as the physical address of old DEV table
is retrieved from iommu reg. Besides the old domain id occupied in 1st kernel
need be reserved to avoid touching the old io-page tables so that on-flight DMA
can continue looking up.

And define MACRO DEV_DOMID_MASK to replace magic number 0xULL because
it need be reused in copy_dev_tables.

Signed-off-by: Baoquan He 
---
 drivers/iommu/amd_iommu.c   |  2 +-
 drivers/iommu/amd_iommu_init.c  | 40 
 drivers/iommu/amd_iommu_types.h |  1 +
 3 files changed, 42 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 995b050..fcb69ff 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -1747,7 +1747,7 @@ static void set_dte_entry(u16 devid, struct 
protection_domain *domain, bool ats)
flags|= tmp;
}
 
-   flags &= ~(0xUL);
+   flags &= ~DEV_DOMID_MASK;
flags |= domain->id;
 
amd_iommu_dev_table[devid].data[1]  = flags;
diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index 77c44c8..ce49641 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -717,6 +717,46 @@ static int get_dev_entry_bit(u16 devid, u8 bit)
 }
 
 
+static int copy_dev_tables(void)
+{
+   u64 entry;
+   u32 lo, hi, devid;
+   phys_addr_t old_devtb_phys;
+   struct dev_table_entry *old_devtb = NULL;
+   u16 dom_id, dte_v;
+   struct amd_iommu *iommu;
+   static int copied;
+
+for_each_iommu(iommu) {
+   if (!translation_pre_enabled(iommu)) {
+   pr_err("IOMMU:%d is not pre-enabled!/n", iommu->index);
+   return -1;
+   }
+
+   if (copied)
+   continue;
+
+lo = readl(iommu->mmio_base + MMIO_DEV_TABLE_OFFSET);
+hi = readl(iommu->mmio_base + MMIO_DEV_TABLE_OFFSET + 4);
+entry = (((u64) hi) << 32) + lo;
+old_devtb_phys = entry & PAGE_MASK;
+old_devtb = memremap(old_devtb_phys, dev_table_size, 
MEMREMAP_WB);
+   if (!old_devtb)
+   return -1;
+for (devid = 0; devid <= amd_iommu_last_bdf; ++devid) {
+amd_iommu_dev_table[devid] = old_devtb[devid];
+dom_id = amd_iommu_dev_table[devid].data[1] & 
DEV_DOMID_MASK;
+   dte_v = amd_iommu_dev_table[devid].data[0] & DTE_FLAG_V;
+   if (!dte_v || !dom_id)
+   continue;
+__set_bit(dom_id, amd_iommu_pd_alloc_bitmap);
+}
+   memunmap(old_devtb);
+   copied = 1;
+}
+   return 0;
+}
+
 void amd_iommu_apply_erratum_63(u16 devid)
 {
int sysmgt;
diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
index 809944a..a1ccede 100644
--- a/drivers/iommu/amd_iommu_types.h
+++ b/drivers/iommu/amd_iommu_types.h
@@ -310,6 +310,7 @@
 #define DTE_FLAG_MASK  (0x3ffULL << 32)
 #define DTE_GLX_SHIFT  (56)
 #define DTE_GLX_MASK   (3)
+#define DEV_DOMID_MASK 0xULL
 
 #define DTE_GCR3_VAL_A(x)  (((x) >> 12) & 0x7ULL)
 #define DTE_GCR3_VAL_B(x)  (((x) >> 15) & 0x0ULL)
-- 
2.5.5

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v5 6/8] iommu/amd: Do not re-enable dev table entries in kdump

2016-09-15 Thread Baoquan He
This enabling should have been done in normal kernel. It's unnecessary
to enable it again in kdump kernel.

And clean up the function comments of init_device_table_dma.

Signed-off-by: Baoquan He 
---
 drivers/iommu/amd_iommu_init.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index 47a8fc9..8d5db2e 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -1651,7 +1651,12 @@ static int __init amd_iommu_init_pci(void)
 */
ret = amd_iommu_init_api();
 
-   init_device_table_dma();
+   for_each_iommu(iommu) {
+   if ( !translation_pre_enabled(iommu) ) {
+   init_device_table_dma();
+   break;
+   }
+   }
 
for_each_iommu(iommu)
iommu_flush_all_caches(iommu);
@@ -1829,8 +1834,7 @@ static int __init init_memory_definitions(struct 
acpi_table_header *table)
 }
 
 /*
- * Init the device table to not allow DMA access for devices and
- * suppress all page faults
+ * Init the device table to not allow DMA access for devices.
  */
 static void init_device_table_dma(void)
 {
-- 
2.5.5

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v5 3/8] iommu/amd: Define bit fields for DTE particularly

2016-09-15 Thread Baoquan He
In amd-vi spec several bits of IO PTE fields and DTE fields are similar
so that both of them can share the same MACRO definition. However
defining their respecitve bit fields can make code more read-able. So
do it in this patch.

Signed-off-by: Baoquan He 
---
 drivers/iommu/amd_iommu.c   |  8 
 drivers/iommu/amd_iommu_types.h | 18 ++
 2 files changed, 18 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 96de97a..995b050 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -1388,9 +1388,9 @@ static int iommu_map_page(struct protection_domain *dom,
 
if (count > 1) {
__pte = PAGE_SIZE_PTE(phys_addr, page_size);
-   __pte |= PM_LEVEL_ENC(7) | IOMMU_PTE_P | IOMMU_PTE_FC;
+   __pte |= PM_LEVEL_ENC(7) | IOMMU_PTE_PR | IOMMU_PTE_FC;
} else
-   __pte = phys_addr | IOMMU_PTE_P | IOMMU_PTE_FC;
+   __pte = phys_addr | IOMMU_PTE_PR | IOMMU_PTE_FC;
 
if (prot & IOMMU_PROT_IR)
__pte |= IOMMU_PTE_IR;
@@ -1714,7 +1714,7 @@ static void set_dte_entry(u16 devid, struct 
protection_domain *domain, bool ats)
 
pte_root |= (domain->mode & DEV_ENTRY_MODE_MASK)
<< DEV_ENTRY_MODE_SHIFT;
-   pte_root |= IOMMU_PTE_IR | IOMMU_PTE_IW | IOMMU_PTE_P | IOMMU_PTE_TV;
+   pte_root |= DTE_FLAG_IR | DTE_FLAG_IW | DTE_FLAG_V | DTE_FLAG_TV;
 
flags = amd_iommu_dev_table[devid].data[1];
 
@@ -1757,7 +1757,7 @@ static void set_dte_entry(u16 devid, struct 
protection_domain *domain, bool ats)
 static void clear_dte_entry(u16 devid)
 {
/* remove entry from the device table seen by the hardware */
-   amd_iommu_dev_table[devid].data[0]  = IOMMU_PTE_P | IOMMU_PTE_TV;
+   amd_iommu_dev_table[devid].data[0]  = DTE_FLAG_V | DTE_FLAG_TV;
amd_iommu_dev_table[devid].data[1] &= DTE_FLAG_MASK;
 
amd_iommu_apply_erratum_63(devid);
diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
index 7781953..809944a 100644
--- a/drivers/iommu/amd_iommu_types.h
+++ b/drivers/iommu/amd_iommu_types.h
@@ -239,7 +239,7 @@
 #define PM_LEVEL_INDEX(x, a)   (((a) >> PM_LEVEL_SHIFT((x))) & 0x1ffULL)
 #define PM_LEVEL_ENC(x)(((x) << 9) & 0xe00ULL)
 #define PM_LEVEL_PDE(x, a) ((a) | PM_LEVEL_ENC((x)) | \
-IOMMU_PTE_P | IOMMU_PTE_IR | IOMMU_PTE_IW)
+IOMMU_PTE_PR | IOMMU_PTE_IR | IOMMU_PTE_IW)
 #define PM_PTE_LEVEL(pte)  (((pte) >> 9) & 0x7ULL)
 
 #define PM_MAP_4k  0
@@ -288,13 +288,23 @@
 #define PTE_LEVEL_PAGE_SIZE(level) \
(1ULL << (12 + (9 * (level
 
-#define IOMMU_PTE_P  (1ULL << 0)
-#define IOMMU_PTE_TV (1ULL << 1)
+/*
+ * Bit value definition for I/O PTE fields
+ */
+#define IOMMU_PTE_PR (1ULL << 0)
 #define IOMMU_PTE_U  (1ULL << 59)
 #define IOMMU_PTE_FC (1ULL << 60)
 #define IOMMU_PTE_IR (1ULL << 61)
 #define IOMMU_PTE_IW (1ULL << 62)
 
+/*
+ * Bit value definition for DTE fields
+ */
+#define DTE_FLAG_V  (1ULL << 0)
+#define DTE_FLAG_TV (1ULL << 1)
+#define DTE_FLAG_IR (1ULL << 61)
+#define DTE_FLAG_IW (1ULL << 62)
+
 #define DTE_FLAG_IOTLB (1ULL << 32)
 #define DTE_FLAG_GV(1ULL << 55)
 #define DTE_FLAG_MASK  (0x3ffULL << 32)
@@ -316,7 +326,7 @@
 #define GCR3_VALID 0x01ULL
 
 #define IOMMU_PAGE_MASK (((1ULL << 52) - 1) & ~0xfffULL)
-#define IOMMU_PTE_PRESENT(pte) ((pte) & IOMMU_PTE_P)
+#define IOMMU_PTE_PRESENT(pte) ((pte) & IOMMU_PTE_PR)
 #define IOMMU_PTE_PAGE(pte) (phys_to_virt((pte) & IOMMU_PAGE_MASK))
 #define IOMMU_PTE_MODE(pte) (((pte) >> 9) & 0x07)
 
-- 
2.5.5

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v5 5/8] iommu/amd: copy old trans table from old kernel

2016-09-15 Thread Baoquan He
Here several things need be done:
1) If iommu is pre-enabled in a normal kernel, just disable it and print
   warning.
2) If failed to copy dev table of old kernel, continue to proceed as
   it does in normal kernel.
3) Re-enable event/cmd buffer and install the new DTE table to reg.
4) Flush all caches

Signed-off-by: Baoquan He 
---
 drivers/iommu/amd_iommu_init.c | 44 +++---
 1 file changed, 41 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index ce49641..47a8fc9 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -34,7 +34,7 @@
 #include 
 #include 
 #include 
-
+#include 
 #include "amd_iommu_proto.h"
 #include "amd_iommu_types.h"
 #include "irq_remapping.h"
@@ -1344,6 +1344,12 @@ static int __init init_iommu_one(struct amd_iommu 
*iommu, struct ivhd_header *h)
iommu->int_enabled = false;
 
init_translation_status(iommu);
+   if (translation_pre_enabled(iommu) && !is_kdump_kernel()) {
+   iommu_disable(iommu);
+   clear_translation_pre_enabled(iommu);
+   pr_warn("Translation was enabled for IOMMU:%d but we are not in 
kdump mode\n",
+   iommu->index);
+   }
 
if (translation_pre_enabled(iommu))
pr_warn("Translation is already enabled - trying to copy 
translation structures\n");
@@ -1946,9 +1952,41 @@ static void early_enable_iommu(struct amd_iommu *iommu)
 static void early_enable_iommus(void)
 {
struct amd_iommu *iommu;
+   bool is_pre_enabled=false;
 
-   for_each_iommu(iommu)
-   early_enable_iommu(iommu);
+   for_each_iommu(iommu) {
+   if ( translation_pre_enabled(iommu) ) {
+   is_pre_enabled = true;
+   break;
+   }
+   }
+
+   if ( !is_pre_enabled) {
+   for_each_iommu(iommu)
+   early_enable_iommu(iommu);
+   } else {
+   if (copy_dev_tables()) {
+   pr_err("Failed to copy DEV table from previous 
kernel.\n");
+   /*
+* If failed to copy dev tables from old kernel, 
continue to proceed
+* as it does in normal kernel.
+*/
+   for_each_iommu(iommu) {
+   clear_translation_pre_enabled(iommu);
+   early_enable_iommu(iommu);
+   }
+   } else {
+   pr_info("Copied DEV table from previous kernel.\n");
+   for_each_iommu(iommu) {
+   iommu_feature_disable(iommu, CONTROL_CMDBUF_EN);
+   iommu_feature_disable(iommu, 
CONTROL_EVT_LOG_EN);
+   iommu_enable_command_buffer(iommu);
+   iommu_enable_event_buffer(iommu);
+   iommu_set_device_table(iommu);
+   iommu_flush_all_caches(iommu);
+   }
+   }
+   }
 }
 
 static void enable_iommus_v2(void)
-- 
2.5.5

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v5 2/8] iommu/amd: add early_enable_iommu() wrapper function

2016-09-15 Thread Baoquan He
Move per iommu enabling code into a wrapper function early_enable_iommu().
This can make later kdump change easier.

Signed-off-by: Baoquan He 
---
 drivers/iommu/amd_iommu_init.c | 24 ++--
 1 file changed, 14 insertions(+), 10 deletions(-)

diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index 9bf1a04..77c44c8 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -1887,6 +1887,18 @@ static void iommu_apply_resume_quirks(struct amd_iommu 
*iommu)
   iommu->stored_addr_lo | 1);
 }
 
+static void early_enable_iommu(struct amd_iommu *iommu)
+{
+   iommu_disable(iommu);
+   iommu_init_flags(iommu);
+   iommu_set_device_table(iommu);
+   iommu_enable_command_buffer(iommu);
+   iommu_enable_event_buffer(iommu);
+   iommu_set_exclusion_range(iommu);
+   iommu_enable(iommu);
+   iommu_flush_all_caches(iommu);
+}
+
 /*
  * This function finally enables all IOMMUs found in the system after
  * they have been initialized
@@ -1895,16 +1907,8 @@ static void early_enable_iommus(void)
 {
struct amd_iommu *iommu;
 
-   for_each_iommu(iommu) {
-   iommu_disable(iommu);
-   iommu_init_flags(iommu);
-   iommu_set_device_table(iommu);
-   iommu_enable_command_buffer(iommu);
-   iommu_enable_event_buffer(iommu);
-   iommu_set_exclusion_range(iommu);
-   iommu_enable(iommu);
-   iommu_flush_all_caches(iommu);
-   }
+   for_each_iommu(iommu)
+   early_enable_iommu(iommu);
 }
 
 static void enable_iommus_v2(void)
-- 
2.5.5

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v5 1/8] iommu/amd: Detect pre enabled translation

2016-09-15 Thread Baoquan He
Add functions to check whether translation is already enabled in IOMMU.

Signed-off-by: Baoquan He 
---
 drivers/iommu/amd_iommu_init.c  | 25 +
 drivers/iommu/amd_iommu_proto.h |  1 +
 drivers/iommu/amd_iommu_types.h |  4 
 3 files changed, 30 insertions(+)

diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index 59741ea..9bf1a04 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -247,6 +247,26 @@ static int amd_iommu_enable_interrupts(void);
 static int __init iommu_go_to_state(enum iommu_init_state state);
 static void init_device_table_dma(void);
 
+
+bool translation_pre_enabled(struct amd_iommu *iommu)
+{
+   return (iommu->flags & AMD_IOMMU_FLAG_TRANS_PRE_ENABLED);
+}
+
+static void clear_translation_pre_enabled(struct amd_iommu *iommu)
+{
+iommu->flags &= ~AMD_IOMMU_FLAG_TRANS_PRE_ENABLED;
+}
+
+static void init_translation_status(struct amd_iommu *iommu)
+{
+   u32 ctrl;
+
+   ctrl = readl(iommu->mmio_base + MMIO_CONTROL_OFFSET);
+   if (ctrl & (1<flags |= AMD_IOMMU_FLAG_TRANS_PRE_ENABLED;
+}
+
 static int iommu_pc_get_set_reg_val(struct amd_iommu *iommu,
u8 bank, u8 cntr, u8 fxn,
u64 *value, bool is_write);
@@ -1283,6 +1303,11 @@ static int __init init_iommu_one(struct amd_iommu 
*iommu, struct ivhd_header *h)
 
iommu->int_enabled = false;
 
+   init_translation_status(iommu);
+
+   if (translation_pre_enabled(iommu))
+   pr_warn("Translation is already enabled - trying to copy 
translation structures\n");
+
ret = init_iommu_from_acpi(iommu, h);
if (ret)
return ret;
diff --git a/drivers/iommu/amd_iommu_proto.h b/drivers/iommu/amd_iommu_proto.h
index 0bd9eb3..f066e01 100644
--- a/drivers/iommu/amd_iommu_proto.h
+++ b/drivers/iommu/amd_iommu_proto.h
@@ -98,4 +98,5 @@ static inline bool iommu_feature(struct amd_iommu *iommu, u64 
f)
return !!(iommu->features & f);
 }
 
+extern bool translation_pre_enabled(struct amd_iommu *iommu);
 #endif /* _ASM_X86_AMD_IOMMU_PROTO_H  */
diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
index caf5e38..7781953 100644
--- a/drivers/iommu/amd_iommu_types.h
+++ b/drivers/iommu/amd_iommu_types.h
@@ -384,6 +384,7 @@ extern struct kmem_cache *amd_iommu_irq_cache;
 #define APERTURE_PAGE_INDEX(a) (((a) >> 21) & 0x3fULL)
 
 
+
 /*
  * This struct is used to pass information about
  * incoming PPR faults around.
@@ -401,6 +402,8 @@ struct amd_iommu_fault {
 struct iommu_domain;
 struct irq_domain;
 
+#define AMD_IOMMU_FLAG_TRANS_PRE_ENABLED  (1 << 0)
+
 /*
  * This structure contains generic data for  IOMMU protection domains
  * independent of their use.
@@ -524,6 +527,7 @@ struct amd_iommu {
struct irq_domain *ir_domain;
struct irq_domain *msi_domain;
 #endif
+   u32 flags;
 };
 
 #define ACPIHID_UID_LEN 256
-- 
2.5.5

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 00/22] Generic DT bindings for PCI IOMMUs and ARM SMMU

2016-09-15 Thread Robin Murphy
On 15/09/16 10:29, Auger Eric wrote:
> Hi Robin,
> 
> On 14/09/2016 14:53, Robin Murphy wrote:
>> On 14/09/16 13:32, Auger Eric wrote:
>>> Hi,
>>> On 14/09/2016 12:35, Robin Murphy wrote:
 On 14/09/16 09:41, Auger Eric wrote:
> Hi,
>
> On 12/09/2016 18:13, Robin Murphy wrote:
>> Hi all,
>>
>> To any more confusing fixups and crazily numbered extra patches, here's
>> a quick v7 with everything rebased into the right order. The significant
>> change this time is to implement iommu_fwspec properly from the start,
>> which ends up being far simpler and more robust than faffing about
>> introducing it somewhere 'less intrusive' to move toward core code later.
>>
>> New branch in the logical place:
>>
>> git://linux-arm.org/linux-rm iommu/generic-v7
>
> For information, as discussed privately with Robin I experience some
> regressions with the former and now deprecated dt description.
>
> on my AMD Overdrive board and my old dt description I now only see a
> single group:
>
> /sys/kernel/iommu_groups/
> /sys/kernel/iommu_groups/0
> /sys/kernel/iommu_groups/0/devices
> /sys/kernel/iommu_groups/0/devices/e070.xgmac
>
> whereas I formerly see
>
> /sys/kernel/iommu_groups/
> /sys/kernel/iommu_groups/3
> /sys/kernel/iommu_groups/3/devices
> /sys/kernel/iommu_groups/3/devices/:00:00.0
> /sys/kernel/iommu_groups/1
> /sys/kernel/iommu_groups/1/devices
> /sys/kernel/iommu_groups/1/devices/e070.xgmac
> /sys/kernel/iommu_groups/4
> /sys/kernel/iommu_groups/4/devices
> /sys/kernel/iommu_groups/4/devices/:00:02.2
> /sys/kernel/iommu_groups/4/devices/:01:00.1
> /sys/kernel/iommu_groups/4/devices/:00:02.0
> /sys/kernel/iommu_groups/4/devices/:01:00.0
> /sys/kernel/iommu_groups/2
> /sys/kernel/iommu_groups/2/devices
> /sys/kernel/iommu_groups/2/devices/e090.xgmac
> /sys/kernel/iommu_groups/0
> /sys/kernel/iommu_groups/0/devices
> /sys/kernel/iommu_groups/0/devices/f000.pcie
>
> This is the group topology without ACS override. Applying the non
> upstreamed "pci: Enable overrides for missing ACS capabilities" I used
> to see separate groups for each PCIe components. Now I don't see any
> difference with and without ACS override.

 OK, having reproduced on my Juno, the problem looks to be that
 of_for_each_phandle() leaves err set to -ENOENT after successfully
 walking a phandle list, which makes __find_legacy_master_phandle()
 always bail out after the first SMMU.

 Can you confirm that the following diff fixes things for you?
>>>
>>> Well it improves but there are still differences in the group topology.
>>> The PFs now are in group 0.
>>>
>>> root@trusty:~# lspci -nk
>>> 00:00.0 0600: 1022:1a00
>>> Subsystem: 1022:1a00
>>> 00:02.0 0600: 1022:1a01
>>> 00:02.2 0604: 1022:1a02
>>> Kernel driver in use: pcieport
>>> 01:00.0 0200: 8086:1521 (rev 01)
>>> Subsystem: 8086:0002
>>> Kernel driver in use: igb
>>> 01:00.1 0200: 8086:1521 (rev 01)
>>> Subsystem: 8086:0002
>>> Kernel driver in use: igb
>>>
>>>
>>> with your series + fix:
>>> /sys/kernel/iommu_groups/
>>> /sys/kernel/iommu_groups/3
>>> /sys/kernel/iommu_groups/3/devices
>>> /sys/kernel/iommu_groups/3/devices/:00:00.0
>>> /sys/kernel/iommu_groups/1
>>> /sys/kernel/iommu_groups/1/devices
>>> /sys/kernel/iommu_groups/1/devices/e070.xgmac
>>> /sys/kernel/iommu_groups/4
>>> /sys/kernel/iommu_groups/4/devices
>>> /sys/kernel/iommu_groups/4/devices/:00:02.2
>>> /sys/kernel/iommu_groups/4/devices/:00:02.0
>>> /sys/kernel/iommu_groups/2
>>> /sys/kernel/iommu_groups/2/devices
>>> /sys/kernel/iommu_groups/2/devices/e090.xgmac
>>> /sys/kernel/iommu_groups/0
>>> /sys/kernel/iommu_groups/0/devices
>>> /sys/kernel/iommu_groups/0/devices/:01:00.1
>>> /sys/kernel/iommu_groups/0/devices/f000.pcie
>>> /sys/kernel/iommu_groups/0/devices/:01:00.0
>>>
>>> Before (4.8-rc5):
>>>
>>> /sys/kernel/iommu_groups/
>>> /sys/kernel/iommu_groups/3
>>> /sys/kernel/iommu_groups/3/devices
>>> /sys/kernel/iommu_groups/3/devices/:00:00.0
>>> /sys/kernel/iommu_groups/1
>>> /sys/kernel/iommu_groups/1/devices
>>> /sys/kernel/iommu_groups/1/devices/e070.xgmac
>>> /sys/kernel/iommu_groups/4
>>> /sys/kernel/iommu_groups/4/devices
>>> /sys/kernel/iommu_groups/4/devices/:00:02.2
>>> /sys/kernel/iommu_groups/4/devices/:01:00.1
>>> /sys/kernel/iommu_groups/4/devices/:00:02.0
>>> /sys/kernel/iommu_groups/4/devices/:01:00.0
>>> /sys/kernel/iommu_groups/2
>>> /sys/kernel/iommu_groups/2/devices
>>> /sys/kernel/iommu_groups/2/devices/e090.xgmac
>>> /sys/kernel/iommu_groups/0
>>> /sys/kernel/iommu_groups/0/devices
>>> /sys/kernel/iommu_groups/0/devices/f000.pcie
>>
>> Your DT claims that f000.pcie (i.e. the "platform device" side of
>> the 

Re: [RFC PATCH v2 11/20] mm: Access BOOT related data in the clear

2016-09-15 Thread Matt Fleming
On Wed, 14 Sep, at 09:20:44AM, Tom Lendacky wrote:
> On 09/12/2016 11:55 AM, Andy Lutomirski wrote:
> > On Aug 22, 2016 6:53 PM, "Tom Lendacky"  wrote:
> >>
> >> BOOT data (such as EFI related data) is not encyrpted when the system is
> >> booted and needs to be accessed as non-encrypted.  Add support to the
> >> early_memremap API to identify the type of data being accessed so that
> >> the proper encryption attribute can be applied.  Currently, two types
> >> of data are defined, KERNEL_DATA and BOOT_DATA.
> > 
> > What happens when you memremap boot services data outside of early
> > boot?  Matt just added code that does this.
> > 
> > IMO this API is not so great.  It scatters a specialized consideration
> > all over the place.  Could early_memremap not look up the PA to figure
> > out what to do?
> 
> Yes, I could see if the PA falls outside of the kernel usable area and,
> if so, remove the memory encryption attribute from the mapping (for both
> early_memremap and memremap).
> 
> Let me look into that, I would prefer something along that line over
> this change.

So, the last time we talked about using the address to figure out
whether to encrypt/decrypt you said,

 "I looked into this and this would be a large change also to parse
  tables and build lists."

Has something changed that makes this approach easier?

And again, you need to be careful with the EFI kexec code paths, since
you've got a mixture of boot and kernel data being passed. In
particular the EFI memory map is allocated by the firmware on first
boot (BOOT_DATA) but by the kernel on kexec (KERNEL_DATA).

That's one of the reasons I suggested requiring the caller to decide
on BOOT_DATA vs KERNEL_DATA - when you start looking at kexec the
distinction isn't easily made.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 00/22] Generic DT bindings for PCI IOMMUs and ARM SMMU

2016-09-15 Thread Auger Eric
Hi Robin,

On 14/09/2016 14:53, Robin Murphy wrote:
> On 14/09/16 13:32, Auger Eric wrote:
>> Hi,
>> On 14/09/2016 12:35, Robin Murphy wrote:
>>> On 14/09/16 09:41, Auger Eric wrote:
 Hi,

 On 12/09/2016 18:13, Robin Murphy wrote:
> Hi all,
>
> To any more confusing fixups and crazily numbered extra patches, here's
> a quick v7 with everything rebased into the right order. The significant
> change this time is to implement iommu_fwspec properly from the start,
> which ends up being far simpler and more robust than faffing about
> introducing it somewhere 'less intrusive' to move toward core code later.
>
> New branch in the logical place:
>
> git://linux-arm.org/linux-rm iommu/generic-v7

 For information, as discussed privately with Robin I experience some
 regressions with the former and now deprecated dt description.

 on my AMD Overdrive board and my old dt description I now only see a
 single group:

 /sys/kernel/iommu_groups/
 /sys/kernel/iommu_groups/0
 /sys/kernel/iommu_groups/0/devices
 /sys/kernel/iommu_groups/0/devices/e070.xgmac

 whereas I formerly see

 /sys/kernel/iommu_groups/
 /sys/kernel/iommu_groups/3
 /sys/kernel/iommu_groups/3/devices
 /sys/kernel/iommu_groups/3/devices/:00:00.0
 /sys/kernel/iommu_groups/1
 /sys/kernel/iommu_groups/1/devices
 /sys/kernel/iommu_groups/1/devices/e070.xgmac
 /sys/kernel/iommu_groups/4
 /sys/kernel/iommu_groups/4/devices
 /sys/kernel/iommu_groups/4/devices/:00:02.2
 /sys/kernel/iommu_groups/4/devices/:01:00.1
 /sys/kernel/iommu_groups/4/devices/:00:02.0
 /sys/kernel/iommu_groups/4/devices/:01:00.0
 /sys/kernel/iommu_groups/2
 /sys/kernel/iommu_groups/2/devices
 /sys/kernel/iommu_groups/2/devices/e090.xgmac
 /sys/kernel/iommu_groups/0
 /sys/kernel/iommu_groups/0/devices
 /sys/kernel/iommu_groups/0/devices/f000.pcie

 This is the group topology without ACS override. Applying the non
 upstreamed "pci: Enable overrides for missing ACS capabilities" I used
 to see separate groups for each PCIe components. Now I don't see any
 difference with and without ACS override.
>>>
>>> OK, having reproduced on my Juno, the problem looks to be that
>>> of_for_each_phandle() leaves err set to -ENOENT after successfully
>>> walking a phandle list, which makes __find_legacy_master_phandle()
>>> always bail out after the first SMMU.
>>>
>>> Can you confirm that the following diff fixes things for you?
>>
>> Well it improves but there are still differences in the group topology.
>> The PFs now are in group 0.
>>
>> root@trusty:~# lspci -nk
>> 00:00.0 0600: 1022:1a00
>> Subsystem: 1022:1a00
>> 00:02.0 0600: 1022:1a01
>> 00:02.2 0604: 1022:1a02
>> Kernel driver in use: pcieport
>> 01:00.0 0200: 8086:1521 (rev 01)
>> Subsystem: 8086:0002
>> Kernel driver in use: igb
>> 01:00.1 0200: 8086:1521 (rev 01)
>> Subsystem: 8086:0002
>> Kernel driver in use: igb
>>
>>
>> with your series + fix:
>> /sys/kernel/iommu_groups/
>> /sys/kernel/iommu_groups/3
>> /sys/kernel/iommu_groups/3/devices
>> /sys/kernel/iommu_groups/3/devices/:00:00.0
>> /sys/kernel/iommu_groups/1
>> /sys/kernel/iommu_groups/1/devices
>> /sys/kernel/iommu_groups/1/devices/e070.xgmac
>> /sys/kernel/iommu_groups/4
>> /sys/kernel/iommu_groups/4/devices
>> /sys/kernel/iommu_groups/4/devices/:00:02.2
>> /sys/kernel/iommu_groups/4/devices/:00:02.0
>> /sys/kernel/iommu_groups/2
>> /sys/kernel/iommu_groups/2/devices
>> /sys/kernel/iommu_groups/2/devices/e090.xgmac
>> /sys/kernel/iommu_groups/0
>> /sys/kernel/iommu_groups/0/devices
>> /sys/kernel/iommu_groups/0/devices/:01:00.1
>> /sys/kernel/iommu_groups/0/devices/f000.pcie
>> /sys/kernel/iommu_groups/0/devices/:01:00.0
>>
>> Before (4.8-rc5):
>>
>> /sys/kernel/iommu_groups/
>> /sys/kernel/iommu_groups/3
>> /sys/kernel/iommu_groups/3/devices
>> /sys/kernel/iommu_groups/3/devices/:00:00.0
>> /sys/kernel/iommu_groups/1
>> /sys/kernel/iommu_groups/1/devices
>> /sys/kernel/iommu_groups/1/devices/e070.xgmac
>> /sys/kernel/iommu_groups/4
>> /sys/kernel/iommu_groups/4/devices
>> /sys/kernel/iommu_groups/4/devices/:00:02.2
>> /sys/kernel/iommu_groups/4/devices/:01:00.1
>> /sys/kernel/iommu_groups/4/devices/:00:02.0
>> /sys/kernel/iommu_groups/4/devices/:01:00.0
>> /sys/kernel/iommu_groups/2
>> /sys/kernel/iommu_groups/2/devices
>> /sys/kernel/iommu_groups/2/devices/e090.xgmac
>> /sys/kernel/iommu_groups/0
>> /sys/kernel/iommu_groups/0/devices
>> /sys/kernel/iommu_groups/0/devices/f000.pcie
> 
> Your DT claims that f000.pcie (i.e. the "platform device" side of
> the host controller) owns the IDs 0x100 0x101 0x102 0x103 0x200 0x201
> 0x202 0x203 0x300 0x301 0x302 0x303 0x400 0x401 0x402 0x403. Thus when
> new devices (the PCI PFs) come 

Re: [PATCH v2] iommu/amd: Don't put completion-wait semaphore on stack

2016-09-15 Thread Ingo Molnar

* Joerg Roedel  wrote:

> Hi Ingo,
> 
> On Thu, Sep 15, 2016 at 07:44:35AM +0200, Ingo Molnar wrote:
> > Yeah, I can still remove it - just zapped it in fact.
> 
> Thanks, and sorry for the hassle. Here is the v2 patch that has the
> correct locking. I tested it with and without lockdep enabled and also
> under some load. Looks all fine.

Thanks a lot for the quick response!

Ingo
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 4/4] iommu/amd: No need to wait iommu completion if no dte irq entry change

2016-09-15 Thread Baoquan He
This is a clean up. In get_irq_table() only if DTE entry is changed
iommu_completion_wait() need be called. Otherwise no need to do it.

Signed-off-by: Baoquan He 
---
 drivers/iommu/amd_iommu.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index a9f78c2..461c2fe 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -3581,7 +3581,7 @@ static struct irq_remap_table *get_irq_table(u16 devid, 
bool ioapic)
 
table = irq_lookup_table[devid];
if (table)
-   goto out;
+   goto out_unlock;
 
alias = amd_iommu_alias_table[devid];
table = irq_lookup_table[alias];
@@ -3595,7 +3595,7 @@ static struct irq_remap_table *get_irq_table(u16 devid, 
bool ioapic)
/* Nothing there yet, allocate new irq remapping table */
table = kzalloc(sizeof(*table), GFP_ATOMIC);
if (!table)
-   goto out;
+   goto out_unlock;
 
/* Initialize table spin-lock */
spin_lock_init(>lock);
@@ -3608,7 +3608,7 @@ static struct irq_remap_table *get_irq_table(u16 devid, 
bool ioapic)
if (!table->table) {
kfree(table);
table = NULL;
-   goto out;
+   goto out_unlock;
}
 
memset(table->table, 0, MAX_IRQS_PER_TABLE * sizeof(u32));
-- 
2.5.5

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 0/4] iommu/amd: Clean up patches

2016-09-15 Thread Baoquan He
These were found out when I tried to fix the kdump failure on system
with AMD iommu. Pack them into this patchset since they are not related
to the kdump issue and each other.

Baoquan He (4):
  iommu/amd: clean up the cmpxchg64 invocation
  iommu/amd: Use standard bitmap operation to set bitmap
  iommu/amd: Free domain id when free a domain of struct dma_ops_domain
  iommu/amd: No need to wait iommu completion if no dte irq entry change

 drivers/iommu/amd_iommu.c  | 12 
 drivers/iommu/amd_iommu_init.c |  3 ++-
 2 files changed, 10 insertions(+), 5 deletions(-)

-- 
2.5.5

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 3/4] iommu/amd: Free domain id when free a domain of struct dma_ops_domain

2016-09-15 Thread Baoquan He
The current code missed freeing domain id when free a domain of
struct dma_ops_domain.

Signed-off-by: Baoquan He 
---
 drivers/iommu/amd_iommu.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 160fc6a..a9f78c2 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -1655,6 +1655,9 @@ static void dma_ops_domain_free(struct dma_ops_domain 
*dom)
 
free_pagetable(>domain);
 
+   if (dom->domain.id)
+   domain_id_free(dom->domain.id);
+
kfree(dom);
 }
 
-- 
2.5.5

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/4] iommu/amd: Use standard bitmap operation to set bitmap

2016-09-15 Thread Baoquan He
It will be more readable and safer than the old setting.

Signed-off-by: Baoquan He 
---
 drivers/iommu/amd_iommu_init.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index 59741ea..3e810c6 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -2136,7 +2137,7 @@ static int __init early_amd_iommu_init(void)
 * never allocate domain 0 because its used as the non-allocated and
 * error value placeholder
 */
-   amd_iommu_pd_alloc_bitmap[0] = 1;
+   __set_bit(0, amd_iommu_pd_alloc_bitmap);
 
spin_lock_init(_iommu_pd_lock);
 
-- 
2.5.5

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/4] iommu/amd: clean up the cmpxchg64 invocation

2016-09-15 Thread Baoquan He
Change it as it's designed for and keep it consistent with other
places.

Signed-off-by: Baoquan He 
---
 drivers/iommu/amd_iommu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 96de97a..160fc6a 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -1274,7 +1274,8 @@ static u64 *alloc_pte(struct protection_domain *domain,
 
__npte = PM_LEVEL_PDE(level, virt_to_phys(page));
 
-   if (cmpxchg64(pte, __pte, __npte)) {
+   /* pte could have been changed somewhere. */
+   if (cmpxchg64(pte, __pte, __npte) != __pte) {
free_page((unsigned long)page);
continue;
}
-- 
2.5.5

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 22/22] iommu/dma: Avoid PCI host bridge windows

2016-09-15 Thread Marek Szyprowski

Hi Robin,


On 2016-09-14 15:25, Robin Murphy wrote:

On 14/09/16 13:35, Marek Szyprowski wrote:

On 2016-09-14 13:10, Robin Murphy wrote:

On 14/09/16 11:55, Marek Szyprowski wrote:

On 2016-09-12 18:14, Robin Murphy wrote:

With our DMA ops enabled for PCI devices, we should avoid allocating
IOVAs which a host bridge might misinterpret as peer-to-peer DMA and
lead to faults, corruption or other badness. To be safe, punch out
holes
for all of the relevant host bridge's windows when initialising a DMA
domain for a PCI device.

CC: Marek Szyprowski 
CC: Inki Dae 
Reported-by: Lorenzo Pieralisi 
Signed-off-by: Robin Murphy 

I don't know much about PCI and their IOMMU integration, but can't we
use
the direct mapping region feature of iommu core for it? There are
already
iommu_get_dm_regions(), iommu_put_dm_regions() and
iommu_request_dm_for_dev()
functions for handling them...

It's rather the opposite problem - in the direct-mapping case, we're
making sure the iommu_domain has translations installed for the given
IOVAs (which are also the corresponding physical address) before it goes
live, whereas what we need to do here is make sure the these addresses
never get used as IOVAs at all, because any attempt to do so them will
likely go wrong. Thus we carve them out of the iova_domain such that
they will never get near an actual IOMMU API call.

This is a slightly generalised equivalent of e.g. amd_iommu.c's
init_reserved_iova_ranges().

Hmmm. Each dm_region have protection parameter. Can't we reuse them to
create prohibited/reserved regions by setting it to 0 (no read / no write)
and mapping to physical 0 address? That's just a quick idea, because
dm_regions and the proposed code for pci looks a bit similar for me...

It might look similar, but at different levels (iommu_domain vs.
iova_domain) and with the opposite intent. The dm_region prot flag is
just the standard flag as passed to iommu_map() - trying to map a region
with no access in an empty pagetable isn't going to achieve anything
anyway (it's effectively unmapping addresses that are already unmapped).

But for this case, even if you _did_ map something in the pagetable
(i.e. the iommu_domain), it wouldn't make any difference, because the
thing we're mitigating against is handing out addresses which are going
to cause a device's accesses to blow up inside the PCI root complex
without ever even reaching the IOMMU. In short, dm_regions are about
"these addresses are already being used for DMA, so make sure the IOMMU
API doesn't block them", whereas reserved ranges are about "these
addresses are unusable for DMA, so make sure the DMA API can't allocate
them".


Okay, thanks for the explanation.

Best regards
--
Marek Szyprowski, PhD
Samsung R Institute Poland

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu