[PATCH] iommu/ipmmu-vmsa: Don't truncate ttbr if LPAE is not enabled

2015-12-22 Thread Geert Uytterhoeven
If CONFIG_PHYS_ADDR_T_64BIT=n: drivers/iommu/ipmmu-vmsa.c: In function 'ipmmu_domain_init_context': drivers/iommu/ipmmu-vmsa.c:434:2: warning: right shift count >= width of type ipmmu_ctx_write(domain, IMTTUBR0, ttbr >> 32); ^ As io_pgtable_cfg.arm_lpae_s1_cfg.ttbr[] is an

[PATCH 0/6] perf/amd/iommu: Enable multi-IOMMU support

2015-12-22 Thread Suravee Suthikulpanit
This patch series modifies the existing perf_event_amd_iommu driver to support systems with multiple IOMMUs. It introduces new AMD IOMMU APIs, which will are used by the AMD IOMMU Perf driver to access performance counters in multiple IOMMUs. In addition, this series should also fix current AMD

[PATCH 03/23] iommu/amd: Introduce bitmap_lock in struct aperture_range

2015-12-22 Thread Joerg Roedel
From: Joerg Roedel This lock only protects the address allocation bitmap in one aperture. Signed-off-by: Joerg Roedel --- drivers/iommu/amd_iommu.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/drivers/iommu/amd_iommu.c

[PATCH 04/23] iommu/amd: Flush IOMMU TLB on __map_single error path

2015-12-22 Thread Joerg Roedel
From: Joerg Roedel There have been present PTEs which in theory could have made it to the IOMMU TLB. Flush the addresses out on the error path to make sure no stale entries remain. Signed-off-by: Joerg Roedel --- drivers/iommu/amd_iommu.c | 2 ++ 1 file

[PATCH 08/23] iommu/amd: Move aperture_range.offset to another cache-line

2015-12-22 Thread Joerg Roedel
From: Joerg Roedel Moving it before the pte_pages array puts in into the same cache-line as the spin-lock and the bitmap array pointer. This should safe a cache-miss. Signed-off-by: Joerg Roedel --- drivers/iommu/amd_iommu.c | 3 +-- 1 file changed, 1

[PATCH 01/23] iommu/amd: Warn only once on unexpected pte value

2015-12-22 Thread Joerg Roedel
From: Joerg Roedel This prevents possible flooding of the kernel log. Signed-off-by: Joerg Roedel --- drivers/iommu/amd_iommu.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c index

[PATCH 02/23] iommu/amd: Move 'struct dma_ops_domain' definition to amd_iommu.c

2015-12-22 Thread Joerg Roedel
From: Joerg Roedel It is only used in this file anyway, so keep it there. Same with 'struct aperture_range'. Signed-off-by: Joerg Roedel --- drivers/iommu/amd_iommu.c | 40 drivers/iommu/amd_iommu_types.h | 40

[PATCH 21/23] iommu/amd: Make dma_ops_domain->next_index percpu

2015-12-22 Thread Joerg Roedel
From: Joerg Roedel Make this pointer percpu so that we start searching for new addresses in the range we last stopped and which is has a higher probability of being still in the cache. Signed-off-by: Joerg Roedel --- drivers/iommu/amd_iommu.c | 39

[PATCH 6/6] perf/amd/iommu: Enable support for multiple IOMMUs

2015-12-22 Thread Suravee Suthikulpanit
The current amd_iommu_pc_get_set_reg_val() does not support muli-IOMMU system. This patch replace amd_iommu_pc_get_set_reg_val() with amd_iommu_pc_set_reg_val() and amd_iommu_pc_[set|get]_cnt_vals(). This implementation makes an assumption that the counters on all IOMMUs will be programmed the

[PATCH 2/6] perf/amd/iommu: Modify functions to query max banks and counters

2015-12-22 Thread Suravee Suthikulpanit
Currently, amd_iommu_pc_get_max_[banks|counters]() require devid, which should not be the case. Also, these don't properly support multi-IOMMU system. Current and future AMD systems with IOMMU that support perf counter would likely contain homogeneous IOMMUs where multiple IOMMUs are availalbe.

[PATCH 5/6] perf/amd/iommu: Introduce get_iommu_bnk_cnt_evt_idx

2015-12-22 Thread Suravee Suthikulpanit
Introduce a helper function to calculate bit-index for assigning performance counter assignment. Signed-off-by: Suravee Suthikulpanit --- arch/x86/kernel/cpu/perf_event_amd_iommu.c | 20 +++- 1 file changed, 15 insertions(+), 5 deletions(-) diff

[PATCH 4/6] perf/amd/iommu: Introduce data structure for tracking prev count.

2015-12-22 Thread Suravee Suthikulpanit
To enable AMD IOMMU PMU to support multiple IOMMUs, this patch introduces a new data structure, perf_amd_iommu.prev_cnts, to track previous counts of IOMMU performance counters in multi-IOMMU environment. Also, this patch allocates perf_iommu_cnts for internal use when manages counters.

[PATCH 3/6] iommu/amd: Introduce amd_iommu_get_num_iommus()

2015-12-22 Thread Suravee Suthikulpanit
This patch introduces amd_iommu_get_num_iommus(). Initially, this is intended to be used by Perf AMD IOMMU driver. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd_iommu_init.c| 16 include/linux/perf/perf_event_amd_iommu.h |

[PATCH 1/6] perf/amd/iommu: Consolidate and move perf_event_amd_iommu header

2015-12-22 Thread Suravee Suthikulpanit
This patch consolidates "arch/x86/kernel/cpu/perf_event_amd_iommu.h" and "drivers/iommu/amd_iommu_proto.h", which contain duplicate function declarations, into "include/linux/perf/perf_event_amd_iommu.h" Signed-off-by: Suravee Suthikulpanit ---

[PATCH 06/23] iommu/amd: Pass correct shift to iommu_area_alloc()

2015-12-22 Thread Joerg Roedel
From: Joerg Roedel The page-offset of the aperture must be passed instead of 0. Signed-off-by: Joerg Roedel --- drivers/iommu/amd_iommu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c

[PATCH 10/23] iommu/amd: Flush iommu tlb in dma_ops_aperture_alloc()

2015-12-22 Thread Joerg Roedel
From: Joerg Roedel Since the allocator wraparound happens in this function now, flush the iommu tlb there too. Signed-off-by: Joerg Roedel --- drivers/iommu/amd_iommu.c | 21 - 1 file changed, 16 insertions(+), 5 deletions(-) diff --git

[PATCH 22/23] iommu/amd: Use trylock to aquire bitmap_lock

2015-12-22 Thread Joerg Roedel
From: Joerg Roedel First search for a non-contended aperture with trylock before spinning. Signed-off-by: Joerg Roedel --- drivers/iommu/amd_iommu.c | 20 +--- 1 file changed, 17 insertions(+), 3 deletions(-) diff --git

[PATCH 11/23] iommu/amd: Remove 'start' parameter from dma_ops_area_alloc

2015-12-22 Thread Joerg Roedel
From: Joerg Roedel Parameter is not needed because the value is part of the already passed in struct dma_ops_domain. Signed-off-by: Joerg Roedel --- drivers/iommu/amd_iommu.c | 10 -- 1 file changed, 4 insertions(+), 6 deletions(-) diff --git

[PATCH 09/23] iommu/amd: Retry address allocation within one aperture

2015-12-22 Thread Joerg Roedel
From: Joerg Roedel Instead of skipping to the next aperture, first try again in the current one. Signed-off-by: Joerg Roedel --- drivers/iommu/amd_iommu.c | 29 +++-- 1 file changed, 19 insertions(+), 10 deletions(-) diff --git

[PATCH 15/23] iommu/amd: Remove need_flush from struct dma_ops_domain

2015-12-22 Thread Joerg Roedel
From: Joerg Roedel The flushing of iommu tlbs is now done on a per-range basis. So there is no need anymore for domain-wide flush tracking. Signed-off-by: Joerg Roedel --- drivers/iommu/amd_iommu.c | 30 ++ 1 file changed, 6

[PATCH 07/23] iommu/amd: Add dma_ops_aperture_alloc() function

2015-12-22 Thread Joerg Roedel
From: Joerg Roedel Make this a wrapper around iommu_ops_area_alloc() for now and add more logic to this function later on. Signed-off-by: Joerg Roedel --- drivers/iommu/amd_iommu.c | 37 + 1 file changed, 25 insertions(+),

[PATCH 18/23] iommu/amd: Build io page-tables with cmpxchg64

2015-12-22 Thread Joerg Roedel
From: Joerg Roedel This allows to build up the page-tables without holding any locks. As a consequence it removes the need to pre-populate dma_ops page-tables. Signed-off-by: Joerg Roedel --- drivers/iommu/amd_iommu.c | 16 +--- 1 file changed, 13

[PATCH 19/23] iommu/amd: Initialize new aperture range before making it visible

2015-12-22 Thread Joerg Roedel
From: Joerg Roedel Make sure the aperture range is fully initialized before it is visible to the address allocator. Signed-off-by: Joerg Roedel --- drivers/iommu/amd_iommu.c | 33 - 1 file changed, 20 insertions(+), 13

[PATCH 16/23] iommu/amd: Optimize dma_ops_free_addresses

2015-12-22 Thread Joerg Roedel
From: Joerg Roedel Don't flush the iommu tlb when we free something behind the current next_bit pointer. Update the next_bit pointer instead and let the flush happen on the next wraparound in the allocation path. Signed-off-by: Joerg Roedel ---

[PATCH 20/23] iommu/amd: Relax locking in dma_ops path

2015-12-22 Thread Joerg Roedel
From: Joerg Roedel Remove the long holding times of the domain->lock and rely on the bitmap_lock instead. Signed-off-by: Joerg Roedel --- drivers/iommu/amd_iommu.c | 70 --- 1 file changed, 11 insertions(+), 59

[PATCH 12/23] iommu/amd: Rename dma_ops_domain->next_address to next_index

2015-12-22 Thread Joerg Roedel
From: Joerg Roedel It points to the next aperture index to allocate from. We don't need the full address anymore because this is now tracked in struct aperture_range. Signed-off-by: Joerg Roedel --- drivers/iommu/amd_iommu.c | 26 +- 1

[PATCH 05/23] iommu/amd: Flush the IOMMU TLB before the addresses are freed

2015-12-22 Thread Joerg Roedel
From: Joerg Roedel This allows to keep the bitmap_lock only for a very short period of time. Signed-off-by: Joerg Roedel --- drivers/iommu/amd_iommu.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/amd_iommu.c

[PATCH 13/23] iommu/amd: Flush iommu tlb in dma_ops_free_addresses

2015-12-22 Thread Joerg Roedel
From: Joerg Roedel Instead of setting need_flush, do the flush directly in dma_ops_free_addresses. Signed-off-by: Joerg Roedel --- drivers/iommu/amd_iommu.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/amd_iommu.c

[PATCH 17/23] iommu/amd: Allocate new aperture ranges in dma_ops_alloc_addresses

2015-12-22 Thread Joerg Roedel
From: Joerg Roedel It really belongs there and not in __map_single. Signed-off-by: Joerg Roedel --- drivers/iommu/amd_iommu.c | 29 ++--- 1 file changed, 10 insertions(+), 19 deletions(-) diff --git a/drivers/iommu/amd_iommu.c

[PATCH 23/23] iommu/amd: Preallocate dma_ops apertures based on dma_mask

2015-12-22 Thread Joerg Roedel
From: Joerg Roedel Preallocate between 4 and 8 apertures when a device gets it dma_mask. With more apertures we reduce the lock contention of the domain lock significantly. Signed-off-by: Joerg Roedel --- drivers/iommu/amd_iommu.c | 60

Re: [GIT PULL] iommu/arm-smmu: Updates for 4.5

2015-12-22 Thread Joerg Roedel
On Mon, Dec 21, 2015 at 06:23:50PM +, Will Deacon wrote: > git://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git > for-joerg/arm-smmu/updates Pulled, thanks Will. ___ iommu mailing list iommu@lists.linux-foundation.org