Re: [PATCH v2 0/4] Optimise 64-bit IOVA allocations

2017-08-08 Thread Leizhen (ThunderTown)


On 2017/8/9 11:24, Ganapatrao Kulkarni wrote:
> On Wed, Aug 9, 2017 at 7:12 AM, Leizhen (ThunderTown)
>  wrote:
>>
>>
>> On 2017/8/8 20:03, Ganapatrao Kulkarni wrote:
>>> On Wed, Jul 26, 2017 at 4:47 PM, Leizhen (ThunderTown)
>>>  wrote:


 On 2017/7/26 19:08, Joerg Roedel wrote:
> Hi Robin.
>
> On Fri, Jul 21, 2017 at 12:41:57PM +0100, Robin Murphy wrote:
>> Hi all,
>>
>> In the wake of the ARM SMMU optimisation efforts, it seems that certain
>> workloads (e.g. storage I/O with large scatterlists) probably remain 
>> quite
>> heavily influenced by IOVA allocation performance. Separately, Ard also
>> reported massive performance drops for a graphical desktop on AMD Seattle
>> when enabling SMMUs via IORT, which we traced to dma_32bit_pfn in the DMA
>> ops domain getting initialised differently for ACPI vs. DT, and exposing
>> the overhead of the rbtree slow path. Whilst we could go around trying to
>> close up all the little gaps that lead to hitting the slowest case, it
>> seems a much better idea to simply make said slowest case a lot less 
>> slow.
>
> Do you have some numbers here? How big was the impact before these
> patches and how is it with the patches?
 Here are some numbers:

 (before)$ iperf -s
 
 Server listening on TCP port 5001
 TCP window size: 85.3 KByte (default)
 
 [  4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 35898
 [ ID] Interval   Transfer Bandwidth
 [  4]  0.0-10.2 sec  7.88 MBytes  6.48 Mbits/sec
 [  5] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 35900
 [  5]  0.0-10.3 sec  7.88 MBytes  6.43 Mbits/sec
 [  4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 35902
 [  4]  0.0-10.3 sec  7.88 MBytes  6.43 Mbits/sec

 (after)$ iperf -s
 
 Server listening on TCP port 5001
 TCP window size: 85.3 KByte (default)
 
 [  4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 36330
 [ ID] Interval   Transfer Bandwidth
 [  4]  0.0-10.0 sec  1.09 GBytes   933 Mbits/sec
 [  5] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 36332
 [  5]  0.0-10.0 sec  1.10 GBytes   939 Mbits/sec
 [  4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 36334
 [  4]  0.0-10.0 sec  1.10 GBytes   938 Mbits/sec

>>>
>>> Is this testing done on Host or on Guest/VM?
>> Host
> 
> As per your log, iperf throughput is improved to 938 Mbits/sec
> from  6.43 Mbits/sec.
> IMO, this seems to be unrealistic, some thing wrong with the testing?
For 64bits non-pci devices, the iova allocation is always searched from the 
last rb-tree node.
When many iovas allocated and keep a long time, the search process should check 
many rb nodes
then find a suitable free space. As my tracking, the average times exceeds 10K.
[free-space][free][used][...][used]
  ^ ^  ^
  | |  |-rb_last
  | |- maybe more than 10K allocated iova nodes
  |--- for 32bits devices, cached32_node remember the 
lastest freed node, which can help us reduce check times

This patch series add a new member "cached_node" to service for 64bits devices, 
like cached32_node service for 32bits devices.

> 
>>
>>>
>
>
>   Joerg
>
>
> .
>

 --
 Thanks!
 BestRegards


 ___
 linux-arm-kernel mailing list
 linux-arm-ker...@lists.infradead.org
 http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>>
>>> thanks
>>> Ganapat
>>>
>>> .
>>>
>>
>> --
>> Thanks!
>> BestRegards
>>
> 
> thanks
> Ganapat
> 
> .
> 

-- 
Thanks!
BestRegards

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 0/4] Optimise 64-bit IOVA allocations

2017-08-08 Thread Ganapatrao Kulkarni
On Wed, Aug 9, 2017 at 7:12 AM, Leizhen (ThunderTown)
 wrote:
>
>
> On 2017/8/8 20:03, Ganapatrao Kulkarni wrote:
>> On Wed, Jul 26, 2017 at 4:47 PM, Leizhen (ThunderTown)
>>  wrote:
>>>
>>>
>>> On 2017/7/26 19:08, Joerg Roedel wrote:
 Hi Robin.

 On Fri, Jul 21, 2017 at 12:41:57PM +0100, Robin Murphy wrote:
> Hi all,
>
> In the wake of the ARM SMMU optimisation efforts, it seems that certain
> workloads (e.g. storage I/O with large scatterlists) probably remain quite
> heavily influenced by IOVA allocation performance. Separately, Ard also
> reported massive performance drops for a graphical desktop on AMD Seattle
> when enabling SMMUs via IORT, which we traced to dma_32bit_pfn in the DMA
> ops domain getting initialised differently for ACPI vs. DT, and exposing
> the overhead of the rbtree slow path. Whilst we could go around trying to
> close up all the little gaps that lead to hitting the slowest case, it
> seems a much better idea to simply make said slowest case a lot less slow.

 Do you have some numbers here? How big was the impact before these
 patches and how is it with the patches?
>>> Here are some numbers:
>>>
>>> (before)$ iperf -s
>>> 
>>> Server listening on TCP port 5001
>>> TCP window size: 85.3 KByte (default)
>>> 
>>> [  4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 35898
>>> [ ID] Interval   Transfer Bandwidth
>>> [  4]  0.0-10.2 sec  7.88 MBytes  6.48 Mbits/sec
>>> [  5] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 35900
>>> [  5]  0.0-10.3 sec  7.88 MBytes  6.43 Mbits/sec
>>> [  4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 35902
>>> [  4]  0.0-10.3 sec  7.88 MBytes  6.43 Mbits/sec
>>>
>>> (after)$ iperf -s
>>> 
>>> Server listening on TCP port 5001
>>> TCP window size: 85.3 KByte (default)
>>> 
>>> [  4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 36330
>>> [ ID] Interval   Transfer Bandwidth
>>> [  4]  0.0-10.0 sec  1.09 GBytes   933 Mbits/sec
>>> [  5] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 36332
>>> [  5]  0.0-10.0 sec  1.10 GBytes   939 Mbits/sec
>>> [  4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 36334
>>> [  4]  0.0-10.0 sec  1.10 GBytes   938 Mbits/sec
>>>
>>
>> Is this testing done on Host or on Guest/VM?
> Host

As per your log, iperf throughput is improved to 938 Mbits/sec
from  6.43 Mbits/sec.
IMO, this seems to be unrealistic, some thing wrong with the testing?

>
>>


   Joerg


 .

>>>
>>> --
>>> Thanks!
>>> BestRegards
>>>
>>>
>>> ___
>>> linux-arm-kernel mailing list
>>> linux-arm-ker...@lists.infradead.org
>>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>
>> thanks
>> Ganapat
>>
>> .
>>
>
> --
> Thanks!
> BestRegards
>

thanks
Ganapat
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 0/4] Optimise 64-bit IOVA allocations

2017-08-08 Thread Leizhen (ThunderTown)


On 2017/8/8 20:03, Ganapatrao Kulkarni wrote:
> On Wed, Jul 26, 2017 at 4:47 PM, Leizhen (ThunderTown)
>  wrote:
>>
>>
>> On 2017/7/26 19:08, Joerg Roedel wrote:
>>> Hi Robin.
>>>
>>> On Fri, Jul 21, 2017 at 12:41:57PM +0100, Robin Murphy wrote:
 Hi all,

 In the wake of the ARM SMMU optimisation efforts, it seems that certain
 workloads (e.g. storage I/O with large scatterlists) probably remain quite
 heavily influenced by IOVA allocation performance. Separately, Ard also
 reported massive performance drops for a graphical desktop on AMD Seattle
 when enabling SMMUs via IORT, which we traced to dma_32bit_pfn in the DMA
 ops domain getting initialised differently for ACPI vs. DT, and exposing
 the overhead of the rbtree slow path. Whilst we could go around trying to
 close up all the little gaps that lead to hitting the slowest case, it
 seems a much better idea to simply make said slowest case a lot less slow.
>>>
>>> Do you have some numbers here? How big was the impact before these
>>> patches and how is it with the patches?
>> Here are some numbers:
>>
>> (before)$ iperf -s
>> 
>> Server listening on TCP port 5001
>> TCP window size: 85.3 KByte (default)
>> 
>> [  4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 35898
>> [ ID] Interval   Transfer Bandwidth
>> [  4]  0.0-10.2 sec  7.88 MBytes  6.48 Mbits/sec
>> [  5] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 35900
>> [  5]  0.0-10.3 sec  7.88 MBytes  6.43 Mbits/sec
>> [  4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 35902
>> [  4]  0.0-10.3 sec  7.88 MBytes  6.43 Mbits/sec
>>
>> (after)$ iperf -s
>> 
>> Server listening on TCP port 5001
>> TCP window size: 85.3 KByte (default)
>> 
>> [  4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 36330
>> [ ID] Interval   Transfer Bandwidth
>> [  4]  0.0-10.0 sec  1.09 GBytes   933 Mbits/sec
>> [  5] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 36332
>> [  5]  0.0-10.0 sec  1.10 GBytes   939 Mbits/sec
>> [  4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 36334
>> [  4]  0.0-10.0 sec  1.10 GBytes   938 Mbits/sec
>>
> 
> Is this testing done on Host or on Guest/VM?
Host

> 
>>>
>>>
>>>   Joerg
>>>
>>>
>>> .
>>>
>>
>> --
>> Thanks!
>> BestRegards
>>
>>
>> ___
>> linux-arm-kernel mailing list
>> linux-arm-ker...@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 
> thanks
> Ganapat
> 
> .
> 

-- 
Thanks!
BestRegards

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 4/4] iommu/vt-d: Hooks to invalidate iotlb/devtlb when using supervisor PASID's.

2017-08-08 Thread Ashok Raj
When a kernel client uses intel_svm_bind_mm() and requests a supervisor
PASID, IOMMU needs to track changes to these addresses. Otherwise the device
tlb will be stale compared to what's on the cpu for kernel mappings. This
is similar to what's done for user space registrations via
mmu_notifier_register() api's.

To: linux-ker...@vger.kernel.org
To: Joerg Roedel 
Cc: Ashok Raj 
Cc: Dave Hansen 
Cc: Huang Ying 
Cc: CQ Tang 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: H. Peter Anvin 
Cc: Andy Lutomirski 
Cc: Rik van Riel 
Cc: Kees Cook 
Cc: Andrew Morton 
Cc: Michal Hocko 
Cc: "Paul E. McKenney" 
Cc: Vegard Nossum 
Cc: x...@kernel.org
Cc: linux...@kvack.org
Cc: iommu@lists.linux-foundation.org
Cc: David Woodhouse 
CC: Jean-Phillipe Brucker 

Signed-off-by: Ashok Raj 
---
 drivers/iommu/intel-svm.c   | 29 +++--
 include/linux/intel-iommu.h |  5 -
 2 files changed, 31 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index 0c9f077..1758814 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -292,6 +292,26 @@ static const struct mmu_notifier_ops intel_mmuops = {
 
 static DEFINE_MUTEX(pasid_mutex);
 
+static int intel_init_mm_inval_range(struct notifier_block *nb,
+   unsigned long action, void *data)
+{
+   struct kernel_mmu_address_range *range;
+   struct intel_svm *svm = container_of(nb, struct intel_svm, init_mm_nb);
+   unsigned long start, end;
+   struct intel_iommu *iommu;
+
+   if (action == KERNEL_MMU_INVALIDATE_RANGE) {
+   range = data;
+   start = range->start;
+   end = range->end;
+   iommu = svm->iommu;
+
+   intel_flush_svm_range(svm, start,
+   (end - start + PAGE_SIZE - 1) >> VTD_PAGE_SHIFT, 0, 0);
+   }
+   return 0;
+}
+
 int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct 
svm_dev_ops *ops)
 {
struct intel_iommu *iommu = intel_svm_device_to_iommu(dev);
@@ -391,12 +411,12 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int 
flags, struct svm_dev_
goto out;
}
svm->pasid = ret;
-   svm->notifier.ops = _mmuops;
svm->mm = mm;
svm->flags = flags;
INIT_LIST_HEAD_RCU(>devs);
ret = -ENOMEM;
if (mm) {
+   svm->notifier.ops = _mmuops;
ret = mmu_notifier_register(>notifier, mm);
if (ret) {
idr_remove(>iommu->pasid_idr, svm->pasid);
@@ -405,8 +425,11 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int 
flags, struct svm_dev_
goto out;
}
iommu->pasid_table[svm->pasid].val = (u64)__pa(mm->pgd) 
| 1;
-   } else
+   } else {
+   svm->init_mm_nb.notifier_call = 
intel_init_mm_inval_range;
+   kernel_mmu_notifier_register(>init_mm_nb);
iommu->pasid_table[svm->pasid].val = 
(u64)__pa(init_mm.pgd) | 1 | (1ULL << 11);
+   }
wmb();
/* In caching mode, we still have to flush with PASID 0 when
 * a PASID table entry becomes present. Not entirely clear
@@ -471,6 +494,8 @@ int intel_svm_unbind_mm(struct device *dev, int pasid)
idr_remove(>iommu->pasid_idr, 
svm->pasid);
if (svm->mm)

mmu_notifier_unregister(>notifier, svm->mm);
+   else
+   
kernel_mmu_notifier_unregister(>init_mm_nb);
 
/* We mandate that no page faults may 
be outstanding
 * for the PASID when 
intel_svm_unbind_mm() is called.
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 485a5b4..d6019b4 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -477,7 +477,10 @@ struct intel_svm_dev {
 };
 
 struct intel_svm {
-   struct mmu_notifier notifier;
+   union {
+   struct mmu_notifier notifier;
+   struct notifier_block init_mm_nb;
+   };
struct mm_struct *mm;
struct intel_iommu *iommu;
int flags;
-- 
2.7.4

___
iommu 

[PATCH 3/4] mm: Add kernel MMU notifier to manage remote TLB

2017-08-08 Thread Ashok Raj
From: Huang Ying 

Shared Virtual Memory (SVM) devices have TLBs that cache entries from
the CPU's page tables.  We need SVM device drivers to flush them at
the same time that we flush the CPU TLBs.  We can use the existing MMU
notifiers for userspace updates, but we lack a mechanism to get
notified when kernel page tables are updated.

To implement the MMU notification mechanism for the kernel address
space, a kernel MMU notifier chain is defined, and will be called when
the CPU TLB is flushed for the kernel address space.  The IOMMU SVM
driver can register on the notifier chain to flush the device TLBs
when necessary.

To: linux-ker...@vger.kernel.org
To: Joerg Roedel 
Cc: Ashok Raj 
Cc: Dave Hansen 
Cc: CQ Tang 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: H. Peter Anvin 
Cc: Andy Lutomirski 
Cc: Rik van Riel 
Cc: Kees Cook 
Cc: Andrew Morton 
Cc: "Kirill A. Shutemov" 
Cc: Michal Hocko 
Cc: "Paul E. McKenney" 
Cc: Vegard Nossum 
Cc: x...@kernel.org
Cc: linux...@kvack.org
Cc: iommu@lists.linux-foundation.org
Cc: David Woodhouse 
CC: Jean-Phillipe Brucker 
Signed-off-by: "Huang, Ying" 
---
 arch/x86/include/asm/tlbflush.h |  1 +
 arch/x86/mm/tlb.c   |  1 +
 include/linux/mmu_notifier.h| 33 +
 mm/mmu_notifier.c   | 25 +
 4 files changed, 60 insertions(+)

diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
index 50ea348..f5fd0b8 100644
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -3,6 +3,7 @@
 
 #include 
 #include 
+#include 
 
 #include 
 #include 
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 014d07a..6dea8e9 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -314,6 +314,7 @@ void flush_tlb_kernel_range(unsigned long start, unsigned 
long end)
info.end = end;
on_each_cpu(do_kernel_range_flush, , 1);
}
+   kernel_mmu_notifier_invalidate_range(start, end);
 }
 
 void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch)
diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h
index c91b3bc..4a96089 100644
--- a/include/linux/mmu_notifier.h
+++ b/include/linux/mmu_notifier.h
@@ -418,6 +418,25 @@ extern void mmu_notifier_call_srcu(struct rcu_head *rcu,
   void (*func)(struct rcu_head *rcu));
 extern void mmu_notifier_synchronize(void);
 
+struct kernel_mmu_address_range {
+   unsigned long start;
+   unsigned long end;
+};
+
+/*
+ * Before the virtual address range managed by kernel (vmalloc/kmap)
+ * is reused, That is, remapped to the new physical addresses, the
+ * kernel MMU notifier will be called with KERNEL_MMU_INVALIDATE_RANGE
+ * and struct kernel_mmu_address_range as parameters.  This is used to
+ * manage the remote TLB.
+ */
+#define KERNEL_MMU_INVALIDATE_RANGE1
+extern int kernel_mmu_notifier_register(struct notifier_block *nb);
+extern int kernel_mmu_notifier_unregister(struct notifier_block *nb);
+
+extern int kernel_mmu_notifier_invalidate_range(unsigned long start,
+   unsigned long end);
+
 #else /* CONFIG_MMU_NOTIFIER */
 
 static inline void mmu_notifier_release(struct mm_struct *mm)
@@ -479,6 +498,20 @@ static inline void mmu_notifier_mm_destroy(struct 
mm_struct *mm)
 #define pudp_huge_clear_flush_notify pudp_huge_clear_flush
 #define set_pte_at_notify set_pte_at
 
+static inline int kernel_mmu_notifier_register(struct notifier_block *nb)
+{
+   return 0;
+}
+
+static inline int kernel_mmu_notifier_unregister(struct notifier_block *nb)
+{
+   return 0;
+}
+
+static inline void kernel_mmu_notifier_invalidate_range(unsigned long start,
+   unsigned long end)
+{
+}
 #endif /* CONFIG_MMU_NOTIFIER */
 
 #endif /* _LINUX_MMU_NOTIFIER_H */
diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c
index 54ca545..a919038 100644
--- a/mm/mmu_notifier.c
+++ b/mm/mmu_notifier.c
@@ -400,3 +400,28 @@ void mmu_notifier_unregister_no_release(struct 
mmu_notifier *mn,
mmdrop(mm);
 }
 EXPORT_SYMBOL_GPL(mmu_notifier_unregister_no_release);
+
+static ATOMIC_NOTIFIER_HEAD(kernel_mmu_notifier_list);
+
+int kernel_mmu_notifier_register(struct notifier_block *nb)
+{
+   return atomic_notifier_chain_register(_mmu_notifier_list, nb);
+}
+
+int kernel_mmu_notifier_unregister(struct notifier_block *nb)
+{
+   return atomic_notifier_chain_unregister(_mmu_notifier_list, nb);
+}
+
+int 

[PATCH 2/4] iommu/vt-d: Avoid calling virt_to_phys() on null pointer

2017-08-08 Thread Ashok Raj
New kernels with debug show panic() from __phys_addr() checks. Avoid
calling virt_to_phys()  when pasid_state_tbl pointer is null

To: Joerg Roedel 
To: linux-ker...@vger.kernel.org>
Cc: iommu@lists.linux-foundation.org
Cc: David Woodhouse 
Cc: Jacob Pan 
Cc: Ashok Raj 

Signed-off-by: Ashok Raj 
---
 drivers/iommu/intel-iommu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 687f18f..5c6118d 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5341,7 +5341,8 @@ int intel_iommu_enable_pasid(struct intel_iommu *iommu, 
struct intel_svm_dev *sd
sdev->sid = PCI_DEVID(info->bus, info->devfn);
 
if (!(ctx_lo & CONTEXT_PASIDE)) {
-   context[1].hi = (u64)virt_to_phys(iommu->pasid_state_table);
+   if (iommu->pasid_state_table)
+   context[1].hi = 
(u64)virt_to_phys(iommu->pasid_state_table);
context[1].lo = (u64)virt_to_phys(iommu->pasid_table) |
intel_iommu_get_pts(iommu);
 
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/4] iommu/vt-d: IOMMU Page Request needs to check if address is canonical.

2017-08-08 Thread Ashok Raj
Page Request from devices that support device-tlb would request translation
to pre-cache them in device to avoid overhead of IOMMU lookups.

IOMMU needs to check for canonicallity of the address before performing
page-fault processing.

To: Joerg Roedel 
To: linux-ker...@vger.kernel.org>
Cc: iommu@lists.linux-foundation.org
Cc: David Woodhouse 
Cc: Jacob Pan 
Cc: Ashok Raj 

Signed-off-by: Ashok Raj 
Reported-by: Sudeep Dutt 
---
 drivers/iommu/intel-svm.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index f167c0d..0c9f077 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -24,6 +24,7 @@
 #include 
 #include 
 #include 
+#include 
 
 static irqreturn_t prq_event_thread(int irq, void *d);
 
@@ -555,6 +556,14 @@ static bool access_error(struct vm_area_struct *vma, 
struct page_req_dsc *req)
return (requested & ~vma->vm_flags) != 0;
 }
 
+static bool is_canonical_address(u64 addr)
+{
+   int shift = 64 - (__VIRTUAL_MASK_SHIFT + 1);
+   long saddr = (long) addr;
+
+   return (((saddr << shift) >> shift) == saddr);
+}
+
 static irqreturn_t prq_event_thread(int irq, void *d)
 {
struct intel_iommu *iommu = d;
@@ -612,6 +621,11 @@ static irqreturn_t prq_event_thread(int irq, void *d)
/* If the mm is already defunct, don't handle faults. */
if (!mmget_not_zero(svm->mm))
goto bad_req;
+
+   /* If address is not canonical, return invalid response */
+   if (!is_canonical_address(address))
+   goto bad_req;
+
down_read(>mm->mmap_sem);
vma = find_extend_vma(svm->mm, address);
if (!vma || address < vma->vm_start)
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 0/4] Patches to support ring0 SVM and devtlb

2017-08-08 Thread Ashok Raj
Hi

Sorry for resending.. iommu list email was mistyped :-(

The first 2 patches in the series fix some simple bugs in Intel vt-d driver.
The 3rd patch Adds support for kmem notify required to support ring0 SVM.
4th patch uses the hooks to perform device tlb invalidations.

Ashok Raj (3):
  iommu/vt-d: IOMMU Page Request needs to check if address is canonical.
  iommu/vt-d: Avoid calling virt_to_phys() on null pointer
  iommu/vt-d: Hooks to invalidate iotlb/devtlb when using supervisor
PASID's.

Huang Ying (1):
  mm: Add kernel MMU notifier to manage remote TLB

 arch/x86/include/asm/tlbflush.h |  1 +
 arch/x86/mm/tlb.c   |  1 +
 drivers/iommu/intel-iommu.c |  3 ++-
 drivers/iommu/intel-svm.c   | 43 +++--
 include/linux/intel-iommu.h |  5 -
 include/linux/mmu_notifier.h| 33 +++
 mm/mmu_notifier.c   | 25 
 7 files changed, 107 insertions(+), 4 deletions(-)

-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 2/2] iommu/arm-smmu: Add system PM support

2017-08-08 Thread Robin Murphy
With all our hardware state tracked in such a way that we can naturally
restore it as part of the necessary reset, resuming is trivial, and
there's nothing to do on suspend at all.

Signed-off-by: Robin Murphy 
---

v2: Whitespace fixes

 drivers/iommu/arm-smmu.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 8f2f345ab7b1..7f7eccab120a 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -2359,10 +2359,21 @@ static int arm_smmu_device_remove(struct 
platform_device *pdev)
return 0;
 }
 
+static int __maybe_unused arm_smmu_pm_resume(struct device *dev)
+{
+   struct arm_smmu_device *smmu = dev_get_drvdata(dev);
+
+   arm_smmu_device_reset(smmu);
+   return 0;
+}
+
+static SIMPLE_DEV_PM_OPS(arm_smmu_pm_ops, NULL, arm_smmu_pm_resume);
+
 static struct platform_driver arm_smmu_driver = {
.driver = {
.name   = "arm-smmu",
.of_match_table = of_match_ptr(arm_smmu_of_match),
+   .pm = _smmu_pm_ops,
},
.probe  = arm_smmu_device_probe,
.remove = arm_smmu_device_remove,
-- 
2.13.4.dirty

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 1/2] iommu/arm-smmu: Track context bank state

2017-08-08 Thread Robin Murphy
Echoing what we do for Stream Map Entries, maintain a software shadow
state for context bank configuration. With this in place, we are mere
moments away from blissfully easy suspend/resume support.

Reviewed-by: Sricharan R 
Signed-off-by: Robin Murphy 
---

v2:
 - Don't forget ASIDs for stage 1 LPAE
 - Avoid touching TTBR1 for stage 2 (even if it is techinically OK)
 - Handle disabling (i.e. cfg == NULL) much more cleanly

 drivers/iommu/arm-smmu.c | 156 +--
 1 file changed, 97 insertions(+), 59 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index bc89b4d6c043..8f2f345ab7b1 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -338,6 +338,13 @@ struct arm_smmu_smr {
boolvalid;
 };
 
+struct arm_smmu_cb {
+   u64 ttbr[2];
+   u32 tcr[2];
+   u32 mair[2];
+   struct arm_smmu_cfg *cfg;
+};
+
 struct arm_smmu_master_cfg {
struct arm_smmu_device  *smmu;
s16 smendx[];
@@ -380,6 +387,7 @@ struct arm_smmu_device {
u32 num_context_banks;
u32 num_s2_context_banks;
DECLARE_BITMAP(context_map, ARM_SMMU_MAX_CBS);
+   struct arm_smmu_cb  *cbs;
atomic_tirptndx;
 
u32 num_mapping_groups;
@@ -768,17 +776,74 @@ static irqreturn_t arm_smmu_global_fault(int irq, void 
*dev)
 static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain,
   struct io_pgtable_cfg *pgtbl_cfg)
 {
-   u32 reg, reg2;
-   u64 reg64;
-   bool stage1;
struct arm_smmu_cfg *cfg = _domain->cfg;
-   struct arm_smmu_device *smmu = smmu_domain->smmu;
+   struct arm_smmu_cb *cb = _domain->smmu->cbs[cfg->cbndx];
+   bool stage1 = cfg->cbar != CBAR_TYPE_S2_TRANS;
+
+   cb->cfg = cfg;
+
+   /* TTBCR */
+   if (stage1) {
+   if (cfg->fmt == ARM_SMMU_CTX_FMT_AARCH32_S) {
+   cb->tcr[0] = pgtbl_cfg->arm_v7s_cfg.tcr;
+   } else {
+   cb->tcr[0] = pgtbl_cfg->arm_lpae_s1_cfg.tcr;
+   cb->tcr[1] = pgtbl_cfg->arm_lpae_s1_cfg.tcr >> 32;
+   cb->tcr[1] |= TTBCR2_SEP_UPSTREAM;
+   if (cfg->fmt == ARM_SMMU_CTX_FMT_AARCH64)
+   cb->tcr[1] |= TTBCR2_AS;
+   }
+   } else {
+   cb->tcr[0] = pgtbl_cfg->arm_lpae_s2_cfg.vtcr;
+   }
+
+   /* TTBRs */
+   if (stage1) {
+   if (cfg->fmt == ARM_SMMU_CTX_FMT_AARCH32_S) {
+   cb->ttbr[0] = pgtbl_cfg->arm_v7s_cfg.ttbr[0];
+   cb->ttbr[1] = pgtbl_cfg->arm_v7s_cfg.ttbr[1];
+   } else {
+   cb->ttbr[0] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr[0];
+   cb->ttbr[0] |= (u64)cfg->asid << TTBRn_ASID_SHIFT;
+   cb->ttbr[1] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr[1];
+   cb->ttbr[1] |= (u64)cfg->asid << TTBRn_ASID_SHIFT;
+   }
+   } else {
+   cb->ttbr[0] = pgtbl_cfg->arm_lpae_s2_cfg.vttbr;
+   }
+
+   /* MAIRs (stage-1 only) */
+   if (stage1) {
+   if (cfg->fmt == ARM_SMMU_CTX_FMT_AARCH32_S) {
+   cb->mair[0] = pgtbl_cfg->arm_v7s_cfg.prrr;
+   cb->mair[1] = pgtbl_cfg->arm_v7s_cfg.nmrr;
+   } else {
+   cb->mair[0] = pgtbl_cfg->arm_lpae_s1_cfg.mair[0];
+   cb->mair[1] = pgtbl_cfg->arm_lpae_s1_cfg.mair[1];
+   }
+   }
+}
+
+static void arm_smmu_write_context_bank(struct arm_smmu_device *smmu, int idx)
+{
+   u32 reg;
+   bool stage1;
+   struct arm_smmu_cb *cb = >cbs[idx];
+   struct arm_smmu_cfg *cfg = cb->cfg;
void __iomem *cb_base, *gr1_base;
 
+   cb_base = ARM_SMMU_CB(smmu, idx);
+
+   /* Unassigned context banks only need disabling */
+   if (!cfg) {
+   writel_relaxed(0, cb_base + ARM_SMMU_CB_SCTLR);
+   return;
+   }
+
gr1_base = ARM_SMMU_GR1(smmu);
stage1 = cfg->cbar != CBAR_TYPE_S2_TRANS;
-   cb_base = ARM_SMMU_CB(smmu, cfg->cbndx);
 
+   /* CBA2R */
if (smmu->version > ARM_SMMU_V1) {
if (cfg->fmt == ARM_SMMU_CTX_FMT_AARCH64)
reg = CBA2R_RW64_64BIT;
@@ -788,7 +853,7 @@ static void arm_smmu_init_context_bank(struct 
arm_smmu_domain *smmu_domain,
if (smmu->features & ARM_SMMU_FEAT_VMID16)
reg |= cfg->vmid << CBA2R_VMID_SHIFT;
 
-   writel_relaxed(reg, gr1_base + 

Re: [PATCH v2 0/4] Optimise 64-bit IOVA allocations

2017-08-08 Thread Ganapatrao Kulkarni
On Wed, Jul 26, 2017 at 4:47 PM, Leizhen (ThunderTown)
 wrote:
>
>
> On 2017/7/26 19:08, Joerg Roedel wrote:
>> Hi Robin.
>>
>> On Fri, Jul 21, 2017 at 12:41:57PM +0100, Robin Murphy wrote:
>>> Hi all,
>>>
>>> In the wake of the ARM SMMU optimisation efforts, it seems that certain
>>> workloads (e.g. storage I/O with large scatterlists) probably remain quite
>>> heavily influenced by IOVA allocation performance. Separately, Ard also
>>> reported massive performance drops for a graphical desktop on AMD Seattle
>>> when enabling SMMUs via IORT, which we traced to dma_32bit_pfn in the DMA
>>> ops domain getting initialised differently for ACPI vs. DT, and exposing
>>> the overhead of the rbtree slow path. Whilst we could go around trying to
>>> close up all the little gaps that lead to hitting the slowest case, it
>>> seems a much better idea to simply make said slowest case a lot less slow.
>>
>> Do you have some numbers here? How big was the impact before these
>> patches and how is it with the patches?
> Here are some numbers:
>
> (before)$ iperf -s
> 
> Server listening on TCP port 5001
> TCP window size: 85.3 KByte (default)
> 
> [  4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 35898
> [ ID] Interval   Transfer Bandwidth
> [  4]  0.0-10.2 sec  7.88 MBytes  6.48 Mbits/sec
> [  5] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 35900
> [  5]  0.0-10.3 sec  7.88 MBytes  6.43 Mbits/sec
> [  4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 35902
> [  4]  0.0-10.3 sec  7.88 MBytes  6.43 Mbits/sec
>
> (after)$ iperf -s
> 
> Server listening on TCP port 5001
> TCP window size: 85.3 KByte (default)
> 
> [  4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 36330
> [ ID] Interval   Transfer Bandwidth
> [  4]  0.0-10.0 sec  1.09 GBytes   933 Mbits/sec
> [  5] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 36332
> [  5]  0.0-10.0 sec  1.10 GBytes   939 Mbits/sec
> [  4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 36334
> [  4]  0.0-10.0 sec  1.10 GBytes   938 Mbits/sec
>

Is this testing done on Host or on Guest/VM?

>>
>>
>>   Joerg
>>
>>
>> .
>>
>
> --
> Thanks!
> BestRegards
>
>
> ___
> linux-arm-kernel mailing list
> linux-arm-ker...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

thanks
Ganapat
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/2] iommu/arm-smmu: Add system PM support

2017-08-08 Thread Robin Murphy
On 08/08/17 12:18, Will Deacon wrote:
> On Tue, Jul 18, 2017 at 01:44:42PM +0100, Robin Murphy wrote:
>> With all our hardware state tracked in such a way that we can naturally
>> restore it as part of the necessary reset, resuming is trivial, and
>> there's nothing to do on suspend at all.
>>
>> Signed-off-by: Robin Murphy 
>> ---
>>  drivers/iommu/arm-smmu.c | 12 
>>  1 file changed, 12 insertions(+)
>>
>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
>> index 86897b7b81d8..0f5f06e9abfa 100644
>> --- a/drivers/iommu/arm-smmu.c
>> +++ b/drivers/iommu/arm-smmu.c
>> @@ -2356,10 +2356,22 @@ static int arm_smmu_device_remove(struct 
>> platform_device *pdev)
>>  return 0;
>>  }
>>  
>> +static int __maybe_unused arm_smmu_pm_resume(struct device *dev)
>> +{
> 
> Did you actually get a warning here without the __maybe_unused annotation?
> It looks like some other drivers just guard the thing with CONFIG_PM_SLEEP.

I'm under the impression that the annotation is preferred over #ifdefs
for new code (for the sake of coverage, I guess).

>> +struct arm_smmu_device *smmu = dev_get_drvdata(dev);
>> +
>> +arm_smmu_device_reset(smmu);
>> +return 0;
>> +}
>> +
>> +
>> +static SIMPLE_DEV_PM_OPS(arm_smmu_pm_ops, NULL, arm_smmu_pm_resume);
>> +
>>  static struct platform_driver arm_smmu_driver = {
>>  .driver = {
>>  .name   = "arm-smmu",
>>  .of_match_table = of_match_ptr(arm_smmu_of_match),
>> +.pm = _smmu_pm_ops,
> 
> Cosmetic: can you tab-align this assignment please?

Oops, I missed that - will do.

Robin.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/2] iommu/arm-smmu: Track context bank state

2017-08-08 Thread Robin Murphy
On 08/08/17 12:11, Will Deacon wrote:
> Hi Robin,
> 
> I like the idea, but I think there are a few minor issues with the patch.
> Comments below.
> 
> On Tue, Jul 18, 2017 at 01:44:41PM +0100, Robin Murphy wrote:
>> Echoing what we do for Stream Map Entries, maintain a software shadow
>> state for context bank configuration. With this in place, we are mere
>> moments away from blissfully easy suspend/resume support.
>>
>> Signed-off-by: Robin Murphy 
>> ---
>>
>> Since the runtime PM discussion has come back again, I figured it was
>> probably about time to finish off my plan for system PM. Lightly tested
>> on Juno (MMU-401) with hibernation.
>>
>> Robin.
>>
>>  drivers/iommu/arm-smmu.c | 159 
>> +--
>>  1 file changed, 97 insertions(+), 62 deletions(-)
> 
> [...]
> 
>> +static void arm_smmu_write_context_bank(struct arm_smmu_device *smmu, int 
>> idx)
>> +{
>> +u32 reg;
>> +bool stage1;
>> +struct arm_smmu_cb *cb = >cbs[idx];
>> +struct arm_smmu_cfg *cfg = cb->cfg;
>> +struct arm_smmu_cfg default_cfg = {0};
>>  void __iomem *cb_base, *gr1_base;
>>  
>> +if (!cfg)
>> +cfg = _cfg;
>> +
>>  gr1_base = ARM_SMMU_GR1(smmu);
>>  stage1 = cfg->cbar != CBAR_TYPE_S2_TRANS;
>> -cb_base = ARM_SMMU_CB(smmu, cfg->cbndx);
>> +cb_base = ARM_SMMU_CB(smmu, idx);
>>  
>> +/* CBA2R */
>>  if (smmu->version > ARM_SMMU_V1) {
>>  if (cfg->fmt == ARM_SMMU_CTX_FMT_AARCH64)
>>  reg = CBA2R_RW64_64BIT;
>> @@ -788,7 +848,7 @@ static void arm_smmu_init_context_bank(struct 
>> arm_smmu_domain *smmu_domain,
>>  if (smmu->features & ARM_SMMU_FEAT_VMID16)
>>  reg |= cfg->vmid << CBA2R_VMID_SHIFT;
>>  
>> -writel_relaxed(reg, gr1_base + ARM_SMMU_GR1_CBA2R(cfg->cbndx));
>> +writel_relaxed(reg, gr1_base + ARM_SMMU_GR1_CBA2R(idx));
>>  }
>>  
>>  /* CBAR */
>> @@ -807,72 +867,43 @@ static void arm_smmu_init_context_bank(struct 
>> arm_smmu_domain *smmu_domain,
>>  /* 8-bit VMIDs live in CBAR */
>>  reg |= cfg->vmid << CBAR_VMID_SHIFT;
>>  }
>> -writel_relaxed(reg, gr1_base + ARM_SMMU_GR1_CBAR(cfg->cbndx));
>> +writel_relaxed(reg, gr1_base + ARM_SMMU_GR1_CBAR(idx));
>>  
>>  /*
>>   * TTBCR
>>   * We must write this before the TTBRs, since it determines the
>>   * access behaviour of some fields (in particular, ASID[15:8]).
>>   */
>> -if (stage1) {
>> -if (cfg->fmt == ARM_SMMU_CTX_FMT_AARCH32_S) {
>> -reg = pgtbl_cfg->arm_v7s_cfg.tcr;
>> -reg2 = 0;
>> -} else {
>> -reg = pgtbl_cfg->arm_lpae_s1_cfg.tcr;
>> -reg2 = pgtbl_cfg->arm_lpae_s1_cfg.tcr >> 32;
>> -reg2 |= TTBCR2_SEP_UPSTREAM;
>> -if (cfg->fmt == ARM_SMMU_CTX_FMT_AARCH64)
>> -reg2 |= TTBCR2_AS;
>> -}
>> -if (smmu->version > ARM_SMMU_V1)
>> -writel_relaxed(reg2, cb_base + ARM_SMMU_CB_TTBCR2);
>> -} else {
>> -reg = pgtbl_cfg->arm_lpae_s2_cfg.vtcr;
>> -}
>> -writel_relaxed(reg, cb_base + ARM_SMMU_CB_TTBCR);
>> +if (stage1 && smmu->version > ARM_SMMU_V1)
>> +writel_relaxed(cb->tcr[1], cb_base + ARM_SMMU_CB_TTBCR2);
>> +writel_relaxed(cb->tcr[0], cb_base + ARM_SMMU_CB_TTBCR);
>>  
>>  /* TTBRs */
>> -if (stage1) {
>> -if (cfg->fmt == ARM_SMMU_CTX_FMT_AARCH32_S) {
>> -reg = pgtbl_cfg->arm_v7s_cfg.ttbr[0];
>> -writel_relaxed(reg, cb_base + ARM_SMMU_CB_TTBR0);
>> -reg = pgtbl_cfg->arm_v7s_cfg.ttbr[1];
>> -writel_relaxed(reg, cb_base + ARM_SMMU_CB_TTBR1);
>> -writel_relaxed(cfg->asid, cb_base + 
>> ARM_SMMU_CB_CONTEXTIDR);
>> -} else {
>> -reg64 = pgtbl_cfg->arm_lpae_s1_cfg.ttbr[0];
>> -reg64 |= (u64)cfg->asid << TTBRn_ASID_SHIFT;
>> -writeq_relaxed(reg64, cb_base + ARM_SMMU_CB_TTBR0);
>> -reg64 = pgtbl_cfg->arm_lpae_s1_cfg.ttbr[1];
>> -reg64 |= (u64)cfg->asid << TTBRn_ASID_SHIFT;
>> -writeq_relaxed(reg64, cb_base + ARM_SMMU_CB_TTBR1);
>> -}
>> +if (cfg->fmt == ARM_SMMU_CTX_FMT_AARCH32_S) {
>> +writel_relaxed(cfg->asid, cb_base + ARM_SMMU_CB_CONTEXTIDR);
>> +writel_relaxed(cb->ttbr[0], cb_base + ARM_SMMU_CB_TTBR0);
>> +writel_relaxed(cb->ttbr[1], cb_base + ARM_SMMU_CB_TTBR1);
>>  } else {
>> -reg64 = pgtbl_cfg->arm_lpae_s2_cfg.vttbr;
>> -writeq_relaxed(reg64, cb_base + ARM_SMMU_CB_TTBR0);
>> +writeq_relaxed(cb->ttbr[0], cb_base + ARM_SMMU_CB_TTBR0);
>> +writeq_relaxed(cb->ttbr[1], cb_base + 

Re: [PATCH v2] iommu/arm-smmu: fix null-pointer dereference in arm_smmu_add_device

2017-08-08 Thread Will Deacon
[+ Joerg]

On Tue, Aug 08, 2017 at 11:37:40AM +0100, Robin Murphy wrote:
> On 08/08/17 11:26, Artem Savkov wrote:
> > Commit c54451a "iommu/arm-smmu: Fix the error path in arm_smmu_add_device"
> > removed fwspec assignment in legacy_binding path as redundant which is
> > wrong. It needs to be updated after fwspec initialisation in
> > arm_smmu_register_legacy_master() as it is dereferenced later. Without
> > this there is a NULL-pointer dereference panic during boot on some hosts.
> 
> Reviewed-by: Robin Murphy 
> 
> Thanks for fixing it up, and sorry for failing to document the
> unfortunately subtle logic in the first place!

Well, I was the one that messed it up:

Acked-by: Will Deacon 

Joerg, can you pick this up as a fix for 4.13, please?

Will

> > Signed-off-by: Artem Savkov 
> > ---
> >  drivers/iommu/arm-smmu.c | 7 +++
> >  1 file changed, 7 insertions(+)
> > 
> > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> > index b97188a..2d80fa8 100644
> > --- a/drivers/iommu/arm-smmu.c
> > +++ b/drivers/iommu/arm-smmu.c
> > @@ -1519,6 +1519,13 @@ static int arm_smmu_add_device(struct device *dev)
> >  
> > if (using_legacy_binding) {
> > ret = arm_smmu_register_legacy_master(dev, );
> > +
> > +   /*
> > +* If dev->iommu_fwspec is initally NULL, 
> > arm_smmu_register_legacy_master()
> > +* will allocate/initialise a new one. Thus we need to update 
> > fwspec for
> > +* later use.
> > +*/
> > +   fwspec = dev->iommu_fwspec;
> > if (ret)
> > goto out_free;
> > } else if (fwspec && fwspec->ops == _smmu_ops) {
> > 
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/2] iommu/arm-smmu: Add system PM support

2017-08-08 Thread Will Deacon
On Tue, Jul 18, 2017 at 01:44:42PM +0100, Robin Murphy wrote:
> With all our hardware state tracked in such a way that we can naturally
> restore it as part of the necessary reset, resuming is trivial, and
> there's nothing to do on suspend at all.
> 
> Signed-off-by: Robin Murphy 
> ---
>  drivers/iommu/arm-smmu.c | 12 
>  1 file changed, 12 insertions(+)
> 
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index 86897b7b81d8..0f5f06e9abfa 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -2356,10 +2356,22 @@ static int arm_smmu_device_remove(struct 
> platform_device *pdev)
>   return 0;
>  }
>  
> +static int __maybe_unused arm_smmu_pm_resume(struct device *dev)
> +{

Did you actually get a warning here without the __maybe_unused annotation?
It looks like some other drivers just guard the thing with CONFIG_PM_SLEEP.

> + struct arm_smmu_device *smmu = dev_get_drvdata(dev);
> +
> + arm_smmu_device_reset(smmu);
> + return 0;
> +}
> +
> +
> +static SIMPLE_DEV_PM_OPS(arm_smmu_pm_ops, NULL, arm_smmu_pm_resume);
> +
>  static struct platform_driver arm_smmu_driver = {
>   .driver = {
>   .name   = "arm-smmu",
>   .of_match_table = of_match_ptr(arm_smmu_of_match),
> + .pm = _smmu_pm_ops,

Cosmetic: can you tab-align this assignment please?

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/2] iommu/arm-smmu: Track context bank state

2017-08-08 Thread Will Deacon
Hi Robin,

I like the idea, but I think there are a few minor issues with the patch.
Comments below.

On Tue, Jul 18, 2017 at 01:44:41PM +0100, Robin Murphy wrote:
> Echoing what we do for Stream Map Entries, maintain a software shadow
> state for context bank configuration. With this in place, we are mere
> moments away from blissfully easy suspend/resume support.
> 
> Signed-off-by: Robin Murphy 
> ---
> 
> Since the runtime PM discussion has come back again, I figured it was
> probably about time to finish off my plan for system PM. Lightly tested
> on Juno (MMU-401) with hibernation.
> 
> Robin.
> 
>  drivers/iommu/arm-smmu.c | 159 
> +--
>  1 file changed, 97 insertions(+), 62 deletions(-)

[...]

> +static void arm_smmu_write_context_bank(struct arm_smmu_device *smmu, int 
> idx)
> +{
> + u32 reg;
> + bool stage1;
> + struct arm_smmu_cb *cb = >cbs[idx];
> + struct arm_smmu_cfg *cfg = cb->cfg;
> + struct arm_smmu_cfg default_cfg = {0};
>   void __iomem *cb_base, *gr1_base;
>  
> + if (!cfg)
> + cfg = _cfg;
> +
>   gr1_base = ARM_SMMU_GR1(smmu);
>   stage1 = cfg->cbar != CBAR_TYPE_S2_TRANS;
> - cb_base = ARM_SMMU_CB(smmu, cfg->cbndx);
> + cb_base = ARM_SMMU_CB(smmu, idx);
>  
> + /* CBA2R */
>   if (smmu->version > ARM_SMMU_V1) {
>   if (cfg->fmt == ARM_SMMU_CTX_FMT_AARCH64)
>   reg = CBA2R_RW64_64BIT;
> @@ -788,7 +848,7 @@ static void arm_smmu_init_context_bank(struct 
> arm_smmu_domain *smmu_domain,
>   if (smmu->features & ARM_SMMU_FEAT_VMID16)
>   reg |= cfg->vmid << CBA2R_VMID_SHIFT;
>  
> - writel_relaxed(reg, gr1_base + ARM_SMMU_GR1_CBA2R(cfg->cbndx));
> + writel_relaxed(reg, gr1_base + ARM_SMMU_GR1_CBA2R(idx));
>   }
>  
>   /* CBAR */
> @@ -807,72 +867,43 @@ static void arm_smmu_init_context_bank(struct 
> arm_smmu_domain *smmu_domain,
>   /* 8-bit VMIDs live in CBAR */
>   reg |= cfg->vmid << CBAR_VMID_SHIFT;
>   }
> - writel_relaxed(reg, gr1_base + ARM_SMMU_GR1_CBAR(cfg->cbndx));
> + writel_relaxed(reg, gr1_base + ARM_SMMU_GR1_CBAR(idx));
>  
>   /*
>* TTBCR
>* We must write this before the TTBRs, since it determines the
>* access behaviour of some fields (in particular, ASID[15:8]).
>*/
> - if (stage1) {
> - if (cfg->fmt == ARM_SMMU_CTX_FMT_AARCH32_S) {
> - reg = pgtbl_cfg->arm_v7s_cfg.tcr;
> - reg2 = 0;
> - } else {
> - reg = pgtbl_cfg->arm_lpae_s1_cfg.tcr;
> - reg2 = pgtbl_cfg->arm_lpae_s1_cfg.tcr >> 32;
> - reg2 |= TTBCR2_SEP_UPSTREAM;
> - if (cfg->fmt == ARM_SMMU_CTX_FMT_AARCH64)
> - reg2 |= TTBCR2_AS;
> - }
> - if (smmu->version > ARM_SMMU_V1)
> - writel_relaxed(reg2, cb_base + ARM_SMMU_CB_TTBCR2);
> - } else {
> - reg = pgtbl_cfg->arm_lpae_s2_cfg.vtcr;
> - }
> - writel_relaxed(reg, cb_base + ARM_SMMU_CB_TTBCR);
> + if (stage1 && smmu->version > ARM_SMMU_V1)
> + writel_relaxed(cb->tcr[1], cb_base + ARM_SMMU_CB_TTBCR2);
> + writel_relaxed(cb->tcr[0], cb_base + ARM_SMMU_CB_TTBCR);
>  
>   /* TTBRs */
> - if (stage1) {
> - if (cfg->fmt == ARM_SMMU_CTX_FMT_AARCH32_S) {
> - reg = pgtbl_cfg->arm_v7s_cfg.ttbr[0];
> - writel_relaxed(reg, cb_base + ARM_SMMU_CB_TTBR0);
> - reg = pgtbl_cfg->arm_v7s_cfg.ttbr[1];
> - writel_relaxed(reg, cb_base + ARM_SMMU_CB_TTBR1);
> - writel_relaxed(cfg->asid, cb_base + 
> ARM_SMMU_CB_CONTEXTIDR);
> - } else {
> - reg64 = pgtbl_cfg->arm_lpae_s1_cfg.ttbr[0];
> - reg64 |= (u64)cfg->asid << TTBRn_ASID_SHIFT;
> - writeq_relaxed(reg64, cb_base + ARM_SMMU_CB_TTBR0);
> - reg64 = pgtbl_cfg->arm_lpae_s1_cfg.ttbr[1];
> - reg64 |= (u64)cfg->asid << TTBRn_ASID_SHIFT;
> - writeq_relaxed(reg64, cb_base + ARM_SMMU_CB_TTBR1);
> - }
> + if (cfg->fmt == ARM_SMMU_CTX_FMT_AARCH32_S) {
> + writel_relaxed(cfg->asid, cb_base + ARM_SMMU_CB_CONTEXTIDR);
> + writel_relaxed(cb->ttbr[0], cb_base + ARM_SMMU_CB_TTBR0);
> + writel_relaxed(cb->ttbr[1], cb_base + ARM_SMMU_CB_TTBR1);
>   } else {
> - reg64 = pgtbl_cfg->arm_lpae_s2_cfg.vttbr;
> - writeq_relaxed(reg64, cb_base + ARM_SMMU_CB_TTBR0);
> + writeq_relaxed(cb->ttbr[0], cb_base + ARM_SMMU_CB_TTBR0);
> + writeq_relaxed(cb->ttbr[1], cb_base + ARM_SMMU_CB_TTBR1);

This doesn't look right to me. For the LPAE S1 case, we don't set the ASID

Re: [PATCH v2] iommu/arm-smmu: fix null-pointer dereference in arm_smmu_add_device

2017-08-08 Thread Robin Murphy
On 08/08/17 11:26, Artem Savkov wrote:
> Commit c54451a "iommu/arm-smmu: Fix the error path in arm_smmu_add_device"
> removed fwspec assignment in legacy_binding path as redundant which is
> wrong. It needs to be updated after fwspec initialisation in
> arm_smmu_register_legacy_master() as it is dereferenced later. Without
> this there is a NULL-pointer dereference panic during boot on some hosts.

Reviewed-by: Robin Murphy 

Thanks for fixing it up, and sorry for failing to document the
unfortunately subtle logic in the first place!

Robin.

> Signed-off-by: Artem Savkov 
> ---
>  drivers/iommu/arm-smmu.c | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index b97188a..2d80fa8 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -1519,6 +1519,13 @@ static int arm_smmu_add_device(struct device *dev)
>  
>   if (using_legacy_binding) {
>   ret = arm_smmu_register_legacy_master(dev, );
> +
> + /*
> +  * If dev->iommu_fwspec is initally NULL, 
> arm_smmu_register_legacy_master()
> +  * will allocate/initialise a new one. Thus we need to update 
> fwspec for
> +  * later use.
> +  */
> + fwspec = dev->iommu_fwspec;
>   if (ret)
>   goto out_free;
>   } else if (fwspec && fwspec->ops == _smmu_ops) {
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2] iommu/arm-smmu: fix null-pointer dereference in arm_smmu_add_device

2017-08-08 Thread Artem Savkov
Commit c54451a "iommu/arm-smmu: Fix the error path in arm_smmu_add_device"
removed fwspec assignment in legacy_binding path as redundant which is
wrong. It needs to be updated after fwspec initialisation in
arm_smmu_register_legacy_master() as it is dereferenced later. Without
this there is a NULL-pointer dereference panic during boot on some hosts.

Signed-off-by: Artem Savkov 
---
 drivers/iommu/arm-smmu.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index b97188a..2d80fa8 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1519,6 +1519,13 @@ static int arm_smmu_add_device(struct device *dev)
 
if (using_legacy_binding) {
ret = arm_smmu_register_legacy_master(dev, );
+
+   /*
+* If dev->iommu_fwspec is initally NULL, 
arm_smmu_register_legacy_master()
+* will allocate/initialise a new one. Thus we need to update 
fwspec for
+* later use.
+*/
+   fwspec = dev->iommu_fwspec;
if (ret)
goto out_free;
} else if (fwspec && fwspec->ops == _smmu_ops) {
-- 
2.7.5

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v5 1/2] ACPI/IORT: Add ITS address regions reservation helper

2017-08-08 Thread Shameerali Kolothum Thodi


> -Original Message-
> From: Lorenzo Pieralisi [mailto:lorenzo.pieral...@arm.com]
> Sent: Monday, August 07, 2017 6:09 PM
> To: Shameerali Kolothum Thodi
> Cc: Robin Murphy; marc.zyng...@arm.com; sudeep.ho...@arm.com;
> will.dea...@arm.com; hanjun@linaro.org; Gabriele Paoloni; John Garry;
> Linuxarm; linux-a...@vger.kernel.org; iommu@lists.linux-foundation.org;
> Wangzhou (B); Guohanjun (Hanjun Guo); linux-arm-
> ker...@lists.infradead.org; de...@acpica.org
> Subject: Re: [PATCH v5 1/2] ACPI/IORT: Add ITS address regions reservation
> helper
> 
> On Mon, Aug 07, 2017 at 08:21:40AM +, Shameerali Kolothum Thodi
> wrote:
> >
> >
> > > -Original Message-
> > > From: Robin Murphy [mailto:robin.mur...@arm.com]
> > > Sent: Friday, August 04, 2017 5:57 PM
> > > To: Shameerali Kolothum Thodi; lorenzo.pieral...@arm.com;
> > > marc.zyng...@arm.com; sudeep.ho...@arm.com; will.dea...@arm.com;
> > > hanjun@linaro.org
> > > Cc: Gabriele Paoloni; John Garry; iommu@lists.linux-foundation.org;
> linux-
> > > arm-ker...@lists.infradead.org; linux-a...@vger.kernel.org;
> > > de...@acpica.org; Linuxarm; Wangzhou (B); Guohanjun (Hanjun Guo)
> > > Subject: Re: [PATCH v5 1/2] ACPI/IORT: Add ITS address regions
> reservation
> > > helper
> > >
> > > On 01/08/17 11:49, Shameer Kolothum wrote:
> > > > On some platforms ITS address regions have to be excluded from
> normal
> > > > IOVA allocation in that they are detected and decoded in a HW specific
> > > > way by system components and so they cannot be considered normal
> > > IOVA
> > > > address space.
> > > >
> > > > Add an helper function that retrieves ITS address regions through IORT
> > > > device <-> ITS mappings and reserves it so that these regions will not
> > > > be translated by IOMMU and will be excluded from IOVA allocations.
> > >
> > > I've just realised that we no longer seem to have a check that ensures
> > > the regions are only reserved on platforms that need it - if not, then
> > > we're going to break everything else that does have an ITS behind SMMU
> > > translation as expected.
> >
> > Right. I had this doubt, but then my thinking was that we will have the
> SW_MSI
> > regions for those and will end up  using that. But that doesn’t seems
> > to be the case now.
> >
> > > It feels like IORT should know enough to be able to make that decision
> > > internally, but if not (or if it would be hideous to do so), then I
> > > guess my idea for patch #2 was a bust and we probably do need to go
> back
> > > to calling directly from the SMMU driver based on the SMMU model.
> >
> > It might be possible to do that check inside iort code, but then we have to
> find
> > the  smmu node first and check the model number. I think it will be more
> > cleaner if SMMU driver makes that check and call the
> > iommu_dma_get_msi_resv_regions().
> 
> +1 on this one - we can do it in IORT but I think it is more logical
> to have a flag in the SMMU driver (keeping the DT/ACPI fwnode switch in
> generic IOMMU layer though).

Please find [1] for the generic iommu helper function which will be called
from SMMU driver based on the model.

> Side note I: AFAICS iommu_dma_get_resv_regions() is only used on ARM, if
> it is not patch 2 would break miserably on arches that do not select
> IORT - you should rework the return code on !CONFIG_ACPI_IORT configs.

Yes. But so far it is only used on ARM. In any way this function is not going 
to be
changed and will introduce a new iommu_dma_get_msi_resv_regions() fn  as
proposed in [1].

Please take a look and let me know. I will submit the series again if this is 
fine.

Thanks,
Shameer

[1] This will replace patch 2 and will be followed by smmu driver patch to 
invoke
iommu_dma_get_msi_resv_regions() based on SMMU model.

-->8--
Subject: [PATCH] iommu/dma: Add a helper  function to reserve HW MSI address 
regions for IOMMU drivers

IOMMU drivers can use this to implement their .get_resv_regions callback
for HW MSI specific reservations(e.g. ARM GICv3 ITS MSI region).

Signed-off-by: Shameer Kolothum 
---
 drivers/iommu/dma-iommu.c | 19 +++
 include/linux/dma-iommu.h |  7 +++
 2 files changed, 26 insertions(+)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 9d1cebe..952ecdd 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -19,6 +19,7 @@
  * along with this program.  If not, see .
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -198,6 +199,24 @@ void iommu_dma_get_resv_regions(struct device *dev, struct 
list_head *list)
 }
 EXPORT_SYMBOL(iommu_dma_get_resv_regions);
 
+/**
+ * iommu_dma_get_msi_resv_regions - Reserved region driver helper
+ * @dev: Device from iommu_get_resv_regions()
+ * @list: Reserved region list from iommu_get_resv_regions()
+ *
+ * IOMMU drivers can use this to implement their .get_resv_regions
+ * callback for HW MSI specific reservations. For now, 

Re: [PATCH] iommu/arm-smmu: fix null-pointer dereference in arm_smmu_add_device

2017-08-08 Thread Will Deacon
Hi Artem,

Thanks for the patch.

On Tue, Aug 08, 2017 at 10:58:01AM +0200, Artem Savkov wrote:
> Commit c54451a "iommu/arm-smmu: Fix the error path in arm_smmu_add_device"
> removed fwspec assignment in legacy_binding path as redundant which is
> wrong. It needs to be updated after fwspec initialisation in
> arm_smmu_register_legacy_master() as it is dereferenced later. Without
> this there is a NULL-pointer dereference panic during boot on some hosts.
> 
> Signed-off-by: Artem Savkov 
> ---
>  drivers/iommu/arm-smmu.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index b97188a..95f1c86 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -1519,6 +1519,7 @@ static int arm_smmu_add_device(struct device *dev)
>  
>   if (using_legacy_binding) {
>   ret = arm_smmu_register_legacy_master(dev, );
> + fwspec = dev->iommu_fwspec;
>   if (ret)
>   goto out_free;
>   } else if (fwspec && fwspec->ops == _smmu_ops) {

Damn, you're completely right! Robin and I bashed our heads against this for
a while and couldn't remember why the code was structured like it was, but
that explains it. Can you add a comment saying that
arm_smmu_register_legacy_master will allocate an fwspec if its initially NULL,
please?

Cheers,

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu/arm-smmu: fix null-pointer dereference in arm_smmu_add_device

2017-08-08 Thread Artem Savkov
Commit c54451a "iommu/arm-smmu: Fix the error path in arm_smmu_add_device"
removed fwspec assignment in legacy_binding path as redundant which is
wrong. It needs to be updated after fwspec initialisation in
arm_smmu_register_legacy_master() as it is dereferenced later. Without
this there is a NULL-pointer dereference panic during boot on some hosts.

Signed-off-by: Artem Savkov 
---
 drivers/iommu/arm-smmu.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index b97188a..95f1c86 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1519,6 +1519,7 @@ static int arm_smmu_add_device(struct device *dev)
 
if (using_legacy_binding) {
ret = arm_smmu_register_legacy_master(dev, );
+   fwspec = dev->iommu_fwspec;
if (ret)
goto out_free;
} else if (fwspec && fwspec->ops == _smmu_ops) {
-- 
2.7.5

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu