[PATCH v2] iommu/vt-d: avoid unnecessory panic if iommu init fail in tboot system

2020-11-09 Thread Zhenzhong Duan
"intel_iommu=off" command line is used to disable iommu but iommu is force
enabled in a tboot system for security reason.

However for better performance on high speed network device, a new option
"intel_iommu=tboot_noforce" is introduced to disable the force on.

By default kernel should panic if iommu init fail in tboot for security
reason, but it's unnecessory if we use "intel_iommu=tboot_noforce,off".

Fix the code setting force_on and move intel_iommu_tboot_noforce
from tboot code to intel iommu code.

Fixes: 7304e8f28bb2 ("iommu/vt-d: Correctly disable Intel IOMMU force on")
Signed-off-by: Zhenzhong Duan 
---
v2: move ckeck of intel_iommu_tboot_noforce into iommu code per Baolu.

 arch/x86/kernel/tboot.c | 3 ---
 drivers/iommu/intel/iommu.c | 5 +++--
 include/linux/intel-iommu.h | 1 -
 3 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/tboot.c b/arch/x86/kernel/tboot.c
index 992fb14..420be87 100644
--- a/arch/x86/kernel/tboot.c
+++ b/arch/x86/kernel/tboot.c
@@ -514,9 +514,6 @@ int tboot_force_iommu(void)
if (!tboot_enabled())
return 0;
 
-   if (intel_iommu_tboot_noforce)
-   return 1;
-
if (no_iommu || swiotlb || dmar_disabled)
pr_warn("Forcing Intel-IOMMU to enabled\n");
 
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 1b1ca63..4d9b298 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -179,7 +179,7 @@ static inline unsigned long virt_to_dma_pfn(void *p)
  * (used when kernel is launched w/ TXT)
  */
 static int force_on = 0;
-int intel_iommu_tboot_noforce;
+static int intel_iommu_tboot_noforce;
 static int no_platform_optin;
 
 #define ROOT_ENTRY_NR (VTD_PAGE_SIZE/sizeof(struct root_entry))
@@ -4885,7 +4885,8 @@ int __init intel_iommu_init(void)
 * Intel IOMMU is required for a TXT/tboot launch or platform
 * opt in, so enforce that.
 */
-   force_on = tboot_force_iommu() || platform_optin_force_iommu();
+   force_on = (!intel_iommu_tboot_noforce && tboot_force_iommu()) ||
+   platform_optin_force_iommu();
 
if (iommu_init_mempool()) {
if (force_on)
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index fbf5b3e..d956987 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -798,7 +798,6 @@ struct context_entry *iommu_context_addr(struct intel_iommu 
*iommu, u8 bus,
 extern int iommu_calculate_max_sagaw(struct intel_iommu *iommu);
 extern int dmar_disabled;
 extern int intel_iommu_enabled;
-extern int intel_iommu_tboot_noforce;
 extern int intel_iommu_gfx_mapped;
 #else
 static inline int iommu_calculate_agaw(struct intel_iommu *iommu)
-- 
1.8.3.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 19/35] x86/io_apic: Cleanup trigger/polarity helpers

2020-11-09 Thread Qian Cai
On Sat, 2020-10-24 at 22:35 +0100, David Woodhouse wrote:
> From: Thomas Gleixner 
> 
> 'trigger' and 'polarity' are used throughout the I/O-APIC code for handling
> the trigger type (edge/level) and the active low/high configuration. While
> there are defines for initializing these variables and struct members, they
> are not used consequently and the meaning of 'trigger' and 'polarity' is
> opaque and confusing at best.
> 
> Rename them to 'is_level' and 'active_low' and make them boolean in various
> structs so it's entirely clear what the meaning is.
> 
> Signed-off-by: Thomas Gleixner 
> Signed-off-by: David Woodhouse 
> ---
>  arch/x86/include/asm/hw_irq.h   |   6 +-
>  arch/x86/kernel/apic/io_apic.c  | 244 +---
>  arch/x86/pci/intel_mid_pci.c|   8 +-
>  drivers/iommu/amd/iommu.c   |  10 +-
>  drivers/iommu/intel/irq_remapping.c |   9 +-
>  5 files changed, 130 insertions(+), 147 deletions(-)

Reverting the rest of patchset up to this commit on next-20201109 fixed an
endless soft-lockups issue booting an AMD server below. I noticed that the
failed boots always has this IOMMU IO_PAGE_FAULT before those soft-lockups:

[ 3404.093354][T1] AMD-Vi: Interrupt remapping enabled
[ 3404.098593][T1] AMD-Vi: Virtual APIC enabled
[ 3404.107783][  T340] pci :00:14.0: AMD-Vi: Event logged [IO_PAGE_FAULT 
domain=0x address=0xfffdf8020200 flags=0x0008]
[ 3404.120644][T1] AMD-Vi: Lazy IO/TLB flushing enabled
[ 3404.126011][T1] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
[ 3404.133173][T1] software IO TLB: mapped [mem 
0x68dcf000-0x6cdcf000] (64MB)

.config (if ever matters):
https://cailca.coding.net/public/linux/mm/git/files/master/x86.config

good boot dmesg (with the commits reverted):
http://people.redhat.com/qcai/dmesg.txt

== system info ==
Dell Poweredge R6415
AMD EPYC 7401P 24-Core Processor
24576 MB memory, 239 GB disk space

[  OK  ] Started Flush Journal to Persistent Storage.
[  OK  ] Started udev Kernel Device Manager.
[  OK  ] Started udev Coldplug all Devices.
[  OK  ] Started Monitoring of LVM2 mirrors,…sing dmeventd or progress polling.
[  OK  ] Reached target Local File Systems (Pre).
 Mounting /boot...
[  OK  ] Created slice system-lvm2\x2dpvscan.slice.
[ 3740.376500][ T1058] XFS (sda1): Mounting V5 Filesystem
[ 3740.438474][ T1058] XFS (sda1): Ending clean mount
[ 3765.159433][C0] watchdog: BUG: soft lockup - CPU#0 stuck for 22s! 
[systemd:1]
[ 3765.166929][C0] Modules linked in: acpi_cpufreq(+) ip_tables x_tables 
sd_mod ahci libahci tg3 bnxt_en megaraid_sas libata firmware_class libphy 
dm_mirror dm_region_hash dm_log dm_mod
[ 3765.183576][C0] irq event stamp: 26230104
[ 3765.187954][C0] hardirqs last  enabled at (26230103): 
[] asm_common_interrupt+0x1e/0x40
[ 3765.197873][C0] hardirqs last disabled at (26230104): 
[] sysvec_apic_timer_interrupt+0xa/0xa0
[ 3765.208303][C0] softirqs last  enabled at (26202664): 
[] __do_softirq+0x61b/0x95d
[ 3765.217699][C0] softirqs last disabled at (26202591): 
[] asm_call_irq_on_stack+0x12/0x20
[ 3765.227702][C0] CPU: 0 PID: 1 Comm: systemd Not tainted 
5.10.0-rc2-next-20201109+ #2
[ 3765.235793][C0] Hardware name: Dell Inc. PowerEdge R6415/07YXFK, BIOS 
1.9.3 06/25/2019
[ 3765.244065][C0] RIP: 0010:lock_acquire+0x1f4/0x820
lock_acquire at kernel/locking/lockdep.c:5404
[ 3765.249211][C0] Code: ff ff ff 48 83 c4 20 65 0f c1 05 a7 ba 9e 7e 83 f8 
01 4c 8b 54 24 08 0f 85 60 04 00 00 41 52 9d 48 b8 00 00 00 00 00 fc ff df <48> 
01 c3 c7 03 00 00 00 00 c7 43 08 00 00 00 00 48 8b0
[ 3765.268657][C0] RSP: 0018:c906f9f8 EFLAGS: 0246
[ 3765.274587][C0] RAX: dc00 RBX: 1920df42 RCX: 
1920df28
[ 3765.282420][C0] RDX: 111020645769 RSI:  RDI: 
0001
[ 3765.290256][C0] RBP: 0001 R08: fbfff164cb10 R09: 
fbfff164cb10
[ 3765.298090][C0] R10: 0246 R11: fbfff164cb0f R12: 
88812be555b0
[ 3765.305922][C0] R13:  R14:  R15: 

[ 3765.313750][C0] FS:  7f12bb8c59c0() GS:8881b700() 
knlGS:
[ 3765.322537][C0] CS:  0010 DS:  ES:  CR0: 80050033
[ 3765.328985][C0] CR2: 7f0c2d828fd0 CR3: 00011868a000 CR4: 
003506f0
[ 3765.336820][C0] Call Trace:
[ 3765.339979][C0]  ? rcu_read_unlock+0x40/0x40
[ 3765.344609][C0]  __d_move+0x2a2/0x16f0
__seqprop_spinlock_assert at include/linux/seqlock.h:277
(inlined by) __d_move at fs/dcache.c:2861
[ 3765.348711][C0]  ? d_move+0x47/0x70
[ 3765.352560][C0]  ? _raw_spin_unlock+0x1a/0x30
[ 3765.357275][C0]  d_move+0x47/0x70
write_seqcount_t_end at include/linux/seqlock.h:560
(inlined by) write_sequnlock at include/linux/seqlock.h:901
(inlined by) d_move at 

[PATCH AUTOSEL 4.4 08/10] iommu/amd: Increase interrupt remapping table limit to 512 entries

2020-11-09 Thread Sasha Levin
From: Suravee Suthikulpanit 

[ Upstream commit 73db2fc595f358460ce32bcaa3be1f0cce4a2db1 ]

Certain device drivers allocate IO queues on a per-cpu basis.
On AMD EPYC platform, which can support up-to 256 cpu threads,
this can exceed the current MAX_IRQ_PER_TABLE limit of 256,
and result in the error message:

AMD-Vi: Failed to allocate IRTE

This has been observed with certain NVME devices.

AMD IOMMU hardware can actually support upto 512 interrupt
remapping table entries. Therefore, update the driver to
match the hardware limit.

Please note that this also increases the size of interrupt remapping
table to 8KB per device when using the 128-bit IRTE format.

Signed-off-by: Suravee Suthikulpanit 
Link: 
https://lore.kernel.org/r/20201015025002.87997-1-suravee.suthikulpa...@amd.com
Signed-off-by: Joerg Roedel 
Signed-off-by: Sasha Levin 
---
 drivers/iommu/amd_iommu_types.h | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
index 695d4e235438c..90832bf00538e 100644
--- a/drivers/iommu/amd_iommu_types.h
+++ b/drivers/iommu/amd_iommu_types.h
@@ -351,7 +351,11 @@ extern bool amd_iommu_np_cache;
 /* Only true if all IOMMUs support device IOTLBs */
 extern bool amd_iommu_iotlb_sup;
 
-#define MAX_IRQS_PER_TABLE 256
+/*
+ * AMD IOMMU hardware only support 512 IRTEs despite
+ * the architectural limitation of 2048 entries.
+ */
+#define MAX_IRQS_PER_TABLE 512
 #define IRQ_TABLE_ALIGNMENT128
 
 struct irq_remap_table {
-- 
2.27.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH AUTOSEL 4.9 10/12] iommu/amd: Increase interrupt remapping table limit to 512 entries

2020-11-09 Thread Sasha Levin
From: Suravee Suthikulpanit 

[ Upstream commit 73db2fc595f358460ce32bcaa3be1f0cce4a2db1 ]

Certain device drivers allocate IO queues on a per-cpu basis.
On AMD EPYC platform, which can support up-to 256 cpu threads,
this can exceed the current MAX_IRQ_PER_TABLE limit of 256,
and result in the error message:

AMD-Vi: Failed to allocate IRTE

This has been observed with certain NVME devices.

AMD IOMMU hardware can actually support upto 512 interrupt
remapping table entries. Therefore, update the driver to
match the hardware limit.

Please note that this also increases the size of interrupt remapping
table to 8KB per device when using the 128-bit IRTE format.

Signed-off-by: Suravee Suthikulpanit 
Link: 
https://lore.kernel.org/r/20201015025002.87997-1-suravee.suthikulpa...@amd.com
Signed-off-by: Joerg Roedel 
Signed-off-by: Sasha Levin 
---
 drivers/iommu/amd_iommu_types.h | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
index da3fbf82d1cf4..e19c05d9e84ba 100644
--- a/drivers/iommu/amd_iommu_types.h
+++ b/drivers/iommu/amd_iommu_types.h
@@ -383,7 +383,11 @@ extern bool amd_iommu_np_cache;
 /* Only true if all IOMMUs support device IOTLBs */
 extern bool amd_iommu_iotlb_sup;
 
-#define MAX_IRQS_PER_TABLE 256
+/*
+ * AMD IOMMU hardware only support 512 IRTEs despite
+ * the architectural limitation of 2048 entries.
+ */
+#define MAX_IRQS_PER_TABLE 512
 #define IRQ_TABLE_ALIGNMENT128
 
 struct irq_remap_table {
-- 
2.27.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH AUTOSEL 4.14 11/14] iommu/amd: Increase interrupt remapping table limit to 512 entries

2020-11-09 Thread Sasha Levin
From: Suravee Suthikulpanit 

[ Upstream commit 73db2fc595f358460ce32bcaa3be1f0cce4a2db1 ]

Certain device drivers allocate IO queues on a per-cpu basis.
On AMD EPYC platform, which can support up-to 256 cpu threads,
this can exceed the current MAX_IRQ_PER_TABLE limit of 256,
and result in the error message:

AMD-Vi: Failed to allocate IRTE

This has been observed with certain NVME devices.

AMD IOMMU hardware can actually support upto 512 interrupt
remapping table entries. Therefore, update the driver to
match the hardware limit.

Please note that this also increases the size of interrupt remapping
table to 8KB per device when using the 128-bit IRTE format.

Signed-off-by: Suravee Suthikulpanit 
Link: 
https://lore.kernel.org/r/20201015025002.87997-1-suravee.suthikulpa...@amd.com
Signed-off-by: Joerg Roedel 
Signed-off-by: Sasha Levin 
---
 drivers/iommu/amd_iommu_types.h | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
index 74c8638aac2b9..ac3cac052af9d 100644
--- a/drivers/iommu/amd_iommu_types.h
+++ b/drivers/iommu/amd_iommu_types.h
@@ -404,7 +404,11 @@ extern bool amd_iommu_np_cache;
 /* Only true if all IOMMUs support device IOTLBs */
 extern bool amd_iommu_iotlb_sup;
 
-#define MAX_IRQS_PER_TABLE 256
+/*
+ * AMD IOMMU hardware only support 512 IRTEs despite
+ * the architectural limitation of 2048 entries.
+ */
+#define MAX_IRQS_PER_TABLE 512
 #define IRQ_TABLE_ALIGNMENT128
 
 struct irq_remap_table {
-- 
2.27.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH AUTOSEL 4.19 14/21] iommu/amd: Increase interrupt remapping table limit to 512 entries

2020-11-09 Thread Sasha Levin
From: Suravee Suthikulpanit 

[ Upstream commit 73db2fc595f358460ce32bcaa3be1f0cce4a2db1 ]

Certain device drivers allocate IO queues on a per-cpu basis.
On AMD EPYC platform, which can support up-to 256 cpu threads,
this can exceed the current MAX_IRQ_PER_TABLE limit of 256,
and result in the error message:

AMD-Vi: Failed to allocate IRTE

This has been observed with certain NVME devices.

AMD IOMMU hardware can actually support upto 512 interrupt
remapping table entries. Therefore, update the driver to
match the hardware limit.

Please note that this also increases the size of interrupt remapping
table to 8KB per device when using the 128-bit IRTE format.

Signed-off-by: Suravee Suthikulpanit 
Link: 
https://lore.kernel.org/r/20201015025002.87997-1-suravee.suthikulpa...@amd.com
Signed-off-by: Joerg Roedel 
Signed-off-by: Sasha Levin 
---
 drivers/iommu/amd_iommu_types.h | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
index 859b06424e5c4..df6f3cc958e5e 100644
--- a/drivers/iommu/amd_iommu_types.h
+++ b/drivers/iommu/amd_iommu_types.h
@@ -410,7 +410,11 @@ extern bool amd_iommu_np_cache;
 /* Only true if all IOMMUs support device IOTLBs */
 extern bool amd_iommu_iotlb_sup;
 
-#define MAX_IRQS_PER_TABLE 256
+/*
+ * AMD IOMMU hardware only support 512 IRTEs despite
+ * the architectural limitation of 2048 entries.
+ */
+#define MAX_IRQS_PER_TABLE 512
 #define IRQ_TABLE_ALIGNMENT128
 
 struct irq_remap_table {
-- 
2.27.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH AUTOSEL 5.4 29/42] iommu/amd: Increase interrupt remapping table limit to 512 entries

2020-11-09 Thread Sasha Levin
From: Suravee Suthikulpanit 

[ Upstream commit 73db2fc595f358460ce32bcaa3be1f0cce4a2db1 ]

Certain device drivers allocate IO queues on a per-cpu basis.
On AMD EPYC platform, which can support up-to 256 cpu threads,
this can exceed the current MAX_IRQ_PER_TABLE limit of 256,
and result in the error message:

AMD-Vi: Failed to allocate IRTE

This has been observed with certain NVME devices.

AMD IOMMU hardware can actually support upto 512 interrupt
remapping table entries. Therefore, update the driver to
match the hardware limit.

Please note that this also increases the size of interrupt remapping
table to 8KB per device when using the 128-bit IRTE format.

Signed-off-by: Suravee Suthikulpanit 
Link: 
https://lore.kernel.org/r/20201015025002.87997-1-suravee.suthikulpa...@amd.com
Signed-off-by: Joerg Roedel 
Signed-off-by: Sasha Levin 
---
 drivers/iommu/amd_iommu_types.h | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
index 0679896b9e2e1..3ec090adcdae7 100644
--- a/drivers/iommu/amd_iommu_types.h
+++ b/drivers/iommu/amd_iommu_types.h
@@ -406,7 +406,11 @@ extern bool amd_iommu_np_cache;
 /* Only true if all IOMMUs support device IOTLBs */
 extern bool amd_iommu_iotlb_sup;
 
-#define MAX_IRQS_PER_TABLE 256
+/*
+ * AMD IOMMU hardware only support 512 IRTEs despite
+ * the architectural limitation of 2048 entries.
+ */
+#define MAX_IRQS_PER_TABLE 512
 #define IRQ_TABLE_ALIGNMENT128
 
 struct irq_remap_table {
-- 
2.27.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH AUTOSEL 5.9 37/55] iommu/amd: Increase interrupt remapping table limit to 512 entries

2020-11-09 Thread Sasha Levin
From: Suravee Suthikulpanit 

[ Upstream commit 73db2fc595f358460ce32bcaa3be1f0cce4a2db1 ]

Certain device drivers allocate IO queues on a per-cpu basis.
On AMD EPYC platform, which can support up-to 256 cpu threads,
this can exceed the current MAX_IRQ_PER_TABLE limit of 256,
and result in the error message:

AMD-Vi: Failed to allocate IRTE

This has been observed with certain NVME devices.

AMD IOMMU hardware can actually support upto 512 interrupt
remapping table entries. Therefore, update the driver to
match the hardware limit.

Please note that this also increases the size of interrupt remapping
table to 8KB per device when using the 128-bit IRTE format.

Signed-off-by: Suravee Suthikulpanit 
Link: 
https://lore.kernel.org/r/20201015025002.87997-1-suravee.suthikulpa...@amd.com
Signed-off-by: Joerg Roedel 
Signed-off-by: Sasha Levin 
---
 drivers/iommu/amd/amd_iommu_types.h | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index 30a5d412255a4..427484c455891 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -406,7 +406,11 @@ extern bool amd_iommu_np_cache;
 /* Only true if all IOMMUs support device IOTLBs */
 extern bool amd_iommu_iotlb_sup;
 
-#define MAX_IRQS_PER_TABLE 256
+/*
+ * AMD IOMMU hardware only support 512 IRTEs despite
+ * the architectural limitation of 2048 entries.
+ */
+#define MAX_IRQS_PER_TABLE 512
 #define IRQ_TABLE_ALIGNMENT128
 
 struct irq_remap_table {
-- 
2.27.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v1 3/3] drm/msm: Improve the a6xx page fault handler

2020-11-09 Thread Rob Clark
On Mon, Nov 9, 2020 at 2:23 PM Jordan Crouse  wrote:
>
> Use the new adreno-smmu-priv fault info function to get more SMMU
> debug registers and print the current TTBR0 to debug per-instance
> pagetables and figure out which GPU block generated the request.
>
> Signed-off-by: Jordan Crouse 
> ---
>
>  drivers/gpu/drm/msm/adreno/a5xx_gpu.c |  4 +-
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 76 +--
>  drivers/gpu/drm/msm/msm_iommu.c   | 11 +++-
>  drivers/gpu/drm/msm/msm_mmu.h |  4 +-
>  4 files changed, 87 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> index d6804a802355..ed4cb81af874 100644
> --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> @@ -933,7 +933,7 @@ bool a5xx_idle(struct msm_gpu *gpu, struct msm_ringbuffer 
> *ring)
> return true;
>  }
>
> -static int a5xx_fault_handler(void *arg, unsigned long iova, int flags)
> +static int a5xx_fault_handler(void *arg, unsigned long iova, int flags, void 
> *data)
>  {
> struct msm_gpu *gpu = arg;
> pr_warn_ratelimited("*** gpu fault: iova=%08lx, flags=%d 
> (%u,%u,%u,%u)\n",
> @@ -943,7 +943,7 @@ static int a5xx_fault_handler(void *arg, unsigned long 
> iova, int flags)
> gpu_read(gpu, REG_A5XX_CP_SCRATCH_REG(6)),
> gpu_read(gpu, REG_A5XX_CP_SCRATCH_REG(7)));
>
> -   return -EFAULT;
> +   return 0;
>  }
>
>  static void a5xx_cp_err_irq(struct msm_gpu *gpu)
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index 948f3656c20c..ac6e8cd5cf1a 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -905,18 +905,88 @@ static void a6xx_recover(struct msm_gpu *gpu)
> msm_gpu_hw_init(gpu);
>  }
>
> -static int a6xx_fault_handler(void *arg, unsigned long iova, int flags)
> +static const char *a6xx_uche_fault_block(struct msm_gpu *gpu, u32 mid)
> +{
> +   static const char *uche_clients[7] = {
> +   "VFD", "SP", "VSC", "VPC", "HLSQ", "PC", "LRZ",
> +   };
> +   u32 val;
> +
> +   if (mid < 1 || mid > 3)
> +   return "UNKNOWN";
> +
> +   /*
> +* The source of the data depends on the mid ID read from FSYNR1.
> +* and the client ID read from the UCHE block
> +*/
> +   val = gpu_read(gpu, REG_A6XX_UCHE_CLIENT_PF);
> +
> +   /* mid = 3 is most precise and refers to only one block per client */
> +   if (mid == 3)
> +   return uche_clients[val & 7];
> +
> +   /* For mid=2 the source is TP or VFD except when the client id is 0 */
> +   if (mid == 2)
> +   return ((val & 7) == 0) ? "TP" : "TP|VFD";
> +
> +   /* For mid=1 just return "UCHE" as a catchall for everything else */
> +   return "UCHE";
> +}
> +
> +static const char *a6xx_fault_block(struct msm_gpu *gpu, u32 id)
> +{
> +   if (id == 0)
> +   return "CP";
> +   else if (id == 4)
> +   return "CCU";
> +   else if (id == 6)
> +   return "CDP Prefetch";
> +
> +   return a6xx_uche_fault_block(gpu, id);
> +}
> +
> +#define ARM_SMMU_FSR_TF BIT(1)
> +#define ARM_SMMU_FSR_PFBIT(3)
> +#define ARM_SMMU_FSR_EFBIT(4)
> +
> +static int a6xx_fault_handler(void *arg, unsigned long iova, int flags, void 
> *data)
>  {
> struct msm_gpu *gpu = arg;
> +   struct adreno_smmu_fault_info *info = data;
> +   const char *type = "UNKNOWN";
>
> -   pr_warn_ratelimited("*** gpu fault: iova=%08lx, flags=%d 
> (%u,%u,%u,%u)\n",
> +   /*
> +* Print a default message if we couldn't get the data from the
> +* adreno-smmu-priv
> +*/
> +   if (!info) {
> +   pr_warn_ratelimited("*** gpu fault: iova=%.16lx flags=%d 
> (%u,%u,%u,%u)\n",
> iova, flags,
> gpu_read(gpu, REG_A6XX_CP_SCRATCH_REG(4)),
> gpu_read(gpu, REG_A6XX_CP_SCRATCH_REG(5)),
> gpu_read(gpu, REG_A6XX_CP_SCRATCH_REG(6)),
> gpu_read(gpu, REG_A6XX_CP_SCRATCH_REG(7)));
>
> -   return -EFAULT;
> +   return 0;
> +   }
> +
> +   if (info->fsr & ARM_SMMU_FSR_TF)
> +   type = "TRANSLATION";
> +   else if (info->fsr & ARM_SMMU_FSR_PF)
> +   type = "PERMISSION";
> +   else if (info->fsr & ARM_SMMU_FSR_EF)
> +   type = "EXTERNAL";
> +
> +   pr_warn_ratelimited("*** gpu fault: ttbr0=%.16llx iova=%.16lx dir=%s 
> type=%s source=%s (%u,%u,%u,%u)\n",
> +   info->ttbr0, iova,
> +   flags & IOMMU_FAULT_WRITE ? "WRITE" : "READ", type,
> +   a6xx_fault_block(gpu, info->fsynr1 & 0xff),
> +   gpu_read(gpu, 

Re: [RFC PATCH v1 2/3] drm/msm: Add an adreno-smmu-priv callback to get pagefault info

2020-11-09 Thread Rob Clark
On Mon, Nov 9, 2020 at 2:23 PM Jordan Crouse  wrote:
>
> Add a callback in adreno-smmu-priv to read interesting SMMU
> registers to provide an opportunity for a richer debug experience
> in the GPU driver.
>
> Signed-off-by: Jordan Crouse 
> ---
>
>  drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 19 +
>  drivers/iommu/arm/arm-smmu/arm-smmu.h  |  2 ++
>  include/linux/adreno-smmu-priv.h   | 31 +-
>  3 files changed, 51 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c 
> b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> index d0636c803a36..367a267324a2 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> @@ -32,6 +32,24 @@ static void qcom_adreno_smmu_write_sctlr(struct 
> arm_smmu_device *smmu, int idx,
> arm_smmu_cb_write(smmu, idx, ARM_SMMU_CB_SCTLR, reg);
>  }
>
> +static void qcom_adreno_smmu_get_fault_info(const void *cookie,
> +   struct adreno_smmu_fault_info *info)
> +{
> +   struct arm_smmu_domain *smmu_domain = (void *)cookie;
> +   struct arm_smmu_cfg *cfg = _domain->cfg;
> +   struct arm_smmu_device *smmu = smmu_domain->smmu;
> +
> +   info->fsr = arm_smmu_cb_read(smmu, cfg->cbndx, ARM_SMMU_CB_FSR);
> +   /* FIXME: return error here if we aren't really in a fault? */
> +
> +   info->fsynr0 = arm_smmu_cb_read(smmu, cfg->cbndx, ARM_SMMU_CB_FSYNR0);
> +   info->fsynr1 = arm_smmu_cb_read(smmu, cfg->cbndx, ARM_SMMU_CB_FSYNR1);
> +   info->far = arm_smmu_cb_readq(smmu, cfg->cbndx, ARM_SMMU_CB_FAR);
> +   info->cbfrsynra = arm_smmu_gr1_read(smmu, 
> ARM_SMMU_GR1_CBFRSYNRA(cfg->cbndx));
> +   info->ttbr0 = arm_smmu_cb_read(smmu, cfg->cbndx, ARM_SMMU_CB_TTBR0);
> +   info->contextidr = arm_smmu_cb_read(smmu, cfg->cbndx, 
> ARM_SMMU_CB_CONTEXTIDR);
> +}
> +
>  #define QCOM_ADRENO_SMMU_GPU_SID 0
>
>  static bool qcom_adreno_smmu_is_gpu_device(struct device *dev)
> @@ -156,6 +174,7 @@ static int qcom_adreno_smmu_init_context(struct 
> arm_smmu_domain *smmu_domain,
> priv->cookie = smmu_domain;
> priv->get_ttbr1_cfg = qcom_adreno_smmu_get_ttbr1_cfg;
> priv->set_ttbr0_cfg = qcom_adreno_smmu_set_ttbr0_cfg;
> +   priv->get_fault_info = qcom_adreno_smmu_get_fault_info;
>
> return 0;
>  }
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h 
> b/drivers/iommu/arm/arm-smmu/arm-smmu.h
> index 04288b6fc619..fe511540a6bf 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu.h
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h
> @@ -224,6 +224,8 @@ enum arm_smmu_cbar_type {
>  #define ARM_SMMU_CB_FSYNR0 0x68
>  #define ARM_SMMU_FSYNR0_WNRBIT(4)
>
> +#define ARM_SMMU_CB_FSYNR1 0x6c
> +
>  #define ARM_SMMU_CB_S1_TLBIVA  0x600
>  #define ARM_SMMU_CB_S1_TLBIASID0x610
>  #define ARM_SMMU_CB_S1_TLBIVAL 0x620
> diff --git a/include/linux/adreno-smmu-priv.h 
> b/include/linux/adreno-smmu-priv.h
> index a889f28afb42..fc2592ebb9ba 100644
> --- a/include/linux/adreno-smmu-priv.h
> +++ b/include/linux/adreno-smmu-priv.h
> @@ -8,6 +8,32 @@
>
>  #include 
>
> +/**
> + * struct adreno_smmu_fault_info - container for key fault information
> + *
> + * @far: The faulting IOVA from ARM_SMMU_CB_FAR
> + * @ttbr0: The current TTBR0 pagetable from ARM_SMMU_CB_TTBR0
> + * @contextidr: The value of ARM_SMMU_CB_CONTEXTIDR
> + * @fsr: The fault status from ARM_SMMU_CB_FSR
> + * @fsynr0: The value of FSYNR0 from ARM_SMMU_CB_FSYNR0
> + * @fsynr1: The value of FSYNR1 from ARM_SMMU_CB_FSYNR0
> + * @cbfrsynra: The value of CBFRSYNRA from ARM_SMMU_GR1_CBFRSYNRA(idx)
> + *
> + * This struct passes back key page fault information to the GPU driver
> + * through the get_fault_info function pointer.
> + * The GPU driver can use this information to print informative
> + * log messages and provide deeper GPU specific insight into the fault.
> + */
> +struct adreno_smmu_fault_info {
> +   u64 far;
> +   u64 ttbr0;
> +   u32 contextidr;
> +   u32 fsr;
> +   u32 fsynr0;
> +   u32 fsynr1;
> +   u32 cbfrsynra;
> +};
> +
>  /**
>   * struct adreno_smmu_priv - private interface between adreno-smmu and GPU
>   *
> @@ -17,6 +43,8 @@
>   * @set_ttbr0_cfg: Set the TTBR0 config for the GPUs context bank.  A
>   * NULL config disables TTBR0 translation, otherwise
>   * TTBR0 translation is enabled with the specified cfg
> + * @get_fault_info: Call a helper function in the GPU driver to process a
> + * pagefault

This description isn't quite right, since it is call*ed* by the GPU
driver.  (And the helper aspect is irrelivant to the adreno/smmu
private interface).  Maybe something like:

"Called by the GPU driver fault handler to retrieve information about
a pagefault"

?

BR,
-R

>   *
>   * The GPU driver (drm/msm) and adreno-smmu work together for controlling
>   * the GPU's SMMU instance.  This is by 

[RFC PATCH v1 2/3] drm/msm: Add an adreno-smmu-priv callback to get pagefault info

2020-11-09 Thread Jordan Crouse
Add a callback in adreno-smmu-priv to read interesting SMMU
registers to provide an opportunity for a richer debug experience
in the GPU driver.

Signed-off-by: Jordan Crouse 
---

 drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 19 +
 drivers/iommu/arm/arm-smmu/arm-smmu.h  |  2 ++
 include/linux/adreno-smmu-priv.h   | 31 +-
 3 files changed, 51 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
index d0636c803a36..367a267324a2 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
@@ -32,6 +32,24 @@ static void qcom_adreno_smmu_write_sctlr(struct 
arm_smmu_device *smmu, int idx,
arm_smmu_cb_write(smmu, idx, ARM_SMMU_CB_SCTLR, reg);
 }
 
+static void qcom_adreno_smmu_get_fault_info(const void *cookie,
+   struct adreno_smmu_fault_info *info)
+{
+   struct arm_smmu_domain *smmu_domain = (void *)cookie;
+   struct arm_smmu_cfg *cfg = _domain->cfg;
+   struct arm_smmu_device *smmu = smmu_domain->smmu;
+
+   info->fsr = arm_smmu_cb_read(smmu, cfg->cbndx, ARM_SMMU_CB_FSR);
+   /* FIXME: return error here if we aren't really in a fault? */
+
+   info->fsynr0 = arm_smmu_cb_read(smmu, cfg->cbndx, ARM_SMMU_CB_FSYNR0);
+   info->fsynr1 = arm_smmu_cb_read(smmu, cfg->cbndx, ARM_SMMU_CB_FSYNR1);
+   info->far = arm_smmu_cb_readq(smmu, cfg->cbndx, ARM_SMMU_CB_FAR);
+   info->cbfrsynra = arm_smmu_gr1_read(smmu, 
ARM_SMMU_GR1_CBFRSYNRA(cfg->cbndx));
+   info->ttbr0 = arm_smmu_cb_read(smmu, cfg->cbndx, ARM_SMMU_CB_TTBR0);
+   info->contextidr = arm_smmu_cb_read(smmu, cfg->cbndx, 
ARM_SMMU_CB_CONTEXTIDR);
+}
+
 #define QCOM_ADRENO_SMMU_GPU_SID 0
 
 static bool qcom_adreno_smmu_is_gpu_device(struct device *dev)
@@ -156,6 +174,7 @@ static int qcom_adreno_smmu_init_context(struct 
arm_smmu_domain *smmu_domain,
priv->cookie = smmu_domain;
priv->get_ttbr1_cfg = qcom_adreno_smmu_get_ttbr1_cfg;
priv->set_ttbr0_cfg = qcom_adreno_smmu_set_ttbr0_cfg;
+   priv->get_fault_info = qcom_adreno_smmu_get_fault_info;
 
return 0;
 }
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h 
b/drivers/iommu/arm/arm-smmu/arm-smmu.h
index 04288b6fc619..fe511540a6bf 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.h
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h
@@ -224,6 +224,8 @@ enum arm_smmu_cbar_type {
 #define ARM_SMMU_CB_FSYNR0 0x68
 #define ARM_SMMU_FSYNR0_WNRBIT(4)
 
+#define ARM_SMMU_CB_FSYNR1 0x6c
+
 #define ARM_SMMU_CB_S1_TLBIVA  0x600
 #define ARM_SMMU_CB_S1_TLBIASID0x610
 #define ARM_SMMU_CB_S1_TLBIVAL 0x620
diff --git a/include/linux/adreno-smmu-priv.h b/include/linux/adreno-smmu-priv.h
index a889f28afb42..fc2592ebb9ba 100644
--- a/include/linux/adreno-smmu-priv.h
+++ b/include/linux/adreno-smmu-priv.h
@@ -8,6 +8,32 @@
 
 #include 
 
+/**
+ * struct adreno_smmu_fault_info - container for key fault information
+ *
+ * @far: The faulting IOVA from ARM_SMMU_CB_FAR
+ * @ttbr0: The current TTBR0 pagetable from ARM_SMMU_CB_TTBR0
+ * @contextidr: The value of ARM_SMMU_CB_CONTEXTIDR
+ * @fsr: The fault status from ARM_SMMU_CB_FSR
+ * @fsynr0: The value of FSYNR0 from ARM_SMMU_CB_FSYNR0
+ * @fsynr1: The value of FSYNR1 from ARM_SMMU_CB_FSYNR0
+ * @cbfrsynra: The value of CBFRSYNRA from ARM_SMMU_GR1_CBFRSYNRA(idx)
+ *
+ * This struct passes back key page fault information to the GPU driver
+ * through the get_fault_info function pointer.
+ * The GPU driver can use this information to print informative
+ * log messages and provide deeper GPU specific insight into the fault.
+ */
+struct adreno_smmu_fault_info {
+   u64 far;
+   u64 ttbr0;
+   u32 contextidr;
+   u32 fsr;
+   u32 fsynr0;
+   u32 fsynr1;
+   u32 cbfrsynra;
+};
+
 /**
  * struct adreno_smmu_priv - private interface between adreno-smmu and GPU
  *
@@ -17,6 +43,8 @@
  * @set_ttbr0_cfg: Set the TTBR0 config for the GPUs context bank.  A
  * NULL config disables TTBR0 translation, otherwise
  * TTBR0 translation is enabled with the specified cfg
+ * @get_fault_info: Call a helper function in the GPU driver to process a
+ * pagefault
  *
  * The GPU driver (drm/msm) and adreno-smmu work together for controlling
  * the GPU's SMMU instance.  This is by necessity, as the GPU is directly
@@ -31,6 +59,7 @@ struct adreno_smmu_priv {
 const void *cookie;
 const struct io_pgtable_cfg *(*get_ttbr1_cfg)(const void *cookie);
 int (*set_ttbr0_cfg)(const void *cookie, const struct io_pgtable_cfg *cfg);
+void (*get_fault_info)(const void *cookie, struct adreno_smmu_fault_info 
*info);
 };
 
-#endif /* __ADRENO_SMMU_PRIV_H */
\ No newline at end of file
+#endif /* __ADRENO_SMMU_PRIV_H */
-- 
2.25.1

___
iommu mailing list

[RFC PATCH v1 3/3] drm/msm: Improve the a6xx page fault handler

2020-11-09 Thread Jordan Crouse
Use the new adreno-smmu-priv fault info function to get more SMMU
debug registers and print the current TTBR0 to debug per-instance
pagetables and figure out which GPU block generated the request.

Signed-off-by: Jordan Crouse 
---

 drivers/gpu/drm/msm/adreno/a5xx_gpu.c |  4 +-
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 76 +--
 drivers/gpu/drm/msm/msm_iommu.c   | 11 +++-
 drivers/gpu/drm/msm/msm_mmu.h |  4 +-
 4 files changed, 87 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
index d6804a802355..ed4cb81af874 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
@@ -933,7 +933,7 @@ bool a5xx_idle(struct msm_gpu *gpu, struct msm_ringbuffer 
*ring)
return true;
 }
 
-static int a5xx_fault_handler(void *arg, unsigned long iova, int flags)
+static int a5xx_fault_handler(void *arg, unsigned long iova, int flags, void 
*data)
 {
struct msm_gpu *gpu = arg;
pr_warn_ratelimited("*** gpu fault: iova=%08lx, flags=%d 
(%u,%u,%u,%u)\n",
@@ -943,7 +943,7 @@ static int a5xx_fault_handler(void *arg, unsigned long 
iova, int flags)
gpu_read(gpu, REG_A5XX_CP_SCRATCH_REG(6)),
gpu_read(gpu, REG_A5XX_CP_SCRATCH_REG(7)));
 
-   return -EFAULT;
+   return 0;
 }
 
 static void a5xx_cp_err_irq(struct msm_gpu *gpu)
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 948f3656c20c..ac6e8cd5cf1a 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -905,18 +905,88 @@ static void a6xx_recover(struct msm_gpu *gpu)
msm_gpu_hw_init(gpu);
 }
 
-static int a6xx_fault_handler(void *arg, unsigned long iova, int flags)
+static const char *a6xx_uche_fault_block(struct msm_gpu *gpu, u32 mid)
+{
+   static const char *uche_clients[7] = {
+   "VFD", "SP", "VSC", "VPC", "HLSQ", "PC", "LRZ",
+   };
+   u32 val;
+
+   if (mid < 1 || mid > 3)
+   return "UNKNOWN";
+
+   /*
+* The source of the data depends on the mid ID read from FSYNR1.
+* and the client ID read from the UCHE block
+*/
+   val = gpu_read(gpu, REG_A6XX_UCHE_CLIENT_PF);
+
+   /* mid = 3 is most precise and refers to only one block per client */
+   if (mid == 3)
+   return uche_clients[val & 7];
+
+   /* For mid=2 the source is TP or VFD except when the client id is 0 */
+   if (mid == 2)
+   return ((val & 7) == 0) ? "TP" : "TP|VFD";
+
+   /* For mid=1 just return "UCHE" as a catchall for everything else */
+   return "UCHE";
+}
+
+static const char *a6xx_fault_block(struct msm_gpu *gpu, u32 id)
+{
+   if (id == 0)
+   return "CP";
+   else if (id == 4)
+   return "CCU";
+   else if (id == 6)
+   return "CDP Prefetch";
+
+   return a6xx_uche_fault_block(gpu, id);
+}
+
+#define ARM_SMMU_FSR_TF BIT(1)
+#define ARM_SMMU_FSR_PFBIT(3)
+#define ARM_SMMU_FSR_EFBIT(4)
+
+static int a6xx_fault_handler(void *arg, unsigned long iova, int flags, void 
*data)
 {
struct msm_gpu *gpu = arg;
+   struct adreno_smmu_fault_info *info = data;
+   const char *type = "UNKNOWN";
 
-   pr_warn_ratelimited("*** gpu fault: iova=%08lx, flags=%d 
(%u,%u,%u,%u)\n",
+   /*
+* Print a default message if we couldn't get the data from the
+* adreno-smmu-priv
+*/
+   if (!info) {
+   pr_warn_ratelimited("*** gpu fault: iova=%.16lx flags=%d 
(%u,%u,%u,%u)\n",
iova, flags,
gpu_read(gpu, REG_A6XX_CP_SCRATCH_REG(4)),
gpu_read(gpu, REG_A6XX_CP_SCRATCH_REG(5)),
gpu_read(gpu, REG_A6XX_CP_SCRATCH_REG(6)),
gpu_read(gpu, REG_A6XX_CP_SCRATCH_REG(7)));
 
-   return -EFAULT;
+   return 0;
+   }
+
+   if (info->fsr & ARM_SMMU_FSR_TF)
+   type = "TRANSLATION";
+   else if (info->fsr & ARM_SMMU_FSR_PF)
+   type = "PERMISSION";
+   else if (info->fsr & ARM_SMMU_FSR_EF)
+   type = "EXTERNAL";
+
+   pr_warn_ratelimited("*** gpu fault: ttbr0=%.16llx iova=%.16lx dir=%s 
type=%s source=%s (%u,%u,%u,%u)\n",
+   info->ttbr0, iova,
+   flags & IOMMU_FAULT_WRITE ? "WRITE" : "READ", type,
+   a6xx_fault_block(gpu, info->fsynr1 & 0xff),
+   gpu_read(gpu, REG_A6XX_CP_SCRATCH_REG(4)),
+   gpu_read(gpu, REG_A6XX_CP_SCRATCH_REG(5)),
+   gpu_read(gpu, REG_A6XX_CP_SCRATCH_REG(6)),
+   gpu_read(gpu, REG_A6XX_CP_SCRATCH_REG(7)));
+
+   return 0;
 }
 
 static void a6xx_cp_hw_err_irq(struct msm_gpu *gpu)

[RFC PATCH v1 0/3] iommu/arm-smmu: adreno-smmu page fault handling

2020-11-09 Thread Jordan Crouse
This is an RFC to add an Adreno GPU specific handler for pagefaults. The first
patch starts by wiring up report_iommu_fault for arm-smmu. The next patch adds
a adreno-smmu-priv function hook to capture a handful of important debugging
registers such as TTBR0, CONTEXTIDR, FSYNR0 and others. This is used by the
third patch to print more detailed information on page fault such as the TTBR0
for the pagetable that caused the fault and the source of the fault as
determined by a combination of the FSYNR1 register and an internal GPU
register.

This code provides a solid base that we can expand on later for even more
extensive GPU side page fault debugging capabilities.

Jordan Crouse (3):
  iommu/arm-smmu: Add support for driver IOMMU fault handlers
  drm/msm: Add an adreno-smmu-priv callback to get pagefault info
  drm/msm: Improve the a6xx page fault handler

 drivers/gpu/drm/msm/adreno/a5xx_gpu.c  |  4 +-
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c  | 76 +-
 drivers/gpu/drm/msm/msm_iommu.c| 11 +++-
 drivers/gpu/drm/msm/msm_mmu.h  |  4 +-
 drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 19 ++
 drivers/iommu/arm/arm-smmu/arm-smmu.c  | 16 -
 drivers/iommu/arm/arm-smmu/arm-smmu.h  |  2 +
 include/linux/adreno-smmu-priv.h   | 31 -
 8 files changed, 151 insertions(+), 12 deletions(-)

-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC PATCH v1 1/3] iommu/arm-smmu: Add support for driver IOMMU fault handlers

2020-11-09 Thread Jordan Crouse
Call report_iommu_fault() to allow upper-level drivers to register their
own fault handlers.

Signed-off-by: Jordan Crouse 
---

 drivers/iommu/arm/arm-smmu/arm-smmu.c | 16 +---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index 0f28a8614da3..7fd18bbda8f5 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -427,6 +427,7 @@ static irqreturn_t arm_smmu_context_fault(int irq, void 
*dev)
struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
struct arm_smmu_device *smmu = smmu_domain->smmu;
int idx = smmu_domain->cfg.cbndx;
+   int ret;
 
fsr = arm_smmu_cb_read(smmu, idx, ARM_SMMU_CB_FSR);
if (!(fsr & ARM_SMMU_FSR_FAULT))
@@ -436,11 +437,20 @@ static irqreturn_t arm_smmu_context_fault(int irq, void 
*dev)
iova = arm_smmu_cb_readq(smmu, idx, ARM_SMMU_CB_FAR);
cbfrsynra = arm_smmu_gr1_read(smmu, ARM_SMMU_GR1_CBFRSYNRA(idx));
 
-   dev_err_ratelimited(smmu->dev,
-   "Unhandled context fault: fsr=0x%x, iova=0x%08lx, fsynr=0x%x, 
cbfrsynra=0x%x, cb=%d\n",
+   ret = report_iommu_fault(domain, dev, iova,
+   fsynr & ARM_SMMU_FSYNR0_WNR ? IOMMU_FAULT_WRITE : 
IOMMU_FAULT_READ);
+
+   if (ret == -ENOSYS)
+   dev_err_ratelimited(smmu->dev,
+   "Unhandled context fault: fsr=0x%x, iova=0x%08lx, fsynr=0x%x, 
cbfrsynra=0x%x, cb=%d\n",
fsr, iova, fsynr, cbfrsynra, idx);
 
-   arm_smmu_cb_write(smmu, idx, ARM_SMMU_CB_FSR, fsr);
+   /*
+* If the iommu fault returns an error (except -ENOSYS) then assume that
+* they will handle resuming on their own
+*/
+   if (!ret || ret == -ENOSYS)
+   arm_smmu_cb_write(smmu, idx, ARM_SMMU_CB_FSR, fsr);
return IRQ_HANDLED;
 }
 
-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v19 3/4] dt-bindings: arm-smmu: Add compatible string for Adreno GPU SMMU

2020-11-09 Thread Jordan Crouse
Every Qcom Adreno GPU has an embedded SMMU for its own use. These
devices depend on unique features such as split pagetables,
different stall/halt requirements and other settings. Identify them
with a compatible string so that they can be identified in the
arm-smmu implementation specific code.

Signed-off-by: Jordan Crouse 
Reviewed-by: Rob Herring 
Signed-off-by: Rob Clark 
Reviewed-by: Bjorn Andersson 
---

 Documentation/devicetree/bindings/iommu/arm,smmu.yaml | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml 
b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
index 503160a7b9a0..3b63f2ae24db 100644
--- a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
+++ b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
@@ -28,8 +28,6 @@ properties:
   - enum:
   - qcom,msm8996-smmu-v2
   - qcom,msm8998-smmu-v2
-  - qcom,sc7180-smmu-v2
-  - qcom,sdm845-smmu-v2
   - const: qcom,smmu-v2
 
   - description: Qcom SoCs implementing "arm,mmu-500"
@@ -40,6 +38,13 @@ properties:
   - qcom,sm8150-smmu-500
   - qcom,sm8250-smmu-500
   - const: arm,mmu-500
+  - description: Qcom Adreno GPUs implementing "arm,smmu-v2"
+items:
+  - enum:
+  - qcom,sc7180-smmu-v2
+  - qcom,sdm845-smmu-v2
+  - const: qcom,adreno-smmu
+  - const: qcom,smmu-v2
   - description: Marvell SoCs implementing "arm,mmu-500"
 items:
   - const: marvell,ap806-smmu-500
-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v19 4/4] arm: dts: qcom: sm845: Set the compatible string for the GPU SMMU

2020-11-09 Thread Jordan Crouse
Set the qcom,adreno-smmu compatible string for the GPU SMMU to enable
split pagetables and per-instance pagetables for drm/msm.

Signed-off-by: Jordan Crouse 
Signed-off-by: Rob Clark 
Reviewed-by: Bjorn Andersson 
---

 arch/arm64/boot/dts/qcom/sdm845-cheza.dtsi | 9 +
 arch/arm64/boot/dts/qcom/sdm845.dtsi   | 2 +-
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/qcom/sdm845-cheza.dtsi 
b/arch/arm64/boot/dts/qcom/sdm845-cheza.dtsi
index 64fc1bfd66fa..39f23cdcbd02 100644
--- a/arch/arm64/boot/dts/qcom/sdm845-cheza.dtsi
+++ b/arch/arm64/boot/dts/qcom/sdm845-cheza.dtsi
@@ -633,6 +633,15 @@ _mdp {
status = "okay";
 };
 
+/*
+ * Cheza fw does not properly program the GPU aperture to allow the
+ * GPU to update the SMMU pagetables for context switches.  Work
+ * around this by dropping the "qcom,adreno-smmu" compat string.
+ */
+_smmu {
+   compatible = "qcom,sdm845-smmu-v2", "qcom,smmu-v2";
+};
+
 _pil {
iommus = <_smmu 0x781 0x0>,
 <_smmu 0x724 0x3>;
diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi 
b/arch/arm64/boot/dts/qcom/sdm845.dtsi
index 40e8c11f23ab..0508e86140bd 100644
--- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
+++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
@@ -4103,7 +4103,7 @@ opp-25700 {
};
 
adreno_smmu: iommu@504 {
-   compatible = "qcom,sdm845-smmu-v2", "qcom,smmu-v2";
+   compatible = "qcom,sdm845-smmu-v2", "qcom,adreno-smmu", 
"qcom,smmu-v2";
reg = <0 0x504 0 0x1>;
#iommu-cells = <1>;
#global-interrupts = <2>;
-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v19 1/4] iommu/arm-smmu-qcom: Add implementation for the adreno GPU SMMU

2020-11-09 Thread Jordan Crouse
Add a special implementation for the SMMU attached to most Adreno GPU
target triggered from the qcom,adreno-smmu compatible string.

The new Adreno SMMU implementation will enable split pagetables
(TTBR1) for the domain attached to the GPU device (SID 0) and
hard code it context bank 0 so the GPU hardware can implement
per-instance pagetables.

Co-developed-by: Rob Clark 
Signed-off-by: Jordan Crouse 
Signed-off-by: Rob Clark 
Reviewed-by: Bjorn Andersson 
---

 drivers/iommu/arm/arm-smmu/arm-smmu-impl.c |   3 +
 drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 151 -
 drivers/iommu/arm/arm-smmu/arm-smmu.h  |   1 +
 3 files changed, 153 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
index 336f36ed9ed7..7fed89c9d18a 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
@@ -220,6 +220,9 @@ struct arm_smmu_device *arm_smmu_impl_init(struct 
arm_smmu_device *smmu)
of_device_is_compatible(np, "qcom,sm8250-smmu-500"))
return qcom_smmu_impl_init(smmu);
 
+   if (of_device_is_compatible(smmu->dev->of_node, "qcom,adreno-smmu"))
+   return qcom_adreno_smmu_impl_init(smmu);
+
if (of_device_is_compatible(np, "marvell,ap806-smmu-500"))
smmu->impl = _mmu500_impl;
 
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
index 86f0d182703e..b5384c4d92c8 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
@@ -3,6 +3,7 @@
  * Copyright (c) 2019, The Linux Foundation. All rights reserved.
  */
 
+#include 
 #include 
 #include 
 
@@ -19,6 +20,134 @@ static struct qcom_smmu *to_qcom_smmu(struct 
arm_smmu_device *smmu)
return container_of(smmu, struct qcom_smmu, smmu);
 }
 
+#define QCOM_ADRENO_SMMU_GPU_SID 0
+
+static bool qcom_adreno_smmu_is_gpu_device(struct device *dev)
+{
+   struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
+   int i;
+
+   /*
+* The GPU will always use SID 0 so that is a handy way to uniquely
+* identify it and configure it for per-instance pagetables
+*/
+   for (i = 0; i < fwspec->num_ids; i++) {
+   u16 sid = FIELD_GET(ARM_SMMU_SMR_ID, fwspec->ids[i]);
+
+   if (sid == QCOM_ADRENO_SMMU_GPU_SID)
+   return true;
+   }
+
+   return false;
+}
+
+static const struct io_pgtable_cfg *qcom_adreno_smmu_get_ttbr1_cfg(
+   const void *cookie)
+{
+   struct arm_smmu_domain *smmu_domain = (void *)cookie;
+   struct io_pgtable *pgtable =
+   io_pgtable_ops_to_pgtable(smmu_domain->pgtbl_ops);
+   return >cfg;
+}
+
+/*
+ * Local implementation to configure TTBR0 with the specified pagetable config.
+ * The GPU driver will call this to enable TTBR0 when per-instance pagetables
+ * are active
+ */
+
+static int qcom_adreno_smmu_set_ttbr0_cfg(const void *cookie,
+   const struct io_pgtable_cfg *pgtbl_cfg)
+{
+   struct arm_smmu_domain *smmu_domain = (void *)cookie;
+   struct io_pgtable *pgtable = 
io_pgtable_ops_to_pgtable(smmu_domain->pgtbl_ops);
+   struct arm_smmu_cfg *cfg = _domain->cfg;
+   struct arm_smmu_cb *cb = _domain->smmu->cbs[cfg->cbndx];
+
+   /* The domain must have split pagetables already enabled */
+   if (cb->tcr[0] & ARM_SMMU_TCR_EPD1)
+   return -EINVAL;
+
+   /* If the pagetable config is NULL, disable TTBR0 */
+   if (!pgtbl_cfg) {
+   /* Do nothing if it is already disabled */
+   if ((cb->tcr[0] & ARM_SMMU_TCR_EPD0))
+   return -EINVAL;
+
+   /* Set TCR to the original configuration */
+   cb->tcr[0] = arm_smmu_lpae_tcr(>cfg);
+   cb->ttbr[0] = FIELD_PREP(ARM_SMMU_TTBRn_ASID, cb->cfg->asid);
+   } else {
+   u32 tcr = cb->tcr[0];
+
+   /* Don't call this again if TTBR0 is already enabled */
+   if (!(cb->tcr[0] & ARM_SMMU_TCR_EPD0))
+   return -EINVAL;
+
+   tcr |= arm_smmu_lpae_tcr(pgtbl_cfg);
+   tcr &= ~(ARM_SMMU_TCR_EPD0 | ARM_SMMU_TCR_EPD1);
+
+   cb->tcr[0] = tcr;
+   cb->ttbr[0] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr;
+   cb->ttbr[0] |= FIELD_PREP(ARM_SMMU_TTBRn_ASID, cb->cfg->asid);
+   }
+
+   arm_smmu_write_context_bank(smmu_domain->smmu, cb->cfg->cbndx);
+
+   return 0;
+}
+
+static int qcom_adreno_smmu_alloc_context_bank(struct arm_smmu_domain 
*smmu_domain,
+  struct arm_smmu_device *smmu,
+  struct device *dev, int start)
+{
+   int count;
+
+   /*
+* Assign context bank 0 to the GPU device so the GPU hardware can
+* switch pagetables
+*/
+ 

[PATCH v19 2/4] iommu/arm-smmu: Add a way for implementations to influence SCTLR

2020-11-09 Thread Jordan Crouse
From: Rob Clark 

For the Adreno GPU's SMMU, we want SCTLR.HUPCF set to ensure that
pending translations are not terminated on iova fault.  Otherwise
a terminated CP read could hang the GPU by returning invalid
command-stream data. Add a hook to for the implementation to modify
the sctlr value if it wishes.

Co-developed-by: Jordan Crouse 
Signed-off-by: Rob Clark 
Signed-off-by: Jordan Crouse 
---

 drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 13 +
 drivers/iommu/arm/arm-smmu/arm-smmu.c  |  5 -
 drivers/iommu/arm/arm-smmu/arm-smmu.h  |  2 ++
 3 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
index b5384c4d92c8..d0636c803a36 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
@@ -20,6 +20,18 @@ static struct qcom_smmu *to_qcom_smmu(struct arm_smmu_device 
*smmu)
return container_of(smmu, struct qcom_smmu, smmu);
 }
 
+static void qcom_adreno_smmu_write_sctlr(struct arm_smmu_device *smmu, int idx,
+   u32 reg)
+{
+   /*
+* On the GPU device we want to process subsequent transactions after a
+* fault to keep the GPU from hanging
+*/
+   reg |= ARM_SMMU_SCTLR_HUPCF;
+
+   arm_smmu_cb_write(smmu, idx, ARM_SMMU_CB_SCTLR, reg);
+}
+
 #define QCOM_ADRENO_SMMU_GPU_SID 0
 
 static bool qcom_adreno_smmu_is_gpu_device(struct device *dev)
@@ -289,6 +301,7 @@ static const struct arm_smmu_impl qcom_adreno_smmu_impl = {
.def_domain_type = qcom_smmu_def_domain_type,
.reset = qcom_smmu500_reset,
.alloc_context_bank = qcom_adreno_smmu_alloc_context_bank,
+   .write_sctlr = qcom_adreno_smmu_write_sctlr,
 };
 
 static struct arm_smmu_device *qcom_smmu_create(struct arm_smmu_device *smmu,
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index bcbacf22331d..0f28a8614da3 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -617,7 +617,10 @@ void arm_smmu_write_context_bank(struct arm_smmu_device 
*smmu, int idx)
if (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN))
reg |= ARM_SMMU_SCTLR_E;
 
-   arm_smmu_cb_write(smmu, idx, ARM_SMMU_CB_SCTLR, reg);
+   if (smmu->impl && smmu->impl->write_sctlr)
+   smmu->impl->write_sctlr(smmu, idx, reg);
+   else
+   arm_smmu_cb_write(smmu, idx, ARM_SMMU_CB_SCTLR, reg);
 }
 
 static int arm_smmu_alloc_context_bank(struct arm_smmu_domain *smmu_domain,
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h 
b/drivers/iommu/arm/arm-smmu/arm-smmu.h
index 9a5eb6782918..04288b6fc619 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.h
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h
@@ -144,6 +144,7 @@ enum arm_smmu_cbar_type {
 #define ARM_SMMU_CB_SCTLR  0x0
 #define ARM_SMMU_SCTLR_S1_ASIDPNE  BIT(12)
 #define ARM_SMMU_SCTLR_CFCFG   BIT(7)
+#define ARM_SMMU_SCTLR_HUPCF   BIT(8)
 #define ARM_SMMU_SCTLR_CFIEBIT(6)
 #define ARM_SMMU_SCTLR_CFREBIT(5)
 #define ARM_SMMU_SCTLR_E   BIT(4)
@@ -437,6 +438,7 @@ struct arm_smmu_impl {
  struct arm_smmu_device *smmu,
  struct device *dev, int start);
void (*write_s2cr)(struct arm_smmu_device *smmu, int idx);
+   void (*write_sctlr)(struct arm_smmu_device *smmu, int idx, u32 reg);
 };
 
 #define INVALID_SMENDX -1
-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v19 0/4] iommu/arm-smmu: Add adreno-smmu implementation and bindings

2020-11-09 Thread Jordan Crouse
This short series adds support for the adreno-smmu implementation of the
arm-smmu driver and the device-tree bindings to turn on the implementation
for the sm845 and sc7180 GPUs. These changes are the last ones needed to enable
per-instance pagetables in the drm/msm driver.

v19: Rebase to kernel/git/will/linux.git for-joerg/arm-smmu/updates to pick up
 system cache patches and devm_realloc() updates. Use a function hook to
 modify / write sctlr
v18: No deltas in this patchset since the last go-around for 5.10 [1].

[1] https://patchwork.freedesktop.org/series/81393/

Jordan Crouse (3):
  iommu/arm-smmu-qcom: Add implementation for the adreno GPU SMMU
  dt-bindings: arm-smmu: Add compatible string for Adreno GPU SMMU
  arm: dts: qcom: sm845: Set the compatible string for the GPU SMMU

Rob Clark (1):
  iommu/arm-smmu: Add a way for implementations to influence SCTLR

 .../devicetree/bindings/iommu/arm,smmu.yaml   |   9 +-
 arch/arm64/boot/dts/qcom/sdm845-cheza.dtsi|   9 +
 arch/arm64/boot/dts/qcom/sdm845.dtsi  |   2 +-
 drivers/iommu/arm/arm-smmu/arm-smmu-impl.c|   3 +
 drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c| 164 +-
 drivers/iommu/arm/arm-smmu/arm-smmu.c |   5 +-
 drivers/iommu/arm/arm-smmu/arm-smmu.h |   3 +
 7 files changed, 189 insertions(+), 6 deletions(-)

-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 15/15] nvme-pci: Allow mmaping the CMB in userspace

2020-11-09 Thread Logan Gunthorpe



On 2020-11-09 8:03 a.m., Keith Busch wrote:
> On Fri, Nov 06, 2020 at 10:00:36AM -0700, Logan Gunthorpe wrote:
>> Allow userspace to obtain CMB memory by mmaping the controller's
>> char device. The mmap call allocates and returns a hunk of CMB memory,
>> (the offset is ignored) so userspace does not have control over the
>> address within the CMB.
>>
>> A VMA allocated in this way will only be usable by drivers that set
>> FOLL_PCI_P2PDMA when calling GUP. And inter-device support will be
>> checked the first time the pages are mapped for DMA.
>>
>> Currently this is only supported by O_DIRECT to an PCI NVMe device
>> or through the NVMe passthrough IOCTL.
> 
> Rather than make this be specific to nvme, could pci p2pdma create an
> mmap'able file for any resource registered with it?

It's certainly possible. However, other people have been arguing that
more of this should be specific to NVMe as some use cases do not want to
use the genalloc inside p2pdma.

Logan
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 04/15] lib/scatterlist: Add flag for indicating P2PDMA segments in an SGL

2020-11-09 Thread Logan Gunthorpe



On 2020-11-09 2:12 a.m., Christoph Hellwig wrote:
> On Fri, Nov 06, 2020 at 10:00:25AM -0700, Logan Gunthorpe wrote:
>> We make use of the top bit of the dma_length to indicate a P2PDMA
>> segment.
> 
> I don't think "we" can.  There is nothing limiting the size of a SGL
> segment.

Yes, I expected this would be the unacceptable part. Any alternative ideas?

Logan
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 8/8] WIP: add a dma_alloc_contiguous API

2020-11-09 Thread Ricardo Ribalda
Hi Christoph

I have started now to give a try to your patchset. Sorry for the delay.

For uvc I have prepared this patch:
https://github.com/ribalda/linux/commit/9094fe223fe38f8c8ff21366d893b43cbbdf0113

I have tested successfully in a x86_64 noteboot..., yes I know there
is no change for that platform :).
I am trying to get hold of an arm device that can run the latest
kernel from upstream.

On the meanwhile if you could take a look to the patch to verify that
this the way that you expect the drivers to use your api I would
appreciate it

Thanks



On Wed, Oct 14, 2020 at 3:20 PM Tomasz Figa  wrote:
>
> +CC Ricardo who will be looking into using this in the USB stack (UVC
> camera driver).
>
> On Wed, Sep 30, 2020 at 6:09 PM Christoph Hellwig  wrote:
> >
> > Add a new API that returns a virtually non-contigous array of pages
> > and dma address.  This API is only implemented for dma-iommu and will
> > not be implemented for non-iommu DMA API instances that have to allocate
> > contiguous memory.  It is up to the caller to check if the API is
> > available.
> >
> > The intent is that media drivers can use this API if either:
> >
> >  - no kernel mapping or only temporary kernel mappings are required.
> >That is as a better replacement for DMA_ATTR_NO_KERNEL_MAPPING
> >  - a kernel mapping is required for cached and DMA mapped pages, but
> >the driver also needs the pages to e.g. map them to userspace.
> >In that sense it is a replacement for some aspects of the recently
> >removed and never fully implemented DMA_ATTR_NON_CONSISTENT
> >
> > Signed-off-by: Christoph Hellwig 
> > ---
> >  drivers/iommu/dma-iommu.c   | 73 +
> >  include/linux/dma-mapping.h |  9 +
> >  kernel/dma/mapping.c| 35 ++
> >  3 files changed, 93 insertions(+), 24 deletions(-)
> >
> > diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> > index 7922f545cd5eef..158026a856622c 100644
> > --- a/drivers/iommu/dma-iommu.c
> > +++ b/drivers/iommu/dma-iommu.c
> > @@ -565,23 +565,12 @@ static struct page **__iommu_dma_alloc_pages(struct 
> > device *dev,
> > return pages;
> >  }
> >
> > -/**
> > - * iommu_dma_alloc_remap - Allocate and map a buffer contiguous in IOVA 
> > space
> > - * @dev: Device to allocate memory for. Must be a real device
> > - *  attached to an iommu_dma_domain
> > - * @size: Size of buffer in bytes
> > - * @dma_handle: Out argument for allocated DMA handle
> > - * @gfp: Allocation flags
> > - * @prot: pgprot_t to use for the remapped mapping
> > - * @attrs: DMA attributes for this allocation
> > - *
> > - * If @size is less than PAGE_SIZE, then a full CPU page will be allocated,
> > +/*
> > + * If size is less than PAGE_SIZE, then a full CPU page will be allocated,
> >   * but an IOMMU which supports smaller pages might not map the whole thing.
> > - *
> > - * Return: Mapped virtual address, or NULL on failure.
> >   */
> > -static void *iommu_dma_alloc_remap(struct device *dev, size_t size,
> > -   dma_addr_t *dma_handle, gfp_t gfp, pgprot_t prot,
> > +static struct page **__iommu_dma_alloc_noncontiguous(struct device *dev,
> > +   size_t size, dma_addr_t *dma_handle, gfp_t gfp, pgprot_t 
> > prot,
> > unsigned long attrs)
> >  {
> > struct iommu_domain *domain = iommu_get_dma_domain(dev);
> > @@ -593,7 +582,6 @@ static void *iommu_dma_alloc_remap(struct device *dev, 
> > size_t size,
> > struct page **pages;
> > struct sg_table sgt;
> > dma_addr_t iova;
> > -   void *vaddr;
> >
> > *dma_handle = DMA_MAPPING_ERROR;
> >
> > @@ -636,17 +624,10 @@ static void *iommu_dma_alloc_remap(struct device 
> > *dev, size_t size,
> > < size)
> > goto out_free_sg;
> >
> > -   vaddr = dma_common_pages_remap(pages, size, prot,
> > -   __builtin_return_address(0));
> > -   if (!vaddr)
> > -   goto out_unmap;
> > -
> > *dma_handle = iova;
> > sg_free_table();
> > -   return vaddr;
> > +   return pages;
> >
> > -out_unmap:
> > -   __iommu_dma_unmap(dev, iova, size);
> >  out_free_sg:
> > sg_free_table();
> >  out_free_iova:
> > @@ -656,6 +637,46 @@ static void *iommu_dma_alloc_remap(struct device *dev, 
> > size_t size,
> > return NULL;
> >  }
> >
> > +static void *iommu_dma_alloc_remap(struct device *dev, size_t size,
> > +   dma_addr_t *dma_handle, gfp_t gfp, pgprot_t prot,
> > +   unsigned long attrs)
> > +{
> > +   struct page **pages;
> > +   void *vaddr;
> > +
> > +   pages = __iommu_dma_alloc_noncontiguous(dev, size, dma_handle, gfp,
> > +   prot, attrs);
> > +   if (!pages)
> > +   return NULL;
> > +   vaddr = dma_common_pages_remap(pages, size, prot,
> > +   __builtin_return_address(0));
> > +   

RE: [Devel] Re: [RFC PATCH 2/4] ACPI/IORT: Add support for RMR node parsing

2020-11-09 Thread Sami Mujawar
Hi,

-Original Message-
From: David E. Box  
Sent: 28 October 2020 06:44 PM
To: Shameer Kolothum ; 
linux-arm-ker...@lists.infradead.org; linux-a...@vger.kernel.org; 
iommu@lists.linux-foundation.org; de...@acpica.org
Cc: linux...@huawei.com; Lorenzo Pieralisi ; 
j...@8bytes.org; Robin Murphy ; wanghuiqi...@huawei.com; 
jonathan.came...@huawei.com
Subject: [Devel] Re: [RFC PATCH 2/4] ACPI/IORT: Add support for RMR node parsing

Hi,

On Tue, 2020-10-27 at 11:26 +, Shameer Kolothum wrote:

...

> @@ -1647,6 +1667,100 @@ static void __init iort_enable_acs(struct
> acpi_iort_node *iort_node)
>  #else
>  static inline void iort_enable_acs(struct acpi_iort_node *iort_node)
> { }
>  #endif
> +static int iort_rmr_desc_valid(struct acpi_iort_rmr_desc *desc)
> +{
> + struct iort_rmr_entry *e;
> + u64 end, start = desc->base_address, length = desc->length;
> +
> + if ((!IS_ALIGNED(start, SZ_64K)) || (length % SZ_64K != 0))

You could just do:

if ((!IS_ALIGNED(start, SZ_64K)) || (length % SZ_64K))

[SAMI] In my opinion, the following may be better:
if (!IS_ALIGNED(start, SZ_64K) || !IS_ALIGNED(length, SZ_64K)) 
[/SAMI]

Regards,

Sami Mujawar

David
___
Devel mailing list -- de...@acpica.org
To unsubscribe send an email to devel-le...@acpica.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 15/15] nvme-pci: Allow mmaping the CMB in userspace

2020-11-09 Thread Keith Busch
On Fri, Nov 06, 2020 at 10:00:36AM -0700, Logan Gunthorpe wrote:
> Allow userspace to obtain CMB memory by mmaping the controller's
> char device. The mmap call allocates and returns a hunk of CMB memory,
> (the offset is ignored) so userspace does not have control over the
> address within the CMB.
> 
> A VMA allocated in this way will only be usable by drivers that set
> FOLL_PCI_P2PDMA when calling GUP. And inter-device support will be
> checked the first time the pages are mapped for DMA.
> 
> Currently this is only supported by O_DIRECT to an PCI NVMe device
> or through the NVMe passthrough IOCTL.

Rather than make this be specific to nvme, could pci p2pdma create an
mmap'able file for any resource registered with it?
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 04/15] lib/scatterlist: Add flag for indicating P2PDMA segments in an SGL

2020-11-09 Thread Robin Murphy

On 2020-11-09 09:12, Christoph Hellwig wrote:

On Fri, Nov 06, 2020 at 10:00:25AM -0700, Logan Gunthorpe wrote:

We make use of the top bit of the dma_length to indicate a P2PDMA
segment.


I don't think "we" can.  There is nothing limiting the size of a SGL
segment.


Right, the story behind ab2cbeb0ed30 ("iommu/dma: Handle SG length 
overflow better") comes immediately to mind, for one. If all the P2P 
users can agree to be in on the game then by all means implement this in 
the P2P code, but I don't think it belongs in the generic top-level 
scatterlist API.


Robin.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] dma-pool: no need to check return value of debugfs_create functions

2020-11-09 Thread Robin Murphy

On 2020-11-07 10:03, Tiezhu Yang wrote:

When calling debugfs functions, there is no need to ever check the
return value.  The function can work or not, but the code logic should
never do something different based on this.


Well, the only difference in behaviour is that it won't attempt to call 
further debugfs functions if they're definitely going to fail anyway, so 
no "real" logic is affected. AFAICS it's not possible for 
debugfs_create_dir() to return NULL, so this check makes no practical 
difference, just means that if it did ever fail we would save a bit of 
unnecessary work by not subsequently calling all the way down to the "if 
(IS_ERR(parent))" check in start_creating() several times.


So the given justification is a little overdramatic for this particular 
situation, when it's really just that it's not worth optimising an 
unexpected failure case, but the change itself is obviously fine.


Reviewed-by: Robin Murphy 


Signed-off-by: Tiezhu Yang 
---
  kernel/dma/pool.c | 3 ---
  1 file changed, 3 deletions(-)

diff --git a/kernel/dma/pool.c b/kernel/dma/pool.c
index d4637f7..5f84e6c 100644
--- a/kernel/dma/pool.c
+++ b/kernel/dma/pool.c
@@ -38,9 +38,6 @@ static void __init dma_atomic_pool_debugfs_init(void)
struct dentry *root;
  
  	root = debugfs_create_dir("dma_pools", NULL);

-   if (IS_ERR_OR_NULL(root))
-   return;
-
debugfs_create_ulong("pool_size_dma", 0400, root, _size_dma);
debugfs_create_ulong("pool_size_dma32", 0400, root, _size_dma32);
debugfs_create_ulong("pool_size_kernel", 0400, root, _size_kernel);


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 2/2] iommu/iova: Free global iova rcache on iova alloc failure

2020-11-09 Thread John Garry

On 03/11/2020 15:59, Robin Murphy wrote:

alloc failure even after retry as global
rcache is holding the iova's which can cause fragmentation.
So, free the global iova rcache as well and then go for the
retry.




If we do clear all the CPU rcaches, it would nice to have something 
immediately available to replenish, i.e. use the global rcache, 
instead of flushing it, if that is not required...


If we've reached the point of clearing *any* caches, though, I think any 
hope of maintaining performance is already long gone. We've walked the 
rbtree for the entire address space and found that it's still too full 
to allocate from; we're teetering on the brink of hard failure and this 
is a last-ditch attempt to claw back as much as possible in the hope 
that it gives us a usable space. >
TBH I'm not entirely sure what allocation pattern was expected by the 
original code such that purging only some of the caches made sense,


I'd say that the assumption is that once the CPU rcaches are flushed, 
then we should have space again. No need to go any further.


nor 
what kind of pattern leads to lots of smaller IOVAs being allocated, 
freed, and never reused to the point of blocking larger allocations, but 
either way the reasoning does at least seem to hold up in abstract.


Ok, but I'd like to see that hard failure (if you get my meaning). 
Flushing the depot rcache may be papering over some other bug.


Either way, I don't feel to strongly, so if you're happy then I won't 
try to block, so [apart from comment, below]:

Acked-by: John Garry 



This looks reasonable to me - it's mildly annoying that we end up 
with so many similar-looking functions,


Well I did add a function to clear all CPU rcaches here, if you would 
like to check:


https://lore.kernel.org/linux-iommu/1603733501-211004-2-git-send-email-john.ga...@huawei.com/ 



I was thinking more of the way free_iova_rcaches(), 
free_cpu_cached_iovas(), and free_global_cached_iovas() all look pretty 
much the same shape at a glance.


but the necessary differences are right down in the middle of the 
loops so nothing can reasonably be factored out :(


Reviewed-by: Robin Murphy 


Signed-off-by: Vijayanand Jitta 
---
  drivers/iommu/iova.c | 23 +++
  1 file changed, 23 insertions(+)

diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index c3a1a8e..faf9b13 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -25,6 +25,7 @@ static void init_iova_rcaches(struct iova_domain 
*iovad);

  static void free_iova_rcaches(struct iova_domain *iovad);
  static void fq_destroy_all_entries(struct iova_domain *iovad);
  static void fq_flush_timeout(struct timer_list *t);
+static void free_global_cached_iovas(struct iova_domain *iovad);


a thought: It would be great if the file could be rearranged at some 
point where we don't require so many forward declarations.



  void
  init_iova_domain(struct iova_domain *iovad, unsigned long granule,
@@ -442,6 +443,7 @@ alloc_iova_fast(struct iova_domain *iovad, 
unsigned long size,

  flush_rcache = false;
  for_each_online_cpu(cpu)
  free_cpu_cached_iovas(cpu, iovad);
+    free_global_cached_iovas(iovad);
  goto retry;
  }
@@ -1057,5 +1059,26 @@ void free_cpu_cached_iovas(unsigned int cpu, 
struct iova_domain *iovad)

  }
  }
+/*
+ * free all the IOVA ranges of global cache
+ */
+static void free_global_cached_iovas(struct iova_domain *iovad)
+{
+    struct iova_rcache *rcache;
+    unsigned long flags;
+    int i, j;
+
+    for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) {
+    rcache = >rcaches[i];
+    spin_lock_irqsave(>lock, flags);
+    for (j = 0; j < rcache->depot_size; ++j) {
+    iova_magazine_free_pfns(rcache->depot[j], iovad);
+    iova_magazine_free(rcache->depot[j]);
+    rcache->depot[j] = NULL;


I don't think that NULLify is strictly necessary


True, we don't explicitly clear depot entries in __iova_rcache_get() for 
normal operation, so there's not much point in doing so here.


Right, so for consistency, I think that it would be nice not to NULLify, 
for consistency.




Robin.


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC PATCH 04/15] lib/scatterlist: Add flag for indicating P2PDMA segments in an SGL

2020-11-09 Thread Christoph Hellwig
On Fri, Nov 06, 2020 at 10:00:25AM -0700, Logan Gunthorpe wrote:
> We make use of the top bit of the dma_length to indicate a P2PDMA
> segment.

I don't think "we" can.  There is nothing limiting the size of a SGL
segment.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 01/15] PCI/P2PDMA: Don't sleep in upstream_bridge_distance_warn()

2020-11-09 Thread Christoph Hellwig
On Fri, Nov 06, 2020 at 10:00:22AM -0700, Logan Gunthorpe wrote:
> In order to call this function from a dma_map function, it must not sleep.
> The only reason it does sleep so to allocate the seqbuf to print
> which devices are within the ACS path.
> 
> Switch the kmalloc call to use GFP_NOWAIT and simply not print that
> message if the buffer fails to be allocated.

Please pass in the actual gfp_t.  Especially from an I/O path
GFP_NOWAIT is not the right gfp_t anyway, you probably want GFP_ATOMIC
there.  But also for the path where we can sleep we should allow that.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/vt-d: avoid unnecessory panic if iommu init fail in tboot

2020-11-09 Thread Zhenzhong Duan
Hi Baolu,

On Mon, Nov 9, 2020 at 11:15 AM Lu Baolu  wrote:
>
> Hi Zhenzhong,
>
> On 11/9/20 10:27 AM, Zhenzhong Duan wrote:
> > +intel iommu maintainers,
> >
> > Can anyone help review this patch? Thanks
> >
> > Zhenzhong
> >
> > On Wed, Nov 4, 2020 at 4:15 PM Zhenzhong Duan  
> > wrote:
> >>
> >> "intel_iommu=off" command line is used to disable iommu and iommu is force
> >> enabled in a tboot system. Meanwhile "intel_iommu=tboot_noforce,off"
> >> could be used to force disable iommu in tboot for better performance.
> >>
> >> By default kernel should panic if iommu init fail in tboot for security
> >> reason, but it's unnecessory if we use "intel_iommu=tboot_noforce,off".
> >>
> >> Fix it by return 0 in tboot_force_iommu() so that force_on is not set.
> >>
> >> Fixes: 7304e8f28bb2 ("iommu/vt-d: Correctly disable Intel IOMMU force on")
> >> Signed-off-by: Zhenzhong Duan 
> >> ---
> >>   arch/x86/kernel/tboot.c | 5 +
> >>   1 file changed, 1 insertion(+), 4 deletions(-)
> >>
> >> diff --git a/arch/x86/kernel/tboot.c b/arch/x86/kernel/tboot.c
> >> index 992fb1415c0f..9fd4d162b331 100644
> >> --- a/arch/x86/kernel/tboot.c
> >> +++ b/arch/x86/kernel/tboot.c
> >> @@ -511,12 +511,9 @@ struct acpi_table_header *tboot_get_dmar_table(struct 
> >> acpi_table_header *dmar_tb
> >>
> >>   int tboot_force_iommu(void)
> >>   {
> >> -   if (!tboot_enabled())
> >> +   if (!tboot_enabled() || intel_iommu_tboot_noforce)
> >>  return 0;
> >>
> >> -   if (intel_iommu_tboot_noforce)
> >> -   return 1;
>
> This was obviously wrong. It should return false, right?

I guess you missed above change: "if (!tboot_enabled() ||
intel_iommu_tboot_noforce)".
It does return false.

>
> It looks odd that intel_iommu_tboot_noforce is defined in Intel iommu
> implementation, but is used here. How about moving it back to the iommu
> driver?

That's better, will do. Thanks for your suggestion.

Zhenzhong
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu