[PATCH 1/1] iommu/vt-d: Fix NULL pointer dereference in dev_iommu_priv_set()

2020-09-02 Thread Lu Baolu
The dev_iommu_priv_set() must be called after probe_device(). This fixes
a NULL pointer deference bug when booting a system with kernel cmdline
"intel_iommu=on,igfx_off", where the dev_iommu_priv_set() is abused.

The following stacktrace was produced:

[0.00] Command line: BOOT_IMAGE=/isolinux/bzImage console=tty1 
intel_iommu=on,igfx_off
...
[3.341682] DMAR: Host address width 39
[3.341684] DMAR: DRHD base: 0x00fed9 flags: 0x0
[3.341702] DMAR: dmar0: reg_base_addr fed9 ver 1:0 cap 1cc40660462 
ecap 19e2ff0505e
[3.341705] DMAR: DRHD base: 0x00fed91000 flags: 0x1
[3.341711] DMAR: dmar1: reg_base_addr fed91000 ver 1:0 cap d2008c40660462 
ecap f050da
[3.341713] DMAR: RMRR base: 0x009aa9f000 end: 0x009aabefff
[3.341716] DMAR: RMRR base: 0x009d00 end: 0x009f7f
[3.341726] DMAR: No ATSR found
[3.341772] BUG: kernel NULL pointer dereference, address: 0038
[3.341774] #PF: supervisor write access in kernel mode
[3.341776] #PF: error_code(0x0002) - not-present page
[3.341777] PGD 0 P4D 0
[3.341780] Oops: 0002 [#1] SMP PTI
[3.341783] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.9.0-devel+ #2
[3.341785] Hardware name: LENOVO 20HGS0TW00/20HGS0TW00, BIOS N1WET46S 
(1.25s ) 03/30/2018
[3.341790] RIP: 0010:intel_iommu_init+0xed0/0x1136
[3.341792] Code: fe e9 61 02 00 00 bb f4 ff ff ff e9 57 02 00 00 48 63 d1 
48 c1 e2 04 48
 03 50 20 48 8b 12 48 85 d2 74 0b 48 8b 92 d0 02 00 00 48 
89 7a 38 ff c1
 e9 15 f5 ff ff 48 c7 c7 60 99 ac a7 49 c7 c7 a0
[3.341796] RSP: :96d180073dd0 EFLAGS: 00010282
[3.341798] RAX: 8c91037a7d20 RBX:  RCX: 
[3.341800] RDX:  RSI:  RDI: 
[3.341802] RBP: 96d180073e90 R08: 0001 R09: 8c91039fe3c0
[3.341804] R10: 0226 R11: 0226 R12: 000b
[3.341806] R13: 8c910367c650 R14: a8426d60 R15: 
[3.341808] FS:  () GS:8c910748() 
knlGS:
[3.341810] CS:  0010 DS:  ES:  CR0: 80050033
[3.341812] CR2: 0038 CR3: 0004b100a001 CR4: 003706e0
[3.341814] Call Trace:
[3.341820]  ? _raw_spin_unlock_irqrestore+0x1f/0x30
[3.341824]  ? call_rcu+0x10e/0x320
[3.341828]  ? trace_hardirqs_on+0x2c/0xd0
[3.341831]  ? rdinit_setup+0x2c/0x2c
[3.341834]  ? e820__memblock_setup+0x8b/0x8b
[3.341836]  pci_iommu_init+0x16/0x3f
[3.341839]  do_one_initcall+0x46/0x1e4
[3.341842]  kernel_init_freeable+0x169/0x1b2
[3.341845]  ? rest_init+0x9f/0x9f
[3.341847]  kernel_init+0xa/0x101
[3.341849]  ret_from_fork+0x22/0x30
[3.341851] Modules linked in:
[3.341854] CR2: 0038
[3.341860] ---[ end trace 3653722a6f936f18 ]---

Fixes: 01b9d4e21148c ("iommu/vt-d: Use dev_iommu_priv_get/set()")
Reported-by: Torsten Hilbrich 
Reported-by: Wendy Wang 
Link: 
https://lore.kernel.org/linux-iommu/96717683-70be-7388-3d2f-61131070a...@secunet.com/
Signed-off-by: Lu Baolu 
---
 drivers/iommu/intel/iommu.c | 100 
 1 file changed, 55 insertions(+), 45 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 50431c7b2e71..777b9be60a0e 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -366,7 +366,6 @@ static int iommu_skip_te_disable;
 int intel_iommu_gfx_mapped;
 EXPORT_SYMBOL_GPL(intel_iommu_gfx_mapped);
 
-#define DUMMY_DEVICE_DOMAIN_INFO ((struct device_domain_info *)(-1))
 #define DEFER_DEVICE_DOMAIN_INFO ((struct device_domain_info *)(-2))
 struct device_domain_info *get_domain_info(struct device *dev)
 {
@@ -376,8 +375,7 @@ struct device_domain_info *get_domain_info(struct device 
*dev)
return NULL;
 
info = dev_iommu_priv_get(dev);
-   if (unlikely(info == DUMMY_DEVICE_DOMAIN_INFO ||
-info == DEFER_DEVICE_DOMAIN_INFO))
+   if (unlikely(info == DEFER_DEVICE_DOMAIN_INFO))
return NULL;
 
return info;
@@ -773,11 +771,6 @@ struct context_entry *iommu_context_addr(struct 
intel_iommu *iommu, u8 bus,
return &context[devfn];
 }
 
-static int iommu_dummy(struct device *dev)
-{
-   return dev_iommu_priv_get(dev) == DUMMY_DEVICE_DOMAIN_INFO;
-}
-
 static bool attach_deferred(struct device *dev)
 {
return dev_iommu_priv_get(dev) == DEFER_DEVICE_DOMAIN_INFO;
@@ -810,6 +803,53 @@ is_downstream_to_pci_bridge(struct device *dev, struct 
device *bridge)
return false;
 }
 
+static bool quirk_ioat_snb_local_iommu(struct pci_dev *pdev)
+{
+   struct dmar_drhd_unit *drhd;
+   u32 vtbar;
+   int rc;
+
+   /* We know that this device on this chipset has its own IOMMU.
+* If we find it under a different IOMMU, then the BIOS is lying
+   

Re: [PATCH] iommu: Allocate dev_iommu before accessing priv data

2020-09-02 Thread Lu Baolu

Hi Robin,

On 9/2/20 7:31 PM, Robin Murphy wrote:

On 2020-09-02 06:32, Torsten Hilbrich wrote:

After updating from v5.8 to v5.9-rc2 I noticed some problems when
booting a system with kernel cmdline "intel_iommu=on,igfx_off".

The following stacktrace was produced:

<6>[    0.00] Command line: BOOT_IMAGE=/isolinux/bzImage 
console=tty1 intel_iommu=on,igfx_off

...
<6>[    3.341682] DMAR: Host address width 39
<6>[    3.341684] DMAR: DRHD base: 0x00fed9 flags: 0x0
<6>[    3.341702] DMAR: dmar0: reg_base_addr fed9 ver 1:0 cap 
1cc40660462 ecap 19e2ff0505e

<6>[    3.341705] DMAR: DRHD base: 0x00fed91000 flags: 0x1
<6>[    3.341711] DMAR: dmar1: reg_base_addr fed91000 ver 1:0 cap 
d2008c40660462 ecap f050da

<6>[    3.341713] DMAR: RMRR base: 0x009aa9f000 end: 0x009aabefff
<6>[    3.341716] DMAR: RMRR base: 0x009d00 end: 0x009f7f
<6>[    3.341726] DMAR: No ATSR found
<1>[    3.341772] BUG: kernel NULL pointer dereference, address: 
0038

<1>[    3.341774] #PF: supervisor write access in kernel mode
<1>[    3.341776] #PF: error_code(0x0002) - not-present page
<6>[    3.341777] PGD 0 P4D 0
<4>[    3.341780] Oops: 0002 [#1] SMP PTI
<4>[    3.341783] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 
5.9.0-devel+ #2
<4>[    3.341785] Hardware name: LENOVO 20HGS0TW00/20HGS0TW00, BIOS 
N1WET46S (1.25s ) 03/30/2018

<4>[    3.341790] RIP: 0010:intel_iommu_init+0xed0/0x1136
<4>[    3.341792] Code: fe e9 61 02 00 00 bb f4 ff ff ff e9 57 02 00 
00 48 63 d1 48 c1 e2 04 48 03 50 20 48 8b 12 48 85 d2 74 0b 48 8b 92 
d0 02 00 00 <48> 89 7a 38 ff c1 e9 15 f5 ff ff 48 c7 c7 60 99 ac a7 49 
c7 c7 a0

<4>[    3.341796] RSP: :96d180073dd0 EFLAGS: 00010282
<4>[    3.341798] RAX: 8c91037a7d20 RBX:  RCX: 

<4>[    3.341800] RDX:  RSI:  RDI: 

<4>[    3.341802] RBP: 96d180073e90 R08: 0001 R09: 
8c91039fe3c0
<4>[    3.341804] R10: 0226 R11: 0226 R12: 
000b
<4>[    3.341806] R13: 8c910367c650 R14: a8426d60 R15: 

<4>[    3.341808] FS:  () 
GS:8c910748() knlGS:

<4>[    3.341810] CS:  0010 DS:  ES:  CR0: 80050033
<4>[    3.341812] CR2: 0038 CR3: 0004b100a001 CR4: 
003706e0

<4>[    3.341814] Call Trace:
<4>[    3.341820]  ? _raw_spin_unlock_irqrestore+0x1f/0x30
<4>[    3.341824]  ? call_rcu+0x10e/0x320
<4>[    3.341828]  ? trace_hardirqs_on+0x2c/0xd0
<4>[    3.341831]  ? rdinit_setup+0x2c/0x2c
<4>[    3.341834]  ? e820__memblock_setup+0x8b/0x8b
<4>[    3.341836]  pci_iommu_init+0x16/0x3f
<4>[    3.341839]  do_one_initcall+0x46/0x1e4
<4>[    3.341842]  kernel_init_freeable+0x169/0x1b2
<4>[    3.341845]  ? rest_init+0x9f/0x9f
<4>[    3.341847]  kernel_init+0xa/0x101
<4>[    3.341849]  ret_from_fork+0x22/0x30
<4>[    3.341851] Modules linked in:
<4>[    3.341854] CR2: 0038
<4>[    3.341860] ---[ end trace 3653722a6f936f18 ]---

I could track the problem down to the dev_iommu_priv_set call in the 
function
init_no_remapping_devices in the path where !dmar_map_gfx. It turned 
out that

the dev->iommu entry is NULL at this time.

Lu Baolu  suggested for dev_iommu_priv_set
to automatically allocate the iommu entry by using the function
dev_iommu_get to retrieve that pointer. This function allocates the
entry if needed.

Fixes: 01b9d4e21148 ("iommu/vt-d: Use dev_iommu_priv_get/set()")
Signed-off-by: Torsten Hilbrich 
Tested-by: Torsten Hilbrich 
Link: 
https://lists.linuxfoundation.org/pipermail/iommu/2020-August/048098.html

---
  drivers/iommu/iommu.c | 22 ++
  include/linux/iommu.h | 11 ++-
  2 files changed, 24 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 609bd25bf154..3edca2a31296 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2849,3 +2849,25 @@ int iommu_sva_get_pasid(struct iommu_sva *handle)
  return ops->sva_get_pasid(handle);
  }
  EXPORT_SYMBOL_GPL(iommu_sva_get_pasid);
+
+void *dev_iommu_priv_get(struct device *dev)
+{
+   struct dev_iommu *param = dev_iommu_get(dev);
+
+   if (WARN_ON(!param))
+   return ERR_PTR(-ENOMEM);
+
+    return param->priv;
+}
+EXPORT_SYMBOL_GPL(dev_iommu_priv_get);


Hmm, I'm not convinced by this - it looks it would only paper over real 
driver bugs. If the driver's calling dev_iommu_priv_get(), it presumably 
wants to actually *do* something with its private data - if it somehow 
manages to make that call before it's processed ->probe_device(), it 
can't possibly get *meaningful* data, so even if we stop that call from 
crashing how can it result in correct behaviour?


And if the device isn't managed by that IOMMU driver, then it shouldn't 
be calling dev_iommu_priv_get() blindly in the first place (and 
allocating redundant structures would just be a w

Re: [PATCH v2 1/3] swiotlb: Use %pa to print phys_addr_t variables

2020-09-02 Thread Fabio Estevam
On Wed, Sep 2, 2020 at 2:31 PM Andy Shevchenko
 wrote:
>
> There is an extension to a %p to print phys_addr_t type of variables.
> Use it here.
>
> Signed-off-by: Andy Shevchenko 
> ---
> v2: dropped bytes replacement (Fabio)

Reviewed-by: Fabio Estevam 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v11 07/11] device-mapping: Introduce DMA range map, supplanting dma_pfn_offset

2020-09-02 Thread Nathan Chancellor
On Wed, Sep 02, 2020 at 05:36:29PM -0700, Florian Fainelli wrote:
> 
> 
> On 9/2/2020 3:38 PM, Nathan Chancellor wrote:
> [snip]
> > > Hello Nathan,
> > > 
> > > Can you tell me how much memory your RPI has and if all of it is
> > 
> > This is the 4GB version.
> > 
> > > accessible by the PCIe device?  Could you also please include the DTS
> > > of the PCIe node?  IIRC, the RPI firmware does some mangling of the
> > > PCIe DT before Linux boots -- could you describe what is going on
> > > there?
> > 
> > Unfortunately, I am not familiar with how to get this information. If
> > you could provide some instructions for how to do so, I am more than
> > happy to. I am not very knowleagable about the inner working of the Pi,
> > I mainly use it as a test platform for making sure that LLVM does not
> > cause problems on real devices.
> 
> Can you bring the dtc application to your Pi root filesystem, and if so, can
> you run the following:
> 
> dtc -I fs -O dtb /proc/device-tree -f > /tmp/device.dtb

Sure, the result is attached.

> or cat /sys/firmware/fdt > device.dtb
> 
> and attach the resulting file?
> 
> > 
> > > Finally, can you attach the text of the full boot log?
> > 
> > I have attached a working and broken boot log. Thank you for the quick
> > response!
> 
> Is it possible for you to rebuild your kernel with CONFIG_MMC_DEBUG by any
> chance?

Of course. A new log is attached with the debug output from that config.

> I have a suspicion that this part of the DTS for the bcm2711.dtsi platform
> is at fault:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm/boot/dts/bcm2711.dtsi#n264
> 
> and the resulting dma-ranges parsing is just not working for reasons to be
> determined.
> --
> Florian

Let me know if you need anything else out of me.

Cheers,
Nathan


device.dtb
Description: Binary data
[0.00] Booting Linux on physical CPU 0x00 [0x410fd083]
[0.00] Linux version 5.9.0-rc3-next-20200902-dirty 
(nathan@ubuntu-n2-xlarge-x86) (ClangBuiltLinux clang version 12.0.0 
(https://github.com/llvm/llvm-project.git 
b21ddded8f04fee925bbf9e6458347104b5b99eb), LLD 12.0.0 
(https://github.com/llvm/llvm-project.git 
b21ddded8f04fee925bbf9e6458347104b5b99eb)) #1 SMP PREEMPT Wed Sep 2 17:41:42 
MST 2020
[0.00] Machine model: Raspberry Pi 4 Model B Rev 1.2
[0.00] efi: UEFI not found.
[0.00] Reserved memory: created CMA memory pool at 0x3740, 
size 64 MiB
[0.00] OF: reserved mem: initialized node linux,cma, compatible id 
shared-dma-pool
[0.00] NUMA: No NUMA configuration found
[0.00] NUMA: Faking a node at [mem 
0x-0xfbff]
[0.00] NUMA: NODE_DATA [mem 0xfb81f100-0xfb820fff]
[0.00] Zone ranges:
[0.00]   DMA  [mem 0x-0x3fff]
[0.00]   DMA32[mem 0x4000-0xfbff]
[0.00]   Normal   empty
[0.00] Movable zone start for each node
[0.00] Early memory node ranges
[0.00]   node   0: [mem 0x-0x3b3f]
[0.00]   node   0: [mem 0x4000-0xfbff]
[0.00] Initmem setup node 0 [mem 0x-0xfbff]
[0.00] percpu: Embedded 23 pages/cpu s54168 r8192 d31848 u94208
[0.00] Detected PIPT I-cache on CPU0
[0.00] CPU features: detected: EL2 vector hardening
[0.00] CPU features: kernel page table isolation forced ON by KASLR
[0.00] CPU features: detected: Kernel page table isolation (KPTI)
[0.00] ARM_SMCCC_ARCH_WORKAROUND_1 missing from firmware
[0.00] CPU features: detected: ARM errata 1165522, 1319367, or 1530923
[0.00] Built 1 zonelists, mobility grouping on.  Total pages: 996912
[0.00] Policy zone: DMA32
[0.00] Kernel command line:  dma.dmachans=0x71f5 
bcm2709.boardrev=0xc03112 bcm2709.serial=0xb78d398 bcm2709.uart_clock=4800 
bcm2709.disk_led_gpio=42 bcm2709.disk_led_active_low=0 
smsc95xx.macaddr=DC:A6:32:60:6C:87 vc_mem.mem_base=0x3ec0 
vc_mem.mem_size=0x4000  console=ttyS1,115200 console=tty1 
root=PARTUUID=45a8dd8a-02 rootfstype=ext4 elevator=deadline fsck.repair=yes 
rootwait plymouth.ignore-serial-consoles
[0.00] Kernel parameter elevator= does not have any effect anymore.
[0.00] Please use sysfs to set IO scheduler for individual devices.
[0.00] Dentry cache hash table entries: 524288 (order: 10, 4194304 
bytes, linear)
[0.00] Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes, 
linear)
[0.00] mem auto-init: stack:off, heap alloc:off, heap free:off
[0.00] software IO TLB: mapped [mem 0x3340-0x3740] (64

Re: [PATCH v11 07/11] device-mapping: Introduce DMA range map, supplanting dma_pfn_offset

2020-09-02 Thread Florian Fainelli




On 9/2/2020 3:38 PM, Nathan Chancellor wrote:
[snip]

Hello Nathan,

Can you tell me how much memory your RPI has and if all of it is


This is the 4GB version.


accessible by the PCIe device?  Could you also please include the DTS
of the PCIe node?  IIRC, the RPI firmware does some mangling of the
PCIe DT before Linux boots -- could you describe what is going on
there?


Unfortunately, I am not familiar with how to get this information. If
you could provide some instructions for how to do so, I am more than
happy to. I am not very knowleagable about the inner working of the Pi,
I mainly use it as a test platform for making sure that LLVM does not
cause problems on real devices.


Can you bring the dtc application to your Pi root filesystem, and if so, 
can you run the following:


dtc -I fs -O dtb /proc/device-tree -f > /tmp/device.dtb

or cat /sys/firmware/fdt > device.dtb

and attach the resulting file?




Finally, can you attach the text of the full boot log?


I have attached a working and broken boot log. Thank you for the quick
response!


Is it possible for you to rebuild your kernel with CONFIG_MMC_DEBUG by 
any chance?


I have a suspicion that this part of the DTS for the bcm2711.dtsi 
platform is at fault:


https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm/boot/dts/bcm2711.dtsi#n264

and the resulting dma-ranges parsing is just not working for reasons to 
be determined.

--
Florian
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v11 07/11] device-mapping: Introduce DMA range map, supplanting dma_pfn_offset

2020-09-02 Thread Nathan Chancellor
On Wed, Sep 02, 2020 at 06:11:08PM -0400, Jim Quinlan wrote:
> On Wed, Sep 2, 2020 at 5:53 PM Nathan Chancellor
>  wrote:
> >
> > On Mon, Aug 24, 2020 at 03:30:20PM -0400, Jim Quinlan wrote:
> > > The new field 'dma_range_map' in struct device is used to facilitate the
> > > use of single or multiple offsets between mapping regions of cpu addrs and
> > > dma addrs.  It subsumes the role of "dev->dma_pfn_offset" which was only
> > > capable of holding a single uniform offset and had no region bounds
> > > checking.
> > >
> > > The function of_dma_get_range() has been modified so that it takes a 
> > > single
> > > argument -- the device node -- and returns a map, NULL, or an error code.
> > > The map is an array that holds the information regarding the DMA regions.
> > > Each range entry contains the address offset, the cpu_start address, the
> > > dma_start address, and the size of the region.
> > >
> > > of_dma_configure() is the typical manner to set range offsets but there 
> > > are
> > > a number of ad hoc assignments to "dev->dma_pfn_offset" in the kernel
> > > driver code.  These cases now invoke the function
> > > dma_attach_offset_range(dev, cpu_addr, dma_addr, size).
> > >
> > > Signed-off-by: Jim Quinlan 
> > > ---
> > >  arch/arm/include/asm/dma-mapping.h| 10 +--
> > >  arch/arm/mach-keystone/keystone.c | 17 +++--
> > >  arch/sh/drivers/pci/pcie-sh7786.c |  9 +--
> > >  arch/x86/pci/sta2x11-fixup.c  |  7 +-
> > >  drivers/acpi/arm64/iort.c |  5 +-
> > >  drivers/base/core.c   |  2 +
> > >  drivers/gpu/drm/sun4i/sun4i_backend.c |  5 +-
> > >  drivers/iommu/io-pgtable-arm.c|  2 +-
> > >  .../platform/sunxi/sun4i-csi/sun4i_csi.c  |  5 +-
> > >  .../platform/sunxi/sun6i-csi/sun6i_csi.c  |  4 +-
> > >  drivers/of/address.c  | 72 +--
> > >  drivers/of/device.c   | 43 ++-
> > >  drivers/of/of_private.h   | 10 +--
> > >  drivers/of/unittest.c | 34 ++---
> > >  drivers/remoteproc/remoteproc_core.c  |  8 ++-
> > >  .../staging/media/sunxi/cedrus/cedrus_hw.c|  7 +-
> > >  drivers/usb/core/message.c|  9 ++-
> > >  drivers/usb/core/usb.c|  7 +-
> > >  include/linux/device.h|  4 +-
> > >  include/linux/dma-direct.h|  8 +--
> > >  include/linux/dma-mapping.h   | 36 ++
> > >  kernel/dma/coherent.c | 10 +--
> > >  kernel/dma/mapping.c  | 66 +
> > >  23 files changed, 265 insertions(+), 115 deletions(-)
> >
> > Apologies if this has already been reported or is known but this commit
> > is now in next-20200902 and it causes my Raspberry Pi 4 to no longer
> > make it to userspace, instead spewing mmc errors:
> >
> > That commit causes my Raspberry Pi 4 to no longer make it to userspace,
> > instead spewing mmc errors:
> >
> > [0.00] Booting Linux on physical CPU 0x00 [0x410fd083]
> > [0.00] Linux version 5.9.0-rc3-4-geef520b232c6-dirty 
> > (nathan@ubuntu-n2-xlarge-x86) (ClangBuiltLinux clang version 12.0.0 
> > (https://github.com/llvm/llvm-project.git 
> > b21ddded8f04fee925bbf9e6458347104b5b99eb), LLD 12.0.0 
> > (https://github.com/llvm/llvm-project.git 
> > b21ddded8f04fee925bbf9e6458347104b5b99eb)) #1 SMP PREEMPT Wed Sep 2 
> > 13:48:49 MST 2020
> > [0.00] Machine model: Raspberry Pi 4 Model B Rev 1.2
> > ...
> > [1.459752] raspberrypi-firmware soc:firmware: Attached to firmware from 
> > 2020-08-24T18:50:56
> > [1.57] dwc2 fe98.usb: supply vusb_d not found, using dummy 
> > regulator
> > [1.507454] dwc2 fe98.usb: supply vusb_a not found, using dummy 
> > regulator
> > [1.615547] dwc2 fe98.usb: EPs: 8, dedicated fifos, 4080 entries in 
> > SPRAM
> > [1.627537] sdhci-iproc fe30.sdhci: allocated mmc-pwrseq
> > [1.665497] mmc0: SDHCI controller on fe30.sdhci [fe30.sdhci] 
> > using PIO
> > [1.690601] mmc0: queuing unknown CIS tuple 0x80 (2 bytes)
> > [1.697892] mmc0: queuing unknown CIS tuple 0x80 (3 bytes)
> > [1.705173] mmc0: queuing unknown C

Re: [PATCH v11 07/11] device-mapping: Introduce DMA range map, supplanting dma_pfn_offset

2020-09-02 Thread Jim Quinlan via iommu
On Wed, Sep 2, 2020 at 5:53 PM Nathan Chancellor
 wrote:
>
> On Mon, Aug 24, 2020 at 03:30:20PM -0400, Jim Quinlan wrote:
> > The new field 'dma_range_map' in struct device is used to facilitate the
> > use of single or multiple offsets between mapping regions of cpu addrs and
> > dma addrs.  It subsumes the role of "dev->dma_pfn_offset" which was only
> > capable of holding a single uniform offset and had no region bounds
> > checking.
> >
> > The function of_dma_get_range() has been modified so that it takes a single
> > argument -- the device node -- and returns a map, NULL, or an error code.
> > The map is an array that holds the information regarding the DMA regions.
> > Each range entry contains the address offset, the cpu_start address, the
> > dma_start address, and the size of the region.
> >
> > of_dma_configure() is the typical manner to set range offsets but there are
> > a number of ad hoc assignments to "dev->dma_pfn_offset" in the kernel
> > driver code.  These cases now invoke the function
> > dma_attach_offset_range(dev, cpu_addr, dma_addr, size).
> >
> > Signed-off-by: Jim Quinlan 
> > ---
> >  arch/arm/include/asm/dma-mapping.h| 10 +--
> >  arch/arm/mach-keystone/keystone.c | 17 +++--
> >  arch/sh/drivers/pci/pcie-sh7786.c |  9 +--
> >  arch/x86/pci/sta2x11-fixup.c  |  7 +-
> >  drivers/acpi/arm64/iort.c |  5 +-
> >  drivers/base/core.c   |  2 +
> >  drivers/gpu/drm/sun4i/sun4i_backend.c |  5 +-
> >  drivers/iommu/io-pgtable-arm.c|  2 +-
> >  .../platform/sunxi/sun4i-csi/sun4i_csi.c  |  5 +-
> >  .../platform/sunxi/sun6i-csi/sun6i_csi.c  |  4 +-
> >  drivers/of/address.c  | 72 +--
> >  drivers/of/device.c   | 43 ++-
> >  drivers/of/of_private.h   | 10 +--
> >  drivers/of/unittest.c | 34 ++---
> >  drivers/remoteproc/remoteproc_core.c  |  8 ++-
> >  .../staging/media/sunxi/cedrus/cedrus_hw.c|  7 +-
> >  drivers/usb/core/message.c|  9 ++-
> >  drivers/usb/core/usb.c|  7 +-
> >  include/linux/device.h|  4 +-
> >  include/linux/dma-direct.h|  8 +--
> >  include/linux/dma-mapping.h   | 36 ++
> >  kernel/dma/coherent.c | 10 +--
> >  kernel/dma/mapping.c  | 66 +
> >  23 files changed, 265 insertions(+), 115 deletions(-)
>
> Apologies if this has already been reported or is known but this commit
> is now in next-20200902 and it causes my Raspberry Pi 4 to no longer
> make it to userspace, instead spewing mmc errors:
>
> That commit causes my Raspberry Pi 4 to no longer make it to userspace,
> instead spewing mmc errors:
>
> [0.00] Booting Linux on physical CPU 0x00 [0x410fd083]
> [0.00] Linux version 5.9.0-rc3-4-geef520b232c6-dirty 
> (nathan@ubuntu-n2-xlarge-x86) (ClangBuiltLinux clang version 12.0.0 
> (https://github.com/llvm/llvm-project.git 
> b21ddded8f04fee925bbf9e6458347104b5b99eb), LLD 12.0.0 
> (https://github.com/llvm/llvm-project.git 
> b21ddded8f04fee925bbf9e6458347104b5b99eb)) #1 SMP PREEMPT Wed Sep 2 13:48:49 
> MST 2020
> [0.00] Machine model: Raspberry Pi 4 Model B Rev 1.2
> ...
> [1.459752] raspberrypi-firmware soc:firmware: Attached to firmware from 
> 2020-08-24T18:50:56
> [1.57] dwc2 fe98.usb: supply vusb_d not found, using dummy 
> regulator
> [1.507454] dwc2 fe98.usb: supply vusb_a not found, using dummy 
> regulator
> [1.615547] dwc2 fe98.usb: EPs: 8, dedicated fifos, 4080 entries in 
> SPRAM
> [1.627537] sdhci-iproc fe30.sdhci: allocated mmc-pwrseq
> [1.665497] mmc0: SDHCI controller on fe30.sdhci [fe30.sdhci] 
> using PIO
> [1.690601] mmc0: queuing unknown CIS tuple 0x80 (2 bytes)
> [1.697892] mmc0: queuing unknown CIS tuple 0x80 (3 bytes)
> [1.705173] mmc0: queuing unknown CIS tuple 0x80 (3 bytes)
> [1.713788] mmc0: queuing unknown CIS tuple 0x80 (7 bytes)
> [1.721228] mmc0: queuing unknown CIS tuple 0x80 (3 bytes)
> [1.732062] mmc1: SDHCI controller on fe34.emmc2 [fe34.emmc2] 
> using ADMA
> [1.741828] ALSA device list:
> [1.744885]   No soundcards found.
> [1.748540] Waiting for root device PARTUUID=45a8dd8a-02...
> [1.788865] random: fast init done
> [1.793489] mmc1: unrecognised SCR

Re: [PATCH v11 07/11] device-mapping: Introduce DMA range map, supplanting dma_pfn_offset

2020-09-02 Thread Nathan Chancellor
On Mon, Aug 24, 2020 at 03:30:20PM -0400, Jim Quinlan wrote:
> The new field 'dma_range_map' in struct device is used to facilitate the
> use of single or multiple offsets between mapping regions of cpu addrs and
> dma addrs.  It subsumes the role of "dev->dma_pfn_offset" which was only
> capable of holding a single uniform offset and had no region bounds
> checking.
> 
> The function of_dma_get_range() has been modified so that it takes a single
> argument -- the device node -- and returns a map, NULL, or an error code.
> The map is an array that holds the information regarding the DMA regions.
> Each range entry contains the address offset, the cpu_start address, the
> dma_start address, and the size of the region.
> 
> of_dma_configure() is the typical manner to set range offsets but there are
> a number of ad hoc assignments to "dev->dma_pfn_offset" in the kernel
> driver code.  These cases now invoke the function
> dma_attach_offset_range(dev, cpu_addr, dma_addr, size).
> 
> Signed-off-by: Jim Quinlan 
> ---
>  arch/arm/include/asm/dma-mapping.h| 10 +--
>  arch/arm/mach-keystone/keystone.c | 17 +++--
>  arch/sh/drivers/pci/pcie-sh7786.c |  9 +--
>  arch/x86/pci/sta2x11-fixup.c  |  7 +-
>  drivers/acpi/arm64/iort.c |  5 +-
>  drivers/base/core.c   |  2 +
>  drivers/gpu/drm/sun4i/sun4i_backend.c |  5 +-
>  drivers/iommu/io-pgtable-arm.c|  2 +-
>  .../platform/sunxi/sun4i-csi/sun4i_csi.c  |  5 +-
>  .../platform/sunxi/sun6i-csi/sun6i_csi.c  |  4 +-
>  drivers/of/address.c  | 72 +--
>  drivers/of/device.c   | 43 ++-
>  drivers/of/of_private.h   | 10 +--
>  drivers/of/unittest.c | 34 ++---
>  drivers/remoteproc/remoteproc_core.c  |  8 ++-
>  .../staging/media/sunxi/cedrus/cedrus_hw.c|  7 +-
>  drivers/usb/core/message.c|  9 ++-
>  drivers/usb/core/usb.c|  7 +-
>  include/linux/device.h|  4 +-
>  include/linux/dma-direct.h|  8 +--
>  include/linux/dma-mapping.h   | 36 ++
>  kernel/dma/coherent.c | 10 +--
>  kernel/dma/mapping.c  | 66 +
>  23 files changed, 265 insertions(+), 115 deletions(-)

Apologies if this has already been reported or is known but this commit
is now in next-20200902 and it causes my Raspberry Pi 4 to no longer
make it to userspace, instead spewing mmc errors:

That commit causes my Raspberry Pi 4 to no longer make it to userspace,
instead spewing mmc errors:

[0.00] Booting Linux on physical CPU 0x00 [0x410fd083]
[0.00] Linux version 5.9.0-rc3-4-geef520b232c6-dirty 
(nathan@ubuntu-n2-xlarge-x86) (ClangBuiltLinux clang version 12.0.0 
(https://github.com/llvm/llvm-project.git 
b21ddded8f04fee925bbf9e6458347104b5b99eb), LLD 12.0.0 
(https://github.com/llvm/llvm-project.git 
b21ddded8f04fee925bbf9e6458347104b5b99eb)) #1 SMP PREEMPT Wed Sep 2 13:48:49 
MST 2020
[0.00] Machine model: Raspberry Pi 4 Model B Rev 1.2
...
[1.459752] raspberrypi-firmware soc:firmware: Attached to firmware from 
2020-08-24T18:50:56
[1.57] dwc2 fe98.usb: supply vusb_d not found, using dummy regulator
[1.507454] dwc2 fe98.usb: supply vusb_a not found, using dummy regulator
[1.615547] dwc2 fe98.usb: EPs: 8, dedicated fifos, 4080 entries in SPRAM
[1.627537] sdhci-iproc fe30.sdhci: allocated mmc-pwrseq
[1.665497] mmc0: SDHCI controller on fe30.sdhci [fe30.sdhci] using 
PIO
[1.690601] mmc0: queuing unknown CIS tuple 0x80 (2 bytes)
[1.697892] mmc0: queuing unknown CIS tuple 0x80 (3 bytes)
[1.705173] mmc0: queuing unknown CIS tuple 0x80 (3 bytes)
[1.713788] mmc0: queuing unknown CIS tuple 0x80 (7 bytes)
[1.721228] mmc0: queuing unknown CIS tuple 0x80 (3 bytes)
[1.732062] mmc1: SDHCI controller on fe34.emmc2 [fe34.emmc2] using 
ADMA
[1.741828] ALSA device list:
[1.744885]   No soundcards found.
[1.748540] Waiting for root device PARTUUID=45a8dd8a-02...
[1.788865] random: fast init done
[1.793489] mmc1: unrecognised SCR structure version 4
[1.798814] mmc1: error -22 whilst initialising SD card
[1.813969] mmc0: new high speed SDIO card at address 0001
[1.883178] mmc1: unrecognised SCR structure version 2
[1.888423] mmc1: error -22 whilst initialising SD card
[1.964069] mmc1: unrecognised SCR structure version 4
[1.969314] mmc1: error -22 whilst initialising SD card
[2.061225] mmc1: unrecognised SCR structure version 4
[2.066470] mmc1: error -22 whilst initialising

Re: [PATCH v2 3/9] iommu/ioasid: Introduce ioasid_set APIs

2020-09-02 Thread Jacob Pan
On Mon, 24 Aug 2020 11:34:29 -0700
Randy Dunlap  wrote:

> On 8/24/20 11:28 AM, Jean-Philippe Brucker wrote:
> >> +/**
> >> + * struct ioasid_set - Meta data about ioasid_set
> >> + * @type: Token types and other features  
> > nit: doesn't follow struct order
> >   
> >> + * @token:Unique to identify an IOASID set
> >> + * @xa:   XArray to store ioasid_set private IDs, can be used for
> >> + *guest-host IOASID mapping, or just a private IOASID 
> >> namespace.
> >> + * @quota:Max number of IOASIDs can be allocated within the set
> >> + * @nr_ioasidsNumber of IOASIDs currently allocated in the set  
> 
>  * @nr_ioasids: Number of IOASIDs currently allocated in the set
> 
got it. thanks!

> >> + * @sid:  ID of the set
> >> + * @ref:  Reference count of the users
> >> + */
> >>  struct ioasid_set {
> >> -  int dummy;
> >> +  void *token;
> >> +  struct xarray xa;
> >> +  int type;
> >> +  int quota;
> >> +  int nr_ioasids;
> >> +  int sid;
> >> +  refcount_t ref;
> >> +  struct rcu_head rcu;
> >>  };  
> 
> 

[Jacob Pan]
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 3/9] iommu/ioasid: Introduce ioasid_set APIs

2020-09-02 Thread Jacob Pan
On Mon, 24 Aug 2020 11:30:47 -0700
Randy Dunlap  wrote:

> On 8/24/20 11:28 AM, Jean-Philippe Brucker wrote:
> >> +/**
> >> + * struct ioasid_data - Meta data about ioasid
> >> + *
> >> + * @id:   Unique ID
> >> + * @users Number of active users
> >> + * @state Track state of the IOASID
> >> + * @set   Meta data of the set this IOASID belongs to
> >> + * @private   Private data associated with the IOASID
> >> + * @rcu   For free after RCU grace period  
> > nit: it would be nicer to follow the struct order  
> 
> and use a ':' after each struct member name, as is done for @id:
> 
Got it, thanks.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 22/28] sgiseeq: convert from dma_cache_sync to dma_sync_single_for_device

2020-09-02 Thread Thomas Bogendoerfer
On Tue, Sep 01, 2020 at 07:38:10PM +0200, Thomas Bogendoerfer wrote:
> On Tue, Sep 01, 2020 at 07:16:27PM +0200, Christoph Hellwig wrote:
> > Well, if IP22 doesn't speculate (which I'm pretty sure is the case),
> > dma_sync_single_for_cpu should indeeed be a no-op.  But then there
> > also shouldn't be anything in the cache, as the previous
> > dma_sync_single_for_device should have invalidated it.  So it seems like
> > we are missing one (or more) ownership transfers to the device.  I'll
> > try to look at the the ownership management in a little more detail
> > tomorrow.
> 
> this is the problem:
> 
>/* Always check for received packets. */
> sgiseeq_rx(dev, sp, hregs, sregs);
> 
> so the driver will look at the rx descriptor on every interrupt, so
> we cache the rx descriptor on the first interrupt and if there was
> $no rx packet, we will only see it, if cache line gets flushed for
> some other reason. kick_tx() does a busy loop checking tx descriptors,
> with just sync_desc_cpu...

the patch below fixes the problem.

Thomas.


diff --git a/drivers/net/ethernet/seeq/sgiseeq.c 
b/drivers/net/ethernet/seeq/sgiseeq.c
index 8507ff242014..876e3700a0e4 100644
--- a/drivers/net/ethernet/seeq/sgiseeq.c
+++ b/drivers/net/ethernet/seeq/sgiseeq.c
@@ -112,14 +112,18 @@ struct sgiseeq_private {
 
 static inline void dma_sync_desc_cpu(struct net_device *dev, void *addr)
 {
-   dma_cache_sync(dev->dev.parent, addr, sizeof(struct sgiseeq_rx_desc),
-  DMA_FROM_DEVICE);
+   struct sgiseeq_private *sp = netdev_priv(dev);
+
+   dma_sync_single_for_device(dev->dev.parent, VIRT_TO_DMA(sp, addr),
+   sizeof(struct sgiseeq_rx_desc), DMA_FROM_DEVICE);
 }
 
 static inline void dma_sync_desc_dev(struct net_device *dev, void *addr)
 {
-   dma_cache_sync(dev->dev.parent, addr, sizeof(struct sgiseeq_rx_desc),
-  DMA_TO_DEVICE);
+   struct sgiseeq_private *sp = netdev_priv(dev);
+
+   dma_sync_single_for_device(dev->dev.parent, VIRT_TO_DMA(sp, addr),
+   sizeof(struct sgiseeq_rx_desc), DMA_TO_DEVICE);
 }
 
-- 
Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
good idea.[ RFC1925, 2.3 ]
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 3/9] iommu/ioasid: Introduce ioasid_set APIs

2020-09-02 Thread Jacob Pan
On Mon, 24 Aug 2020 20:28:48 +0200
Jean-Philippe Brucker  wrote:

> On Fri, Aug 21, 2020 at 09:35:12PM -0700, Jacob Pan wrote:
> > ioasid_set was introduced as an arbitrary token that are shared by a
> > group of IOASIDs. For example, if IOASID #1 and #2 are allocated
> > via the same ioasid_set*, they are viewed as to belong to the same
> > set.
> > 
> > For guest SVA usages, system-wide IOASID resources need to be
> > partitioned such that VMs can have its own quota and being managed
> > separately. ioasid_set is the perfect candidate for meeting such
> > requirements. This patch redefines and extends ioasid_set with the
> > following new fields:
> > - Quota
> > - Reference count
> > - Storage of its namespace
> > - The token is stored in the new ioasid_set but with optional types
> > 
> > ioasid_set level APIs are introduced that wires up these new data.
> > Existing users of IOASID APIs are converted where a host IOASID set
> > is allocated for bare-metal usage.
> > 
> > Signed-off-by: Liu Yi L 
> > Signed-off-by: Jacob Pan 
> > ---
> >  drivers/iommu/intel/iommu.c |  27 ++-
> >  drivers/iommu/intel/pasid.h |   1 +
> >  drivers/iommu/intel/svm.c   |   8 +-
> >  drivers/iommu/ioasid.c  | 390
> > +---
> > include/linux/ioasid.h  |  82 -- 5 files changed, 465
> > insertions(+), 43 deletions(-)
> > 
> > diff --git a/drivers/iommu/intel/iommu.c
> > b/drivers/iommu/intel/iommu.c index a3a0b5c8921d..5813eeaa5edb
> > 100644 --- a/drivers/iommu/intel/iommu.c
> > +++ b/drivers/iommu/intel/iommu.c
> > @@ -42,6 +42,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  #include 
> >  #include 
> >  #include 
> > @@ -103,6 +104,9 @@
> >   */
> >  #define INTEL_IOMMU_PGSIZES(~0xFFFUL)
> >  
> > +/* PASIDs used by host SVM */
> > +struct ioasid_set *host_pasid_set;
> > +
> >  static inline int agaw_to_level(int agaw)
> >  {
> > return agaw + 2;
> > @@ -3103,8 +3107,8 @@ static void intel_vcmd_ioasid_free(ioasid_t
> > ioasid, void *data)
> >  * Sanity check the ioasid owner is done at upper layer,
> > e.g. VFIO
> >  * We can only free the PASID when all the devices are
> > unbound. */
> > -   if (ioasid_find(NULL, ioasid, NULL)) {
> > -   pr_alert("Cannot free active IOASID %d\n", ioasid);
> > +   if (IS_ERR(ioasid_find(host_pasid_set, ioasid, NULL))) {
> > +   pr_err("Cannot free IOASID %d, not in system
> > set\n", ioasid); return;
> > }
> > vcmd_free_pasid(iommu, ioasid);
> > @@ -3288,6 +3292,19 @@ static int __init init_dmars(void)
> > if (ret)
> > goto free_iommu;
> >  
> > +   /* PASID is needed for scalable mode irrespective to SVM */
> > +   if (intel_iommu_sm) {
> > +   ioasid_install_capacity(intel_pasid_max_id);
> > +   /* We should not run out of IOASIDs at boot */
> > +   host_pasid_set = ioasid_alloc_set(NULL,
> > PID_MAX_DEFAULT,
> > +
> > IOASID_SET_TYPE_NULL);
> > +   if (IS_ERR_OR_NULL(host_pasid_set)) {
> > +   pr_err("Failed to enable host PASID
> > allocator %lu\n",
> > +   PTR_ERR(host_pasid_set));
> > +   intel_iommu_sm = 0;
> > +   }
> > +   }
> > +
> > /*
> >  * for each drhd
> >  *   enable fault log
> > @@ -5149,7 +5166,7 @@ static void auxiliary_unlink_device(struct
> > dmar_domain *domain, domain->auxd_refcnt--;
> >  
> > if (!domain->auxd_refcnt && domain->default_pasid > 0)
> > -   ioasid_free(domain->default_pasid);
> > +   ioasid_free(host_pasid_set, domain->default_pasid);
> >  }
> >  
> >  static int aux_domain_add_dev(struct dmar_domain *domain,
> > @@ -5167,7 +5184,7 @@ static int aux_domain_add_dev(struct
> > dmar_domain *domain, int pasid;
> >  
> > /* No private data needed for the default pasid */
> > -   pasid = ioasid_alloc(NULL, PASID_MIN,
> > +   pasid = ioasid_alloc(host_pasid_set, PASID_MIN,
> >  pci_max_pasids(to_pci_dev(dev))
> > - 1, NULL);
> > if (pasid == INVALID_IOASID) {
> > @@ -5210,7 +5227,7 @@ static int aux_domain_add_dev(struct
> > dmar_domain *domain, spin_unlock(&iommu->lock);
> > spin_unlock_irqrestore(&device_domain_lock, flags);
> > if (!domain->auxd_refcnt && domain->default_pasid > 0)
> > -   ioasid_free(domain->default_pasid);
> > +   ioasid_free(host_pasid_set, domain->default_pasid);
> >  
> > return ret;
> >  }
> > diff --git a/drivers/iommu/intel/pasid.h
> > b/drivers/iommu/intel/pasid.h index c9850766c3a9..ccdc23446015
> > 100644 --- a/drivers/iommu/intel/pasid.h
> > +++ b/drivers/iommu/intel/pasid.h
> > @@ -99,6 +99,7 @@ static inline bool pasid_pte_is_present(struct
> > pasid_entry *pte) }
> >  
> >  extern u32 intel_pasid_max_id;
> > +extern struct ioasid_set *host_pasid_set;
> >  int intel_pasid_alloc_id(void *ptr, int start, int end, gfp_t gfp);
> >  void intel_pasid_free_id(int pasid);
> >

[PATCH v2 3/3] swiotlb: Mark max_segment with static keyword

2020-09-02 Thread Andy Shevchenko
Sparse is not happy about max_segment declaration:

  CHECK   kernel/dma/swiotlb.c
  kernel/dma/swiotlb.c:96:14: warning: symbol 'max_segment' was not declared. 
Should it be static?

Mark it static as suggested.

Signed-off-by: Andy Shevchenko 
---
v2: no change
 kernel/dma/swiotlb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 6499bda8f0b8..465a567678d9 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -93,7 +93,7 @@ static unsigned int io_tlb_index;
  * Max segment that we can provide which (if pages are contingous) will
  * not be bounced (unless SWIOTLB_FORCE is set).
  */
-unsigned int max_segment;
+static unsigned int max_segment;
 
 /*
  * We need to save away the original address corresponding to a mapped entry
-- 
2.28.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 1/3] swiotlb: Use %pa to print phys_addr_t variables

2020-09-02 Thread Andy Shevchenko
There is an extension to a %p to print phys_addr_t type of variables.
Use it here.

Signed-off-by: Andy Shevchenko 
---
v2: dropped bytes replacement (Fabio)
 kernel/dma/swiotlb.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index c19379fabd20..6499bda8f0b8 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -172,9 +172,7 @@ void swiotlb_print_info(void)
return;
}
 
-   pr_info("mapped [mem %#010llx-%#010llx] (%luMB)\n",
-  (unsigned long long)io_tlb_start,
-  (unsigned long long)io_tlb_end,
+   pr_info("mapped [mem %pa-%pa] (%luMB)\n", &io_tlb_start, &io_tlb_end,
   bytes >> 20);
 }
 
-- 
2.28.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 2/3] swiotlb: Declare swiotlb_late_init_with_default_size() in header

2020-09-02 Thread Andy Shevchenko
Compiler is not happy about one function prototype:

  CC  kernel/dma/swiotlb.o
  kernel/dma/swiotlb.c:275:1: warning: no previous prototype for 
‘swiotlb_late_init_with_default_size’ [-Wmissing-prototypes]
  275 | swiotlb_late_init_with_default_size(size_t default_size)
  | ^~~

Since it's used outside of the module, move its declaration to the header
from the user.

Signed-off-by: Andy Shevchenko 
---
v2: no change
 arch/x86/pci/sta2x11-fixup.c | 1 -
 include/linux/swiotlb.h  | 1 +
 2 files changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/pci/sta2x11-fixup.c b/arch/x86/pci/sta2x11-fixup.c
index c313d784efab..11c0e80b9ed4 100644
--- a/arch/x86/pci/sta2x11-fixup.c
+++ b/arch/x86/pci/sta2x11-fixup.c
@@ -15,7 +15,6 @@
 #include 
 
 #define STA2X11_SWIOTLB_SIZE (4*1024*1024)
-extern int swiotlb_late_init_with_default_size(size_t default_size);
 
 /*
  * We build a list of bus numbers that are under the ConneXt. The
diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index 046bb94bd4d6..513913ff7486 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -34,6 +34,7 @@ int swiotlb_init_with_tbl(char *tlb, unsigned long nslabs, 
int verbose);
 extern unsigned long swiotlb_nr_tbl(void);
 unsigned long swiotlb_size_or_default(void);
 extern int swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs);
+extern int swiotlb_late_init_with_default_size(size_t default_size);
 extern void __init swiotlb_update_mem_attributes(void);
 
 /*
-- 
2.28.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/2] iommu: amd: Restore IRTE.RemapEn bit after programming IRTE

2020-09-02 Thread Joao Martins
On 9/2/20 5:51 AM, Suravee Suthikulpanit wrote:
> Currently, the RemapEn (valid) bit is accidentally cleared when
> programming IRTE w/ guestMode=0. It should be restored to
> the prior state.
> 
Probably requires:

 Fixes: b9fc6b56f478 ("iommu/amd: Implements irq_set_vcpu_affinity() hook to 
setup vapic
mode for pass-through devices")

?

> Signed-off-by: Suravee Suthikulpanit 

FWIW,

 Reviewed-by: Joao Martins 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/2] iommu: amd: Use cmpxchg_double() when updating 128-bit IRTE

2020-09-02 Thread Joao Martins
On 9/2/20 5:51 AM, Suravee Suthikulpanit wrote:
> When using 128-bit interrupt-remapping table entry (IRTE) (a.k.a GA mode),
> current driver disables interrupt remapping when it updates the IRTE
> so that the upper and lower 64-bit values can be updated safely.
> 
> However, this creates a small window, where the interrupt could
> arrive and result in IO_PAGE_FAULT (for interrupt) as shown below.
> 
>   IOMMU DriverDevice IRQ
>   ===
>   irte.RemapEn=0
>...
>change IRTEIRQ from device ==> IO_PAGE_FAULT !!
>...
>   irte.RemapEn=1
> 
> This scenario has been observed when changing irq affinity on a system
> running I/O-intensive workload, in which the destination APIC ID
> in the IRTE is updated.
> 
> Instead, use cmpxchg_double() to update the 128-bit IRTE at once without
> disabling the interrupt remapping. However, this means several features,
> which require GA (128-bit IRTE) support will also be affected if cmpxchg16b
> is not supported (which is unprecedented for AMD processors w/ IOMMU).
> 
Probably requires:

 Fixes: 880ac60e2538 ("iommu/amd: Introduce interrupt remapping ops structure")

?

> Reported-by: Sean Osborne 
> Tested-by: Erik Rockstrom 
> Signed-off-by: Suravee Suthikulpanit 

With the comments below addressed, FWIW:

 Reviewed-by: Joao Martins 

> diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
> index c652f16eb702..ad30467f6930 100644
> --- a/drivers/iommu/amd/init.c
> +++ b/drivers/iommu/amd/init.c
> @@ -1511,7 +1511,14 @@ static int __init init_iommu_one(struct amd_iommu 
> *iommu, struct ivhd_header *h)
>   iommu->mmio_phys_end = MMIO_REG_END_OFFSET;
>   else
>   iommu->mmio_phys_end = MMIO_CNTR_CONF_OFFSET;
> - if (((h->efr_attr & (0x1 << IOMMU_FEAT_GASUP_SHIFT)) == 0))
> +
> + /*
> +  * Note: GA (128-bit IRTE) mode requires cmpxchg16b supports.
> +  * GAM also requires GA mode. Therefore, we need to
> +  * check cmbxchg16b support before enabling it.
> +  */

s/cmbxchg16b/cmpxchg16b

> + if (!boot_cpu_has(X86_FEATURE_CX16) ||
> + ((h->efr_attr & (0x1 << IOMMU_FEAT_GASUP_SHIFT)) == 0))
>   amd_iommu_guest_ir = AMD_IOMMU_GUEST_IR_LEGACY;
>   break;
>   case 0x11:
> @@ -1520,8 +1527,18 @@ static int __init init_iommu_one(struct amd_iommu 
> *iommu, struct ivhd_header *h)
>   iommu->mmio_phys_end = MMIO_REG_END_OFFSET;
>   else
>   iommu->mmio_phys_end = MMIO_CNTR_CONF_OFFSET;
> - if (((h->efr_reg & (0x1 << IOMMU_EFR_GASUP_SHIFT)) == 0))
> +
> + /*
> +  * Note: GA (128-bit IRTE) mode requires cmpxchg16b supports.
> +  * XT, GAM also requires GA mode. Therefore, we need to
> +  * check cmbxchg16b support before enabling them.

s/cmbxchg16b/cmpxchg16b

> +  */
> + if (boot_cpu_has(X86_FEATURE_CX16) ||

You probably want !boot_cpu_has(X86_FEATURE_CX16) ?

> + ((h->efr_reg & (0x1 << IOMMU_EFR_GASUP_SHIFT)) == 0)) {
>   amd_iommu_guest_ir = AMD_IOMMU_GUEST_IR_LEGACY;
> + break;
> + }
> +
>   /*
>* Note: Since iommu_update_intcapxt() leverages
>* the IOMMU MMIO access to MSI capability block registers
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 07/28] 53c700: improve non-coherent DMA handling

2020-09-02 Thread Helge Deller
Hi Willy,

On 01.09.20 18:53, Matthew Wilcox wrote:
> On Tue, Sep 01, 2020 at 06:41:12PM +0200, Helge Deller wrote:
>>> I still have a zoo of machines running for such testing, including a
>>> 715/64 and two 730.
>>> I'm going to test this git tree on the 715/64:
>
> The 715/64 is a 7100LC machine though.  I think you need to boot on
> the 730 to test the non-coherent path.

Just tested the 730, and it works as well.

Helge
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v11 07/11] device-mapping: Introduce DMA range map, supplanting dma_pfn_offset

2020-09-02 Thread Jim Quinlan via iommu
On Tue, Sep 1, 2020 at 4:24 AM Christoph Hellwig  wrote:
>
> I've applied this to the dma-mapping tree.
>
> I had to resolve a conflict in drivers/of/address.c with a recent
> mainline commit.  I also applied the minor tweaks Andy pointed out
> plus a few more style changes.  A real change is that I changed the
> prototype for dma_copy_dma_range_map to require less boilerplate code.
>
> The result is here:
>
> 
> http://git.infradead.org/users/hch/dma-mapping.git/commitdiff/eef520b232c60e74eb8b33a5a7863ad8f2b4a5c7
>
> please double check that everyting works as expected.
Tested-by: Jim Quinlan 

Thanks Christoph
Jim
>
> I can cut a stable branch with this if you need it for other trees, but
> I'd like to wait a few days to see if there is any fallout first.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu: Allocate dev_iommu before accessing priv data

2020-09-02 Thread Robin Murphy

On 2020-09-02 06:32, Torsten Hilbrich wrote:

After updating from v5.8 to v5.9-rc2 I noticed some problems when
booting a system with kernel cmdline "intel_iommu=on,igfx_off".

The following stacktrace was produced:

<6>[0.00] Command line: BOOT_IMAGE=/isolinux/bzImage console=tty1 
intel_iommu=on,igfx_off
...
<6>[3.341682] DMAR: Host address width 39
<6>[3.341684] DMAR: DRHD base: 0x00fed9 flags: 0x0
<6>[3.341702] DMAR: dmar0: reg_base_addr fed9 ver 1:0 cap 
1cc40660462 ecap 19e2ff0505e
<6>[3.341705] DMAR: DRHD base: 0x00fed91000 flags: 0x1
<6>[3.341711] DMAR: dmar1: reg_base_addr fed91000 ver 1:0 cap 
d2008c40660462 ecap f050da
<6>[3.341713] DMAR: RMRR base: 0x009aa9f000 end: 0x009aabefff
<6>[3.341716] DMAR: RMRR base: 0x009d00 end: 0x009f7f
<6>[3.341726] DMAR: No ATSR found
<1>[3.341772] BUG: kernel NULL pointer dereference, address: 
0038
<1>[3.341774] #PF: supervisor write access in kernel mode
<1>[3.341776] #PF: error_code(0x0002) - not-present page
<6>[3.341777] PGD 0 P4D 0
<4>[3.341780] Oops: 0002 [#1] SMP PTI
<4>[3.341783] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.9.0-devel+ #2
<4>[3.341785] Hardware name: LENOVO 20HGS0TW00/20HGS0TW00, BIOS N1WET46S 
(1.25s ) 03/30/2018
<4>[3.341790] RIP: 0010:intel_iommu_init+0xed0/0x1136
<4>[3.341792] Code: fe e9 61 02 00 00 bb f4 ff ff ff e9 57 02 00 00 48 63 d1 48 
c1 e2 04 48 03 50 20 48 8b 12 48 85 d2 74 0b 48 8b 92 d0 02 00 00 <48> 89 7a 38 ff c1 
e9 15 f5 ff ff 48 c7 c7 60 99 ac a7 49 c7 c7 a0
<4>[3.341796] RSP: :96d180073dd0 EFLAGS: 00010282
<4>[3.341798] RAX: 8c91037a7d20 RBX:  RCX: 

<4>[3.341800] RDX:  RSI:  RDI: 

<4>[3.341802] RBP: 96d180073e90 R08: 0001 R09: 
8c91039fe3c0
<4>[3.341804] R10: 0226 R11: 0226 R12: 
000b
<4>[3.341806] R13: 8c910367c650 R14: a8426d60 R15: 

<4>[3.341808] FS:  () GS:8c910748() 
knlGS:
<4>[3.341810] CS:  0010 DS:  ES:  CR0: 80050033
<4>[3.341812] CR2: 0038 CR3: 0004b100a001 CR4: 
003706e0
<4>[3.341814] Call Trace:
<4>[3.341820]  ? _raw_spin_unlock_irqrestore+0x1f/0x30
<4>[3.341824]  ? call_rcu+0x10e/0x320
<4>[3.341828]  ? trace_hardirqs_on+0x2c/0xd0
<4>[3.341831]  ? rdinit_setup+0x2c/0x2c
<4>[3.341834]  ? e820__memblock_setup+0x8b/0x8b
<4>[3.341836]  pci_iommu_init+0x16/0x3f
<4>[3.341839]  do_one_initcall+0x46/0x1e4
<4>[3.341842]  kernel_init_freeable+0x169/0x1b2
<4>[3.341845]  ? rest_init+0x9f/0x9f
<4>[3.341847]  kernel_init+0xa/0x101
<4>[3.341849]  ret_from_fork+0x22/0x30
<4>[3.341851] Modules linked in:
<4>[3.341854] CR2: 0038
<4>[3.341860] ---[ end trace 3653722a6f936f18 ]---

I could track the problem down to the dev_iommu_priv_set call in the function
init_no_remapping_devices in the path where !dmar_map_gfx. It turned out that
the dev->iommu entry is NULL at this time.

Lu Baolu  suggested for dev_iommu_priv_set
to automatically allocate the iommu entry by using the function
dev_iommu_get to retrieve that pointer. This function allocates the
entry if needed.

Fixes: 01b9d4e21148 ("iommu/vt-d: Use dev_iommu_priv_get/set()")
Signed-off-by: Torsten Hilbrich 
Tested-by: Torsten Hilbrich 
Link: https://lists.linuxfoundation.org/pipermail/iommu/2020-August/048098.html
---
  drivers/iommu/iommu.c | 22 ++
  include/linux/iommu.h | 11 ++-
  2 files changed, 24 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 609bd25bf154..3edca2a31296 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2849,3 +2849,25 @@ int iommu_sva_get_pasid(struct iommu_sva *handle)
return ops->sva_get_pasid(handle);
  }
  EXPORT_SYMBOL_GPL(iommu_sva_get_pasid);
+
+void *dev_iommu_priv_get(struct device *dev)
+{
+   struct dev_iommu *param = dev_iommu_get(dev);
+
+   if (WARN_ON(!param))
+   return ERR_PTR(-ENOMEM);
+
+return param->priv;
+}
+EXPORT_SYMBOL_GPL(dev_iommu_priv_get);


Hmm, I'm not convinced by this - it looks it would only paper over real 
driver bugs. If the driver's calling dev_iommu_priv_get(), it presumably 
wants to actually *do* something with its private data - if it somehow 
manages to make that call before it's processed ->probe_device(), it 
can't possibly get *meaningful* data, so even if we stop that call from 
crashing how can it result in correct behaviour?


And if the device isn't managed by that IOMMU driver, then it shouldn't 
be calling dev_iommu_priv_get() blindly in the first place (and 
allocating redundant structures would just be a waste).



+void dev_iommu_priv_set(struct device *dev, void *pri

[PATCH 0/2] iommu: amd: Fix intremap IO_PAGE_FAULT for VMs

2020-09-02 Thread Suravee Suthikulpanit
Interrupt remapping IO_PAGE_FAULT has been observed under system w/
large number of VMs w/ pass-through devices. This can be reproduced with
64 VMs + 64 pass-through VFs of Mellanox MT28800 Family [ConnectX-5 Ex],
where each VM runs small-packet netperf test via the pass-through device
to the netserver running on the host. All VMs are running in reboot loop,
to trigger IRTE updates.

In addition, to accelerate the failure, irqbalance is triggered periodically
(e.g. 1-5 sec), which should generate large amount of updates to IRTE.
This setup generally triggers IO_PAGE_FAULT within 3-4 hours.

Investigation has shown that the issue is in the code to update IRTE
while remapping is enabled. Please see patch 2/2 for detail discussion.

This serires has been tested running in the setup mentioned above
upto 96 hours w/o seeing issues.

Thanks,
Suravee

Suravee Suthikulpanit (2):
  iommu: amd: Restore IRTE.RemapEn bit after programming IRTE
  iommu: amd: Use cmpxchg_double() when updating 128-bit IRTE

 drivers/iommu/amd/Kconfig |  2 +-
 drivers/iommu/amd/init.c  | 21 +++--
 drivers/iommu/amd/iommu.c | 19 +++
 3 files changed, 35 insertions(+), 7 deletions(-)

-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v9 14/32] drm: omapdrm: fix common struct sg_table related issues

2020-09-02 Thread Tomi Valkeinen via iommu
On 01/09/2020 22:33, Robin Murphy wrote:
> On 2020-08-26 07:32, Marek Szyprowski wrote:
>> The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
>> returns the number of the created entries in the DMA address space.
>> However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
>> dma_unmap_sg must be called with the original number of the entries
>> passed to the dma_map_sg().
>>
>> struct sg_table is a common structure used for describing a non-contiguous
>> memory buffer, used commonly in the DRM and graphics subsystems. It
>> consists of a scatterlist with memory pages and DMA addresses (sgl entry),
>> as well as the number of scatterlist entries: CPU pages (orig_nents entry)
>> and DMA mapped pages (nents entry).
>>
>> It turned out that it was a common mistake to misuse nents and orig_nents
>> entries, calling DMA-mapping functions with a wrong number of entries or
>> ignoring the number of mapped entries returned by the dma_map_sg()
>> function.
>>
>> Fix the code to refer to proper nents or orig_nents entries. This driver
>> checks for a buffer contiguity in DMA address space, so it should test
>> sg_table->nents entry.
>>
>> Signed-off-by: Marek Szyprowski 
>> ---
>>   drivers/gpu/drm/omapdrm/omap_gem.c | 6 +++---
>>   1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/omapdrm/omap_gem.c 
>> b/drivers/gpu/drm/omapdrm/omap_gem.c
>> index ff0c4b0c3fd0..a7a9a0afe2b6 100644
>> --- a/drivers/gpu/drm/omapdrm/omap_gem.c
>> +++ b/drivers/gpu/drm/omapdrm/omap_gem.c
>> @@ -48,7 +48,7 @@ struct omap_gem_object {
>>    *   OMAP_BO_MEM_DMA_API flag set)
>>    *
>>    * - buffers imported from dmabuf (with the OMAP_BO_MEM_DMABUF flag 
>> set)
>> - *   if they are physically contiguous (when sgt->orig_nents == 1)
>> + *   if they are physically contiguous (when sgt->nents == 1)
> 
> Hmm, if this really does mean *physically* contiguous - i.e. if buffers might 
> be shared between
> DMA-translatable and non-DMA-translatable devices - then these changes might 
> not be appropriate. If
> not and it only actually means DMA-contiguous, then it would be good to 
> clarify the comments to that
> effect.
> 
> Can anyone familiar with omapdrm clarify what exactly the case is here? I 
> know that IOMMUs might be
> involved to some degree, and I've skimmed the interconnect chapters of enough 
> OMAP TRMs to be scared
> by the reference to the tiler aperture in the context below :)

DSS (like many other IPs in OMAP) does not have any MMU/PAT, and can only use 
contiguous buffers
(contiguous in the RAM).

There's a special case with TILER (which is not part of DSS but of the memory 
subsystem, but it's
still handled internally by the omapdrm driver), which has a PAT. PAT can 
create a contiguous view
of scattered pages, and DSS can then use this contiguous view ("tiler 
aperture", which to DSS looks
just like normal contiguous memory).

Note that omapdrm does not use dma_map_sg() & co. mentioned in the patch 
description.

If there's no MMU/PAT, is orig_nents always the same as nents? Or can we have 
multiple physically
contiguous pages listed separately in the sgt (so orig_nents > 1) but as the 
pages form one big
contiguous area, nents == 1?

 Tomi

-- 
Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 1/2] iommu: amd: Restore IRTE.RemapEn bit after programming IRTE

2020-09-02 Thread Suravee Suthikulpanit
Currently, the RemapEn (valid) bit is accidentally cleared when
programming IRTE w/ guestMode=0. It should be restored to
the prior state.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/iommu.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index ba9f3dbc5b94..967f4e96d1eb 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -3850,6 +3850,7 @@ int amd_iommu_deactivate_guest_mode(void *data)
struct amd_ir_data *ir_data = (struct amd_ir_data *)data;
struct irte_ga *entry = (struct irte_ga *) ir_data->entry;
struct irq_cfg *cfg = ir_data->cfg;
+   u64 valid = entry->lo.fields_remap.valid;
 
if (!AMD_IOMMU_GUEST_IR_VAPIC(amd_iommu_guest_ir) ||
!entry || !entry->lo.fields_vapic.guest_mode)
@@ -3858,6 +3859,7 @@ int amd_iommu_deactivate_guest_mode(void *data)
entry->lo.val = 0;
entry->hi.val = 0;
 
+   entry->lo.fields_remap.valid   = valid;
entry->lo.fields_remap.dm  = apic->irq_dest_mode;
entry->lo.fields_remap.int_type= apic->irq_delivery_mode;
entry->hi.fields.vector= cfg->vector;
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu