Re: Re: Re: [PATCH v8 09/10] vduse: Introduce VDUSE - vDPA Device in Userspace

2021-06-30 Thread Yongji Xie
On Wed, Jun 30, 2021 at 5:51 PM Stefan Hajnoczi  wrote:
>
> On Tue, Jun 29, 2021 at 10:59:51AM +0800, Yongji Xie wrote:
> > On Mon, Jun 28, 2021 at 9:02 PM Stefan Hajnoczi  wrote:
> > >
> > > On Tue, Jun 15, 2021 at 10:13:30PM +0800, Xie Yongji wrote:
> > > > +/* ioctls */
> > > > +
> > > > +struct vduse_dev_config {
> > > > + char name[VDUSE_NAME_MAX]; /* vduse device name */
> > > > + __u32 vendor_id; /* virtio vendor id */
> > > > + __u32 device_id; /* virtio device id */
> > > > + __u64 features; /* device features */
> > > > + __u64 bounce_size; /* bounce buffer size for iommu */
> > > > + __u16 vq_size_max; /* the max size of virtqueue */
> > >
> > > The VIRTIO specification allows per-virtqueue sizes. A device can have
> > > two virtqueues, where the first one allows up to 1024 descriptors and
> > > the second one allows only 128 descriptors, for example.
> > >
> >
> > Good point! But it looks like virtio-vdpa/virtio-pci doesn't support
> > that now. All virtqueues have the same maximum size.
>
> I see struct vpda_config_ops only supports a per-device max vq size:
> u16 (*get_vq_num_max)(struct vdpa_device *vdev);
>
> virtio-pci supports per-virtqueue sizes because the struct
> virtio_pci_common_cfg->queue_size register is per-queue (controlled by
> queue_select).
>

Oh, yes. I miss queue_select.

> I guess this is a question for Jason: will vdpa will keep this limitation?
> If yes, then VDUSE can stick to it too without running into problems in
> the future.
>
> > > > + __u16 padding; /* padding */
> > > > + __u32 vq_num; /* the number of virtqueues */
> > > > + __u32 vq_align; /* the allocation alignment of virtqueue's 
> > > > metadata */
> > >
> > > I'm not sure what this is?
> > >
> >
> >  This will be used by vring_create_virtqueue() too.
>
> If there is no official definition for the meaning of this value then
> "/* same as vring_create_virtqueue()'s vring_align parameter */" would
> be clearer. That way the reader knows what to research in order to
> understand how this field works.
>

OK.

> I don't remember but maybe it was used to support vrings when the
> host/guest have non-4KB page sizes. I wonder if anyone has an official
> definition for this value?

Not sure. Maybe we might need some alignment which is less than
PAGE_SIZE sometimes.

Thanks,
Yongji
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v4] swiotlb: fix implicit debugfs declarations

2021-06-30 Thread Claire Chang
Factor out the debugfs bits from rmem_swiotlb_device_init() into a separate
rmem_swiotlb_debugfs_init() to fix the implicit debugfs declarations.

Fixes: 461021875c50 ("swiotlb: Add restricted DMA pool initialization")
Reported-by: kernel test robot 
Signed-off-by: Claire Chang 
---
 kernel/dma/swiotlb.c | 21 -
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 0ffbaae9fba2..b7f76bca89bf 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -712,6 +712,21 @@ late_initcall(swiotlb_create_default_debugfs);
 #endif
 
 #ifdef CONFIG_DMA_RESTRICTED_POOL
+
+#ifdef CONFIG_DEBUG_FS
+static void rmem_swiotlb_debugfs_init(struct reserved_mem *rmem)
+{
+   struct io_tlb_mem *mem = rmem->priv;
+
+   mem->debugfs = debugfs_create_dir(rmem->name, debugfs_dir);
+   swiotlb_create_debugfs_files(mem);
+}
+#else
+static void rmem_swiotlb_debugfs_init(struct reserved_mem *rmem)
+{
+}
+#endif
+
 struct page *swiotlb_alloc(struct device *dev, size_t size)
 {
struct io_tlb_mem *mem = dev->dma_io_tlb_mem;
@@ -766,11 +781,7 @@ static int rmem_swiotlb_device_init(struct reserved_mem 
*rmem,
 
rmem->priv = mem;
 
-   if (IS_ENABLED(CONFIG_DEBUG_FS)) {
-   mem->debugfs =
-   debugfs_create_dir(rmem->name, debugfs_dir);
-   swiotlb_create_debugfs_files(mem);
-   }
+   rmem_swiotlb_debugfs_init(rmem);
}
 
dev->dma_io_tlb_mem = mem;
-- 
2.32.0.93.g670b81a890-goog

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH] iommu/arm: Cleanup resources in case of probe error path

2021-06-30 Thread Krishna Reddy
> +
> +err_pm_disable:
> +   pm_runtime_disable(dev);
>  return ret;
>   }

Should it be pm_runtime_force_suspend()?  qcom_iommu_device_remove() doesn't 
use pm_runtime_disable(dev).

875 static int qcom_iommu_device_remove(struct platform_device *pdev)
876 {
...
881 >---pm_runtime_force_suspend(&pdev->dev);
882 >---platform_set_drvdata(pdev, NULL);
883 >---iommu_device_sysfs_remove(&qcom_iommu->iommu);
884 >---iommu_device_unregister(&qcom_iommu->iommu);
...
887 }

-KR

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH rdma-next v1 1/2] lib/scatterlist: Fix wrong update of orig_nents

2021-06-30 Thread Leon Romanovsky
On Wed, Jun 30, 2021 at 07:44:21PM +0200, Thierry Reding wrote:
> On Wed, Jun 30, 2021 at 01:12:26PM +0200, Marek Szyprowski wrote:
> > Hi Leon,
> > 
> > On 29.06.2021 10:40, Leon Romanovsky wrote:
> > > From: Maor Gottlieb 
> > >
> > > orig_nents should represent the number of entries with pages,
> > > but __sg_alloc_table_from_pages sets orig_nents as the number of
> > > total entries in the table. This is wrong when the API is used for
> > > dynamic allocation where not all the table entries are mapped with
> > > pages. It wasn't observed until now, since RDMA umem who uses this
> > > API in the dynamic form doesn't use orig_nents implicit or explicit
> > > by the scatterlist APIs.
> > >
> > > Fix it by:
> > > 1. Set orig_nents as number of entries with pages also in
> > > __sg_alloc_table_from_pages.
> > > 2. Add a new field total_nents to reflect the total number of entries
> > > in the table. This is required for the release flow (sg_free_table).
> > > This filed should be used internally only by scatterlist.
> > >
> > > Fixes: 07da1223ec93 ("lib/scatterlist: Add support in dynamic allocation 
> > > of SG table from pages")
> > > Signed-off-by: Maor Gottlieb 
> > > Signed-off-by: Leon Romanovsky 
> > 
> > This patch landed in linux-next 20210630 as commit a52724456928 
> > ("lib/scatterlist: Fix wrong update of orig_nents"). It causes serious 
> > regression in DMA-IOMMU integration, which can be observed for example 
> > on ARM Juno board during boot:
> > 
> > Unable to handle kernel paging request at virtual address 00376f42a6e40454
> > Mem abort info:
> >    ESR = 0x9604
> >    EC = 0x25: DABT (current EL), IL = 32 bits
> >    SET = 0, FnV = 0
> >    EA = 0, S1PTW = 0
> >    FSC = 0x04: level 0 translation fault
> > Data abort info:
> >    ISV = 0, ISS = 0x0004
> >    CM = 0, WnR = 0
> > [00376f42a6e40454] address between user and kernel address ranges
> > Internal error: Oops: 9604 [#1] PREEMPT SMP
> > Modules linked in:
> > CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.13.0-next-20210630+ #3585
> > Hardware name: ARM Juno development board (r1) (DT)
> > pstate: 8005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
> > pc : __sg_free_table+0x60/0xa0
> > lr : __sg_free_table+0x7c/0xa0
> > ..
> > Call trace:
> >   __sg_free_table+0x60/0xa0
> >   sg_free_table+0x1c/0x28
> >   iommu_dma_alloc+0xc8/0x388
> >   dma_alloc_attrs+0xcc/0xf0
> >   dmam_alloc_attrs+0x68/0xb8
> >   sil24_port_start+0x60/0xe0
> >   ata_host_start.part.32+0xbc/0x208
> >   ata_host_activate+0x64/0x150
> >   sil24_init_one+0x1e8/0x268
> >   local_pci_probe+0x3c/0xa0
> >   pci_device_probe+0x128/0x1c8
> >   really_probe+0x138/0x2d0
> >   __driver_probe_device+0x78/0xd8
> >   driver_probe_device+0x40/0x110
> >   __driver_attach+0xcc/0x118
> >   bus_for_each_dev+0x68/0xc8
> >   driver_attach+0x20/0x28
> >   bus_add_driver+0x168/0x1f8
> >   driver_register+0x60/0x110
> >   __pci_register_driver+0x5c/0x68
> >   sil24_pci_driver_init+0x20/0x28
> >   do_one_initcall+0x84/0x450
> >   kernel_init_freeable+0x31c/0x38c
> >   kernel_init+0x20/0x120
> >   ret_from_fork+0x10/0x18
> > Code: d37be885 6b01007f 5284 54a2 (f8656813)
> > ---[ end trace 4ba4f0c9c48711a1 ]---
> > Kernel panic - not syncing: Attempted to kill init! exitcode=0x000b
> > 
> > It looks that some changes to the scatterlist structures are missing 
> > outside of the lib/scatterlist.c.
> > 
> > For now I would suggest to revert this change.
> 
> I see a very similar crash on Tegra during the HDA driver's probe.
> 
> Leon, let me know if you need a tester for a replacement patch.

With a great pleasure, I'll contact you offline when we prepare it.

For now, this patch will be dropped.

Thanks

> 
> Thanks,
> Thierry


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH rdma-next v1 1/2] lib/scatterlist: Fix wrong update of orig_nents

2021-06-30 Thread Thierry Reding
On Wed, Jun 30, 2021 at 01:12:26PM +0200, Marek Szyprowski wrote:
> Hi Leon,
> 
> On 29.06.2021 10:40, Leon Romanovsky wrote:
> > From: Maor Gottlieb 
> >
> > orig_nents should represent the number of entries with pages,
> > but __sg_alloc_table_from_pages sets orig_nents as the number of
> > total entries in the table. This is wrong when the API is used for
> > dynamic allocation where not all the table entries are mapped with
> > pages. It wasn't observed until now, since RDMA umem who uses this
> > API in the dynamic form doesn't use orig_nents implicit or explicit
> > by the scatterlist APIs.
> >
> > Fix it by:
> > 1. Set orig_nents as number of entries with pages also in
> > __sg_alloc_table_from_pages.
> > 2. Add a new field total_nents to reflect the total number of entries
> > in the table. This is required for the release flow (sg_free_table).
> > This filed should be used internally only by scatterlist.
> >
> > Fixes: 07da1223ec93 ("lib/scatterlist: Add support in dynamic allocation of 
> > SG table from pages")
> > Signed-off-by: Maor Gottlieb 
> > Signed-off-by: Leon Romanovsky 
> 
> This patch landed in linux-next 20210630 as commit a52724456928 
> ("lib/scatterlist: Fix wrong update of orig_nents"). It causes serious 
> regression in DMA-IOMMU integration, which can be observed for example 
> on ARM Juno board during boot:
> 
> Unable to handle kernel paging request at virtual address 00376f42a6e40454
> Mem abort info:
>    ESR = 0x9604
>    EC = 0x25: DABT (current EL), IL = 32 bits
>    SET = 0, FnV = 0
>    EA = 0, S1PTW = 0
>    FSC = 0x04: level 0 translation fault
> Data abort info:
>    ISV = 0, ISS = 0x0004
>    CM = 0, WnR = 0
> [00376f42a6e40454] address between user and kernel address ranges
> Internal error: Oops: 9604 [#1] PREEMPT SMP
> Modules linked in:
> CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.13.0-next-20210630+ #3585
> Hardware name: ARM Juno development board (r1) (DT)
> pstate: 8005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
> pc : __sg_free_table+0x60/0xa0
> lr : __sg_free_table+0x7c/0xa0
> ..
> Call trace:
>   __sg_free_table+0x60/0xa0
>   sg_free_table+0x1c/0x28
>   iommu_dma_alloc+0xc8/0x388
>   dma_alloc_attrs+0xcc/0xf0
>   dmam_alloc_attrs+0x68/0xb8
>   sil24_port_start+0x60/0xe0
>   ata_host_start.part.32+0xbc/0x208
>   ata_host_activate+0x64/0x150
>   sil24_init_one+0x1e8/0x268
>   local_pci_probe+0x3c/0xa0
>   pci_device_probe+0x128/0x1c8
>   really_probe+0x138/0x2d0
>   __driver_probe_device+0x78/0xd8
>   driver_probe_device+0x40/0x110
>   __driver_attach+0xcc/0x118
>   bus_for_each_dev+0x68/0xc8
>   driver_attach+0x20/0x28
>   bus_add_driver+0x168/0x1f8
>   driver_register+0x60/0x110
>   __pci_register_driver+0x5c/0x68
>   sil24_pci_driver_init+0x20/0x28
>   do_one_initcall+0x84/0x450
>   kernel_init_freeable+0x31c/0x38c
>   kernel_init+0x20/0x120
>   ret_from_fork+0x10/0x18
> Code: d37be885 6b01007f 5284 54a2 (f8656813)
> ---[ end trace 4ba4f0c9c48711a1 ]---
> Kernel panic - not syncing: Attempted to kill init! exitcode=0x000b
> 
> It looks that some changes to the scatterlist structures are missing 
> outside of the lib/scatterlist.c.
> 
> For now I would suggest to revert this change.

I see a very similar crash on Tegra during the HDA driver's probe.

Leon, let me know if you need a tester for a replacement patch.

Thanks,
Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v15 06/12] swiotlb: Use is_swiotlb_force_bounce for swiotlb data bouncing

2021-06-30 Thread Nathan Chancellor
Hi Will and Claire,

On Wed, Jun 30, 2021 at 12:43:48PM +0100, Will Deacon wrote:
> On Wed, Jun 30, 2021 at 05:17:27PM +0800, Claire Chang wrote:
> > On Wed, Jun 30, 2021 at 9:43 AM Nathan Chancellor  wrote:
> > >
> > > On Thu, Jun 24, 2021 at 11:55:20PM +0800, Claire Chang wrote:
> > > > Propagate the swiotlb_force into io_tlb_default_mem->force_bounce and
> > > > use it to determine whether to bounce the data or not. This will be
> > > > useful later to allow for different pools.
> > > >
> > > > Signed-off-by: Claire Chang 
> > > > Reviewed-by: Christoph Hellwig 
> > > > Tested-by: Stefano Stabellini 
> > > > Tested-by: Will Deacon 
> > > > Acked-by: Stefano Stabellini 
> > >
> > > This patch as commit af452ec1b1a3 ("swiotlb: Use is_swiotlb_force_bounce
> > > for swiotlb data bouncing") causes my Ryzen 3 4300G system to fail to
> > > get to an X session consistently (although not every single time),
> > > presumably due to a crash in the AMDGPU driver that I see in dmesg.
> > >
> > > I have attached logs at af452ec1b1a3 and f127c9556a8e and I am happy
> > > to provide any further information, debug, or test patches as necessary.
> > 
> > Are you using swiotlb=force? or the swiotlb_map is called because of
> > !dma_capable? 
> > (https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/kernel/dma/direct.h#n93)
> 
> The command line is in the dmesg:
> 
>   | Kernel command line: initrd=\amd-ucode.img 
> initrd=\initramfs-linux-next-llvm.img 
> root=PARTUUID=8680aa0c-cf09-4a69-8cf3-970478040ee7 rw intel_pstate=no_hwp 
> irqpoll
> 
> but I worry that this looks _very_ similar to the issue reported by Qian
> Cai which we thought we had fixed. Nathan -- is the failure deterministic?

Yes, for the most part. It does not happen every single boot so when I
was bisecting, I did a series of seven boots and only considered the
revision good when all seven of them made it to LightDM's greeter. My
results that I notated show most bad revisions failed anywhere from four
to six times.

> > `BUG: unable to handle page fault for address: 003a8290` and
> > the fact it crashed at `_raw_spin_lock_irqsave` look like the memory
> > (maybe dev->dma_io_tlb_mem) was corrupted?
> > The dev->dma_io_tlb_mem should be set here
> > (https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/drivers/pci/probe.c#n2528)
> > through device_initialize.
> 
> I'm less sure about this. 'dma_io_tlb_mem' should be pointing at
> 'io_tlb_default_mem', which is a page-aligned allocation from memblock.
> The spinlock is at offset 0x24 in that structure, and looking at the
> register dump from the crash:
> 
> Jun 29 18:28:42 hp-4300G kernel: RSP: 0018:adb4013db9e8 EFLAGS: 00010006
> Jun 29 18:28:42 hp-4300G kernel: RAX: 003a8290 RBX:  
> RCX: 8900572ad580
> Jun 29 18:28:42 hp-4300G kernel: RDX: 89005653f024 RSI: 000c 
> RDI: 1d17
> Jun 29 18:28:42 hp-4300G kernel: RBP: 0a20d000 R08: 000c 
> R09: 
> Jun 29 18:28:42 hp-4300G kernel: R10: 0a20d000 R11: 89005653f000 
> R12: 0212
> Jun 29 18:28:42 hp-4300G kernel: R13: 1000 R14: 0002 
> R15: 0020
> Jun 29 18:28:42 hp-4300G kernel: FS:  7f1f8898ea40() 
> GS:89005728() knlGS:
> Jun 29 18:28:42 hp-4300G kernel: CS:  0010 DS:  ES:  CR0: 
> 80050033
> Jun 29 18:28:42 hp-4300G kernel: CR2: 003a8290 CR3: 0001020d 
> CR4: 00350ee0
> Jun 29 18:28:42 hp-4300G kernel: Call Trace:
> Jun 29 18:28:42 hp-4300G kernel:  _raw_spin_lock_irqsave+0x39/0x50
> Jun 29 18:28:42 hp-4300G kernel:  swiotlb_tbl_map_single+0x12b/0x4c0
> 
> Then that correlates with R11 holding the 'dma_io_tlb_mem' pointer and
> RDX pointing at the spinlock. Yet RAX is holding junk :/
> 
> I agree that enabling KASAN would be a good idea, but I also think we
> probably need to get some more information out of swiotlb_tbl_map_single()
> to see see what exactly is going wrong in there.

I can certainly enable KASAN and if there is any debug print I can add
or dump anything, let me know!

Cheers,
Nathan
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 3/3] iommu: dart: Add DART iommu driver

2021-06-30 Thread Alyssa Rosenzweig
Looks really good! Just a few minor comments. With them addressed,

Reviewed-by: Alyssa Rosenzweig 

> +   Say Y here if you are using an Apple SoC with a DART IOMMU.

Nit: Do we need to spell out "with a DART IOMMU"? Don't all the apple
socs need DART?

> +/*
> + * This structure is used to identify a single stream attached to a domain.
> + * It's used as a list inside that domain to be able to attach multiple
> + * streams to a single domain. Since multiple devices can use a single stream
> + * it additionally keeps track of how many devices are represented by this
> + * stream. Once that number reaches zero it is detached from the IOMMU domain
> + * and all translations from this stream are disabled.
> + *
> + * @dart: DART instance to which this stream belongs
> + * @sid: stream id within the DART instance
> + * @num_devices: count of devices attached to this stream
> + * @stream_head: list head for the next stream
> + */
> +struct apple_dart_stream {
> + struct apple_dart *dart;
> + u32 sid;
> +
> + u32 num_devices;
> +
> + struct list_head stream_head;
> +};

It wasn't obvious to me why we can get away without reference counting.
Looking ahead it looks like we assert locks in each case. Maybe add
that to the comment?

```
> +static void apple_dart_hw_set_ttbr(struct apple_dart *dart, u16 sid, u16 idx,
> +phys_addr_t paddr)
> +{
> + writel(DART_TTBR_VALID | (paddr >> DART_TTBR_SHIFT),
> +dart->regs + DART_TTBR(sid, idx));
> +}
```

Should we be checking alignment here? Something like

BUG_ON(paddr & ((1 << DART_TTBR_SHIFT) - 1));
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 1/3] iommu: io-pgtable: add DART pagetable format

2021-06-30 Thread Alyssa Rosenzweig
Reviewed-by: Alyssa Rosenzweig 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 2/3] dt-bindings: iommu: add DART iommu bindings

2021-06-30 Thread Alyssa Rosenzweig
Reviewed-by: Alyssa Rosenzweig 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/arm: Cleanup resources in case of probe error path

2021-06-30 Thread Robin Murphy

On 2021-06-30 14:48, Marek Szyprowski wrote:

On 30.06.2021 14:59, Will Deacon wrote:

On Wed, Jun 30, 2021 at 02:48:15PM +0200, Marek Szyprowski wrote:

On 08.06.2021 18:45, Amey Narkhede wrote:

If device registration fails, remove sysfs attribute
and if setting bus callbacks fails, unregister the device
and cleanup the sysfs attribute.

Signed-off-by: Amey Narkhede 

This patch landed in linux-next some time ago as commit 249c9dc6aa0d
("iommu/arm: Cleanup resources in case of probe error path"). After
bisecting and some manual searching I finally found that it is
responsible for breaking s2idle on DragonBoard 410c. Here is the log
(captured with no_console_suspend):

# time rtcwake -s10 -mmem
rtcwake: wakeup from "mem" using /dev/rtc0 at Thu Jan  1 00:02:13 1970
PM: suspend entry (s2idle)
Filesystems sync: 0.002 seconds
Freezing user space processes ... (elapsed 0.006 seconds) done.
OOM killer disabled.
Freezing remaining freezable tasks ... (elapsed 0.004 seconds) done.
Unable to handle kernel NULL pointer dereference at virtual address
0070
Mem abort info:
     ESR = 0x9606
     EC = 0x25: DABT (current EL), IL = 32 bits
     SET = 0, FnV = 0
     EA = 0, S1PTW = 0
     FSC = 0x06: level 2 translation fault
Data abort info:
     ISV = 0, ISS = 0x0006
     CM = 0, WnR = 0
user pgtable: 4k pages, 48-bit VAs, pgdp=8ad08000
[0070] pgd=080085c3c003, p4d=080085c3c003,
pud=080088dcf003, pmd=
Internal error: Oops: 9606 [#1] PREEMPT SMP
Modules linked in: bluetooth ecdh_generic ecc rfkill ipv6 ax88796b
venus_enc venus_dec videobuf2_dma_contig asix crct10dif_ce adv7511
snd_soc_msm8916_analog qcom_spmi_temp_alarm rtc_pm8xxx qcom_pon
qcom_camss qcom_spmi_vadc videobuf2_dma_sg qcom_vadc_common msm
venus_core v4l2_fwnode v4l2_async snd_soc_msm8916_digital
videobuf2_memops snd_soc_lpass_apq8016 snd_soc_lpass_cpu v4l2_mem2mem
snd_soc_lpass_platform snd_soc_apq8016_sbc videobuf2_v4l2
snd_soc_qcom_common qcom_rng videobuf2_common i2c_qcom_cci qnoc_msm8916
videodev mc icc_smd_rpm mdt_loader socinfo display_connector rmtfs_mem
CPU: 1 PID: 1522 Comm: rtcwake Not tainted 5.13.0-next-20210629 #3592
Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
pstate: 8005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
pc : msm_runtime_suspend+0x1c/0x60 [msm]
lr : msm_pm_suspend+0x18/0x38 [msm]
...
Call trace:
    msm_runtime_suspend+0x1c/0x60 [msm]
    msm_pm_suspend+0x18/0x38 [msm]
    dpm_run_callback+0x84/0x378

I wonder if we're missing a pm_runtime_disable() call on the failure path?
i.e. something like the diff below...


I've checked and it doesn't fix anything.


What's happened previously? Has an IOMMU actually failed to probe, or is 
this a fiddly "code movement unveils latent bug elsewhere" kind of 
thing? There doesn't look to be much capable of going wrong in 
msm_runtime_suspend() itself, so is the DRM driver also in a broken 
half-probed state where it's left its pm_runtime_ops behind without its 
drvdata being valid?


Robin.




Will

--->8

diff --git a/drivers/iommu/arm/arm-smmu/qcom_iommu.c 
b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
index 25ed444ff94d..ce8f354755d0 100644
--- a/drivers/iommu/arm/arm-smmu/qcom_iommu.c
+++ b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
@@ -836,14 +836,14 @@ static int qcom_iommu_device_probe(struct platform_device 
*pdev)
  ret = devm_of_platform_populate(dev);
  if (ret) {
  dev_err(dev, "Failed to populate iommu contexts\n");
-   return ret;
+   goto err_pm_disable;
  }
   
  ret = iommu_device_sysfs_add(&qcom_iommu->iommu, dev, NULL,

   dev_name(dev));
  if (ret) {
  dev_err(dev, "Failed to register iommu in sysfs\n");
-   return ret;
+   goto err_pm_disable;
  }
   
  ret = iommu_device_register(&qcom_iommu->iommu, &qcom_iommu_ops, dev);

@@ -869,6 +869,9 @@ static int qcom_iommu_device_probe(struct platform_device 
*pdev)
   
   err_sysfs_remove:

  iommu_device_sysfs_remove(&qcom_iommu->iommu);
+
+err_pm_disable:
+   pm_runtime_disable(dev);
  return ret;
   }


Best regards


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3] swiotlb: fix implicit debugfs_create_dir declaration

2021-06-30 Thread Robin Murphy

On 2021-06-30 05:06, Claire Chang wrote:

Factor out the debugfs bits from rmem_swiotlb_device_init() into a separate
rmem_swiotlb_debugfs_init().

Fixes: 461021875c50 ("swiotlb: Add restricted DMA pool initialization")
Reported-by: kernel test robot 
Signed-off-by: Claire Chang 
---
  kernel/dma/swiotlb.c | 17 -
  1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 0ffbaae9fba2..355dc98d381c 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -709,6 +709,16 @@ static int __init swiotlb_create_default_debugfs(void)
  
  late_initcall(swiotlb_create_default_debugfs);
  
+static void rmem_swiotlb_debugfs_init(const char *name, struct io_tlb_mem *mem)


Nit: maybe just pass in the reserved_mem, then dereference rmem->name 
and rmem->priv inside here?



+{
+   mem->debugfs = debugfs_create_dir(name, debugfs_dir);
+   swiotlb_create_debugfs_files(mem);
+}
+#else
+__maybe_unused static void rmem_swiotlb_debugfs_init(const char *name,
+struct io_tlb_mem *mem)
+{
+}
  #endif
  
  #ifdef CONFIG_DMA_RESTRICTED_POOL

@@ -766,11 +776,8 @@ static int rmem_swiotlb_device_init(struct reserved_mem 
*rmem,
  
  		rmem->priv = mem;
  
-		if (IS_ENABLED(CONFIG_DEBUG_FS)) {

-   mem->debugfs =
-   debugfs_create_dir(rmem->name, debugfs_dir);
-   swiotlb_create_debugfs_files(mem);
-   }
+   if (IS_ENABLED(CONFIG_DEBUG_FS))
+   rmem_swiotlb_debugfs_init(rmem->name, mem);


Just make the call unconditional (and drop the __maybe_unused annotation 
above) - the stub already does nothing for the DEBUG_FS=n case, so the 
IS_ENABLED() is pointless.


Robin.


}
  
  	dev->dma_io_tlb_mem = mem;



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/arm: Cleanup resources in case of probe error path

2021-06-30 Thread Marek Szyprowski
On 30.06.2021 14:59, Will Deacon wrote:
> On Wed, Jun 30, 2021 at 02:48:15PM +0200, Marek Szyprowski wrote:
>> On 08.06.2021 18:45, Amey Narkhede wrote:
>>> If device registration fails, remove sysfs attribute
>>> and if setting bus callbacks fails, unregister the device
>>> and cleanup the sysfs attribute.
>>>
>>> Signed-off-by: Amey Narkhede 
>> This patch landed in linux-next some time ago as commit 249c9dc6aa0d
>> ("iommu/arm: Cleanup resources in case of probe error path"). After
>> bisecting and some manual searching I finally found that it is
>> responsible for breaking s2idle on DragonBoard 410c. Here is the log
>> (captured with no_console_suspend):
>>
>> # time rtcwake -s10 -mmem
>> rtcwake: wakeup from "mem" using /dev/rtc0 at Thu Jan  1 00:02:13 1970
>> PM: suspend entry (s2idle)
>> Filesystems sync: 0.002 seconds
>> Freezing user space processes ... (elapsed 0.006 seconds) done.
>> OOM killer disabled.
>> Freezing remaining freezable tasks ... (elapsed 0.004 seconds) done.
>> Unable to handle kernel NULL pointer dereference at virtual address
>> 0070
>> Mem abort info:
>>     ESR = 0x9606
>>     EC = 0x25: DABT (current EL), IL = 32 bits
>>     SET = 0, FnV = 0
>>     EA = 0, S1PTW = 0
>>     FSC = 0x06: level 2 translation fault
>> Data abort info:
>>     ISV = 0, ISS = 0x0006
>>     CM = 0, WnR = 0
>> user pgtable: 4k pages, 48-bit VAs, pgdp=8ad08000
>> [0070] pgd=080085c3c003, p4d=080085c3c003,
>> pud=080088dcf003, pmd=
>> Internal error: Oops: 9606 [#1] PREEMPT SMP
>> Modules linked in: bluetooth ecdh_generic ecc rfkill ipv6 ax88796b
>> venus_enc venus_dec videobuf2_dma_contig asix crct10dif_ce adv7511
>> snd_soc_msm8916_analog qcom_spmi_temp_alarm rtc_pm8xxx qcom_pon
>> qcom_camss qcom_spmi_vadc videobuf2_dma_sg qcom_vadc_common msm
>> venus_core v4l2_fwnode v4l2_async snd_soc_msm8916_digital
>> videobuf2_memops snd_soc_lpass_apq8016 snd_soc_lpass_cpu v4l2_mem2mem
>> snd_soc_lpass_platform snd_soc_apq8016_sbc videobuf2_v4l2
>> snd_soc_qcom_common qcom_rng videobuf2_common i2c_qcom_cci qnoc_msm8916
>> videodev mc icc_smd_rpm mdt_loader socinfo display_connector rmtfs_mem
>> CPU: 1 PID: 1522 Comm: rtcwake Not tainted 5.13.0-next-20210629 #3592
>> Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
>> pstate: 8005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
>> pc : msm_runtime_suspend+0x1c/0x60 [msm]
>> lr : msm_pm_suspend+0x18/0x38 [msm]
>> ...
>> Call trace:
>>    msm_runtime_suspend+0x1c/0x60 [msm]
>>    msm_pm_suspend+0x18/0x38 [msm]
>>    dpm_run_callback+0x84/0x378
> I wonder if we're missing a pm_runtime_disable() call on the failure path?
> i.e. something like the diff below...

I've checked and it doesn't fix anything.

> Will
>
> --->8
>
> diff --git a/drivers/iommu/arm/arm-smmu/qcom_iommu.c 
> b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
> index 25ed444ff94d..ce8f354755d0 100644
> --- a/drivers/iommu/arm/arm-smmu/qcom_iommu.c
> +++ b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
> @@ -836,14 +836,14 @@ static int qcom_iommu_device_probe(struct 
> platform_device *pdev)
>  ret = devm_of_platform_populate(dev);
>  if (ret) {
>  dev_err(dev, "Failed to populate iommu contexts\n");
> -   return ret;
> +   goto err_pm_disable;
>  }
>   
>  ret = iommu_device_sysfs_add(&qcom_iommu->iommu, dev, NULL,
>   dev_name(dev));
>  if (ret) {
>  dev_err(dev, "Failed to register iommu in sysfs\n");
> -   return ret;
> +   goto err_pm_disable;
>  }
>   
>  ret = iommu_device_register(&qcom_iommu->iommu, &qcom_iommu_ops, 
> dev);
> @@ -869,6 +869,9 @@ static int qcom_iommu_device_probe(struct platform_device 
> *pdev)
>   
>   err_sysfs_remove:
>  iommu_device_sysfs_remove(&qcom_iommu->iommu);
> +
> +err_pm_disable:
> +   pm_runtime_disable(dev);
>  return ret;
>   }
>
Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] iommu/arm: Cleanup resources in case of probe error path

2021-06-30 Thread Will Deacon
On Wed, Jun 30, 2021 at 02:48:15PM +0200, Marek Szyprowski wrote:
> On 08.06.2021 18:45, Amey Narkhede wrote:
> > If device registration fails, remove sysfs attribute
> > and if setting bus callbacks fails, unregister the device
> > and cleanup the sysfs attribute.
> >
> > Signed-off-by: Amey Narkhede 
> 
> This patch landed in linux-next some time ago as commit 249c9dc6aa0d 
> ("iommu/arm: Cleanup resources in case of probe error path"). After 
> bisecting and some manual searching I finally found that it is 
> responsible for breaking s2idle on DragonBoard 410c. Here is the log 
> (captured with no_console_suspend):
> 
> # time rtcwake -s10 -mmem
> rtcwake: wakeup from "mem" using /dev/rtc0 at Thu Jan  1 00:02:13 1970
> PM: suspend entry (s2idle)
> Filesystems sync: 0.002 seconds
> Freezing user space processes ... (elapsed 0.006 seconds) done.
> OOM killer disabled.
> Freezing remaining freezable tasks ... (elapsed 0.004 seconds) done.
> Unable to handle kernel NULL pointer dereference at virtual address 
> 0070
> Mem abort info:
>    ESR = 0x9606
>    EC = 0x25: DABT (current EL), IL = 32 bits
>    SET = 0, FnV = 0
>    EA = 0, S1PTW = 0
>    FSC = 0x06: level 2 translation fault
> Data abort info:
>    ISV = 0, ISS = 0x0006
>    CM = 0, WnR = 0
> user pgtable: 4k pages, 48-bit VAs, pgdp=8ad08000
> [0070] pgd=080085c3c003, p4d=080085c3c003, 
> pud=080088dcf003, pmd=
> Internal error: Oops: 9606 [#1] PREEMPT SMP
> Modules linked in: bluetooth ecdh_generic ecc rfkill ipv6 ax88796b 
> venus_enc venus_dec videobuf2_dma_contig asix crct10dif_ce adv7511 
> snd_soc_msm8916_analog qcom_spmi_temp_alarm rtc_pm8xxx qcom_pon 
> qcom_camss qcom_spmi_vadc videobuf2_dma_sg qcom_vadc_common msm 
> venus_core v4l2_fwnode v4l2_async snd_soc_msm8916_digital 
> videobuf2_memops snd_soc_lpass_apq8016 snd_soc_lpass_cpu v4l2_mem2mem 
> snd_soc_lpass_platform snd_soc_apq8016_sbc videobuf2_v4l2 
> snd_soc_qcom_common qcom_rng videobuf2_common i2c_qcom_cci qnoc_msm8916 
> videodev mc icc_smd_rpm mdt_loader socinfo display_connector rmtfs_mem
> CPU: 1 PID: 1522 Comm: rtcwake Not tainted 5.13.0-next-20210629 #3592
> Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
> pstate: 8005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
> pc : msm_runtime_suspend+0x1c/0x60 [msm]
> lr : msm_pm_suspend+0x18/0x38 [msm]
> ...
> Call trace:
>   msm_runtime_suspend+0x1c/0x60 [msm]
>   msm_pm_suspend+0x18/0x38 [msm]
>   dpm_run_callback+0x84/0x378

I wonder if we're missing a pm_runtime_disable() call on the failure path?
i.e. something like the diff below...

Will

--->8

diff --git a/drivers/iommu/arm/arm-smmu/qcom_iommu.c 
b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
index 25ed444ff94d..ce8f354755d0 100644
--- a/drivers/iommu/arm/arm-smmu/qcom_iommu.c
+++ b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
@@ -836,14 +836,14 @@ static int qcom_iommu_device_probe(struct platform_device 
*pdev)
ret = devm_of_platform_populate(dev);
if (ret) {
dev_err(dev, "Failed to populate iommu contexts\n");
-   return ret;
+   goto err_pm_disable;
}
 
ret = iommu_device_sysfs_add(&qcom_iommu->iommu, dev, NULL,
 dev_name(dev));
if (ret) {
dev_err(dev, "Failed to register iommu in sysfs\n");
-   return ret;
+   goto err_pm_disable;
}
 
ret = iommu_device_register(&qcom_iommu->iommu, &qcom_iommu_ops, dev);
@@ -869,6 +869,9 @@ static int qcom_iommu_device_probe(struct platform_device 
*pdev)
 
 err_sysfs_remove:
iommu_device_sysfs_remove(&qcom_iommu->iommu);
+
+err_pm_disable:
+   pm_runtime_disable(dev);
return ret;
 }
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/arm: Cleanup resources in case of probe error path

2021-06-30 Thread Marek Szyprowski
Hi,

On 08.06.2021 18:45, Amey Narkhede wrote:
> If device registration fails, remove sysfs attribute
> and if setting bus callbacks fails, unregister the device
> and cleanup the sysfs attribute.
>
> Signed-off-by: Amey Narkhede 

This patch landed in linux-next some time ago as commit 249c9dc6aa0d 
("iommu/arm: Cleanup resources in case of probe error path"). After 
bisecting and some manual searching I finally found that it is 
responsible for breaking s2idle on DragonBoard 410c. Here is the log 
(captured with no_console_suspend):

# time rtcwake -s10 -mmem
rtcwake: wakeup from "mem" using /dev/rtc0 at Thu Jan  1 00:02:13 1970
PM: suspend entry (s2idle)
Filesystems sync: 0.002 seconds
Freezing user space processes ... (elapsed 0.006 seconds) done.
OOM killer disabled.
Freezing remaining freezable tasks ... (elapsed 0.004 seconds) done.
Unable to handle kernel NULL pointer dereference at virtual address 
0070
Mem abort info:
   ESR = 0x9606
   EC = 0x25: DABT (current EL), IL = 32 bits
   SET = 0, FnV = 0
   EA = 0, S1PTW = 0
   FSC = 0x06: level 2 translation fault
Data abort info:
   ISV = 0, ISS = 0x0006
   CM = 0, WnR = 0
user pgtable: 4k pages, 48-bit VAs, pgdp=8ad08000
[0070] pgd=080085c3c003, p4d=080085c3c003, 
pud=080088dcf003, pmd=
Internal error: Oops: 9606 [#1] PREEMPT SMP
Modules linked in: bluetooth ecdh_generic ecc rfkill ipv6 ax88796b 
venus_enc venus_dec videobuf2_dma_contig asix crct10dif_ce adv7511 
snd_soc_msm8916_analog qcom_spmi_temp_alarm rtc_pm8xxx qcom_pon 
qcom_camss qcom_spmi_vadc videobuf2_dma_sg qcom_vadc_common msm 
venus_core v4l2_fwnode v4l2_async snd_soc_msm8916_digital 
videobuf2_memops snd_soc_lpass_apq8016 snd_soc_lpass_cpu v4l2_mem2mem 
snd_soc_lpass_platform snd_soc_apq8016_sbc videobuf2_v4l2 
snd_soc_qcom_common qcom_rng videobuf2_common i2c_qcom_cci qnoc_msm8916 
videodev mc icc_smd_rpm mdt_loader socinfo display_connector rmtfs_mem
CPU: 1 PID: 1522 Comm: rtcwake Not tainted 5.13.0-next-20210629 #3592
Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
pstate: 8005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
pc : msm_runtime_suspend+0x1c/0x60 [msm]
lr : msm_pm_suspend+0x18/0x38 [msm]
...
Call trace:
  msm_runtime_suspend+0x1c/0x60 [msm]
  msm_pm_suspend+0x18/0x38 [msm]
  dpm_run_callback+0x84/0x378
  __device_suspend+0x118/0x680
  dpm_suspend+0x150/0x4f0
  dpm_suspend_start+0x98/0xa0
  suspend_devices_and_enter+0xfc/0xaf0
  pm_suspend+0x2b0/0x3d0
  state_store+0x84/0x108
  kobj_attr_store+0x14/0x28
  sysfs_kf_write+0x60/0x70
  kernfs_fop_write_iter+0x124/0x1a8
  new_sync_write+0xe8/0x1b0
  vfs_write+0x1e8/0x450
  ksys_write+0x64/0xf0
  __arm64_sys_write+0x14/0x20
  invoke_syscall+0x40/0xf8
  el0_svc_common+0x60/0x100
  do_el0_svc_compat+0x1c/0x48
  el0_svc_compat+0x20/0x30
  el0t_32_sync_handler+0xec/0x140
  el0t_32_sync+0x168/0x16c
Code: 910003fd f9000bf3 f9403c02 52800040 (f9403842)
---[ end trace 215b72fcd7026947 ]---

Reverting it on top of linux-next fixes s2idle oepration on that board.

> ---
>   drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 14 --
>   drivers/iommu/arm/arm-smmu/arm-smmu.c   | 15 ---
>   drivers/iommu/arm/arm-smmu/qcom_iommu.c | 13 +++--
>   3 files changed, 35 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 54b2f27b81d4..de2499754025 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -3669,10 +3669,20 @@ static int arm_smmu_device_probe(struct 
> platform_device *pdev)
>   ret = iommu_device_register(&smmu->iommu, &arm_smmu_ops, dev);
>   if (ret) {
>   dev_err(dev, "Failed to register iommu\n");
> - return ret;
> + goto err_sysfs_remove;
>   }
>
> - return arm_smmu_set_bus_ops(&arm_smmu_ops);
> + ret = arm_smmu_set_bus_ops(&arm_smmu_ops);
> + if (ret)
> + goto err_unregister_device;
> +
> + return 0;
> +
> +err_unregister_device:
> + iommu_device_unregister(&smmu->iommu);
> +err_sysfs_remove:
> + iommu_device_sysfs_remove(&smmu->iommu);
> + return ret;
>   }
>
>   static int arm_smmu_device_remove(struct platform_device *pdev)
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c 
> b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> index 6f72c4d208ca..88a3023676ce 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> @@ -2164,7 +2164,7 @@ static int arm_smmu_device_probe(struct platform_device 
> *pdev)
>   err = iommu_device_register(&smmu->iommu, &arm_smmu_ops, dev);
>   if (err) {
>   dev_err(dev, "Failed to register iommu\n");
> - return err;
> + goto err_sysfs_remove;
>   }
>
>   platform_set_drvdata(pdev, smmu);
> @@ -2187,10 +2187,19 @@ static int arm_smmu_device_probe(struct 
> pla

Re: [PATCH v15 06/12] swiotlb: Use is_swiotlb_force_bounce for swiotlb data bouncing

2021-06-30 Thread Will Deacon
On Wed, Jun 30, 2021 at 05:17:27PM +0800, Claire Chang wrote:
> On Wed, Jun 30, 2021 at 9:43 AM Nathan Chancellor  wrote:
> >
> > On Thu, Jun 24, 2021 at 11:55:20PM +0800, Claire Chang wrote:
> > > Propagate the swiotlb_force into io_tlb_default_mem->force_bounce and
> > > use it to determine whether to bounce the data or not. This will be
> > > useful later to allow for different pools.
> > >
> > > Signed-off-by: Claire Chang 
> > > Reviewed-by: Christoph Hellwig 
> > > Tested-by: Stefano Stabellini 
> > > Tested-by: Will Deacon 
> > > Acked-by: Stefano Stabellini 
> >
> > This patch as commit af452ec1b1a3 ("swiotlb: Use is_swiotlb_force_bounce
> > for swiotlb data bouncing") causes my Ryzen 3 4300G system to fail to
> > get to an X session consistently (although not every single time),
> > presumably due to a crash in the AMDGPU driver that I see in dmesg.
> >
> > I have attached logs at af452ec1b1a3 and f127c9556a8e and I am happy
> > to provide any further information, debug, or test patches as necessary.
> 
> Are you using swiotlb=force? or the swiotlb_map is called because of
> !dma_capable? 
> (https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/kernel/dma/direct.h#n93)

The command line is in the dmesg:

  | Kernel command line: initrd=\amd-ucode.img 
initrd=\initramfs-linux-next-llvm.img 
root=PARTUUID=8680aa0c-cf09-4a69-8cf3-970478040ee7 rw intel_pstate=no_hwp 
irqpoll

but I worry that this looks _very_ similar to the issue reported by Qian
Cai which we thought we had fixed. Nathan -- is the failure deterministic?

> `BUG: unable to handle page fault for address: 003a8290` and
> the fact it crashed at `_raw_spin_lock_irqsave` look like the memory
> (maybe dev->dma_io_tlb_mem) was corrupted?
> The dev->dma_io_tlb_mem should be set here
> (https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/drivers/pci/probe.c#n2528)
> through device_initialize.

I'm less sure about this. 'dma_io_tlb_mem' should be pointing at
'io_tlb_default_mem', which is a page-aligned allocation from memblock.
The spinlock is at offset 0x24 in that structure, and looking at the
register dump from the crash:

Jun 29 18:28:42 hp-4300G kernel: RSP: 0018:adb4013db9e8 EFLAGS: 00010006
Jun 29 18:28:42 hp-4300G kernel: RAX: 003a8290 RBX:  
RCX: 8900572ad580
Jun 29 18:28:42 hp-4300G kernel: RDX: 89005653f024 RSI: 000c 
RDI: 1d17
Jun 29 18:28:42 hp-4300G kernel: RBP: 0a20d000 R08: 000c 
R09: 
Jun 29 18:28:42 hp-4300G kernel: R10: 0a20d000 R11: 89005653f000 
R12: 0212
Jun 29 18:28:42 hp-4300G kernel: R13: 1000 R14: 0002 
R15: 0020
Jun 29 18:28:42 hp-4300G kernel: FS:  7f1f8898ea40() 
GS:89005728() knlGS:
Jun 29 18:28:42 hp-4300G kernel: CS:  0010 DS:  ES:  CR0: 
80050033
Jun 29 18:28:42 hp-4300G kernel: CR2: 003a8290 CR3: 0001020d 
CR4: 00350ee0
Jun 29 18:28:42 hp-4300G kernel: Call Trace:
Jun 29 18:28:42 hp-4300G kernel:  _raw_spin_lock_irqsave+0x39/0x50
Jun 29 18:28:42 hp-4300G kernel:  swiotlb_tbl_map_single+0x12b/0x4c0

Then that correlates with R11 holding the 'dma_io_tlb_mem' pointer and
RDX pointing at the spinlock. Yet RAX is holding junk :/

I agree that enabling KASAN would be a good idea, but I also think we
probably need to get some more information out of swiotlb_tbl_map_single()
to see see what exactly is going wrong in there.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH rdma-next v1 1/2] lib/scatterlist: Fix wrong update of orig_nents

2021-06-30 Thread Leon Romanovsky
On Wed, Jun 30, 2021 at 01:12:26PM +0200, Marek Szyprowski wrote:
> Hi Leon,
> 
> On 29.06.2021 10:40, Leon Romanovsky wrote:
> > From: Maor Gottlieb 
> >
> > orig_nents should represent the number of entries with pages,
> > but __sg_alloc_table_from_pages sets orig_nents as the number of
> > total entries in the table. This is wrong when the API is used for
> > dynamic allocation where not all the table entries are mapped with
> > pages. It wasn't observed until now, since RDMA umem who uses this
> > API in the dynamic form doesn't use orig_nents implicit or explicit
> > by the scatterlist APIs.
> >
> > Fix it by:
> > 1. Set orig_nents as number of entries with pages also in
> > __sg_alloc_table_from_pages.
> > 2. Add a new field total_nents to reflect the total number of entries
> > in the table. This is required for the release flow (sg_free_table).
> > This filed should be used internally only by scatterlist.
> >
> > Fixes: 07da1223ec93 ("lib/scatterlist: Add support in dynamic allocation of 
> > SG table from pages")
> > Signed-off-by: Maor Gottlieb 
> > Signed-off-by: Leon Romanovsky 

<...>

> For now I would suggest to revert this change.

Thanks for the report, we will drop this patch.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH rdma-next v1 1/2] lib/scatterlist: Fix wrong update of orig_nents

2021-06-30 Thread Marek Szyprowski
Hi Leon,

On 29.06.2021 10:40, Leon Romanovsky wrote:
> From: Maor Gottlieb 
>
> orig_nents should represent the number of entries with pages,
> but __sg_alloc_table_from_pages sets orig_nents as the number of
> total entries in the table. This is wrong when the API is used for
> dynamic allocation where not all the table entries are mapped with
> pages. It wasn't observed until now, since RDMA umem who uses this
> API in the dynamic form doesn't use orig_nents implicit or explicit
> by the scatterlist APIs.
>
> Fix it by:
> 1. Set orig_nents as number of entries with pages also in
> __sg_alloc_table_from_pages.
> 2. Add a new field total_nents to reflect the total number of entries
> in the table. This is required for the release flow (sg_free_table).
> This filed should be used internally only by scatterlist.
>
> Fixes: 07da1223ec93 ("lib/scatterlist: Add support in dynamic allocation of 
> SG table from pages")
> Signed-off-by: Maor Gottlieb 
> Signed-off-by: Leon Romanovsky 

This patch landed in linux-next 20210630 as commit a52724456928 
("lib/scatterlist: Fix wrong update of orig_nents"). It causes serious 
regression in DMA-IOMMU integration, which can be observed for example 
on ARM Juno board during boot:

Unable to handle kernel paging request at virtual address 00376f42a6e40454
Mem abort info:
   ESR = 0x9604
   EC = 0x25: DABT (current EL), IL = 32 bits
   SET = 0, FnV = 0
   EA = 0, S1PTW = 0
   FSC = 0x04: level 0 translation fault
Data abort info:
   ISV = 0, ISS = 0x0004
   CM = 0, WnR = 0
[00376f42a6e40454] address between user and kernel address ranges
Internal error: Oops: 9604 [#1] PREEMPT SMP
Modules linked in:
CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.13.0-next-20210630+ #3585
Hardware name: ARM Juno development board (r1) (DT)
pstate: 8005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
pc : __sg_free_table+0x60/0xa0
lr : __sg_free_table+0x7c/0xa0
..
Call trace:
  __sg_free_table+0x60/0xa0
  sg_free_table+0x1c/0x28
  iommu_dma_alloc+0xc8/0x388
  dma_alloc_attrs+0xcc/0xf0
  dmam_alloc_attrs+0x68/0xb8
  sil24_port_start+0x60/0xe0
  ata_host_start.part.32+0xbc/0x208
  ata_host_activate+0x64/0x150
  sil24_init_one+0x1e8/0x268
  local_pci_probe+0x3c/0xa0
  pci_device_probe+0x128/0x1c8
  really_probe+0x138/0x2d0
  __driver_probe_device+0x78/0xd8
  driver_probe_device+0x40/0x110
  __driver_attach+0xcc/0x118
  bus_for_each_dev+0x68/0xc8
  driver_attach+0x20/0x28
  bus_add_driver+0x168/0x1f8
  driver_register+0x60/0x110
  __pci_register_driver+0x5c/0x68
  sil24_pci_driver_init+0x20/0x28
  do_one_initcall+0x84/0x450
  kernel_init_freeable+0x31c/0x38c
  kernel_init+0x20/0x120
  ret_from_fork+0x10/0x18
Code: d37be885 6b01007f 5284 54a2 (f8656813)
---[ end trace 4ba4f0c9c48711a1 ]---
Kernel panic - not syncing: Attempted to kill init! exitcode=0x000b

It looks that some changes to the scatterlist structures are missing 
outside of the lib/scatterlist.c.

For now I would suggest to revert this change.

> ---
>   include/linux/scatterlist.h |  8 ++--
>   lib/scatterlist.c   | 32 
>   2 files changed, 14 insertions(+), 26 deletions(-)
>
> diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
> index 6f70572b2938..1c889141eb91 100644
> --- a/include/linux/scatterlist.h
> +++ b/include/linux/scatterlist.h
> @@ -35,8 +35,12 @@ struct scatterlist {
>   
>   struct sg_table {
>   struct scatterlist *sgl;/* the list */
> - unsigned int nents; /* number of mapped entries */
> - unsigned int orig_nents;/* original size of list */
> + unsigned int nents; /* number of DMA mapped entries */
> + unsigned int orig_nents;/* number of CPU mapped entries */
> + /* The fields below should be used internally only by
> +  * scatterlist implementation.
> +  */
> + unsigned int total_nents;   /* number of total entries in the table 
> */
>   };
>   
>   /*
> diff --git a/lib/scatterlist.c b/lib/scatterlist.c
> index a59778946404..6db70a1e7dd0 100644
> --- a/lib/scatterlist.c
> +++ b/lib/scatterlist.c
> @@ -192,33 +192,26 @@ static void sg_kfree(struct scatterlist *sg, unsigned 
> int nents)
>   void __sg_free_table(struct sg_table *table, unsigned int max_ents,
>unsigned int nents_first_chunk, sg_free_fn *free_fn)
>   {
> - struct scatterlist *sgl, *next;
> + struct scatterlist *sgl, *next = NULL;
>   unsigned curr_max_ents = nents_first_chunk ?: max_ents;
>   
>   if (unlikely(!table->sgl))
>   return;
>   
>   sgl = table->sgl;
> - while (table->orig_nents) {
> - unsigned int alloc_size = table->orig_nents;
> - unsigned int 

Re: [PATCH] iommu/arm-smmu-v3: Add default domain quirk for Arm Fast Models

2021-06-30 Thread Robin Murphy

On 2021-06-30 09:56, Marc Zyngier wrote:

On Tue, 29 Jun 2021 18:34:40 +0100,
Will Deacon  wrote:


On Fri, Jun 18, 2021 at 05:24:49PM +0100, Robin Murphy wrote:

Arm Fast Models are still implementing legacy virtio-pci devices behind
the SMMU, which continue to be problematic as "real hardware" devices
(from the point of view of the simulated system) without the mitigating
VIRTIO_F_ACCESS_PLATFORM feature.

Since we now have the ability to force passthrough on a device-specific
basis, let's use it to work around this particular oddity so that people
who just want to boot Linux on a model don't have to fiddle around with
things to avoid the SMMU getting in the way by default (and I don't have
to keep telling them which things...)

Signed-off-by: Robin Murphy 
---
  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 15 +++
  1 file changed, 15 insertions(+)


Any chance of getting the fastmodels updated instead? It feels like it
has to happen *eventually*, and then there would be no need for this bodge.


That'd be ideal. What are the chances of that happening before the Sun
turns into a black hole?


We'll try making some noise again internally and see where that goes, 
but given the progress over the last 4 years I'd be inclined to posit a 
new theory of the universe eventually ending in one giant event 0x10.



diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 54b2f27b81d4..13cf16e8f45b 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2649,6 +2649,20 @@ static int arm_smmu_dev_disable_feature(struct device 
*dev,
}
  }
  
+static int arm_smmu_def_domain_type(struct device *dev)

+{
+   if (dev_is_pci(dev)) {
+   struct pci_dev *pdev = to_pci_dev(dev);
+
+   /* Legacy virtio-block devices on Arm Fast Models */
+   if (pdev->vendor == 0x1af4 && pdev->device == 0x1001 &&
+   pdev->subsystem_vendor == 0x00ff && pdev->subsystem_device 
== 0x0002)
+   return IOMMU_DOMAIN_IDENTITY;
+   }
+
+   return 0;
+}


Could this be expressed as a PCI quirk instead? It would at least keep
the ID matching out of the SMMU driver...


I don't think so - getting the information from the PCI layer to the 
IOMMU layer at the right place and time to influence the default domain 
choice would seemingly need some uglier and even less popular 
infrastructure adding. It's something that only the IOMMU layer has any 
need to be aware of, and the def_domain_type callback essentially exists 
for this kind of special-casing - compare intel-iommu's 
device_def_domain_type() for example. If we accept a workaround at all, 
I do believe this is the least-worst option.


Robin.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/3] iommu/io-pgtable-arm: Add IOMMU_LLC page protection flag

2021-06-30 Thread Sai Prakash Ranjan

Hi Will,

On 2021-03-25 23:03, Will Deacon wrote:

On Tue, Mar 09, 2021 at 12:10:44PM +0530, Sai Prakash Ranjan wrote:

On 2021-02-05 17:38, Sai Prakash Ranjan wrote:
> On 2021-02-04 03:16, Will Deacon wrote:
> > On Tue, Feb 02, 2021 at 11:56:27AM +0530, Sai Prakash Ranjan wrote:
> > > On 2021-02-01 23:50, Jordan Crouse wrote:
> > > > On Mon, Feb 01, 2021 at 08:20:44AM -0800, Rob Clark wrote:
> > > > > On Mon, Feb 1, 2021 at 3:16 AM Will Deacon  wrote:
> > > > > > On Fri, Jan 29, 2021 at 03:12:59PM +0530, Sai Prakash Ranjan wrote:
> > > > > > > On 2021-01-29 14:35, Will Deacon wrote:
> > > > > > > > On Mon, Jan 11, 2021 at 07:45:04PM +0530, Sai Prakash Ranjan 
wrote:
> > > > > > > > > +#define IOMMU_LLC(1 << 6)
> > > > > > > >
> > > > > > > > On reflection, I'm a bit worried about exposing this because I 
think it
> > > > > > > > will
> > > > > > > > introduce a mismatched virtual alias with the CPU (we don't 
even have a
> > > > > > > > MAIR
> > > > > > > > set up for this memory type). Now, we also have that issue for 
the PTW,
> > > > > > > > but
> > > > > > > > since we always use cache maintenance (i.e. the streaming API) 
for
> > > > > > > > publishing the page-tables to a non-coheren walker, it works 
out.
> > > > > > > > However,
> > > > > > > > if somebody expects IOMMU_LLC to be coherent with a DMA API 
coherent
> > > > > > > > allocation, then they're potentially in for a nasty surprise 
due to the
> > > > > > > > mismatched outer-cacheability attributes.
> > > > > > > >
> > > > > > >
> > > > > > > Can't we add the syscached memory type similar to what is done on 
android?
> > > > > >
> > > > > > Maybe. How does the GPU driver map these things on the CPU side?
> > > > >
> > > > > Currently we use writecombine mappings for everything, although there
> > > > > are some cases that we'd like to use cached (but have not merged
> > > > > patches that would give userspace a way to flush/invalidate)
> > > > >
> > > >
> > > > LLC/system cache doesn't have a relationship with the CPU cache.  Its
> > > > just a
> > > > little accelerator that sits on the connection from the GPU to DDR and
> > > > caches
> > > > accesses. The hint that Sai is suggesting is used to mark the buffers as
> > > > 'no-write-allocate' to prevent GPU write operations from being cached in
> > > > the LLC
> > > > which a) isn't interesting and b) takes up cache space for read
> > > > operations.
> > > >
> > > > Its easiest to think of the LLC as a bonus accelerator that has no cost
> > > > for
> > > > us to use outside of the unfortunate per buffer hint.
> > > >
> > > > We do have to worry about the CPU cache w.r.t I/O coherency (which is a
> > > > different hint) and in that case we have all of concerns that Will
> > > > identified.
> > > >
> > >
> > > For mismatched outer cacheability attributes which Will
> > > mentioned, I was
> > > referring to [1] in android kernel.
> >
> > I've lost track of the conversation here :/
> >
> > When the GPU has a buffer mapped with IOMMU_LLC, is the buffer also
> > mapped
> > into the CPU and with what attributes? Rob said "writecombine for
> > everything" -- does that mean ioremap_wc() / MEMREMAP_WC?
> >
>
> Rob answered this.
>
> > Finally, we need to be careful when we use the word "hint" as
> > "allocation
> > hint" has a specific meaning in the architecture, and if we only
> > mismatch on
> > those then we're actually ok. But I think IOMMU_LLC is more than
> > just a
> > hint, since it actually drives eviction policy (i.e. it enables
> > writeback).
> >
> > Sorry for the pedantry, but I just want to make sure we're all talking
> > about the same things!
> >
>
> Sorry for the confusion which probably was caused by my mentioning of
> android, NWA(no write allocate) is an allocation hint which we can
> ignore
> for now as it is not introduced yet in upstream.
>

Any chance of taking this forward? We do not want to miss out on small 
fps

gain when the product gets released.


Do we have a solution to the mismatched virtual alias?



Sorry for the long delay on this thread.

For mismatched virtual alias question, wasn't this already discussed in 
stretch
when initial support for system cache [1] (which was reverted by you) 
was added?


Excerpt from there,

"As seen in downstream kernels there are few non-coherent devices which
would not want to allocate in system cache, and therefore would want
Inner/Outer non-cached memory. So, we may want to either override the
attributes per-device, or as you suggested we may want to introduce
another memory type 'sys-cached' that can be added with its separate
infra."

As for DMA API usage, we do not have any upstream users (video will be
one if they decide to upstream that).

[1] 
https://patchwork.kernel.org/project/linux-arm-msm/patch/20180615105329.26800-1-vivek.gau...@codeaurora.org/


Thanks,
Sai

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member

of Code Aurora Forum, hosted by The Linux Foundation

Re: Re: [PATCH v8 10/10] Documentation: Add documentation for VDUSE

2021-06-30 Thread Stefan Hajnoczi
On Tue, Jun 29, 2021 at 01:43:11PM +0800, Yongji Xie wrote:
> On Mon, Jun 28, 2021 at 9:02 PM Stefan Hajnoczi  wrote:
> > On Tue, Jun 15, 2021 at 10:13:31PM +0800, Xie Yongji wrote:
> > > + static void *iova_to_va(int dev_fd, uint64_t iova, uint64_t *len)
> > > + {
> > > + int fd;
> > > + void *addr;
> > > + size_t size;
> > > + struct vduse_iotlb_entry entry;
> > > +
> > > + entry.start = iova;
> > > + entry.last = iova + 1;
> >
> > Why +1?
> >
> > I expected the request to include *len so that VDUSE can create a bounce
> > buffer for the full iova range, if necessary.
> >
> 
> The function is used to translate iova to va. And the *len is not
> specified by the caller. Instead, it's used to tell the caller the
> length of the contiguous iova region from the specified iova. And the
> ioctl VDUSE_IOTLB_GET_FD will get the file descriptor to the first
> overlapped iova region. So using iova + 1 should be enough here.

Does the entry.last field have any purpose with VDUSE_IOTLB_GET_FD? I
wonder why userspace needs to assign a value at all if it's always +1.

> 
> > > + fd = ioctl(dev_fd, VDUSE_IOTLB_GET_FD, &entry);
> > > + if (fd < 0)
> > > + return NULL;
> > > +
> > > + size = entry.last - entry.start + 1;
> > > + *len = entry.last - iova + 1;
> > > + addr = mmap(0, size, perm_to_prot(entry.perm), MAP_SHARED,
> > > + fd, entry.offset);
> > > + close(fd);
> > > + if (addr == MAP_FAILED)
> > > + return NULL;
> > > +
> > > + /* do something to cache this iova region */
> >
> > How is userspace expected to manage iotlb mmaps? When should munmap(2)
> > be called?
> >
> 
> The simple way is using a list to store the iotlb mappings. And we
> should call the munmap(2) for the old mappings when VDUSE_UPDATE_IOTLB
> or VDUSE_STOP_DATAPLANE message is received.

Thanks for explaining. It would be helpful to have a description of
IOTLB operation in this document.

> > Should userspace expect VDUSE_IOTLB_GET_FD to return a full chunk of
> > guest RAM (e.g. multiple gigabytes) that can be cached permanently or
> > will it return just enough pages to cover [start, last)?
> >
> 
> It should return one iotlb mapping that covers [start, last). In
> vhost-vdpa cases, it might be a full chunk of guest RAM. In
> virtio-vdpa cases, it might be the whole bounce buffer or one coherent
> mapping (produced by dma_alloc_coherent()).

Great, thanks. Adding something about this to the documentation would
help others implementing VDUSE devices or libraries.

> > > +
> > > + return addr + iova - entry.start;
> > > + }
> > > +
> > > +- VDUSE_DEV_GET_FEATURES: Get the negotiated features
> >
> > Are these VIRTIO feature bits? Please explain how feature negotiation
> > works. There must be a way for userspace to report the device's
> > supported feature bits to the kernel.
> >
> 
> Yes, these are VIRTIO feature bits. Userspace will specify the
> device's supported feature bits when creating a new VDUSE device with
> ioctl(VDUSE_CREATE_DEV).

Can the VDUSE device influence feature bit negotiation? For example, if
the VDUSE virtio-blk device does not implement discard/write-zeroes, how
does QEMU or the guest find out about this?

> > > +- VDUSE_DEV_UPDATE_CONFIG: Update the configuration space and inject a 
> > > config interrupt
> >
> > Does this mean the contents of the configuration space are cached by
> > VDUSE?
> 
> Yes, but the kernel will also store the same contents.
> 
> > The downside is that the userspace code cannot generate the
> > contents on demand. Most devices doin't need to generate the contents
> > on demand, so I think this is okay but I had expected a different
> > interface:
> >
> > kernel->userspace VDUSE_DEV_GET_CONFIG
> > userspace->kernel VDUSE_DEV_INJECT_CONFIG_IRQ
> >
> 
> The problem is how to handle the failure of VDUSE_DEV_GET_CONFIG. We
> will need lots of modification of virtio codes to support that. So to
> make it simple, we choose this way:
> 
> userspace -> kernel VDUSE_DEV_SET_CONFIG
> userspace -> kernel VDUSE_DEV_INJECT_CONFIG_IRQ
> 
> > I think you can leave it the way it is, but I wanted to mention this in
> > case someone thinks it's important to support generating the contents of
> > the configuration space on demand.
> >
> 
> Sorry, I didn't get you here. Can't VDUSE_DEV_SET_CONFIG and
> VDUSE_DEV_INJECT_CONFIG_IRQ achieve that?

If the contents of the configuration space change continuously, then the
VDUSE_DEV_SET_CONFIG approach is inefficient and might have race
conditions. For example, imagine a device where the driver can read a
timer from the configuration space. I think the VIRTIO device model
allows that although I'm not aware of any devices that do something like
it today. The problem is that VDUSE_DEV_SET_CONFIG would have to be
called f

Re: Re: [PATCH v8 09/10] vduse: Introduce VDUSE - vDPA Device in Userspace

2021-06-30 Thread Stefan Hajnoczi
On Tue, Jun 29, 2021 at 10:59:51AM +0800, Yongji Xie wrote:
> On Mon, Jun 28, 2021 at 9:02 PM Stefan Hajnoczi  wrote:
> >
> > On Tue, Jun 15, 2021 at 10:13:30PM +0800, Xie Yongji wrote:
> > > +/* ioctls */
> > > +
> > > +struct vduse_dev_config {
> > > + char name[VDUSE_NAME_MAX]; /* vduse device name */
> > > + __u32 vendor_id; /* virtio vendor id */
> > > + __u32 device_id; /* virtio device id */
> > > + __u64 features; /* device features */
> > > + __u64 bounce_size; /* bounce buffer size for iommu */
> > > + __u16 vq_size_max; /* the max size of virtqueue */
> >
> > The VIRTIO specification allows per-virtqueue sizes. A device can have
> > two virtqueues, where the first one allows up to 1024 descriptors and
> > the second one allows only 128 descriptors, for example.
> >
> 
> Good point! But it looks like virtio-vdpa/virtio-pci doesn't support
> that now. All virtqueues have the same maximum size.

I see struct vpda_config_ops only supports a per-device max vq size:
u16 (*get_vq_num_max)(struct vdpa_device *vdev);

virtio-pci supports per-virtqueue sizes because the struct
virtio_pci_common_cfg->queue_size register is per-queue (controlled by
queue_select).

I guess this is a question for Jason: will vdpa will keep this limitation?
If yes, then VDUSE can stick to it too without running into problems in
the future.

> > > + __u16 padding; /* padding */
> > > + __u32 vq_num; /* the number of virtqueues */
> > > + __u32 vq_align; /* the allocation alignment of virtqueue's metadata 
> > > */
> >
> > I'm not sure what this is?
> >
> 
>  This will be used by vring_create_virtqueue() too.

If there is no official definition for the meaning of this value then
"/* same as vring_create_virtqueue()'s vring_align parameter */" would
be clearer. That way the reader knows what to research in order to
understand how this field works.

I don't remember but maybe it was used to support vrings when the
host/guest have non-4KB page sizes. I wonder if anyone has an official
definition for this value?


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 12/12] iommu: Do not allow IOMMU passthrough with Secure Launch

2021-06-30 Thread Ross Philipson

On 6/21/21 5:15 PM, Andy Lutomirski wrote:

On Mon, Jun 21, 2021 at 10:51 AM Ross Philipson
 wrote:


On 6/18/21 2:32 PM, Robin Murphy wrote:

On 2021-06-18 17:12, Ross Philipson wrote:

The IOMMU should always be set to default translated type after
the PMRs are disabled to protect the MLE from DMA.

Signed-off-by: Ross Philipson 
---
   drivers/iommu/intel/iommu.c | 5 +
   drivers/iommu/iommu.c   | 6 +-
   2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index be35284..4f0256d 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -41,6 +41,7 @@
   #include 
   #include 
   #include 
+#include 
   #include 
   #include 
   #include 
@@ -2877,6 +2878,10 @@ static bool device_is_rmrr_locked(struct device
*dev)
*/
   static int device_def_domain_type(struct device *dev)
   {
+/* Do not allow identity domain when Secure Launch is configured */
+if (slaunch_get_flags() & SL_FLAG_ACTIVE)
+return IOMMU_DOMAIN_DMA;


Is this specific to Intel? It seems like it could easily be done
commonly like the check for untrusted external devices.


It is currently Intel only but that will change. I will look into what
you suggest.




+
   if (dev_is_pci(dev)) {
   struct pci_dev *pdev = to_pci_dev(dev);
   diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 808ab70d..d49b7dd 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -23,6 +23,7 @@
   #include 
   #include 
   #include 
+#include 
   #include 
 static struct kset *iommu_group_kset;
@@ -2761,7 +2762,10 @@ void iommu_set_default_passthrough(bool cmd_line)
   {
   if (cmd_line)
   iommu_cmd_line |= IOMMU_CMD_LINE_DMA_API;
-iommu_def_domain_type = IOMMU_DOMAIN_IDENTITY;
+
+/* Do not allow identity domain when Secure Launch is configured */
+if (!(slaunch_get_flags() & SL_FLAG_ACTIVE))
+iommu_def_domain_type = IOMMU_DOMAIN_IDENTITY;


Quietly ignoring the setting and possibly leaving iommu_def_domain_type
uninitialised (note that 0 is not actually a usable type) doesn't seem
great. AFAICS this probably warrants similar treatment to the


Ok so I guess it would be better to set it to IOMMU_DOMAIN_DMA event
though passthrough was requested. Or perhaps something more is needed here?


mem_encrypt_active() case - there doesn't seem a great deal of value in
trying to save users from themselves if they care about measured boot
yet explicitly pass options which may compromise measured boot. If you
really want to go down that route there's at least the sysfs interface
you'd need to nobble as well, not to mention the various ways of
completely disabling IOMMUs...


Doing a secure launch with the kernel is not a general purpose user use
case. A lot of work is done to secure the environment. Allowing
passthrough mode would leave the secure launch kernel exposed to DMA. I
think what we are trying to do here is what we intend though there may
be a better way or perhaps it is incomplete as you suggest.



I don't really like all these special cases.  Generically, what you're
trying to do is (AFAICT) to get the kernel to run in a mode in which
it does its best not to trust attached devices.  Nothing about this is
specific to Secure Launch.  There are plenty of scenarios in which
this the case:

  - Virtual devices in a VM host outside the TCB, e.g. VDUSE, Xen
device domains (did I get the name right), whatever tricks QEMU has,
etc.
  - SRTM / DRTM technologies (including but not limited to Secure
Launch -- plain old Secure Boot can work like this too).
  - Secure guest technologies, including but not limited to TDX and SEV.
  - Any computer with a USB-C port or other external DMA-capable port.
  - Regular computers in which the admin wants to enable this mode for
whatever reason.

Can you folks all please agree on a coordinated way for a Linux kernel
to configure itself appropriately?  Or to be configured via initramfs,
boot option, or some other trusted source of configuration supplied at
boot time?  We don't need a whole bunch of if (TDX), if (SEV), if
(secure launch), if (I have a USB-C port with PCIe exposed), if
(running on Xen), and similar checks all over the place.



I replied to Robin Murphy in another thread. As far as the IOMMU is 
concerned, I think we need to rethink our approach. As to the other 
technologies you mention here, we have not considered special casing 
anything at this point.


Thanks
Ross
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 12/12] iommu: Do not allow IOMMU passthrough with Secure Launch

2021-06-30 Thread Ross Philipson

On 6/22/21 7:06 AM, Robin Murphy wrote:

On 2021-06-21 18:51, Ross Philipson wrote:

On 6/18/21 2:32 PM, Robin Murphy wrote:

On 2021-06-18 17:12, Ross Philipson wrote:

The IOMMU should always be set to default translated type after
the PMRs are disabled to protect the MLE from DMA.

Signed-off-by: Ross Philipson 
---
   drivers/iommu/intel/iommu.c | 5 +
   drivers/iommu/iommu.c   | 6 +-
   2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index be35284..4f0256d 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -41,6 +41,7 @@
   #include 
   #include 
   #include 
+#include 
   #include 
   #include 
   #include 
@@ -2877,6 +2878,10 @@ static bool device_is_rmrr_locked(struct device
*dev)
    */
   static int device_def_domain_type(struct device *dev)
   {
+    /* Do not allow identity domain when Secure Launch is 
configured */

+    if (slaunch_get_flags() & SL_FLAG_ACTIVE)
+    return IOMMU_DOMAIN_DMA;


Is this specific to Intel? It seems like it could easily be done
commonly like the check for untrusted external devices.


It is currently Intel only but that will change. I will look into what
you suggest.


Yeah, it's simple and unobtrusive enough that I reckon it's worth going 
straight to the common version if it's worth doing at all.



+
   if (dev_is_pci(dev)) {
   struct pci_dev *pdev = to_pci_dev(dev);
   diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 808ab70d..d49b7dd 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -23,6 +23,7 @@
   #include 
   #include 
   #include 
+#include 
   #include 
     static struct kset *iommu_group_kset;
@@ -2761,7 +2762,10 @@ void iommu_set_default_passthrough(bool 
cmd_line)

   {
   if (cmd_line)
   iommu_cmd_line |= IOMMU_CMD_LINE_DMA_API;
-    iommu_def_domain_type = IOMMU_DOMAIN_IDENTITY;
+
+    /* Do not allow identity domain when Secure Launch is 
configured */

+    if (!(slaunch_get_flags() & SL_FLAG_ACTIVE))
+    iommu_def_domain_type = IOMMU_DOMAIN_IDENTITY;


Quietly ignoring the setting and possibly leaving iommu_def_domain_type
uninitialised (note that 0 is not actually a usable type) doesn't seem
great. AFAICS this probably warrants similar treatment to the


Ok so I guess it would be better to set it to IOMMU_DOMAIN_DMA event
though passthrough was requested. Or perhaps something more is needed 
here?



mem_encrypt_active() case - there doesn't seem a great deal of value in
trying to save users from themselves if they care about measured boot
yet explicitly pass options which may compromise measured boot. If you
really want to go down that route there's at least the sysfs interface
you'd need to nobble as well, not to mention the various ways of
completely disabling IOMMUs...


Doing a secure launch with the kernel is not a general purpose user use
case. A lot of work is done to secure the environment. Allowing
passthrough mode would leave the secure launch kernel exposed to DMA. I
think what we are trying to do here is what we intend though there may
be a better way or perhaps it is incomplete as you suggest.


On second thoughts this is overkill anyway - if you do hook 
iommu_get_def_domain_type(), you're done (in terms of the kernel-managed 
setting, at least); it doesn't matter what iommu_def_domain_type gets 
set to if will never get used. However, since this isn't really a 
per-device thing, it might be more semantically appropriate to leave 
that alone and instead only massage the default type in 
iommu_subsys_init(), as for memory encryption.


When you say "secure the environment", what's the actual threat model 
here, i.e. who's securing what against whom? If it's a device lockdown 
type thing where the system owner wants to defend against the end user 
trying to mess with the software stack or gain access to parts they 
shouldn't, then possibly you can trust the command line, but there are 
definitely other places which need consideration. If on the other hand 
it's more about giving the end user confidence that their choice of 
software stack isn't being interfered with by a malicious host or 
external third parties, then it probably leans towards the opposite 
being true...


If the command line *is* within the threat model, consider "iommu=off" 
and/or "intel_iommu=off" for example: I don't know how PMRs work, but I 
can only imagine that that's liable to leave things either wide open, or 
blocked to the point of no DMA working at all, neither of which seems to 
be what you want. I'm guessing "intel_iommu=tboot_noforce" might have 
some relevant implications too.



Thank you for you suggestions and feedback. Sorry we did not get back 
sooner. After the comments from you and Andy Lutomirski we decided we 
needed to re-imagine what we are trying to accomplish here and how else 
we might approach it.


Ross




It might be reasonable to make IOMMU_DEFAULT_P

Re: [PATCH v15 06/12] swiotlb: Use is_swiotlb_force_bounce for swiotlb data bouncing

2021-06-30 Thread Claire Chang
On Wed, Jun 30, 2021 at 9:43 AM Nathan Chancellor  wrote:
>
> On Thu, Jun 24, 2021 at 11:55:20PM +0800, Claire Chang wrote:
> > Propagate the swiotlb_force into io_tlb_default_mem->force_bounce and
> > use it to determine whether to bounce the data or not. This will be
> > useful later to allow for different pools.
> >
> > Signed-off-by: Claire Chang 
> > Reviewed-by: Christoph Hellwig 
> > Tested-by: Stefano Stabellini 
> > Tested-by: Will Deacon 
> > Acked-by: Stefano Stabellini 
>
> This patch as commit af452ec1b1a3 ("swiotlb: Use is_swiotlb_force_bounce
> for swiotlb data bouncing") causes my Ryzen 3 4300G system to fail to
> get to an X session consistently (although not every single time),
> presumably due to a crash in the AMDGPU driver that I see in dmesg.
>
> I have attached logs at af452ec1b1a3 and f127c9556a8e and I am happy
> to provide any further information, debug, or test patches as necessary.

Are you using swiotlb=force? or the swiotlb_map is called because of
!dma_capable? 
(https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/kernel/dma/direct.h#n93)

`BUG: unable to handle page fault for address: 003a8290` and
the fact it crashed at `_raw_spin_lock_irqsave` look like the memory
(maybe dev->dma_io_tlb_mem) was corrupted?
The dev->dma_io_tlb_mem should be set here
(https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/drivers/pci/probe.c#n2528)
through device_initialize.

I can't tell what happened from the logs, but maybe we could try KASAN
to see if it provides more clue.

Thanks,
Claire

>
> Cheers,
> Nathan
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH 3/6] vfio: remove the unused mdev iommu hook

2021-06-30 Thread Tian, Kevin
> From: Joerg Roedel 
> Sent: Monday, May 17, 2021 11:35 PM
> 
> On Mon, May 17, 2021 at 10:35:00AM -0300, Jason Gunthorpe wrote:
> > Well, I'm sorry, but there is a huge other thread talking about the
> > IOASID design in great detail and why this is all needed. Jumping into
> > this thread without context and basically rejecting all the
> > conclusions that were reached over the last several weeks is really
> > not helpful - especially since your objection is not technical.
> >
> > I think you should wait for Intel to put together the /dev/ioasid uAPI
> > proposal and the example use cases it should address then you can give
> > feedback there, with proper context.
> 
> Yes, I think the next step is that someone who read the whole thread
> writes up the conclusions and a rough /dev/ioasid API proposal, also
> mentioning the use-cases it addresses. Based on that we can discuss the
> implications this needs to have for IOMMU-API and code.
> 
> From the use-cases I know the mdev concept is just fine. But if there is
> a more generic one we can talk about it.
> 

Although /dev/iommu v2 proposal is still in progress, I think there are 
enough background gathered in v1 to resume this discussion now.

In a nutshell /dev/iommu requires two sets of services from the iommu
layer:

-   for an kernel-managed I/O page table via map/unmap;
-   for an user-managed I/O page table via bind/invalidate and nested on 
a kernel-managed parent I/O page table;

Each I/O page table could be attached by multiple devices. /dev/iommu
maintains device specific routing information (RID, or RID+PASID) for
where to install the I/O page table in the IOMMU for each attached device.

Kernel-managed page table is represented by iommu domain. Existing 
IOMMU-API allows /dev/iommu to attach a RID device to iommu domain. 
A new interface is required, e.g. iommu_attach_device_pasid(domain, dev, 
pasid), to cover (RID+PASID) attaching. Once attaching succeeds, no change
to following map/unmap which are domain-wide thus applied to both RID 
and RID+PASID. In case of dev_iotlb invalidation is required, the iommu 
driver is responsible for handling it for every attached RID or RID+PASID
if ats is enabled.

to take one example, the parent (RID1) has three work queues. WQ1 is 
for parent's own DMA-API usage, with WQ2 (PASID-x) assigned to VM1 
and WQ3 (PASID-y) assigned to VM2. VM2 is also assigned with another 
device (RID2). In this case there are three kernel-managed I/O page 
tables (IOVA in kernel, GPA for VM1 and GPA for VM2), thus RID1 is 
attached to three domains:

RID1 --- domain1 (default, IOVA)
 |  |
 |  |-- [RID1]
 |
 |-- domain2 (vm1, GPA)
 |  |
 |  |-- [RID1, PASID-x]
 |
 |-- domain3 (vm2, GPA)
 |  |
 |  |-- [RID1, PASID-y]
 |  |
 |  |-- [RID2]

The iommu layer should maintain above attaching status per device and per
iommu domain. There is no mdev/subdev concept in the iommu layer. It's
just about RID or PASID.

User-manage I/O page table might be represented by a new object which 
describes:

- routing information (RID or RID+PASID)
- pointer to iommu_domain of the parent I/O page table (inherit the
  domain ID in iotlb due to nesting)
- address of the I/O page table

There might be chance to share the structure with native SVA which also
has page table managed outside of iommu subsystem. But we can leave 
it and figure out until coding.

And a new set of IOMMU-API:

- iommu_{un}bind_pgtable(domain, dev, addr);
- iommu_{un}bind_pgtable_pasid(domain, dev, addr, pasid);
- iommu_cache_invalidate(domain, dev, invalid_info);
- and APIs for registering fault handler and completing faults;
(here 'domain' is the one representing the parent I/O page table)

Because nesting essentially creates a new reference to the parent I/O 
page table, iommu_bind_pgtable_pasid() implicitly calls __iommu_attach_
device_pasid() to setup the connection between the parent domain and 
the new [RID,PASID]. It's not necessary to do so for iommu_bind_pgtable() 
since the RID is already attached when the parent I/O page table is created.

In consequence the example topology is updated as below, with guest
SVA enabled in both vm1 and vm2:

RID1 --- domain1 (default, IOVA)
 |  |
 |  |-- [RID1]
 |
 |-- domain2 (vm1, GPA)
 |  |
 |  |-- [RID1, PASID-x]
 |  |-- [RID1, PASID-a] // nested for vm1 process1
 |  |-- [RID1, PASID-b] // nested for vm1 process2
 |
 |-- domain3 (vm2, GPA)
 |  |
 |  |-- [RID1, PASID-y]
 |  |-- [RID1, PASID-c] // nested for vm2 process1
 |  |
 |  |-- [RID2]
 |  |-- [RID2, PASID-a] // nested for vm2 process2

Thoughts?

Thanks
Kevin
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/arm-smmu-v3: Add default domain quirk for Arm Fast Models

2021-06-30 Thread Marc Zyngier
On Tue, 29 Jun 2021 18:34:40 +0100,
Will Deacon  wrote:
> 
> On Fri, Jun 18, 2021 at 05:24:49PM +0100, Robin Murphy wrote:
> > Arm Fast Models are still implementing legacy virtio-pci devices behind
> > the SMMU, which continue to be problematic as "real hardware" devices
> > (from the point of view of the simulated system) without the mitigating
> > VIRTIO_F_ACCESS_PLATFORM feature.
> > 
> > Since we now have the ability to force passthrough on a device-specific
> > basis, let's use it to work around this particular oddity so that people
> > who just want to boot Linux on a model don't have to fiddle around with
> > things to avoid the SMMU getting in the way by default (and I don't have
> > to keep telling them which things...)
> > 
> > Signed-off-by: Robin Murphy 
> > ---
> >  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 15 +++
> >  1 file changed, 15 insertions(+)
> 
> Any chance of getting the fastmodels updated instead? It feels like it
> has to happen *eventually*, and then there would be no need for this bodge.

That'd be ideal. What are the chances of that happening before the Sun
turns into a black hole?

> Will
> 
> > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
> > b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > index 54b2f27b81d4..13cf16e8f45b 100644
> > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > @@ -2649,6 +2649,20 @@ static int arm_smmu_dev_disable_feature(struct 
> > device *dev,
> > }
> >  }
> >  
> > +static int arm_smmu_def_domain_type(struct device *dev)
> > +{
> > +   if (dev_is_pci(dev)) {
> > +   struct pci_dev *pdev = to_pci_dev(dev);
> > +
> > +   /* Legacy virtio-block devices on Arm Fast Models */
> > +   if (pdev->vendor == 0x1af4 && pdev->device == 0x1001 &&
> > +   pdev->subsystem_vendor == 0x00ff && pdev->subsystem_device 
> > == 0x0002)
> > +   return IOMMU_DOMAIN_IDENTITY;
> > +   }
> > +
> > +   return 0;
> > +}

Could this be expressed as a PCI quirk instead? It would at least keep
the ID matching out of the SMMU driver...

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v5 7/8] iommu/arm-smmu: Get associated RMR info and install bypass SMR

2021-06-30 Thread Shameerali Kolothum Thodi



> -Original Message-
> From: Jon Nettleton [mailto:j...@solid-run.com]
> Sent: 29 June 2021 17:26
> To: Robin Murphy 
> Cc: Shameerali Kolothum Thodi ;
> linux-arm-kernel ; ACPI Devel Maling
> List ; iommu@lists.linux-foundation.org; Linuxarm
> ; Steven Price ; Guohanjun
> (Hanjun Guo) ; yangyicong
> ; sami.muja...@arm.com; wanghuiqiang
> 
> Subject: Re: [PATCH v5 7/8] iommu/arm-smmu: Get associated RMR info and
> install bypass SMR
> 

[...]
 
> Shameer,
> 
> Sorry for the delays.  Here is a diff of the changes that should
> address the issues pointed out by Robin,
> I have tested that this works as expected on my HoneyComb LX2160A.

Ok. Thanks for that.

> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c
> b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> index ab7b9db77625..a358bd326d0b 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> @@ -2068,29 +2068,21 @@ static void
> arm_smmu_rmr_install_bypass_smr(struct arm_smmu_device *smmu)
> struct list_head rmr_list;
> struct iommu_resv_region *e;
> int i, cnt = 0;
> -   u32 smr;
> u32 reg;
> 
> INIT_LIST_HEAD(&rmr_list);
> if (iommu_dma_get_rmrs(dev_fwnode(smmu->dev), &rmr_list))
> return;
> 
> -   reg = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_sCR0);
> +   /* Rather than trying to look at existing mappings that
> +* are setup by the firmware and then invalidate the ones
> +* that do no have matching RMR entries, just disable the
> +* SMMU until it gets enabled again in the reset routine.
> +*/
> 
> -   if ((reg & ARM_SMMU_sCR0_USFCFG) && !(reg &
> ARM_SMMU_sCR0_CLIENTPD)) {
> -   /*
> -* SMMU is already enabled and disallowing bypass, so
> preserve
> -* the existing SMRs
> -*/
> -   for (i = 0; i < smmu->num_mapping_groups; i++) {
> -   smr = arm_smmu_gr0_read(smmu,
> ARM_SMMU_GR0_SMR(i));
> -   if (!FIELD_GET(ARM_SMMU_SMR_VALID, smr))
> -   continue;
> -   smmu->smrs[i].id =
> FIELD_GET(ARM_SMMU_SMR_ID, smr);
> -   smmu->smrs[i].mask =
> FIELD_GET(ARM_SMMU_SMR_MASK, smr);
> -   smmu->smrs[i].valid = true;
> -   }
> -   }
> +   reg = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_sCR0);
> +   reg &= ~ARM_SMMU_sCR0_CLIENTPD;
> +   arm_smmu_gr0_write(smmu, ARM_SMMU_GR0_sCR0, reg);
> 
> list_for_each_entry(e, &rmr_list, list) {
> u32 sid = e->fw_data.rmr.sid;
> @@ -2100,25 +2092,16 @@ static void
> arm_smmu_rmr_install_bypass_smr(struct arm_smmu_device *smmu)
> continue;
> if (smmu->s2crs[i].count == 0) {
> smmu->smrs[i].id = sid;
> -   smmu->smrs[i].mask = ~0;
> +   smmu->smrs[i].mask = 0;
> smmu->smrs[i].valid = true;
> }
> smmu->s2crs[i].count++;
> smmu->s2crs[i].type = S2CR_TYPE_BYPASS;
> smmu->s2crs[i].privcfg = S2CR_PRIVCFG_DEFAULT;
> -   smmu->s2crs[i].cbndx = 0xff;
> 
> cnt++;
> }
> 
> -   if ((reg & ARM_SMMU_sCR0_USFCFG) && !(reg &
> ARM_SMMU_sCR0_CLIENTPD)) {
> -   /* Remove the valid bit for unused SMRs */
> -   for (i = 0; i < smmu->num_mapping_groups; i++) {
> -   if (smmu->s2crs[i].count == 0)
> -   smmu->smrs[i].valid = false;
> -   }
> -   }
> -
> dev_notice(smmu->dev, "\tpreserved %d boot mapping%s\n", cnt,
>cnt == 1 ? "" : "s");
> iommu_dma_put_rmrs(dev_fwnode(smmu->dev), &rmr_list);
> 
> Please include that in your next patch series.  Let me know if you
> want me to send you the patch direct
> off the list.

No problem, I will take this in next.

Thanks,
Shameer
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 01/24] dt-bindings: mediatek: mt8195: Add binding for MM IOMMU

2021-06-30 Thread Yong Wu
On Wed, 2021-06-30 at 08:26 +0200, Krzysztof Kozlowski wrote:
> On 30/06/2021 04:34, Yong Wu wrote:
> > This patch adds descriptions for mt8195 IOMMU which also use ARM
> > Short-Descriptor translation table format.
> > 
> > In mt8195, there are two smi-common HW and IOMMU, one is for vdo(video
> > output), the other is for vpp(video processing pipe). They connects
> > with different smi-larbs, then some setting(larbid_remap) is different.
> > Differentiate them with the compatible string.
> > 
> > Something like this:
> > 
> > IOMMU(VDO)  IOMMU(VPP)
> >|   |
> >   SMI_COMMON_VDO  SMI_COMMON_VPP
> >   --- 
> >   |  |   ...  |  | ...
> > larb0 larb2  ...larb1 larb3...
> > 
> > Another change is that we have a new IOMMU that is for infra master like
> > PCIe and USB. The infra master don't have the larb and ports, thus we
> > rename the port header file to mt8195-memory-port.h rather than
> > mt8195-larb-port.h.
> > 
> > Also, the IOMMU is not only for MM, thus, we don't call it "m4u" which
> > means "MultiMedia Memory Management UNIT". thus, use the "iommu" as the
> > compatiable string.
> > 
> > Signed-off-by: Yong Wu 
> > ---
> >  .../bindings/iommu/mediatek,iommu.yaml|   7 +
> >  .../dt-bindings/memory/mt8195-memory-port.h   | 390 ++
> >  2 files changed, 397 insertions(+)
> >  create mode 100644 include/dt-bindings/memory/mt8195-memory-port.h
> > 
> 
> I understand this will go through IOMMU tree. Do you know about any
> further patches for memory controllers which will need the header?

This header file will only be used in dtsi file. the iommu masters will
use these port definitions in the dtsi.

The SMI driver no need include this header file.

By the way, the mt8195 SMI patches is at:

https://lore.kernel.org/linux-mediatek/20210616114346.18812-1-yong...@mediatek.com/
 

> 
> Acked-by: Krzysztof Kozlowski 

Thanks very much for your quick review.

> 
> 
> Best regards,
> Krzysztof
> 
> ___
> Linux-mediatek mailing list
> linux-media...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-mediatek

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC] /dev/ioasid uAPI proposal

2021-06-30 Thread Christoph Hellwig
On Mon, Jun 07, 2021 at 11:14:24AM -0300, Jason Gunthorpe wrote:
> "non-coherent DMA" is some general euphemism that evokes images of
> embedded platforms that don't have coherent DMA at all and have low
> cost ways to regain coherence. This is not at all what we are talking
> about here at all.

It literally is the same way of working.  And not just low-end embedded
platforms use this, but a lot of older server platforms did as well.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC] /dev/ioasid uAPI proposal

2021-06-30 Thread Christoph Hellwig
On Fri, Jun 04, 2021 at 08:58:05AM -0300, Jason Gunthorpe wrote:
> On Fri, Jun 04, 2021 at 09:11:03AM +0800, Jason Wang wrote:
> > > nor do any virtio drivers implement the required platform specific
> > > cache flushing to make no-snoop TLPs work.
> > 
> > I don't get why virtio drivers needs to do that. I think DMA API should hide
> > those arch/platform specific stuffs from us.
> 
> It is not arch/platform stuff. If the device uses no-snoop then a
> very platform specific recovery is required in the device driver.

Well, the proper way to support NO_SNOOP DMA would be to force the
DMA API into supporting the device as if the platform was not DMA
coherent, probably on a per-call basis.  It is just that no one bothered
to actually do the work an people just kept piling hacks over hacks.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC] /dev/ioasid uAPI proposal

2021-06-30 Thread Christoph Hellwig
On Wed, Jun 09, 2021 at 09:47:42AM -0300, Jason Gunthorpe wrote:
> I can vaugely understand this rational for vfio, but not at all for
> the platform's iommu driver, sorry.

Agreed.  More importantly the dependency is not for the platform iommu
driver but just for the core iommu code, which is always built in if
enabled.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu