Re: [RFC 1/5] RISC-V: Implement arch_sync_dma* functions

2021-07-25 Thread Christoph Hellwig
> +#ifdef CONFIG_RISCV_DMA_NONCOHERENT
> +struct riscv_dma_cache_sync {
> + void (*cache_invalidate)(phys_addr_t paddr, size_t size);
> + void (*cache_clean)(phys_addr_t paddr, size_t size);
> + void (*cache_flush)(phys_addr_t paddr, size_t size);
> +};
> +
> +void riscv_dma_cache_sync_set(struct riscv_dma_cache_sync *ops);
> +#endif

As told a bunch of times before: doing indirect calls here is a
performance nightmare.  Use something that actually does perform
horribly like alternatives.  Or even delay implementing that until
we need it and do a plain direct call for now.

static void __dma_sync(phys_addr_t paddr, size_t size, enum dma_data_direction 
dir)
> +{
> + if ((dir == DMA_FROM_DEVICE) && (dma_cache_sync->cache_invalidate))
> + dma_cache_sync->cache_invalidate(paddr, size);
> + else if ((dir == DMA_TO_DEVICE) && (dma_cache_sync->cache_clean))
> + dma_cache_sync->cache_clean(paddr, size);
> + else if ((dir == DMA_BIDIRECTIONAL) && dma_cache_sync->cache_flush)
> + dma_cache_sync->cache_flush(paddr, size);
> +}

The seletion of flush types is completely broken.  Take a look at the
comment in arch/arc/mm/dma.c above arch_sync_dma_for_device for a good
explanation.

> +void arch_dma_prep_coherent(struct page *page, size_t size)
> +{
> + void *flush_addr = page_address(page);
> +
> + memset(flush_addr, 0, size);

arch_dma_prep_coherent is not supposed to modify the content of
the data.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC v2] /dev/iommu uAPI proposal

2021-07-25 Thread David Gibson
On Fri, Jul 09, 2021 at 07:48:44AM +, Tian, Kevin wrote:
> /dev/iommu provides an unified interface for managing I/O page tables for 
> devices assigned to userspace. Device passthrough frameworks (VFIO, vDPA, 
> etc.) are expected to use this interface instead of creating their own logic 
> to 
> isolate untrusted device DMAs initiated by userspace. 
> 
> This proposal describes the uAPI of /dev/iommu and also sample sequences 
> with VFIO as example in typical usages. The driver-facing kernel API provided 
> by the iommu layer is still TBD, which can be discussed after consensus is 
> made on this uAPI.
> 
> It's based on a lengthy discussion starting from here:
>   
> https://lore.kernel.org/linux-iommu/20210330132830.go2356...@nvidia.com/ 
> 
> v1 can be found here:
>   
> https://lore.kernel.org/linux-iommu/ph0pr12mb54811863b392c644e5365446dc...@ph0pr12mb5481.namprd12.prod.outlook.com/T/
> 
> This doc is also tracked on github, though it's not very useful for v1->v2 
> given dramatic refactoring:
>   https://github.com/luxis1999/dev_iommu_uapi

Thanks for all your work on this, Kevin.  Apart from the actual
semantic improvements, I'm finding v2 significantly easier to read and
understand than v1.

[snip]
> 1.2. Attach Device to I/O address space
> +++
> 
> Device attach/bind is initiated through passthrough framework uAPI.
> 
> Device attaching is allowed only after a device is successfully bound to
> the IOMMU fd. User should provide a device cookie when binding the 
> device through VFIO uAPI. This cookie is used when the user queries 
> device capability/format, issues per-device iotlb invalidation and 
> receives per-device I/O page fault data via IOMMU fd.
> 
> Successful binding puts the device into a security context which isolates 
> its DMA from the rest system. VFIO should not allow user to access the 
> device before binding is completed. Similarly, VFIO should prevent the 
> user from unbinding the device before user access is withdrawn.
> 
> When a device is in an iommu group which contains multiple devices,
> all devices within the group must enter/exit the security context
> together. Please check {1.3} for more info about group isolation via
> this device-centric design.
> 
> Successful attaching activates an I/O address space in the IOMMU,
> if the device is not purely software mediated. VFIO must provide device
> specific routing information for where to install the I/O page table in 
> the IOMMU for this device. VFIO must also guarantee that the attached 
> device is configured to compose DMAs with the routing information that 
> is provided in the attaching call. When handling DMA requests, IOMMU 
> identifies the target I/O address space according to the routing 
> information carried in the request. Misconfiguration breaks DMA
> isolation thus could lead to severe security vulnerability.
> 
> Routing information is per-device and bus specific. For PCI, it is 
> Requester ID (RID) identifying the device plus optional Process Address 
> Space ID (PASID). For ARM, it is Stream ID (SID) plus optional Sub-Stream 
> ID (SSID). PASID or SSID is used when multiple I/O address spaces are 
> enabled on a single device. For simplicity and continuity reason the 
> following context uses RID+PASID though SID+SSID may sound a clearer 
> naming from device p.o.v. We can decide the actual naming when coding.
> 
> Because one I/O address space can be attached by multiple devices, 
> per-device routing information (plus device cookie) is tracked under 
> each IOASID and is used respectively when activating the I/O address 
> space in the IOMMU for each attached device.
> 
> The device in the /dev/iommu context always refers to a physical one 
> (pdev) which is identifiable via RID. Physically each pdev can support 
> one default I/O address space (routed via RID) and optionally multiple 
> non-default I/O address spaces (via RID+PASID).
> 
> The device in VFIO context is a logic concept, being either a physical
> device (pdev) or mediated device (mdev or subdev). Each vfio device
> is represented by RID+cookie in IOMMU fd. User is allowed to create 
> one default I/O address space (routed by vRID from user p.o.v) per 
> each vfio_device. VFIO decides the routing information for this default
> space based on device type:
> 
> 1)  pdev, routed via RID;
> 
> 2)  mdev/subdev with IOMMU-enforced DMA isolation, routed via 
> the parent's RID plus the PASID marking this mdev;
> 
> 3)  a purely sw-mediated device (sw mdev), no routing required i.e. no
> need to install the I/O page table in the IOMMU. sw mdev just uses 
> the metadata to assist its internal DMA isolation logic on top of 
> the parent's IOMMU page table;
> 
> In addition, VFIO may allow user to create additional I/O address spaces
> on a vfio_device based on the hardware capability. In such case the user 
> has its own view of the virtual routing information (vPASID) when 

Re: [RFC 3/5] dma-mapping: Enable global non-coherent pool support for RISC-V

2021-07-25 Thread Rob Herring
On Fri, Jul 23, 2021 at 3:40 PM Atish Patra  wrote:
>
> Currently, linux,dma-default is used to reserve a global non-coherent pool
> to allocate memory for dma operations. This can be useful for RISC-V as
> well as the ISA specification doesn't specify a method to modify PMA
> attributes or page table entries to define non-cacheable area yet.
> A non-cacheable memory window is an alternate options for vendors to
> support non-coherent devices. "dma-ranges" must be used in conjunction with
> "linux,dma-default" property to define one or more mappings between device
> and cpu accesible memory regions.

'dma-ranges' applies to buses. And, well, maybe devices when the bus
is not well defined. It is not a reserved-memory property.

Rob
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [GIT PULL] dma-mapping fix for Linux 5.14

2021-07-25 Thread pr-tracker-bot
The pull request you sent on Sun, 25 Jul 2021 08:03:21 +0200:

> git://git.infradead.org/users/hch/dma-mapping.git tags/dma-mapping-5.14-1

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/04ca88d056b44efee1e7635c74c0be3705efc72c

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [GIT PULL] dma-mapping fix for Linux 5.14

2021-07-25 Thread Christoph Hellwig
On Sun, Jul 25, 2021 at 09:50:29AM -0700, Linus Torvalds wrote:
> On Sat, Jul 24, 2021 at 11:03 PM Christoph Hellwig  wrote:
> >
> > dma-mapping fix for Lonux 5.14
> 
> We're calling it "Lonux" now?

Only on weekends :)

> >   - handle vmalloc addresses in dma_common_{mmap,get_sgtable}
> > (Roman Skakun)
> 
> I've pulled this, but my reaction is that we've tried to avoid this in
> the past. Why is Xen using vmalloc'ed addresses and passing those in
> to the dma mapping routines?
> 
> It *smells* to me like a Xen-swiotlb bug, and it would have been
> better to try to fix it there. Was that just too painful?

vmalloc (or rather vmap) addresses actually are the most common
way to provide uncachable mappings on architectures that are not
cache coherent.  The only Xen part here is that swiotlb-xen is a mess
and gets the address from the dma-direct allocator which does vmapping
for arm/arm64, but then uses the common helpers later due to a variety
of issues that will take a while to address.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [GIT PULL] dma-mapping fix for Linux 5.14

2021-07-25 Thread Linus Torvalds
On Sat, Jul 24, 2021 at 11:03 PM Christoph Hellwig  wrote:
>
> dma-mapping fix for Lonux 5.14

We're calling it "Lonux" now?

>   - handle vmalloc addresses in dma_common_{mmap,get_sgtable}
> (Roman Skakun)

I've pulled this, but my reaction is that we've tried to avoid this in
the past. Why is Xen using vmalloc'ed addresses and passing those in
to the dma mapping routines?

It *smells* to me like a Xen-swiotlb bug, and it would have been
better to try to fix it there. Was that just too painful?

  Lonus
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 3/3] iommu: dart: Add DART iommu driver

2021-07-25 Thread Sven Peter via iommu



On Mon, Jul 19, 2021, at 20:15, Robin Murphy wrote:
> On 2021-07-15 17:41, Sven Peter via iommu wrote:
> [...]
> >>> + u64 sw_bypass_cpu_start;
> >>> + u64 sw_bypass_dma_start;
> >>> + u64 sw_bypass_len;
> >>> +
> >>> + struct list_head streams;
> >>
> >> I'm staring to think this could just be a bitmap, in a u16 even.
> > 
> > The problem is that these streams may come from two different
> > DART instances. That is required for e.g. the dwc3 controller which
> > has a weird quirk where DMA transactions go through two separate
> > DARTs with no clear pattern (e.g. some xhci control structures use the
> > first dart while other structures use the second one).
> 
> Ah right, I do remember discussing that situation, but I think I 
> misinterpreted dart_domain->dart representing "the DART instance" here 
> to mean we weren't trying to accommodate that just yet.
> 
> > Both of them need to point to the same pagetable.
> > In the device tree the node will have an entry like this:
> > 
> > dwc3_0: usb@38228{
> > ...
> > iommus = <&dwc3_0_dart_0 0>, <&dwc3_0_dart_1 1>;
> > };
> > 
> > There's no need for a linked list though once I do this properly with
> > groups. I can just use an array allocated when the first device is
> > attached, which just contains apple_dart* and streamid values.
> > 
> > 
> >>
> >>> +
> >>> + spinlock_t lock;
> >>> +
> >>> + struct iommu_domain domain;
> >>> +};
> >>> +
> >>> +/*
> >>> + * This structure is attached to devices with dev_iommu_priv_set() on 
> >>> of_xlate
> >>> + * and contains a list of streams bound to this device as defined in the
> >>> + * device tree. Multiple DART instances can be attached to a single 
> >>> device
> >>> + * and each stream is identified by its stream id.
> >>> + * It's usually reference by a pointer called *cfg.
> >>> + *
> >>> + * A dynamic array instead of a linked list is used here since in almost
> >>> + * all cases a device will just be attached to a single stream and 
> >>> streams
> >>> + * are never removed after they have been added.
> >>> + *
> >>> + * @num_streams: number of streams attached
> >>> + * @streams: array of structs to identify attached streams and the 
> >>> device link
> >>> + *   to the iommu
> >>> + */
> >>> +struct apple_dart_master_cfg {
> >>> + int num_streams;
> >>> + struct {
> >>> + struct apple_dart *dart;
> >>> + u32 sid;
> >>
> >> Can't you use the fwspec for this?
> > 
> > 
> > I'd be happy to use the fwspec code if that's somehow possible.
> > I'm not sure how though since I need to store both the reference to the DART
> > _and_ to the stream id. As far as I can tell the fwspec code would only 
> > allow
> > to store the stream ids.
> > (see also the previous comment regarding the dwc3 node which requires stream
> > ids from two separate DART instances)
> 
> Hmm, yes, as above I was overlooking that, although there are still 
> various ideas that come to mind; the question becomes whether they're 
> actually worthwhile or just too-clever-for-their-own-good hacks. The 
> exact format of fwspec->ids is not fixed (other than the ACPI IORT code 
> having a common understanding with the Arm SMMU drivers) so in principle 
> you could munge some sort of DART instance index or indeed anything, but 
> if it remains cleaner to manage your own data internally then by all 
> means keep doing that.

Yeah, I can think of some hacks as well (like storing a global id->apple_dart* 
map
or stuffing the 64bit pointer into two ints) and I've tried a few of them in 
the past
days but didn't like either of them.

I do like the idea to just put two (struct apple_dart *dart, u16 sidmap)
in there though which will be plenty for all current configurations.

> 
> >>> + struct device_link *link;
> >>
> >> Is it necessary to use stateless links, or could you use
> >> DL_FLAG_AUTOREMOVE_SUPPLIER and not have to keep track of them manually?
> > 
> > I'll just use DL_FLAG_AUTOREMOVE_SUPPLIER. No idea why I went for stateless 
> > links.
> > 
> >>
> > [...]
> >>> + /* restore stream identity map */
> >>> + writel(0x03020100, dart->regs + DART_STREAM_REMAP);
> >>> + writel(0x07060504, dart->regs + DART_STREAM_REMAP + 4);
> >>> + writel(0x0b0a0908, dart->regs + DART_STREAM_REMAP + 8);
> >>> + writel(0x0f0e0d0c, dart->regs + DART_STREAM_REMAP + 12);
> >>
> >> Any hint of what the magic numbers mean?
> > 
> > Yes, it's just 0,1,2,3...,0xe,0xf but I can't do 8bit writes to the bus
> > and 32 bit writes then require these slightly awkward "swapped" numbers.
> > I'll add a comment since it's not obvious at first glance.
> 
> Sure, I guessed that much from "identity map" - it was more a question 
> of why that means 0x03020100... rather than, say, 0x0c0d0e0f... or 
> 0x76543210..., and perhaps the reason for "restoring" it in the first place.

So what this feature does is to allow the DART to take an incoming DMA stream
tagged with id i and pretend that it's actually been tagged with
readb(dart->regs + 0x