Re: [RFC 20/20] Doc: Add documentation for /dev/iommu

2021-10-29 Thread Jason Gunthorpe via iommu
On Fri, Oct 29, 2021 at 11:15:31AM +1100, David Gibson wrote:

> > +Device must be bound to an iommufd before the attach operation can
> > +be conducted. The binding operation builds the connection between
> > +the devicefd (opened via device-passthrough framework) and IOMMUFD.
> > +IOMMU-protected security context is esbliashed when the binding
> > +operation is completed.
> 
> This can't be quite right.  You can't establish a safe security
> context until all devices in the groun are bound, but you can only
> bind them one at a time.

When any device is bound the entire group is implicitly adopted to
this iommufd and the whole group enters a safe-for-userspace state.

It is symmetrical with the kernel side which is also device focused,
when any struct device is bound to a kernel driver the entire group is
implicitly adopted to kernel mode.

Lu should send a patch series soon that harmonize how this works, it
is a very nice cleanup.

Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC 20/20] Doc: Add documentation for /dev/iommu

2021-10-29 Thread David Gibson
On Sun, Sep 19, 2021 at 02:38:48PM +0800, Liu Yi L wrote:
> Document the /dev/iommu framework for user.
> 
> Open:
> Do we want to document /dev/iommu in Documentation/userspace-api/iommu.rst?
> Existing iommu.rst is for the vSVA interfaces, honestly, may need to rewrite
> this doc entirely.
> 
> Signed-off-by: Kevin Tian 
> Signed-off-by: Liu Yi L 
> ---
>  Documentation/userspace-api/index.rst   |   1 +
>  Documentation/userspace-api/iommufd.rst | 183 
>  2 files changed, 184 insertions(+)
>  create mode 100644 Documentation/userspace-api/iommufd.rst
> 
> diff --git a/Documentation/userspace-api/index.rst 
> b/Documentation/userspace-api/index.rst
> index 0b5eefed027e..54df5a278023 100644
> --- a/Documentation/userspace-api/index.rst
> +++ b/Documentation/userspace-api/index.rst
> @@ -25,6 +25,7 @@ place where this information is gathered.
> ebpf/index
> ioctl/index
> iommu
> +   iommufd
> media/index
> sysfs-platform_profile
>  
> diff --git a/Documentation/userspace-api/iommufd.rst 
> b/Documentation/userspace-api/iommufd.rst
> new file mode 100644
> index ..abffbb47dc02
> --- /dev/null
> +++ b/Documentation/userspace-api/iommufd.rst
> @@ -0,0 +1,183 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +.. iommu:
> +
> +===
> +IOMMU Userspace API
> +===
> +
> +Direct device access from userspace has been a crtical feature in
> +high performance computing and virtualization usages. Linux now
> +includes multiple device-passthrough frameworks (e.g. VFIO and vDPA)
> +to manage secure device access from the userspace. One critical
> +task of those frameworks is to put the assigned device in a secure,
> +IOMMU-protected context so the device is prevented from doing harm
> +to the rest of the system.
> +
> +Currently those frameworks implement their own logic for managing
> +I/O page tables to isolate user-initiated DMAs. This doesn't scale
> +to support many new IOMMU features, such as PASID-granular DMA
> +remapping, nested translation, I/O page fault, IOMMU dirty bit, etc.
> +
> +The /dev/iommu framework provides an unified interface for managing
> +I/O page tables for passthrough devices. Existing passthrough
> +frameworks are expected to use this interface instead of continuing
> +their ad-hoc implementations.
> +
> +IOMMUFDs, IOASIDs, Devices and Groups
> +-
> +
> +The core concepts in /dev/iommu are IOMMUFDs and IOASIDs. IOMMUFD (by
> +opening /dev/iommu) is the container holding multiple I/O address
> +spaces for a user, while IOASID is the fd-local software handle
> +representing an I/O address space and associated with a single I/O
> +page table. User manages those address spaces through fd operations,
> +e.g. by using vfio type1v2 mapping semantics to manage respective
> +I/O page tables.
> +
> +IOASID is comparable to the conatiner concept in VFIO. The latter
> +is also associated to a single I/O address space. A main difference
> +between them is that multiple IOASIDs in the same IOMMUFD can be
> +nested together (not supported yet) to allow centralized accounting
> +of locked pages, while multiple containers are disconnected thus
> +duplicated accounting is incurred. Typically one IOMMUFD is
> +sufficient for all intended IOMMU usages for a user.
> +
> +An I/O address space takes effect in the IOMMU only after it is
> +attached by a device. One I/O address space can be attached by
> +multiple devices. One device can be only attached to a single I/O
> +address space at this point (on par with current vfio behavior).
> +
> +Device must be bound to an iommufd before the attach operation can
> +be conducted. The binding operation builds the connection between
> +the devicefd (opened via device-passthrough framework) and IOMMUFD.
> +IOMMU-protected security context is esbliashed when the binding
> +operation is completed.

This can't be quite right.  You can't establish a safe security
context until all devices in the groun are bound, but you can only
bind them one at a time.

>  The passthrough framework must block user
> +access to the assigned device until bind() returns success.
> +
> +The entire /dev/iommu framework adopts a device-centric model w/o
> +carrying any container/group legacy as current vfio does. However
> +the group is the minimum granularity that must be used to ensure
> +secure user access (refer to vfio.rst). This framework relies on
> +the IOMMU core layer to map device-centric model into group-granular
> +isolation.
> +
> +Managing I/O Address Spaces
> +---
> +
> +When creating an I/O address space (by allocating IOASID), the user
> +must specify the type of underlying I/O page table. Currently only
> +one type (kernel-managed) is supported. In the future other types
> +will be introduced, e.g. to support user-managed I/O page table or
> +a shared I/O page table which is managed by another kernel sub-
> +system (mm, ept, etc.).