Re: [RFC 20/20] Doc: Add documentation for /dev/iommu
On Fri, Oct 29, 2021 at 11:15:31AM +1100, David Gibson wrote: > > +Device must be bound to an iommufd before the attach operation can > > +be conducted. The binding operation builds the connection between > > +the devicefd (opened via device-passthrough framework) and IOMMUFD. > > +IOMMU-protected security context is esbliashed when the binding > > +operation is completed. > > This can't be quite right. You can't establish a safe security > context until all devices in the groun are bound, but you can only > bind them one at a time. When any device is bound the entire group is implicitly adopted to this iommufd and the whole group enters a safe-for-userspace state. It is symmetrical with the kernel side which is also device focused, when any struct device is bound to a kernel driver the entire group is implicitly adopted to kernel mode. Lu should send a patch series soon that harmonize how this works, it is a very nice cleanup. Jason ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC 20/20] Doc: Add documentation for /dev/iommu
On Sun, Sep 19, 2021 at 02:38:48PM +0800, Liu Yi L wrote: > Document the /dev/iommu framework for user. > > Open: > Do we want to document /dev/iommu in Documentation/userspace-api/iommu.rst? > Existing iommu.rst is for the vSVA interfaces, honestly, may need to rewrite > this doc entirely. > > Signed-off-by: Kevin Tian > Signed-off-by: Liu Yi L > --- > Documentation/userspace-api/index.rst | 1 + > Documentation/userspace-api/iommufd.rst | 183 > 2 files changed, 184 insertions(+) > create mode 100644 Documentation/userspace-api/iommufd.rst > > diff --git a/Documentation/userspace-api/index.rst > b/Documentation/userspace-api/index.rst > index 0b5eefed027e..54df5a278023 100644 > --- a/Documentation/userspace-api/index.rst > +++ b/Documentation/userspace-api/index.rst > @@ -25,6 +25,7 @@ place where this information is gathered. > ebpf/index > ioctl/index > iommu > + iommufd > media/index > sysfs-platform_profile > > diff --git a/Documentation/userspace-api/iommufd.rst > b/Documentation/userspace-api/iommufd.rst > new file mode 100644 > index ..abffbb47dc02 > --- /dev/null > +++ b/Documentation/userspace-api/iommufd.rst > @@ -0,0 +1,183 @@ > +.. SPDX-License-Identifier: GPL-2.0 > +.. iommu: > + > +=== > +IOMMU Userspace API > +=== > + > +Direct device access from userspace has been a crtical feature in > +high performance computing and virtualization usages. Linux now > +includes multiple device-passthrough frameworks (e.g. VFIO and vDPA) > +to manage secure device access from the userspace. One critical > +task of those frameworks is to put the assigned device in a secure, > +IOMMU-protected context so the device is prevented from doing harm > +to the rest of the system. > + > +Currently those frameworks implement their own logic for managing > +I/O page tables to isolate user-initiated DMAs. This doesn't scale > +to support many new IOMMU features, such as PASID-granular DMA > +remapping, nested translation, I/O page fault, IOMMU dirty bit, etc. > + > +The /dev/iommu framework provides an unified interface for managing > +I/O page tables for passthrough devices. Existing passthrough > +frameworks are expected to use this interface instead of continuing > +their ad-hoc implementations. > + > +IOMMUFDs, IOASIDs, Devices and Groups > +- > + > +The core concepts in /dev/iommu are IOMMUFDs and IOASIDs. IOMMUFD (by > +opening /dev/iommu) is the container holding multiple I/O address > +spaces for a user, while IOASID is the fd-local software handle > +representing an I/O address space and associated with a single I/O > +page table. User manages those address spaces through fd operations, > +e.g. by using vfio type1v2 mapping semantics to manage respective > +I/O page tables. > + > +IOASID is comparable to the conatiner concept in VFIO. The latter > +is also associated to a single I/O address space. A main difference > +between them is that multiple IOASIDs in the same IOMMUFD can be > +nested together (not supported yet) to allow centralized accounting > +of locked pages, while multiple containers are disconnected thus > +duplicated accounting is incurred. Typically one IOMMUFD is > +sufficient for all intended IOMMU usages for a user. > + > +An I/O address space takes effect in the IOMMU only after it is > +attached by a device. One I/O address space can be attached by > +multiple devices. One device can be only attached to a single I/O > +address space at this point (on par with current vfio behavior). > + > +Device must be bound to an iommufd before the attach operation can > +be conducted. The binding operation builds the connection between > +the devicefd (opened via device-passthrough framework) and IOMMUFD. > +IOMMU-protected security context is esbliashed when the binding > +operation is completed. This can't be quite right. You can't establish a safe security context until all devices in the groun are bound, but you can only bind them one at a time. > The passthrough framework must block user > +access to the assigned device until bind() returns success. > + > +The entire /dev/iommu framework adopts a device-centric model w/o > +carrying any container/group legacy as current vfio does. However > +the group is the minimum granularity that must be used to ensure > +secure user access (refer to vfio.rst). This framework relies on > +the IOMMU core layer to map device-centric model into group-granular > +isolation. > + > +Managing I/O Address Spaces > +--- > + > +When creating an I/O address space (by allocating IOASID), the user > +must specify the type of underlying I/O page table. Currently only > +one type (kernel-managed) is supported. In the future other types > +will be introduced, e.g. to support user-managed I/O page table or > +a shared I/O page table which is managed by another kernel sub- > +system (mm, ept, etc.).
[RFC 20/20] Doc: Add documentation for /dev/iommu
Document the /dev/iommu framework for user. Open: Do we want to document /dev/iommu in Documentation/userspace-api/iommu.rst? Existing iommu.rst is for the vSVA interfaces, honestly, may need to rewrite this doc entirely. Signed-off-by: Kevin Tian Signed-off-by: Liu Yi L --- Documentation/userspace-api/index.rst | 1 + Documentation/userspace-api/iommufd.rst | 183 2 files changed, 184 insertions(+) create mode 100644 Documentation/userspace-api/iommufd.rst diff --git a/Documentation/userspace-api/index.rst b/Documentation/userspace-api/index.rst index 0b5eefed027e..54df5a278023 100644 --- a/Documentation/userspace-api/index.rst +++ b/Documentation/userspace-api/index.rst @@ -25,6 +25,7 @@ place where this information is gathered. ebpf/index ioctl/index iommu + iommufd media/index sysfs-platform_profile diff --git a/Documentation/userspace-api/iommufd.rst b/Documentation/userspace-api/iommufd.rst new file mode 100644 index ..abffbb47dc02 --- /dev/null +++ b/Documentation/userspace-api/iommufd.rst @@ -0,0 +1,183 @@ +.. SPDX-License-Identifier: GPL-2.0 +.. iommu: + +=== +IOMMU Userspace API +=== + +Direct device access from userspace has been a crtical feature in +high performance computing and virtualization usages. Linux now +includes multiple device-passthrough frameworks (e.g. VFIO and vDPA) +to manage secure device access from the userspace. One critical +task of those frameworks is to put the assigned device in a secure, +IOMMU-protected context so the device is prevented from doing harm +to the rest of the system. + +Currently those frameworks implement their own logic for managing +I/O page tables to isolate user-initiated DMAs. This doesn't scale +to support many new IOMMU features, such as PASID-granular DMA +remapping, nested translation, I/O page fault, IOMMU dirty bit, etc. + +The /dev/iommu framework provides an unified interface for managing +I/O page tables for passthrough devices. Existing passthrough +frameworks are expected to use this interface instead of continuing +their ad-hoc implementations. + +IOMMUFDs, IOASIDs, Devices and Groups +- + +The core concepts in /dev/iommu are IOMMUFDs and IOASIDs. IOMMUFD (by +opening /dev/iommu) is the container holding multiple I/O address +spaces for a user, while IOASID is the fd-local software handle +representing an I/O address space and associated with a single I/O +page table. User manages those address spaces through fd operations, +e.g. by using vfio type1v2 mapping semantics to manage respective +I/O page tables. + +IOASID is comparable to the conatiner concept in VFIO. The latter +is also associated to a single I/O address space. A main difference +between them is that multiple IOASIDs in the same IOMMUFD can be +nested together (not supported yet) to allow centralized accounting +of locked pages, while multiple containers are disconnected thus +duplicated accounting is incurred. Typically one IOMMUFD is +sufficient for all intended IOMMU usages for a user. + +An I/O address space takes effect in the IOMMU only after it is +attached by a device. One I/O address space can be attached by +multiple devices. One device can be only attached to a single I/O +address space at this point (on par with current vfio behavior). + +Device must be bound to an iommufd before the attach operation can +be conducted. The binding operation builds the connection between +the devicefd (opened via device-passthrough framework) and IOMMUFD. +IOMMU-protected security context is esbliashed when the binding +operation is completed. The passthrough framework must block user +access to the assigned device until bind() returns success. + +The entire /dev/iommu framework adopts a device-centric model w/o +carrying any container/group legacy as current vfio does. However +the group is the minimum granularity that must be used to ensure +secure user access (refer to vfio.rst). This framework relies on +the IOMMU core layer to map device-centric model into group-granular +isolation. + +Managing I/O Address Spaces +--- + +When creating an I/O address space (by allocating IOASID), the user +must specify the type of underlying I/O page table. Currently only +one type (kernel-managed) is supported. In the future other types +will be introduced, e.g. to support user-managed I/O page table or +a shared I/O page table which is managed by another kernel sub- +system (mm, ept, etc.). Kernel-managed I/O page table is currently +managed via vfio type1v2 equivalent mapping semantics. + +The user also needs to specify the format of the I/O page table +when allocating an IOASID. The format must be compatible to the +attached devices (or more specifically to the IOMMU which serves +the DMA from the attached devices). User can query the device IOMMU +format via IOMMUFD once a device is successfully bound. Attaching a +device