Re: [PATCH 03/37] iommu/sva: Manage process address spaces
On 24/04/18 18:17, Sinan Kaya wrote: > On 4/24/2018 5:33 AM, Jean-Philippe Brucker wrote: >>> Please return pasid when you find an io_mm that is already bound. Something >>> like >>> *pasid = io_mm->pasid should do the work here when bond is true. >> Right. I think we should also keep returning 0, not switch to -EEXIST or >> similar. So in next version a driver can call bind(devX, mmY) multiple >> times, but the first unbind() removes the bond. > > If we are going to allow multiple binds, then the last unbind should > remove the bond rather than the first one via reference counting. Yeah that's probably better. Since a bond belongs to a device driver it doesn't need multiple bind/unbind, so earlier in this thread (1/37) I talked about removing the bond->refs. But thinking about it, there still is a need for it. When mm exits, we now need to call the device driver's mm_exit handler outside of the spinlock, so we have to take a ref in order to prevent a concurrent unbind() from freeing the bond. Thanks, Jean ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 03/37] iommu/sva: Manage process address spaces
On 4/24/2018 5:33 AM, Jean-Philippe Brucker wrote: >> Please return pasid when you find an io_mm that is already bound. Something >> like >> *pasid = io_mm->pasid should do the work here when bond is true. > Right. I think we should also keep returning 0, not switch to -EEXIST or > similar. So in next version a driver can call bind(devX, mmY) multiple > times, but the first unbind() removes the bond. If we are going to allow multiple binds, then the last unbind should remove the bond rather than the first one via reference counting. -- Sinan Kaya Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 03/37] iommu/sva: Manage process address spaces
On 24/04/18 02:32, Sinan Kaya wrote: > On 2/12/2018 1:33 PM, Jean-Philippe Brucker wrote: >> /** >> * iommu_sva_device_init() - Initialize Shared Virtual Addressing for a >> device >> * @dev: the device >> @@ -129,7 +439,10 @@ EXPORT_SYMBOL_GPL(iommu_sva_device_shutdown); >> int iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, int >> *pasid, >>unsigned long flags, void *drvdata) >> { >> +int i, ret; >> +struct io_mm *io_mm = NULL; >> struct iommu_domain *domain; >> +struct iommu_bond *bond = NULL, *tmp; >> struct iommu_param *dev_param = dev->iommu_param; >> >> domain = iommu_get_domain_for_dev(dev); >> @@ -145,7 +458,42 @@ int iommu_sva_bind_device(struct device *dev, struct >> mm_struct *mm, int *pasid, >> if (flags != (IOMMU_SVA_FEAT_PASID | IOMMU_SVA_FEAT_IOPF)) >> return -EINVAL; >> >> -return -ENOSYS; /* TODO */ >> +/* If an io_mm already exists, use it */ >> +spin_lock(_sva_lock); >> +idr_for_each_entry(_pasid_idr, io_mm, i) { >> +if (io_mm->mm != mm || !io_mm_get_locked(io_mm)) >> +continue; >> + >> +/* Is it already bound to this device? */ >> +list_for_each_entry(tmp, _mm->devices, mm_head) { >> +if (tmp->dev != dev) >> +continue; >> + >> +bond = tmp; >> +refcount_inc(>refs); >> +io_mm_put_locked(io_mm); >> +break; >> +} >> +break; >> +} >> +spin_unlock(_sva_lock); >> + >> +if (bond) > > Please return pasid when you find an io_mm that is already bound. Something > like > *pasid = io_mm->pasid should do the work here when bond is true. Right. I think we should also keep returning 0, not switch to -EEXIST or similar. So in next version a driver can call bind(devX, mmY) multiple times, but the first unbind() removes the bond. Thanks, Jean ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 03/37] iommu/sva: Manage process address spaces
On 2/12/2018 1:33 PM, Jean-Philippe Brucker wrote: > /** > * iommu_sva_device_init() - Initialize Shared Virtual Addressing for a > device > * @dev: the device > @@ -129,7 +439,10 @@ EXPORT_SYMBOL_GPL(iommu_sva_device_shutdown); > int iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, int > *pasid, > unsigned long flags, void *drvdata) > { > + int i, ret; > + struct io_mm *io_mm = NULL; > struct iommu_domain *domain; > + struct iommu_bond *bond = NULL, *tmp; > struct iommu_param *dev_param = dev->iommu_param; > > domain = iommu_get_domain_for_dev(dev); > @@ -145,7 +458,42 @@ int iommu_sva_bind_device(struct device *dev, struct > mm_struct *mm, int *pasid, > if (flags != (IOMMU_SVA_FEAT_PASID | IOMMU_SVA_FEAT_IOPF)) > return -EINVAL; > > - return -ENOSYS; /* TODO */ > + /* If an io_mm already exists, use it */ > + spin_lock(_sva_lock); > + idr_for_each_entry(_pasid_idr, io_mm, i) { > + if (io_mm->mm != mm || !io_mm_get_locked(io_mm)) > + continue; > + > + /* Is it already bound to this device? */ > + list_for_each_entry(tmp, _mm->devices, mm_head) { > + if (tmp->dev != dev) > + continue; > + > + bond = tmp; > + refcount_inc(>refs); > + io_mm_put_locked(io_mm); > + break; > + } > + break; > + } > + spin_unlock(_sva_lock); > + > + if (bond) Please return pasid when you find an io_mm that is already bound. Something like *pasid = io_mm->pasid should do the work here when bond is true. > + return 0; -- Sinan Kaya Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 03/37] iommu/sva: Manage process address spaces
On 10/04/18 19:53, Sinan Kaya wrote: > On 2/12/2018 1:33 PM, Jean-Philippe Brucker wrote: >> +static void io_mm_detach_all_locked(struct iommu_bond *bond) >> +{ >> +while (!io_mm_detach_locked(bond)); >> +} >> + > > I don't remember if I mentioned this before or not but I think this loop > needs a little bit relaxation with yield and maybe an informational message > with might help if wait exceeds some time. Right, at the very least we should have a cpu_relax here. I think this bit is going away, though, because I want to lift the possibility of calling bind() for the same dev/mm pair multiple times. It's not useful in my opinion because that call could only be issued by a given driver. Thanks, Jean ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 03/37] iommu/sva: Manage process address spaces
On 2/12/2018 1:33 PM, Jean-Philippe Brucker wrote: > +static void io_mm_detach_all_locked(struct iommu_bond *bond) > +{ > + while (!io_mm_detach_locked(bond)); > +} > + I don't remember if I mentioned this before or not but I think this loop needs a little bit relaxation with yield and maybe an informational message with might help if wait exceeds some time. -- Sinan Kaya Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 03/37] iommu/sva: Manage process address spaces
On 05/03/18 15:28, Sinan Kaya wrote: > On 2/12/2018 1:33 PM, Jean-Philippe Brucker wrote: >> +static void io_mm_free(struct io_mm *io_mm) >> +{ >> +struct mm_struct *mm; >> +void (*release)(struct io_mm *); >> + >> +release = io_mm->release; >> +mm = io_mm->mm; >> + >> +release(io_mm); > > Is there any reason why you can't call iommu->release() > here directly? Why do you need the release local variable? I think I can remove the local variable Thanks, Jean ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 03/37] iommu/sva: Manage process address spaces
On 01/03/18 08:04, Christian König wrote: > Am 01.03.2018 um 07:52 schrieb Lu Baolu: >> Hi Jean, >> >> On 02/13/2018 02:33 AM, Jean-Philippe Brucker wrote: >>> [SNIP] >>> + pasid = idr_alloc_cyclic(_pasid_idr, io_mm, dev_param->min_pasid, >>> +dev_param->max_pasid + 1, GFP_ATOMIC); >> Can the pasid management code be moved into a common library? >> PASID is not stick to SVA. An IOMMU model device could be designed >> to use PASID for second level translation (classical DMA translation) >> as well. > > Yeah, we have the same problem on amdgpu. > > We assign PASIDs to clients even when IOMMU isn't present in the system > just because we need it for debugging. > > E.g. when the hardware detects that some shader program is doing > something nasty we get the PASID in the interrupt and could for example > use it to inform the client about the fault. This seems like a new requirement altogether, and a bit outside the scope of this series to be honest. Is the client userspace in this context? I guess it would be mostly for prototyping then? Otherwise you probably don't want to hand GPU contexts to userspace without an IOMMU isolating them? If you don't need mm tracking/sharing or iommu_map/unmap, then maybe an IDR private to the GPU driver would be enough? If you do need mm tracking, I suppose we could modify iommu_sva_bind() to allocate and track io_mm even if the given device doesn't have an IOMMU, but it seems a bit backward. Thanks, Jean ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 03/37] iommu/sva: Manage process address spaces
On 01/03/18 06:52, Lu Baolu wrote: > Can the pasid management code be moved into a common library? > PASID is not stick to SVA. An IOMMU model device could be designed > to use PASID for second level translation (classical DMA translation) > as well. What do you mean by second level translation? Do you see a use-case with nesting translation within the host? I agree that PASID + classical DMA is desirable. A device driver would allocate PASIDs and perform iommu_sva_map(domain, pasid, iova, pa, size, prot) and iommu_sva_unmap(domain, pasid, iova, size). I'm hoping that we can also augment the DMA API with PASIDs, and that a driver can use both shared and private contexts simultaneously. So that it can use a few PASIDs for management purpose, and assign the rest to userspace. The intent is for iommu-sva.c to be this common library. Work for "private" PASID allocation is underway, see Jordan Crouse's series posted last week https://www.spinics.net/lists/arm-kernel/msg635857.html Thanks, Jean ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 03/37] iommu/sva: Manage process address spaces
Am 01.03.2018 um 07:52 schrieb Lu Baolu: Hi Jean, On 02/13/2018 02:33 AM, Jean-Philippe Brucker wrote: [SNIP] + pasid = idr_alloc_cyclic(_pasid_idr, io_mm, dev_param->min_pasid, +dev_param->max_pasid + 1, GFP_ATOMIC); Can the pasid management code be moved into a common library? PASID is not stick to SVA. An IOMMU model device could be designed to use PASID for second level translation (classical DMA translation) as well. Yeah, we have the same problem on amdgpu. We assign PASIDs to clients even when IOMMU isn't present in the system just because we need it for debugging. E.g. when the hardware detects that some shader program is doing something nasty we get the PASID in the interrupt and could for example use it to inform the client about the fault. Regards, Christian. Best regards, Lu Baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 03/37] iommu/sva: Manage process address spaces
Hi Jean, On 02/13/2018 02:33 AM, Jean-Philippe Brucker wrote: > Introduce boilerplate code for allocating IOMMU mm structures and binding > them to devices. Four operations are added to IOMMU drivers: > > * mm_alloc(): to create an io_mm structure and perform architecture- > specific operations required to grab the process (for instance on ARM, > pin down the CPU ASID so that the process doesn't get assigned a new > ASID on rollover). > > There is a single valid io_mm structure per Linux mm. Future extensions > may also use io_mm for kernel-managed address spaces, populated with > map()/unmap() calls instead of bound to process address spaces. This > patch focuses on "shared" io_mm. > > * mm_attach(): attach an mm to a device. The IOMMU driver checks that the > device is capable of sharing an address space, and writes the PASID > table entry to install the pgd. > > Some IOMMU drivers will have a single PASID table per domain, for > convenience. Other can implement it differently but to help these > drivers, mm_attach and mm_detach take 'attach_domain' and > 'detach_domain' parameters, that tell whether they need to set and clear > the PASID entry or only send the required TLB invalidations. > > * mm_detach(): detach an mm from a device. The IOMMU driver removes the > PASID table entry and invalidates the IOTLBs. > > * mm_free(): free a structure allocated by mm_alloc(), and let arch > release the process. > > mm_attach and mm_detach operations are serialized with a spinlock. At the > moment it is global, but if we try to optimize it, the core should at > least prevent concurrent attach()/detach() on the same domain (so > multi-level PASID table code can allocate tables lazily). mm_alloc() can > sleep, but mm_free must not (because we'll have to call it from call_srcu > later on.) > > At the moment we use an IDR for allocating PASIDs and retrieving contexts. > We also use a single spinlock. These can be refined and optimized later (a > custom allocator will be needed for top-down PASID allocation). > > Keeping track of address spaces requires the use of MMU notifiers. > Handling process exit with regard to unbind() is tricky, so it is left for > another patch and we explicitly fail mm_alloc() for the moment. > > Signed-off-by: Jean-Philippe Brucker> --- > drivers/iommu/iommu-sva.c | 382 > +- > drivers/iommu/iommu.c | 2 + > include/linux/iommu.h | 25 +++ > 3 files changed, 406 insertions(+), 3 deletions(-) > > diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c > index 593685d891bf..f9af9d66b3ed 100644 > --- a/drivers/iommu/iommu-sva.c > +++ b/drivers/iommu/iommu-sva.c > @@ -7,11 +7,321 @@ > * SPDX-License-Identifier: GPL-2.0 > */ > > +#include > #include > +#include > +#include > + > +/** > + * DOC: io_mm model > + * > + * The io_mm keeps track of process address spaces shared between CPU and > IOMMU. > + * The following example illustrates the relation between structures > + * iommu_domain, io_mm and iommu_bond. An iommu_bond is a link between io_mm > and > + * device. A device can have multiple io_mm and an io_mm may be bound to > + * multiple devices. > + * ___ > + * | IOMMU domain A | > + * | | > + * | | IOMMU group |+--- io_pgtables > + * | ||| > + * | | dev 00:00.0 +--- bond --- io_mm X > + * | || \| > + * | '- bond ---. > + * |___| \ > + * ___ \ > + * | IOMMU domain B | io_mm Y > + * | | / / > + * | | IOMMU group || / / > + * | ||| / / > + * | | dev 00:01.0 bond -' / > + * | | dev 00:01.1 bond --' > + * | ||| > + * | +--- io_pgtables > + * |___| > + * > + * In this example, device 00:00.0 is in domain A, devices 00:01.* are in > domain > + * B. All devices within the same domain access the same address spaces. > Device > + * 00:00.0 accesses address spaces X and Y, each corresponding to an > mm_struct. > + * Devices 00:01.* only access address space Y. In addition each > + * IOMMU_DOMAIN_DMA domain has a private address space, io_pgtable, that is > + * managed with iommu_map()/iommu_unmap(), and isn't shared with the CPU MMU. > + * > + * To obtain the above configuration, users would for instance issue the > + * following calls: > + * > + * iommu_sva_bind_device(dev 00:00.0, mm
Re: [PATCH 03/37] iommu/sva: Manage process address spaces
On 22/02/18 09:03, Yisheng Xie wrote: [...] >> +/* Is it already bound to this device? */ >> +list_for_each_entry(tmp, _mm->devices, mm_head) { >> +if (tmp->dev != dev) >> +continue; >> + >> +bond = tmp; >> +refcount_inc(>refs); >> +io_mm_put_locked(io_mm); > > Should io_mm->pasid still be set to *pasid when the device already bond? so > driver can > always get the right pasid if it bond to a mm multi-times, without keeping > the pasid itself? (Assuming you mean set *pasid = io_mm->pasid) I think it should, I seem to have removed it by accident from this version Thanks, Jean ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 03/37] iommu/sva: Manage process address spaces
Hi Jean, On 2018/2/22 14:23, Jean-Philippe Brucker wrote: > @@ -129,7 +439,10 @@ int iommu_sva_device_shutdown(struct device *dev) > int iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, int > *pasid, > unsigned long flags, void *drvdata) > { > + int i, ret; > + struct io_mm *io_mm = NULL; > struct iommu_domain *domain; > + struct iommu_bond *bond = NULL, *tmp; > struct iommu_param *dev_param = dev->iommu_param; > > domain = iommu_get_domain_for_dev(dev); > @@ -145,7 +458,42 @@ int iommu_sva_bind_device(struct device *dev, struct > mm_struct *mm, int *pasid, > if (flags != (IOMMU_SVA_FEAT_PASID | IOMMU_SVA_FEAT_IOPF)) > return -EINVAL; > > - return -ENOSYS; /* TODO */ > + /* If an io_mm already exists, use it */ > + spin_lock(_sva_lock); > + idr_for_each_entry(_pasid_idr, io_mm, i) { > + if (io_mm->mm != mm || !io_mm_get_locked(io_mm)) > + continue; > + > + /* Is it already bound to this device? */ > + list_for_each_entry(tmp, _mm->devices, mm_head) { > + if (tmp->dev != dev) > + continue; > + > + bond = tmp; > + refcount_inc(>refs); > + io_mm_put_locked(io_mm); Should io_mm->pasid still be set to *pasid when the device already bond? so driver can always get the right pasid if it bond to a mm multi-times, without keeping the pasid itself? Thanks Yisheng > + break; > + } > + break; > + } > + spin_unlock(_sva_lock); > + > + if (bond) > + return 0; > + > + if (!io_mm) { > + io_mm = io_mm_alloc(domain, dev, mm); > + if (IS_ERR(io_mm)) > + return PTR_ERR(io_mm); > + } > + > + ret = io_mm_attach(domain, dev, io_mm, drvdata); > + if (ret) > + io_mm_put(io_mm); > + else > + *pasid = io_mm->pasid; > + > + return ret; > } > EXPORT_SYMBOL_GPL(iommu_sva_bind_device); ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 03/37] iommu/sva: Manage process address spaces
Introduce boilerplate code for allocating IOMMU mm structures and binding them to devices. Four operations are added to IOMMU drivers: * mm_alloc(): to create an io_mm structure and perform architecture- specific operations required to grab the process (for instance on ARM, pin down the CPU ASID so that the process doesn't get assigned a new ASID on rollover). There is a single valid io_mm structure per Linux mm. Future extensions may also use io_mm for kernel-managed address spaces, populated with map()/unmap() calls instead of bound to process address spaces. This patch focuses on "shared" io_mm. * mm_attach(): attach an mm to a device. The IOMMU driver checks that the device is capable of sharing an address space, and writes the PASID table entry to install the pgd. Some IOMMU drivers will have a single PASID table per domain, for convenience. Other can implement it differently but to help these drivers, mm_attach and mm_detach take 'attach_domain' and 'detach_domain' parameters, that tell whether they need to set and clear the PASID entry or only send the required TLB invalidations. * mm_detach(): detach an mm from a device. The IOMMU driver removes the PASID table entry and invalidates the IOTLBs. * mm_free(): free a structure allocated by mm_alloc(), and let arch release the process. mm_attach and mm_detach operations are serialized with a spinlock. At the moment it is global, but if we try to optimize it, the core should at least prevent concurrent attach()/detach() on the same domain (so multi-level PASID table code can allocate tables lazily). mm_alloc() can sleep, but mm_free must not (because we'll have to call it from call_srcu later on.) At the moment we use an IDR for allocating PASIDs and retrieving contexts. We also use a single spinlock. These can be refined and optimized later (a custom allocator will be needed for top-down PASID allocation). Keeping track of address spaces requires the use of MMU notifiers. Handling process exit with regard to unbind() is tricky, so it is left for another patch and we explicitly fail mm_alloc() for the moment. Signed-off-by: Jean-Philippe Brucker--- drivers/iommu/iommu-sva.c | 382 +- drivers/iommu/iommu.c | 2 + include/linux/iommu.h | 25 +++ 3 files changed, 406 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c index 593685d891bf..f9af9d66b3ed 100644 --- a/drivers/iommu/iommu-sva.c +++ b/drivers/iommu/iommu-sva.c @@ -7,11 +7,321 @@ * SPDX-License-Identifier: GPL-2.0 */ +#include #include +#include +#include + +/** + * DOC: io_mm model + * + * The io_mm keeps track of process address spaces shared between CPU and IOMMU. + * The following example illustrates the relation between structures + * iommu_domain, io_mm and iommu_bond. An iommu_bond is a link between io_mm and + * device. A device can have multiple io_mm and an io_mm may be bound to + * multiple devices. + * ___ + * | IOMMU domain A | + * | | + * | | IOMMU group |+--- io_pgtables + * | ||| + * | | dev 00:00.0 +--- bond --- io_mm X + * | || \| + * | '- bond ---. + * |___| \ + * ___ \ + * | IOMMU domain B | io_mm Y + * | | / / + * | | IOMMU group || / / + * | ||| / / + * | | dev 00:01.0 bond -' / + * | | dev 00:01.1 bond --' + * | ||| + * | +--- io_pgtables + * |___| + * + * In this example, device 00:00.0 is in domain A, devices 00:01.* are in domain + * B. All devices within the same domain access the same address spaces. Device + * 00:00.0 accesses address spaces X and Y, each corresponding to an mm_struct. + * Devices 00:01.* only access address space Y. In addition each + * IOMMU_DOMAIN_DMA domain has a private address space, io_pgtable, that is + * managed with iommu_map()/iommu_unmap(), and isn't shared with the CPU MMU. + * + * To obtain the above configuration, users would for instance issue the + * following calls: + * + * iommu_sva_bind_device(dev 00:00.0, mm X, ...) -> PASID 1 + * iommu_sva_bind_device(dev 00:00.0, mm Y, ...) -> PASID 2 + * iommu_sva_bind_device(dev 00:01.0, mm Y, ...) -> PASID 2 + * iommu_sva_bind_device(dev 00:01.1, mm Y, ...) -> PASID 2 + * + * A single Process Address Space ID (PASID) is