RE: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
> From: 'j...@8bytes.org' [mailto:j...@8bytes.org] > Sent: Wednesday, December 12, 2018 5:54 PM > > Hi Kevin, > > On Wed, Dec 12, 2018 at 09:31:27AM +, Tian, Kevin wrote: > > > From: 'j...@8bytes.org' > > > Sent: Monday, December 10, 2018 4:58 PM > > > These represent whether the device together with the IOMMU support > > > them, > > > basically whether these features are usable via the IOMMU-API. > > > > "device together with IOMMU" or just "IOMMU itself"? > > No, it should mean device together with IOMMU support it. It is a way > for users of the IOMMU-API to check whether they can successfully use > the aux-specific functions. > > > however there is a problem with aux. A device may implement both > > SR-IOV and Scalable IOV capabilities, but at any time only one of them > > can be enabled. Driver will provide interfaces for end user to choose. > > In such case we cannot assume that device-side Scalable-IOV can be > > always enabled while IOMMU is in scalable mode. > > > > It works better if we position those features just representing IOMMU > > support only. In that case, aux is related only to scalable mode of IOMMU > > itself, which is orthogonal to whether device side supports SIOV. > > Yeah, but we don't make that decision in the IOMMU code. Whether the > device exposes SR-IOV or PASID based isolation is decided in PCI code > based on user input (SR-IOV is also enabled in PCI code and IOMMU just > uses the new devices that appear). > > Only if the user enabled scalable mode on the device and the IOMMU > supports it too, the feature-check function returns true for the aux > feature. > Then this is another proof for an enable/disable part in API. :-) Thanks kevin ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
Hi Kevin, On Wed, Dec 12, 2018 at 09:31:27AM +, Tian, Kevin wrote: > > From: 'j...@8bytes.org' > > Sent: Monday, December 10, 2018 4:58 PM > > These represent whether the device together with the IOMMU support > > them, > > basically whether these features are usable via the IOMMU-API. > > "device together with IOMMU" or just "IOMMU itself"? No, it should mean device together with IOMMU support it. It is a way for users of the IOMMU-API to check whether they can successfully use the aux-specific functions. > however there is a problem with aux. A device may implement both > SR-IOV and Scalable IOV capabilities, but at any time only one of them > can be enabled. Driver will provide interfaces for end user to choose. > In such case we cannot assume that device-side Scalable-IOV can be > always enabled while IOMMU is in scalable mode. > > It works better if we position those features just representing IOMMU > support only. In that case, aux is related only to scalable mode of IOMMU > itself, which is orthogonal to whether device side supports SIOV. Yeah, but we don't make that decision in the IOMMU code. Whether the device exposes SR-IOV or PASID based isolation is decided in PCI code based on user input (SR-IOV is also enabled in PCI code and IOMMU just uses the new devices that appear). Only if the user enabled scalable mode on the device and the IOMMU supports it too, the feature-check function returns true for the aux feature. Regards, Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
> From: 'j...@8bytes.org' > Sent: Monday, December 10, 2018 4:58 PM > > Hi Kevin, > > On Mon, Dec 10, 2018 at 02:06:44AM +, Tian, Kevin wrote: > > Can I interpret above as that you agree with the aux domain concept (i.e. > one > > device can be linked to multiple domains) in general, and now we're just > trying > > to address the remaining open in API level? > > Yes, I thouht about alternatives, but in the end they are all harder to > use than the aux-domain concept. We just need to make sure that we have > a clear definition of what the API extension do and how they impact the > behavior of existing functions. sounds great! > > > > enum iommu_dev_features { > > > /* ... */ > > > IOMMU_DEV_FEAT_AUX, > > > IOMMU_DEV_FEAT_SVA, > > > /* ... */ > > > }; > > > > > > > Does above represent whether a device implements aux/sva features, > > or whether a device has been enabled by driver to support aux/sva > > features? > > These represent whether the device together with the IOMMU support > them, > basically whether these features are usable via the IOMMU-API. "device together with IOMMU" or just "IOMMU itself"? the former might be OK for sva. As Jean replied in another mail, enabling it in both IOMMU and device side just consumes some resources, while not impacting other usages on that device. however there is a problem with aux. A device may implement both SR-IOV and Scalable IOV capabilities, but at any time only one of them can be enabled. Driver will provide interfaces for end user to choose. In such case we cannot assume that device-side Scalable-IOV can be always enabled while IOMMU is in scalable mode. It works better if we position those features just representing IOMMU support only. In that case, aux is related only to scalable mode of IOMMU itself, which is orthogonal to whether device side supports SIOV. > > > > > > > /* Check if a device supports a given feature of the IOMMU-API */ > > > bool iommu_dev_has_feature(struct device *dev, enum > > > iommu_dev_features *feat); > > > > If the latter we also need iommu_dev_set_feature so driver can poke > > it based on its own configuration. > > Do you mean we need an explict enable-API for the features? I thought > about that too, but some features clearly don't need it and I didn't > want to complicate the usage. My assumption was that when the feature is > available, it is also enabled. > > > > /* So we need a iommu_aux_detach_all()? */ > > > > for what scenario? > > Maybe for detaching any PASID users from a device so that it can be > assigned to a VM as a whole. But I am not sure it is needed. > yes, possibly useful in reset path as Jean pointed out. Thanks Kevin ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
On Tue, Dec 11, 2018 at 01:35:23PM +, Jean-Philippe Brucker wrote: > > /* So we need a iommu_aux_detach_all()? */ > > This could be useful for device drivers that want to do bulk cleanup on > device removal. If they rely on Function Level Reset to disable PASID > states for example, they could also cleanup with a detach_all(). But > most will probably need to clean up individual PASID states (for example > free all mdevs) and therefore don't need detach_all() Yeah, so the more I think about it, the more dangerous it seems to have this function. It creates subtle error cases for the users of SVA and aux-domains because their mapping goes away without notice. It is certainly better to force an orderly shutdown of all users before the device can be re-assigned. > > This concept can also be easily extended for supporting SVA in parallel > > on the same device, with the same constraints regarding the behavior of > > iommu_attach_device()/iommu_detach_device(). > > So this would be without the initial step of attaching an "ext" domain > to the device in order to enable SVA/PASID? No, I don't think anymore we should introduce a special domain type to enable these features. Separate enable/disable functions per feature seem to be a better choice. > If I understood it correctly, I agree with the > iommu_attach/detach_device() behavior for SVA as well. If a device is > bound to an mm, then the device cannot be attached to another domain > with iommu_attach_device(). This prevents leaks and forces the device > driver to clean up PASID states when switching to a different driver > (e.g. vfio-pci) Right, the API should ensure we are safe on that side. Thanks, Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
On Tue, Dec 11, 2018 at 06:34:23PM +, Jean-Philippe Brucker wrote: > The cost of enabling those features for a device does seem negligible. > For the SMMU we need to allocate about 70k of additional memory for the > initial PASID table, but enabling the PASID cap shouldn't add any > overhead otherwise. Same for PRI, shouldn't add any overhead. Okay, the memory requirements are a pretty good case for an enable/disable part in the API. > As for ATS, it supposedly makes things faster, although it does mean > more invalidation requests. There currently is a global pci=noats > parameter, but maybe we need something with finer granularity to > enable/disable it per device? Yeah, I don't object against this, but I think this is a matter of PCI code. The IOMMUs should just enable ATS when it is available. PCI core can fail to enable it if requested by the user. Regards, Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
On 10/12/2018 08:57, 'j...@8bytes.org' wrote: > Hi Kevin, > > On Mon, Dec 10, 2018 at 02:06:44AM +, Tian, Kevin wrote: >> Can I interpret above as that you agree with the aux domain concept (i.e. one >> device can be linked to multiple domains) in general, and now we're just >> trying >> to address the remaining open in API level? > > Yes, I thouht about alternatives, but in the end they are all harder to > use than the aux-domain concept. We just need to make sure that we have > a clear definition of what the API extension do and how they impact the > behavior of existing functions. > >>> enum iommu_dev_features { >>> /* ... */ >>> IOMMU_DEV_FEAT_AUX, >>> IOMMU_DEV_FEAT_SVA, >>> /* ... */ >>> }; >>> >> >> Does above represent whether a device implements aux/sva features, >> or whether a device has been enabled by driver to support aux/sva >> features? > > These represent whether the device together with the IOMMU support them, > basically whether these features are usable via the IOMMU-API. > > >> >>> /* Check if a device supports a given feature of the IOMMU-API */ >>> bool iommu_dev_has_feature(struct device *dev, enum >>> iommu_dev_features *feat); >> >> If the latter we also need iommu_dev_set_feature so driver can poke >> it based on its own configuration. > > Do you mean we need an explict enable-API for the features? I thought > about that too, but some features clearly don't need it and I didn't > want to complicate the usage. My assumption was that when the feature is > available, it is also enabled. The cost of enabling those features for a device does seem negligible. For the SMMU we need to allocate about 70k of additional memory for the initial PASID table, but enabling the PASID cap shouldn't add any overhead otherwise. Same for PRI, shouldn't add any overhead. As for ATS, it supposedly makes things faster, although it does mean more invalidation requests. There currently is a global pci=noats parameter, but maybe we need something with finer granularity to enable/disable it per device? Thanks, Jean ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
On 07/12/2018 10:29, 'j...@8bytes.org' wrote: > Hi, > > On Mon, Nov 26, 2018 at 07:29:45AM +, Tian, Kevin wrote: >> btw Baolu just reminded me one thing which is worthy of noting here. >> 'primary' vs. 'aux' concept makes sense only when we look from a device >> p.o.v. That binding relationship is not (*should not be*) carry-and-forwarded >> cross devices. every domain must be explicitly attached to other devices >> (instead of implicitly attached being above example), and new primary/aux >> attribute on another device will be decided at attach time. > > Okay, so after all the discussions we had I learned a few more things > about the scalable mode feature and thought a bit longer about how to > best support it in the IOMMU-API. > > The concept of sub-domains I initially proposed certainly makes no > sense, but scalable-mode specific attach/detach functions do. So instead > of a sub-domain mode, I'd like to propose device-feature sets. > > The posted patch-set already includes this as device-attributes, but I > don't like this naming as we are really talking about additional > feature sets of a device. So how about we introduce this: > > enum iommu_dev_features { > /* ... */ > IOMMU_DEV_FEAT_AUX, > IOMMU_DEV_FEAT_SVA, > /* ... */ > }; > > /* Check if a device supports a given feature of the IOMMU-API */ > bool iommu_dev_has_feature(struct device *dev, enum iommu_dev_features > *feat); > > /* >* Only works if iommu_dev_has_feature(dev, IOMMU_DEV_FEAT_AUX) >* returns true >* >* Also, as long as domains are attached to a device through >* this interface, any trys to call iommu_attach_device() should >* fail (iommu_detach_device() can't fail, so we fail on the >* tryint to re-attach). This should make us safe against a >* device being attached to a guest as a whole while there are >* still pasid users on it (aux and sva). >*/ > int iommu_aux_attach_device(struct iommu_domain *domain, > struct device *dev); > > int iommu_aux_detach_device(struct iommu_domain *domain, > struct device *dev); > /* >* I know we are targeting a system-wide pasid-space, so that >* the pasid would be the same for one domain on all devices, >* let's just keep the option open to have different >* pasid-spaces in one system. Also this way we can use it to >* check whether the domain is attached to this device at all. >* >* returns pasid or <0 if domain has no pasid on that device. >*/ > int iommu_aux_get_pasid(struct iommu_domain *domain, > struct device *dev);> > /* So we need a iommu_aux_detach_all()? */ This could be useful for device drivers that want to do bulk cleanup on device removal. If they rely on Function Level Reset to disable PASID states for example, they could also cleanup with a detach_all(). But most will probably need to clean up individual PASID states (for example free all mdevs) and therefore don't need detach_all() > This concept can also be easily extended for supporting SVA in parallel > on the same device, with the same constraints regarding the behavior of > iommu_attach_device()/iommu_detach_device(). So this would be without the initial step of attaching an "ext" domain to the device in order to enable SVA/PASID? If I understood it correctly, I agree with the iommu_attach/detach_device() behavior for SVA as well. If a device is bound to an mm, then the device cannot be attached to another domain with iommu_attach_device(). This prevents leaks and forces the device driver to clean up PASID states when switching to a different driver (e.g. vfio-pci) Thanks, Jean ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
Hi Lu Baolu, On Mon, Dec 10, 2018 at 10:57:22AM +0800, Lu Baolu wrote: > > /* Check if a device supports a given feature of the IOMMU-API */ > > bool iommu_dev_has_feature(struct device *dev, enum iommu_dev_features > > *feat); > > Here we pass in a pointer of "enum iommu_dev_features", do we want > to return anything here? Yeah, this is a mistake on my side, The enum parameter should be passed by-value, so not pointer there. It doesn't need to return anything in this parameter. Regards, Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
Hi Kevin, On Mon, Dec 10, 2018 at 02:06:44AM +, Tian, Kevin wrote: > Can I interpret above as that you agree with the aux domain concept (i.e. one > device can be linked to multiple domains) in general, and now we're just > trying > to address the remaining open in API level? Yes, I thouht about alternatives, but in the end they are all harder to use than the aux-domain concept. We just need to make sure that we have a clear definition of what the API extension do and how they impact the behavior of existing functions. > > enum iommu_dev_features { > > /* ... */ > > IOMMU_DEV_FEAT_AUX, > > IOMMU_DEV_FEAT_SVA, > > /* ... */ > > }; > > > > Does above represent whether a device implements aux/sva features, > or whether a device has been enabled by driver to support aux/sva > features? These represent whether the device together with the IOMMU support them, basically whether these features are usable via the IOMMU-API. > > > /* Check if a device supports a given feature of the IOMMU-API */ > > bool iommu_dev_has_feature(struct device *dev, enum > > iommu_dev_features *feat); > > If the latter we also need iommu_dev_set_feature so driver can poke > it based on its own configuration. Do you mean we need an explict enable-API for the features? I thought about that too, but some features clearly don't need it and I didn't want to complicate the usage. My assumption was that when the feature is available, it is also enabled. > > /* So we need a iommu_aux_detach_all()? */ > > for what scenario? Maybe for detaching any PASID users from a device so that it can be assigned to a VM as a whole. But I am not sure it is needed. Regards, Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
Hi Joerg, On 12/7/18 6:29 PM, 'j...@8bytes.org' wrote: Hi, On Mon, Nov 26, 2018 at 07:29:45AM +, Tian, Kevin wrote: btw Baolu just reminded me one thing which is worthy of noting here. 'primary' vs. 'aux' concept makes sense only when we look from a device p.o.v. That binding relationship is not (*should not be*) carry-and-forwarded cross devices. every domain must be explicitly attached to other devices (instead of implicitly attached being above example), and new primary/aux attribute on another device will be decided at attach time. Okay, so after all the discussions we had I learned a few more things about the scalable mode feature and thought a bit longer about how to best support it in the IOMMU-API. Thanks for thinking about this. The concept of sub-domains I initially proposed certainly makes no sense, but scalable-mode specific attach/detach functions do. So instead of a sub-domain mode, I'd like to propose device-feature sets. The posted patch-set already includes this as device-attributes, but I don't like this naming as we are really talking about additional feature sets of a device. So how about we introduce this: enum iommu_dev_features { /* ... */ IOMMU_DEV_FEAT_AUX, IOMMU_DEV_FEAT_SVA, /* ... */ }; /* Check if a device supports a given feature of the IOMMU-API */ bool iommu_dev_has_feature(struct device *dev, enum iommu_dev_features *feat); Here we pass in a pointer of "enum iommu_dev_features", do we want to return anything here? Best regards, Lu Baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
> From: 'j...@8bytes.org' [mailto:j...@8bytes.org] > Sent: Friday, December 7, 2018 6:29 PM > > Hi, > > On Mon, Nov 26, 2018 at 07:29:45AM +, Tian, Kevin wrote: > > btw Baolu just reminded me one thing which is worthy of noting here. > > 'primary' vs. 'aux' concept makes sense only when we look from a device > > p.o.v. That binding relationship is not (*should not be*) carry-and- > forwarded > > cross devices. every domain must be explicitly attached to other devices > > (instead of implicitly attached being above example), and new > primary/aux > > attribute on another device will be decided at attach time. > > Okay, so after all the discussions we had I learned a few more things > about the scalable mode feature and thought a bit longer about how to > best support it in the IOMMU-API. Thanks for thinking through this. > > The concept of sub-domains I initially proposed certainly makes no > sense, but scalable-mode specific attach/detach functions do. So instead > of a sub-domain mode, I'd like to propose device-feature sets. Can I interpret above as that you agree with the aux domain concept (i.e. one device can be linked to multiple domains) in general, and now we're just trying to address the remaining open in API level? > > The posted patch-set already includes this as device-attributes, but I > don't like this naming as we are really talking about additional > feature sets of a device. So how about we introduce this: > > enum iommu_dev_features { > /* ... */ > IOMMU_DEV_FEAT_AUX, > IOMMU_DEV_FEAT_SVA, > /* ... */ > }; > Does above represent whether a device implements aux/sva features, or whether a device has been enabled by driver to support aux/sva features? > /* Check if a device supports a given feature of the IOMMU-API */ > bool iommu_dev_has_feature(struct device *dev, enum > iommu_dev_features *feat); If the latter we also need iommu_dev_set_feature so driver can poke it based on its own configuration. > > /* >* Only works if iommu_dev_has_feature(dev, > IOMMU_DEV_FEAT_AUX) >* returns true >* >* Also, as long as domains are attached to a device through >* this interface, any trys to call iommu_attach_device() should >* fail (iommu_detach_device() can't fail, so we fail on the >* tryint to re-attach). This should make us safe against a >* device being attached to a guest as a whole while there are >* still pasid users on it (aux and sva). yes, it makes sense. >*/ > int iommu_aux_attach_device(struct iommu_domain *domain, > struct device *dev); > > int iommu_aux_detach_device(struct iommu_domain *domain, > struct device *dev); > /* >* I know we are targeting a system-wide pasid-space, so that >* the pasid would be the same for one domain on all devices, >* let's just keep the option open to have different >* pasid-spaces in one system. Also this way we can use it to >* check whether the domain is attached to this device at all. >* >* returns pasid or <0 if domain has no pasid on that device. >*/ > int iommu_aux_get_pasid(struct iommu_domain *domain, > struct device *dev); > > /* So we need a iommu_aux_detach_all()? */ for what scenario? > > This concept can also be easily extended for supporting SVA in parallel > on the same device, with the same constraints regarding the behavior of > iommu_attach_device()/iommu_detach_device(). > > So what do you think about that approach? > > Regards, > > Joerg > > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
Hi, On Mon, Nov 26, 2018 at 07:29:45AM +, Tian, Kevin wrote: > btw Baolu just reminded me one thing which is worthy of noting here. > 'primary' vs. 'aux' concept makes sense only when we look from a device > p.o.v. That binding relationship is not (*should not be*) carry-and-forwarded > cross devices. every domain must be explicitly attached to other devices > (instead of implicitly attached being above example), and new primary/aux > attribute on another device will be decided at attach time. Okay, so after all the discussions we had I learned a few more things about the scalable mode feature and thought a bit longer about how to best support it in the IOMMU-API. The concept of sub-domains I initially proposed certainly makes no sense, but scalable-mode specific attach/detach functions do. So instead of a sub-domain mode, I'd like to propose device-feature sets. The posted patch-set already includes this as device-attributes, but I don't like this naming as we are really talking about additional feature sets of a device. So how about we introduce this: enum iommu_dev_features { /* ... */ IOMMU_DEV_FEAT_AUX, IOMMU_DEV_FEAT_SVA, /* ... */ }; /* Check if a device supports a given feature of the IOMMU-API */ bool iommu_dev_has_feature(struct device *dev, enum iommu_dev_features *feat); /* * Only works if iommu_dev_has_feature(dev, IOMMU_DEV_FEAT_AUX) * returns true * * Also, as long as domains are attached to a device through * this interface, any trys to call iommu_attach_device() should * fail (iommu_detach_device() can't fail, so we fail on the * tryint to re-attach). This should make us safe against a * device being attached to a guest as a whole while there are * still pasid users on it (aux and sva). */ int iommu_aux_attach_device(struct iommu_domain *domain, struct device *dev); int iommu_aux_detach_device(struct iommu_domain *domain, struct device *dev); /* * I know we are targeting a system-wide pasid-space, so that * the pasid would be the same for one domain on all devices, * let's just keep the option open to have different * pasid-spaces in one system. Also this way we can use it to * check whether the domain is attached to this device at all. * * returns pasid or <0 if domain has no pasid on that device. */ int iommu_aux_get_pasid(struct iommu_domain *domain, struct device *dev); /* So we need a iommu_aux_detach_all()? */ This concept can also be easily extended for supporting SVA in parallel on the same device, with the same constraints regarding the behavior of iommu_attach_device()/iommu_detach_device(). So what do you think about that approach? Regards, Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
> From: Tian, Kevin > Sent: Monday, November 26, 2018 11:01 AM [...] > > aux-domain is just a normal domain (struct iommu_domain), what > > happens > > when a domain that was used as a primary domain and has bound > > aux-domains to it, is bound with iommu_aux_domain_attach() to another > > domain? > > aux domain is essentially a way to lend partial DMA capability to less- > privileged entity (process or VM), which implies that part of DMA > unmanaged by kernel driver. If we can make such assumption that aux > only applies to UNMANAGED domain (thus disallow BLOCKED/IDENTITY/ > DMA from using aux), then above open doesn't exist, because if primary > domain is UNMANAGED type it essentially means the whole device fully > owned by user space or guest thus no host driver entity to ever create > aux domain. If primary domain is not UNMANAGED type, it will fail in > aux API. with this design, each device will have at most one primary > and multiple aux, i.e. just two layers. no aux-bind-to-aux thing. > > this assumption should be true in discussed usages around aux domain. > but I may overlook something... Jean? > btw Baolu just reminded me one thing which is worthy of noting here. 'primary' vs. 'aux' concept makes sense only when we look from a device p.o.v. That binding relationship is not (*should not be*) carry-and-forwarded cross devices. every domain must be explicitly attached to other devices (instead of implicitly attached being above example), and new primary/aux attribute on another device will be decided at attach time. Thanks Kevin ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
> From: j...@8bytes.org [mailto:j...@8bytes.org] > Sent: Friday, November 23, 2018 7:21 PM > > On Wed, Nov 21, 2018 at 12:40:44PM +0800, Lu Baolu wrote: > > Can you please elaborate a bit more about the concept of subdomains? > > From my point of view, an aux-domain is a normal un-managed domain > which > > has a PASID and could be attached to any ADIs through the aux-domain > > specific attach/detach APIs. It could also be attached to a device with > > the existing domain_attach/detach_device() APIs at the same time, hence > > mdev and pci devices in a vfio container could share a domain. > > Okay, let's think a bit about having aux-specific attach/detach > functions, in the end I don't insist on my proposal as long as the > IOMMU-API extensions are clean, consistent, and have well defined > semantics. > > If we have aux-domain specific attach/detach functions like > iommu_aux_domain_attach/detach(), what happens when the primary > domain > of the device is changed with iommu_attach/detach()? > > 1) Will the aux-domains stay in place? If yes, how does this > work in setups where the pasid-bound page-tables are > guest-owned and translated through the primary domain > page-tables? > > 2) Will the aux-domains be unbound too? In that case, if the > primary domain is re-used, will the aux-domains be implicitly > bound too when iommu_device_attach() is called? > > 3) Do we just disallow changing the primary domain through that > interface as long as there are aux-domains or mm_structs > bound to the device? 3) sounds like a right option. IMO Primary domain represents the primary ownership of DMA capability of the whole device. The owner (say host driver) may lend partial capability (e.g. allocating some queues) to a sub-owner (through aux-domain to a VM or mm_struct to a process), when ownership is still claimed by this owner. Primary domain switch means ownership change (e.g. from host driver to guest driver if type is changed from DMA-API to UNMANAGED), which usually implies driver load/unload thus all resource usages from previous owner must be cleaned up before actual switch happens, which include things allocated from DMA-API, aux domain API, and SVA APIs. > > Using option 2) or 3) would mean that the aux-domains are still bound to > the primary domain, but that is not reflected in the API. Further, if an Do you suggest some reflection in API interface definition, or specific API implementation? If the former I didn't get how to do it. If the latter, yes it's missed in current RFC. We should add check in necessary code path. > aux-domain is just a normal domain (struct iommu_domain), what > happens > when a domain that was used as a primary domain and has bound > aux-domains to it, is bound with iommu_aux_domain_attach() to another > domain? aux domain is essentially a way to lend partial DMA capability to less- privileged entity (process or VM), which implies that part of DMA unmanaged by kernel driver. If we can make such assumption that aux only applies to UNMANAGED domain (thus disallow BLOCKED/IDENTITY/ DMA from using aux), then above open doesn't exist, because if primary domain is UNMANAGED type it essentially means the whole device fully owned by user space or guest thus no host driver entity to ever create aux domain. If primary domain is not UNMANAGED type, it will fail in aux API. with this design, each device will have at most one primary and multiple aux, i.e. just two layers. no aux-bind-to-aux thing. this assumption should be true in discussed usages around aux domain. but I may overlook something... Jean? > > As you can see, this design decission raises a lot of questions, but > maybe we can work it out and find a solution we all agree on. > Thanks for raising those good opens! Thanks Kevin ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
Hi Joerg, On 11/23/18 7:21 PM, j...@8bytes.org wrote: On Wed, Nov 21, 2018 at 12:40:44PM +0800, Lu Baolu wrote: Can you please elaborate a bit more about the concept of subdomains? From my point of view, an aux-domain is a normal un-managed domain which has a PASID and could be attached to any ADIs through the aux-domain specific attach/detach APIs. It could also be attached to a device with the existing domain_attach/detach_device() APIs at the same time, hence mdev and pci devices in a vfio container could share a domain. Okay, let's think a bit about having aux-specific attach/detach functions, in the end I don't insist on my proposal as long as the IOMMU-API extensions are clean, consistent, and have well defined semantics. If we have aux-domain specific attach/detach functions like iommu_aux_domain_attach/detach(), what happens when the primary domain of the device is changed with iommu_attach/detach()? 1) Will the aux-domains stay in place? If yes, how does this work in setups where the pasid-bound page-tables are guest-owned and translated through the primary domain page-tables? Are you thinking about guest SVA? In this case, guest SVA will be translated through the aux domain page-tables. So, we can safely change the primary domain with aux-domains staying there. Conceptually, the primary domain means the existing domain which is for all Request ID based (a.k.a. transfers without PASID) translations. Any transfer with PASID will not go through the primary domain. 2) Will the aux-domains be unbound too? In that case, if the primary domain is re-used, will the aux-domains be implicitly bound too when iommu_device_attach() is called? 3) Do we just disallow changing the primary domain through that interface as long as there are aux-domains or mm_structs bound to the device? Using option 2) or 3) would mean that the aux-domains are still bound to the primary domain, but that is not reflected in the API. Further, if an aux-domain is just a normal domain (struct iommu_domain), what happens when a domain that was used as a primary domain and has bound aux-domains to it, is bound with iommu_aux_domain_attach() to another domain? As you can see, this design decission raises a lot of questions, but maybe we can work it out and find a solution we all agree on. Regards, Joerg Best regards, Lu Baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
Hi Jean-Philippe, On Wed, Nov 21, 2018 at 07:05:13PM +, Jean-Philippe Brucker wrote: > For the moment though, I think we should allow device drivers to use the > DMA-API at the same time as SVA. Yeah, that makes sense. > If a device driver has to map a management ring buffer for example, > it's much nicer to use dma_alloc as usual rather than have them write > their own DMA allocation routines. Which is another reason to > implement 2) above: the DMA-API would still act on the default_domain, > and attaching an "extended" domain augments it with SVA features. > Detaching from the device doesn't require copying mappings back to the > default domain. Does that sound OK? Yes, as I just wrote to Kevin, this sounds good. > struct io_mm *io_mm; > struct iommu_domain *domain; > > domain = iommu_alloc_ext_domain(bus); > > /* Set an mm-exit callback to disable PASIDs. More attributes > could be added later to change the capabilities of the ext > domain */ > iommu_domain_set_attr(domain, DOMAIN_ATTR_MM_EXIT_CB, &mm_exit); > > /* Fails if the device doesn't support this domain type */ > iommu_attach_device(domain, dev); > > /* Same as SVA v3, except a domain instead of dev as argument */ > io_mm = iommu_sva_bind_mm(domain, current->mm, ...); > > /* on success, install the PASID in the device */ > pasid = io_mm->pasid; > > /* Do more work */ > > iommu_sva_unbind_mm(io_mm); > iommu_detach_device(domain, dev); Okay, we can still discuss the naming and a few details, but that goes into the right directions. One open questions is, for example, where the pasid-allocator comes into play. As it looks the amdgpu driver needs it even without an iommu enabled. Regards, Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
Hi Kevin, On Thu, Nov 22, 2018 at 08:39:19AM +, Tian, Kevin wrote: > I agree special action needs to be taken for everything else (other than > DMA-API), but the point that I didn't get is why the action must be based > a new SVA-type domain, instead of extending default domain with SVA > capability (including related paging structures which are accessed through > new SVA APIs). In the latter case domain-wise attribute (e.g. IRQ mapping) > is naturally shared between capabilities (DMA-API and SVA). There is no > need to create cross-domain connections as two options that you listed > below. > > Can you help elaborate more about the motivation behind proposal? Yeah, thinking more about it, there are no real reasons against allowing aux and sva bindings on the default domain. It allows the host to use a device through the DMA-API and assign parts of it to a guest or a user-space process, for example. > |-default domain (DMA-API) > |-sva domain1 (SVA) > |-sva domain2 (SVA) > |-... > |-sva domainN (AUX, guest DMA-API) > | |- sva domainN1 (AUX, guest SVA) > | |- sva domainN2 (AUX, guest SVA) > | |-... > |-sva domainM (AUX, guest DMA-API) > | |- sva domainM1 (AUX, guest SVA) > | |- sva domainM2 (AUX, guest SVA) > | |-... In my proposal the sva-domains are no sub-domains of the default-domain, they exist on the same level. A device is detached from its default domain and attached to an sva-domain. > means same domain can be default on deviceA while being aux on deviceB > (when assigning pci and mdev to same VM). As already written in another mail, this raises a couple of questions, like what happens when the aux-domain itself has other aux-domains bound to it? Regards, Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
On Wed, Nov 21, 2018 at 12:40:44PM +0800, Lu Baolu wrote: > Can you please elaborate a bit more about the concept of subdomains? > From my point of view, an aux-domain is a normal un-managed domain which > has a PASID and could be attached to any ADIs through the aux-domain > specific attach/detach APIs. It could also be attached to a device with > the existing domain_attach/detach_device() APIs at the same time, hence > mdev and pci devices in a vfio container could share a domain. Okay, let's think a bit about having aux-specific attach/detach functions, in the end I don't insist on my proposal as long as the IOMMU-API extensions are clean, consistent, and have well defined semantics. If we have aux-domain specific attach/detach functions like iommu_aux_domain_attach/detach(), what happens when the primary domain of the device is changed with iommu_attach/detach()? 1) Will the aux-domains stay in place? If yes, how does this work in setups where the pasid-bound page-tables are guest-owned and translated through the primary domain page-tables? 2) Will the aux-domains be unbound too? In that case, if the primary domain is re-used, will the aux-domains be implicitly bound too when iommu_device_attach() is called? 3) Do we just disallow changing the primary domain through that interface as long as there are aux-domains or mm_structs bound to the device? Using option 2) or 3) would mean that the aux-domains are still bound to the primary domain, but that is not reflected in the API. Further, if an aux-domain is just a normal domain (struct iommu_domain), what happens when a domain that was used as a primary domain and has bound aux-domains to it, is bound with iommu_aux_domain_attach() to another domain? As you can see, this design decission raises a lot of questions, but maybe we can work it out and find a solution we all agree on. Regards, Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
> From: j...@8bytes.org [mailto:j...@8bytes.org] > Sent: Monday, November 12, 2018 10:56 PM > > Hi Jean-Philippe, > > On Thu, Nov 08, 2018 at 06:29:42PM +, Jean-Philippe Brucker wrote: > > (1) My initial approach would have been to use the same page tables for > > the default_domain and this new domain, but it might be precisely what > > you were trying to avoid with this new proposal: a device attached to > > two domains at the same time. > > My request was about the initial assumptions a device driver can make > about a device. This assumptions is that DMA-API is set up and > initialized for it, for everything else (like SVA) the device driver > needs to take special action, like allocating an SVA domain and attaching > the device to it. Hi, Joerg, I agree special action needs to be taken for everything else (other than DMA-API), but the point that I didn't get is why the action must be based a new SVA-type domain, instead of extending default domain with SVA capability (including related paging structures which are accessed through new SVA APIs). In the latter case domain-wise attribute (e.g. IRQ mapping) is naturally shared between capabilities (DMA-API and SVA). There is no need to create cross-domain connections as two options that you listed below. Can you help elaborate more about the motivation behind proposal? P.S. as you may see from other threads, we support same IOMMU capabilities (both IOVA and SVA) on vfio-mdev (i.e. aux domain) as ones on vfio-pci. which leads to below situation following your proposal, if 'AUX' is treated as a separate capability as SVA: |-default domain (DMA-API) |-sva domain1 (SVA) |-sva domain2 (SVA) |-... |-sva domainN (AUX, guest DMA-API) | |- sva domainN1 (AUX, guest SVA) | |- sva domainN2 (AUX, guest SVA) | |-... |-sva domainM (AUX, guest DMA-API) | |- sva domainM1 (AUX, guest SVA) | |- sva domainM2 (AUX, guest SVA) | |-... Is that what is in your mind? would it cause unnecessary complexities on handling those layers? The hierarchy could be simplified if we allow aux domain to carry both DMA-API and SVA capability: |-default domain (DMA-API) |-sva domain1 (SVA) |-sva domain2 (SVA) |-... |-sva domainN (AUX, guest DMA-API, guest SVA) |-sva domainM (AUX, guest DMA-API, guest SVA) However doing so cause inconsistency on domain definition. why cannot default domain have full capability but aux domain can? Further moving along that road, we could have below (as proposed in this series): |-default domain (DMA-API, SVA) |-sva domainN (AUX, guest DMA-API, guest SVA) |-sva domainM (AUX, guest DMA-API, guest SVA) (whether putting aux domains as sub-domain or same level is not big deal) There each domain could support full DMA translation capabilities (IOVA, SVA, GPA, GIOVA, etc) with supporting structures together (1st level, 2nd level, irq mapping, etc.), and new APIs are invented to utilized new capabilities beyond existing DMA-API in each domain. 'AUX' here becomes a orthogonal way to those DMA translation capabilities, and is chosen at domain attach time based on device configuration. means same domain can be default on deviceA while being aux on deviceB (when assigning pci and mdev to same VM). Thoughts? :-) > > This should of course not break any IRQ setups for the device, and also > enforcing an ordering is not a good and maintainable solution. > > So what you could do here is to either: > > 1) Save the needed IRQ mappings in some extra datastructure and > duplicate it in the SVA domain. This also makes it easier to > use the same SVA domain with multiple devices. > > 2) Just re-use the 'page-tables' from the default domain in the > sva-domain. This needs to happen at attach-time, because at > allocation time you don't know the device yet. > > I think 1) is the best solution, what do you think? > > Btw, things would be different if we could expose SVA through the > DMA-API to drivers. In this case we could just make the default domain > of type SVA and be done, but we are not there yet. > > Regards, Thanks Kevin ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
On 12/11/2018 14:55, j...@8bytes.org wrote: > Hi Jean-Philippe, > > On Thu, Nov 08, 2018 at 06:29:42PM +, Jean-Philippe Brucker wrote: >> (1) My initial approach would have been to use the same page tables for >> the default_domain and this new domain, but it might be precisely what >> you were trying to avoid with this new proposal: a device attached to >> two domains at the same time. > > My request was about the initial assumptions a device driver can make > about a device. This assumptions is that DMA-API is set up and > initialized for it, for everything else (like SVA) the device driver > needs to take special action, like allocating an SVA domain and attaching > the device to it. > > This should of course not break any IRQ setups for the device, and also > enforcing an ordering is not a good and maintainable solution. > > So what you could do here is to either: > > 1) Save the needed IRQ mappings in some extra datastructure and > duplicate it in the SVA domain. This also makes it easier to > use the same SVA domain with multiple devices. > > 2) Just re-use the 'page-tables' from the default domain in the > sva-domain. This needs to happen at attach-time, because at > allocation time you don't know the device yet. > > I think 1) is the best solution, what do you think? Right now 2) seems more feasible, because changes will be limited to the IOMMU driver and don't spill in the dma-iommu layer. Maybe it will be clearer to me once I write a prototype. > Btw, things would be different if we could expose SVA through the > DMA-API to drivers. In this case we could just make the default domain > of type SVA and be done, but we are not there yet. For the moment though, I think we should allow device drivers to use the DMA-API at the same time as SVA. If a device driver has to map a management ring buffer for example, it's much nicer to use dma_alloc as usual rather than have them write their own DMA allocation routines. Which is another reason to implement 2) above: the DMA-API would still act on the default_domain, and attaching an "extended" domain augments it with SVA features. Detaching from the device doesn't require copying mappings back to the default domain. Does that sound OK? Then if vfio-pci wants to use SVA, maybe we can make the UNMANAGED domain support SVA by default? Otherwise introduce a UNMANAGED+ext domain. Anyway, that's for later. This is what I currently plan to add: struct io_mm *io_mm; struct iommu_domain *domain; domain = iommu_alloc_ext_domain(bus); /* Set an mm-exit callback to disable PASIDs. More attributes could be added later to change the capabilities of the ext domain */ iommu_domain_set_attr(domain, DOMAIN_ATTR_MM_EXIT_CB, &mm_exit); /* Fails if the device doesn't support this domain type */ iommu_attach_device(domain, dev); /* Same as SVA v3, except a domain instead of dev as argument */ io_mm = iommu_sva_bind_mm(domain, current->mm, ...); /* on success, install the PASID in the device */ pasid = io_mm->pasid; /* Do more work */ iommu_sva_unbind_mm(io_mm); iommu_detach_device(domain, dev); Thanks, Jean ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
Hi Joerg, On 11/8/18 12:43 AM, j...@8bytes.org wrote: Hi, On Wed, Nov 07, 2018 at 11:40:30AM +0800, Lu Baolu wrote: Hi Joerg, On 11/7/18 12:25 AM, j...@8bytes.org wrote: Nowadays, we can find PASID granular DMA translation on both ARM and x86 platforms. With PASID granular DMA translation supported in system iommu, if a device divides itself into multiple subsets and tag the DMA transfers of each subset with a unique PASID, each subset become assignable. We call the assignable subset as an ADI (Assignable Device Interface). As the result, each ADI could be attached with a domain. Yeah, I know the background. The point is, the IOMMU-API as of today implements a strict one-to-one relationship between domains and devices, every device can only have one domain assigned at a given time. If we assign a new domain to a device, the old gets unassigned. If we allow to attach multiple domains to a single device we fundamentally break that semantic. Yes. In the latest v4 submission, we use the aux-domain specific APIs to attach/detach the domain to a device. Do you see other APIs that will possibly break this semantic? Therefore I think it is better is support the ADI devices with subdomains and a new set of functions in the API to handle only these sub-domains. Can you please elaborate a bit more about the concept of subdomains? From my point of view, an aux-domain is a normal un-managed domain which has a PASID and could be attached to any ADIs through the aux-domain specific attach/detach APIs. It could also be attached to a device with the existing domain_attach/detach_device() APIs at the same time, hence mdev and pci devices in a vfio container could share a domain. Best regards, Lu Baolu Further more, a single domain might be attached to an ADI of device A, while attached to another legacy device B which doesn't support PASID features. In this case, we say "Domain 4" is attached to ADI(PASID#x) in aux mode and attached to device B in primary mode. This will of course not work with subdomains, but is that really necessary? VFIO already allocates a separate domain for each device anyway (iirc), so it is unlikely that is uses the same domain for a legacy and an ADI device. Regards, Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
Hi Jean-Philippe, On Thu, Nov 08, 2018 at 06:29:42PM +, Jean-Philippe Brucker wrote: > (1) My initial approach would have been to use the same page tables for > the default_domain and this new domain, but it might be precisely what > you were trying to avoid with this new proposal: a device attached to > two domains at the same time. My request was about the initial assumptions a device driver can make about a device. This assumptions is that DMA-API is set up and initialized for it, for everything else (like SVA) the device driver needs to take special action, like allocating an SVA domain and attaching the device to it. This should of course not break any IRQ setups for the device, and also enforcing an ordering is not a good and maintainable solution. So what you could do here is to either: 1) Save the needed IRQ mappings in some extra datastructure and duplicate it in the SVA domain. This also makes it easier to use the same SVA domain with multiple devices. 2) Just re-use the 'page-tables' from the default domain in the sva-domain. This needs to happen at attach-time, because at allocation time you don't know the device yet. I think 1) is the best solution, what do you think? Btw, things would be different if we could expose SVA through the DMA-API to drivers. In this case we could just make the default domain of type SVA and be done, but we are not there yet. Regards, Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
Hi, On 06/11/2018 16:25, j...@8bytes.org wrote: > On Mon, Oct 22, 2018 at 12:50:56PM +0100, Robin Murphy wrote: >> To me, that sounds like a very good argument for having separate "attach as >> primary domain" and "attach as aux domain" APIs. > > I agree with that, overloading iommu_attach_device() to support > aux-domains is not going to be a maintainable solution. I'd like this > to be a two-level approach, where the aux-domains are sub-domains of a > primary domain (naming is still to-be-improved): > > > struct iommu_domain *domain; > > domain = iommu_alloc_aux_domain(); > > iommu_attach_device(pdev, domain); /* Fails if device has not > support for this > domain-type */ I'm still not sure how this should interact with MSI on Arm. On x86 the MSI doorbell is a reserved IOVA region in the IOMMU, but for Arm it's a MMIO register in a separate IRQ chip, upstream of the SMMU. From the SMMU point of view, MSI is a normal DMA without PASID, and the corresponding IOMMU mapping must be present. SMMU mappings for the MSI doorbell are created by the IRQ driver, when the device driver requests an IRQ. Roughly speaking it goes like: request_irq() irq_chip_compose_msi_msg() iommu_dma_map_msi_msg() iommu_map(iommu_get_domain_for_dev(dev), ...) Currently for "normal" device drivers, this will create the MSI mapping on the default_domain (which is created before the drivers' probe()). Device drivers that attach an unmanaged domain *and* use MSIs - only VFIO so far - check if the MSI needs to be mapped, and then call iommu_get_msi_cookie() before requesting IRQs. With my previous SVA proposals the default_domain was used for SVA, so we didn't need any change. But now, depending on the order between the IRQ request and this attach(), iommu_get_domain_for_dev() will return either default_domain or the new domain. (1) My initial approach would have been to use the same page tables for the default_domain and this new domain, but it might be precisely what you were trying to avoid with this new proposal: a device attached to two domains at the same time. (2) Another solution is to enforce an order between request_irq() and iommu_attach_device() for all device drivers that intend to use SVA, and have them do the MSI mapping themselves between attaching the domain and requesting IRQs. I don't really like either but to make things easier for device drivers, I would go with option (1). Thanks, Jean > > /* Bind a process address space to the aux-domain */ > sva_handle = iommu_sva_bind_mm(domain, current, ...); > pasid = iommu_sva_get_pasid(sva_handle); > > mdev_handle = iommu_mdev_alloc_domain(domain); > iommu_mdev_map(mdev_handle, ...); > iommu_mdev_unmap(mdev_handle, ...); > > /* Do more work */ > > iommu_sva_unbind_mm(sva_handle); > > iommu_mdev_free_domain(mdev_handle); > > iommu_detach_device(domain, pdev); > > > Regards, > > Joerg > ___ > iommu mailing list > iommu@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/iommu > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
On Wed, 7 Nov 2018 17:43:23 +0100 "j...@8bytes.org" wrote: > Hi, > > On Wed, Nov 07, 2018 at 11:40:30AM +0800, Lu Baolu wrote: > > Hi Joerg, > > > > On 11/7/18 12:25 AM, j...@8bytes.org wrote: > > Nowadays, we can find PASID granular DMA translation on both ARM and x86 > > platforms. With PASID granular DMA translation supported in system iommu, if > > a device divides itself into multiple subsets and tag the DMA > > transfers of each subset with a unique PASID, each subset become > > assignable. We call the assignable subset as an ADI (Assignable Device > > Interface). As the result, each ADI could be attached with a domain. > > Yeah, I know the background. The point is, the IOMMU-API as of today > implements a strict one-to-one relationship between domains and devices, > every device can only have one domain assigned at a given time. If we > assign a new domain to a device, the old gets unassigned. > > If we allow to attach multiple domains to a single device we > fundamentally break that semantic. > > Therefore I think it is better is support the ADI devices with > subdomains and a new set of functions in the API to handle only these > sub-domains. > > > Further more, a single domain might be attached to an ADI of device A, > > while attached to another legacy device B which doesn't support PASID > > features. In this case, we say "Domain 4" is attached to ADI(PASID#x) in > > aux mode and attached to device B in primary mode. > > This will of course not work with subdomains, but is that really > necessary? VFIO already allocates a separate domain for each device > anyway (iirc), so it is unlikely that is uses the same domain for a > legacy and an ADI device. VFIO will attempt to use the same domain for all groups within a container, only falling back to separate domains if there's an incompatibility. Multiple domains means that we need to mirror mapping and unmapping across each domain, so there's more work to do and theoretically more overhead in the IOMMU implementation assuming those domains land on the same IOMMU driver. Thanks, Alex ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
Hi, On Wed, Nov 07, 2018 at 11:40:30AM +0800, Lu Baolu wrote: > Hi Joerg, > > On 11/7/18 12:25 AM, j...@8bytes.org wrote: > Nowadays, we can find PASID granular DMA translation on both ARM and x86 > platforms. With PASID granular DMA translation supported in system iommu, if > a device divides itself into multiple subsets and tag the DMA > transfers of each subset with a unique PASID, each subset become > assignable. We call the assignable subset as an ADI (Assignable Device > Interface). As the result, each ADI could be attached with a domain. Yeah, I know the background. The point is, the IOMMU-API as of today implements a strict one-to-one relationship between domains and devices, every device can only have one domain assigned at a given time. If we assign a new domain to a device, the old gets unassigned. If we allow to attach multiple domains to a single device we fundamentally break that semantic. Therefore I think it is better is support the ADI devices with subdomains and a new set of functions in the API to handle only these sub-domains. > Further more, a single domain might be attached to an ADI of device A, > while attached to another legacy device B which doesn't support PASID > features. In this case, we say "Domain 4" is attached to ADI(PASID#x) in > aux mode and attached to device B in primary mode. This will of course not work with subdomains, but is that really necessary? VFIO already allocates a separate domain for each device anyway (iirc), so it is unlikely that is uses the same domain for a legacy and an ADI device. Regards, Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
Hi Joerg, On 11/7/18 12:25 AM, j...@8bytes.org wrote: On Mon, Oct 22, 2018 at 12:50:56PM +0100, Robin Murphy wrote: To me, that sounds like a very good argument for having separate "attach as primary domain" and "attach as aux domain" APIs. I agree with that, overloading iommu_attach_device() to support aux-domains is not going to be a maintainable solution. I'd like this to be a two-level approach, where the aux-domains are sub-domains of a primary domain (naming is still to-be-improved): struct iommu_domain *domain; domain = iommu_alloc_aux_domain(); iommu_attach_device(pdev, domain); /* Fails if device has not support for this domain-type */ /* Bind a process address space to the aux-domain */ sva_handle = iommu_sva_bind_mm(domain, current, ...); pasid = iommu_sva_get_pasid(sva_handle); mdev_handle = iommu_mdev_alloc_domain(domain); iommu_mdev_map(mdev_handle, ...); iommu_mdev_unmap(mdev_handle, ...); /* Do more work */ iommu_sva_unbind_mm(sva_handle); iommu_mdev_free_domain(mdev_handle); iommu_detach_device(domain, pdev); Aux-domain is not used for sva case. Please allow me to introduce some backgrounds of the aux-domain. Previously we have RID (Request ID) granular DMA translation, hence only a single domain might be attached to a device. The same domain could be attached to multiple devices if they are configured to be assigned to a same application (or VM). .--..--. | PCI device A |.--.| PCI device B | | |<---| Domain |--->| | | |'--'| | | || | | || | | || | | || | '--''--' Nowadays, we can find PASID granular DMA translation on both ARM and x86 platforms. With PASID granular DMA translation supported in system iommu, if a device divides itself into multiple subsets and tag the DMA transfers of each subset with a unique PASID, each subset become assignable. We call the assignable subset as an ADI (Assignable Device Interface). As the result, each ADI could be attached with a domain. So for a device described above, a primary domain will be attached to it for DMA transfers without PASID, and an aux-domain might be attached to each ADI for transfers tagged with a PASID#xyz. .--. .--. | PCI device A |<| Primary Domain | | | '--' .--. .--. | ADI(PASID#x) |<| Aux Domain1 | .--. .--. | ADI(PASID#y) |<| Aux Domain2 | .--. .--. | ADI(PASID#z) |<| Aux Domain3 | '--' '--' | | '--' Conceptually, if we look a domain as an address boundary of each assignable element, there is no difference between a primary domain and an aux domain. The difference exists only in the vendor specific iommu driver, say primary domains and aux domains use different translation tables. Further more, a single domain might be attached to an ADI of device A, while attached to another legacy device B which doesn't support PASID features. In this case, we say "Domain 4" is attached to ADI(PASID#x) in aux mode and attached to device B in primary mode. .--. .--. .---. | PCI device A |<| Primary Domain | | PCI device B | | | '--' | | .--. .--. | | | ADI(PASID#x) |<| Domain 4 |>| | .--. .--. | | | ADI(PASID#y) |<| Aux Domain2 | | | .--. .--. | | | ADI(PASID#z) |<| Aux Domain3 | | | '--' '--' | | | | | | '--' '---' Based on this, actually we can reuse all existing domain APIs for aux domain. iommu_domain_alloc() iommu_domain_free() iommu_attach_device() iommu_detach_device() iommu_map() iommu_unmap() During discussion, we found that reusing iommu_at(de)tach_device() will cause some changes i
Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
On Mon, Oct 22, 2018 at 12:50:56PM +0100, Robin Murphy wrote: > To me, that sounds like a very good argument for having separate "attach as > primary domain" and "attach as aux domain" APIs. I agree with that, overloading iommu_attach_device() to support aux-domains is not going to be a maintainable solution. I'd like this to be a two-level approach, where the aux-domains are sub-domains of a primary domain (naming is still to-be-improved): struct iommu_domain *domain; domain = iommu_alloc_aux_domain(); iommu_attach_device(pdev, domain); /* Fails if device has not support for this domain-type */ /* Bind a process address space to the aux-domain */ sva_handle = iommu_sva_bind_mm(domain, current, ...); pasid = iommu_sva_get_pasid(sva_handle); mdev_handle = iommu_mdev_alloc_domain(domain); iommu_mdev_map(mdev_handle, ...); iommu_mdev_unmap(mdev_handle, ...); /* Do more work */ iommu_sva_unbind_mm(sva_handle); iommu_mdev_free_domain(mdev_handle); iommu_detach_device(domain, pdev); Regards, Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
Hi, On 10/23/18 12:48 AM, Jordan Crouse wrote: On Fri, Oct 19, 2018 at 07:11:52PM +0100, Jean-Philippe Brucker wrote: (2) Allocate a domain and attach it to the device. dom = iommu_domain_alloc() iommu_attach_device(dom, dev) I still have concerns about this part, which are highlighted by the messy changes of patch 1. I think it would make more sense to introduce new attach/detach_dev_aux() functions instead of reusing attach/detach_dev() Can we reconsider this and avoid unnecessary complications in IOMMU core and drivers? Does the VFIO code greatly benefit from using the same attach() function? It could as well use a different one for devices in AUXD mode, which the mediating driver could tell by adding a flag in mdev_set_iommu_device(), for example. And I don't think other users of AUXD would benefit from using the same attach() function, since they will know whether they want to be using main or auxiliary domain when doing attach(). I second this. For the arm-smmu-v2 multiple-pagetable model we would either need new API functions or do something along the lines of dom = iommu_domain_alloc() iommu_set_attr(dom, DOMAIN_ATTR_I_WANT_TO_BE_AUX) iommu_attach_device(dom,dev) Either that or do some some dummy device magic that I haven't started to work out in my head. Having aux specific functions just for this step would make the arm-smmu-v2 version work out much cleaner. Okay! If there are no more comments, I will prepare a new version with aux specific functions. Best regards, Lu Baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
Hi, On 10/23/18 12:03 AM, Jean-Philippe Brucker wrote: #1: Given that PASID is a system wide resource, during boot iommu driver needs to scan and set the max PASID to be no more than min(iommu_supported_max_pasid) of all available IOMMU's. #2: In addition if any device supports less than the choosen system max PASID we should simply disable PASID on that device. The reason is if the process were to bind to 2 or more accelerators and the second device has a limit smaller than the first that the application already attached, the second attemt to bind would fail. (Because we would use the same PASID for a single process) But that's not reason enough to completely disable PASID for the device, it could be the only one bound to that process, or PASID could be only used privately by the host device driver. In addition for Intel IOMMU's PASID is a system wide resource. We have a virtual capability for vIOMMU's to allocate PASID's. At the time of allocation we don't know what device this PASID is even being allocated for. Ah, I had missed that part, I thought the PV allocation had the device ID as well. That's a significant change. I was still hoping that we could go back to per-device PASID spaces in the host, since I still haven't seen any convincing argument in favor of system-wide PASID. Get rid of #1 in order to solve #2, and as a bonus support more PASIDs in total. Until now PASID was a per-device resource except for choices made when writing the IOMMU driver. System wide PASID makes things simple and workable in many use cases although it looks a bit coarse grained. For an example, in a use case of one application manipulating multiple devices with a single PASID, it will be complicated to find a PASID suitable for all the per-device PASID spaces. It will become more complicated when it comes to PV allocation since PV context might have no device information when it requires a PASID. Best regards, Lu Baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
> From: Raj, Ashok > Sent: Wednesday, October 24, 2018 1:17 AM > > > > > But that's not reason enough to completely disable PASID for the > > device, > > it could be the only one bound to that process, or PASID could be > > only > > used privately by the host device driver. > > Agree, there could be other use cases. > > If the device was already attached during boot the driver comes early > to get the low PASID space. If the device was hot-added and the PASID > supported by device wasn't available its going to fail. > > Enforcing something that will always work will be more reliable. But i > agree it maybe too strict for some cases. > > Maybe its a IOMMU enforced limit for the platform on the minimum > requirement for consistency. > what about keeping the flexibility in common logic (e.g. allowing pasid_min and pasid_max in PASID allocator, detecting PASID width limitation at binding time, etc.), while letting vendor driver to vote disabling PASID on device based on its own need (e.g. if not supporting 20bit width). Thanks kevin ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
On Mon, 2018-10-22 at 17:03 +0100, Jean-Philippe Brucker wrote: > On 22/10/2018 11:07, Raj, Ashok wrote: > > > For my own convenience I've been using the SVA infrastructure > > > since > > > I already had the locking and IOMMU ops in place. The > > > proposed > > > interface is also missing min_pasid and max_pasid parameters, > > > which > > > could be needed by device drivers to enforce PASID limits. > > > > Variable PASID limits per-device is hard to manage. I suppose we > > can play > > some games to get this right, but I suspect that wont be very > > useful in > > the long run. > > Devices may have PASID limits that are not discoverable with the PCI > PASID capability (https://patchwork.kernel.org/patch/9989307/#2138957 > 1). > Even if we simply reject devices that don't support enough PASIDs, I > think it's still better to return an error on bind or init_device > than > to return a valid PASID that they can't use That sounds reasonable. What I meant was that trying to do similar things like what we do for IOVA (Reserving from high etc) may not work when a process might need to share the same PASID between 2 accelerators. > > > #1: Given that PASID is a system wide resource, during boot iommu > > driver needs > > to scan and set the max PASID to be no more than > > min(iommu_supported_max_pasid) > > of all available IOMMU's. > > > > #2: In addition if any device supports less than the choosen system > > max PASID > > we should simply disable PASID on that device. > > > > The reason is if the process were to bind to 2 or more accelerators > > and > > the second device has a limit smaller than the first that the > > application > > already attached, the second attemt to bind would fail. (Because > > we would use the same PASID for a single process) > > But that's not reason enough to completely disable PASID for the > device, > it could be the only one bound to that process, or PASID could be > only > used privately by the host device driver. Agree, there could be other use cases. If the device was already attached during boot the driver comes early to get the low PASID space. If the device was hot-added and the PASID supported by device wasn't available its going to fail. Enforcing something that will always work will be more reliable. But i agree it maybe too strict for some cases. Maybe its a IOMMU enforced limit for the platform on the minimum requirement for consistency. > > > In addition for Intel IOMMU's PASID is a system wide resource. We > > have > > a virtual capability for vIOMMU's to allocate PASID's. At the time > > of > > allocation we don't know what device this PASID is even being > > allocated > > for. > > Ah, I had missed that part, I thought the PV allocation had the > device > ID as well. That's a significant change. > > I was still hoping that we could go back to per-device PASID spaces > in > the host, since I still haven't seen any convincing argument in favor > of > system-wide PASID. Get rid of #1 in order to solve #2, and as a bonus > support more PASIDs in total. Until now PASID was a per-device > resource > except for choices made when writing the IOMMU driver. A global shared PASID table is designed with a future ISA extension for PASID isolation. > > > Certainly we could add other intelligence to pass a hint for > > max_pasid in the virtiual interface. > > > > I suppose when this becomes real, most serious accelerators will > > support the full width. > > I don't know if that's obvious from the device perspective. If I had > to > implement one, I would simply size my PASIDs to the number of > contexts > supported by my device (which might be especially small in the > embedded > space), given that technically nothing prevents software from > supporting > that and the specification allows me to choose a width. That's very reasonable. But supporting a smaller number of contexts is different from supporting smaller number of PASID's. A device could support the full PASID width, but limit the number of contexts to a smaller number. Certainly if the device vendors don't know upfront that could be an issue. > > This was the intended model for PCI, but thankfully version 4.0 added > an > implementation note (6.20.2.1 - PASID Field) advising against this > approach, and to instead support 20 bits for interoperability. Maybe > it > will set a trend for non-PCI devices as well. That's a great suggestion. In fact we have already made this suggestion to our PCI WG rep. For Scalable IOV devices the white-paper we published should hint that we expect devices to support the full 20 bit PASID width. > > Thanks, > Jean ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
On 22/10/2018 12:50, Robin Murphy wrote: > To me, that sounds like a very good argument for having separate "attach > as primary domain" and "attach as aux domain" APIs. Say a driver wants > to use multiple aux domains itself to isolate logically-separate > transaction streams, but still also needs to control the main domain for > transactions without PASID (interrupts?) Yes, MSIs don't have a PASID (well, shouldn't). So in Arm systems, where the address of the IRQ chip is translated by the SMMU, they are mapped in the main domain. Thanks, Jean ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
On Fri, Oct 19, 2018 at 07:11:52PM +0100, Jean-Philippe Brucker wrote: > (2) Allocate a domain and attach it to the device. > > dom = iommu_domain_alloc() > iommu_attach_device(dom, dev) > > I still have concerns about this part, which are highlighted by the > messy changes of patch 1. I think it would make more sense to > introduce new attach/detach_dev_aux() functions instead of reusing > attach/detach_dev() > > Can we reconsider this and avoid unnecessary complications in IOMMU > core and drivers? Does the VFIO code greatly benefit from using the > same attach() function? It could as well use a different one for > devices in AUXD mode, which the mediating driver could tell by > adding a flag in mdev_set_iommu_device(), for example. > > And I don't think other users of AUXD would benefit from using the > same attach() function, since they will know whether they want to be > using main or auxiliary domain when doing attach(). I second this. For the arm-smmu-v2 multiple-pagetable model we would either need new API functions or do something along the lines of dom = iommu_domain_alloc() iommu_set_attr(dom, DOMAIN_ATTR_I_WANT_TO_BE_AUX) iommu_attach_device(dom,dev) Either that or do some some dummy device magic that I haven't started to work out in my head. Having aux specific functions just for this step would make the arm-smmu-v2 version work out much cleaner. Jordan -- The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
On 22/10/2018 11:07, Raj, Ashok wrote: >> For my own convenience I've been using the SVA infrastructure since >> I already had the locking and IOMMU ops in place. The proposed >> interface is also missing min_pasid and max_pasid parameters, which >> could be needed by device drivers to enforce PASID limits. > > Variable PASID limits per-device is hard to manage. I suppose we can play > some games to get this right, but I suspect that wont be very useful in > the long run. Devices may have PASID limits that are not discoverable with the PCI PASID capability (https://patchwork.kernel.org/patch/9989307/#21389571). Even if we simply reject devices that don't support enough PASIDs, I think it's still better to return an error on bind or init_device than to return a valid PASID that they can't use > #1: Given that PASID is a system wide resource, during boot iommu driver > needs > to scan and set the max PASID to be no more than > min(iommu_supported_max_pasid) > of all available IOMMU's. > > #2: In addition if any device supports less than the choosen system max PASID > we should simply disable PASID on that device. > > The reason is if the process were to bind to 2 or more accelerators and > the second device has a limit smaller than the first that the application > already attached, the second attemt to bind would fail. (Because > we would use the same PASID for a single process) But that's not reason enough to completely disable PASID for the device, it could be the only one bound to that process, or PASID could be only used privately by the host device driver. > In addition for Intel IOMMU's PASID is a system wide resource. We have > a virtual capability for vIOMMU's to allocate PASID's. At the time of > allocation we don't know what device this PASID is even being allocated > for. Ah, I had missed that part, I thought the PV allocation had the device ID as well. That's a significant change. I was still hoping that we could go back to per-device PASID spaces in the host, since I still haven't seen any convincing argument in favor of system-wide PASID. Get rid of #1 in order to solve #2, and as a bonus support more PASIDs in total. Until now PASID was a per-device resource except for choices made when writing the IOMMU driver. > Certainly we could add other intelligence to pass a hint for > max_pasid in the virtiual interface. > > I suppose when this becomes real, most serious accelerators will > support the full width. I don't know if that's obvious from the device perspective. If I had to implement one, I would simply size my PASIDs to the number of contexts supported by my device (which might be especially small in the embedded space), given that technically nothing prevents software from supporting that and the specification allows me to choose a width. This was the intended model for PCI, but thankfully version 4.0 added an implementation note (6.20.2.1 - PASID Field) advising against this approach, and to instead support 20 bits for interoperability. Maybe it will set a trend for non-PCI devices as well. Thanks, Jean ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
On Mon, Oct 22, 2018 at 12:50:56PM +0100, Robin Murphy wrote: > On 22/10/2018 07:53, Tian, Kevin wrote: > >>From: Jean-Philippe Brucker [mailto:jean-philippe.bruc...@arm.com] > >>Sent: Saturday, October 20, 2018 2:12 AM > >> > >>This is a first prototype adding auxiliary domain support to Arm SMMUv3, > >>following Lu Baolu's latest proposal for IOMMU aware mediated devices > >>[1]. It works, but the attach() API still doesn't feel right. See (2) > >>below. > >> > >>Patch 1 adapts iommu.c to the current proposal for auxiliary domains. > >>Patches 2-4 rework the PASID allocator to make it usable for SVA and > >>AUXD simultaneously. Patches 5-6 add AUXD support to SMMUv3. > >> > >> > >>When a device can have multiple address space, for instance with PCI > >>PASID, an auxiliary domain (AUXD) is the IOMMU representation of one > >>address space. I distinguish auxiliary from "main" domain, which > >>represents the non-PASID address space but also (at least for SMMUv3) > >>the whole device context, PASID tables etc. > > > >I'd like to clarify a bit. :-) > > > >a domain can always represent one or more address spaces, where an > >address space could be for IOVA or GPA and/or other address spaces for > >SVA. Address spaces may be chained together in so-called nested mode. > > > >a domain can be associated with a full device (in BDF granular), and/or > >with a partition of a device (say in PASID granular). In the former case, > >the domain becomes the master (or primary) domain of the device. In > >the latter case, the domain becomes the auxiliary domain of the device. > > > >vendor domain structure may include vendor-specific states which > >are applied to device context at attach time, e.g. PASID table (SMMUv3) > >if representing as master domain, or PASID value (VT-d scalable mode) > >if representing as auxiliary domain. > > > >so the domain definition is never changed with the introduction of > >AUXD. 'auxiliary' is a per-device attribute which takes effect when at > >domain attach time. A domain is perfectly sane to represent as a > >master domain to deviceA and an auxiliary domain to deviceB at the > >same time (say when device A and a mdev on deviceB are assigned to > >same VM), while supporting one or more address spaces. > > To me, that sounds like a very good argument for having separate > "attach as primary domain" and "attach as aux domain" APIs. Say a > driver wants to use multiple aux domains itself to isolate > logically-separate transaction streams, but still also needs to > control the main domain for transactions without PASID (interrupts?) > - having to juggle some invisible device state around multiple > iommu_{attach,detach}_group() calls which look identical but are > expected to behave differently at runtime sounds like a recipe for > unmaintainable code. > > I don't think it's necessarily safe to assume that > one-struct-device-per-PASID will be the only usage model. This feels exactly like the QCOM GPU model - multiple aux domains for individual process pagetables but the main domain for TTBR1 (global) allocations. Jordan -- The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
On 22/10/2018 07:53, Tian, Kevin wrote: From: Jean-Philippe Brucker [mailto:jean-philippe.bruc...@arm.com] Sent: Saturday, October 20, 2018 2:12 AM This is a first prototype adding auxiliary domain support to Arm SMMUv3, following Lu Baolu's latest proposal for IOMMU aware mediated devices [1]. It works, but the attach() API still doesn't feel right. See (2) below. Patch 1 adapts iommu.c to the current proposal for auxiliary domains. Patches 2-4 rework the PASID allocator to make it usable for SVA and AUXD simultaneously. Patches 5-6 add AUXD support to SMMUv3. When a device can have multiple address space, for instance with PCI PASID, an auxiliary domain (AUXD) is the IOMMU representation of one address space. I distinguish auxiliary from "main" domain, which represents the non-PASID address space but also (at least for SMMUv3) the whole device context, PASID tables etc. I'd like to clarify a bit. :-) a domain can always represent one or more address spaces, where an address space could be for IOVA or GPA and/or other address spaces for SVA. Address spaces may be chained together in so-called nested mode. a domain can be associated with a full device (in BDF granular), and/or with a partition of a device (say in PASID granular). In the former case, the domain becomes the master (or primary) domain of the device. In the latter case, the domain becomes the auxiliary domain of the device. vendor domain structure may include vendor-specific states which are applied to device context at attach time, e.g. PASID table (SMMUv3) if representing as master domain, or PASID value (VT-d scalable mode) if representing as auxiliary domain. so the domain definition is never changed with the introduction of AUXD. 'auxiliary' is a per-device attribute which takes effect when at domain attach time. A domain is perfectly sane to represent as a master domain to deviceA and an auxiliary domain to deviceB at the same time (say when device A and a mdev on deviceB are assigned to same VM), while supporting one or more address spaces. To me, that sounds like a very good argument for having separate "attach as primary domain" and "attach as aux domain" APIs. Say a driver wants to use multiple aux domains itself to isolate logically-separate transaction streams, but still also needs to control the main domain for transactions without PASID (interrupts?) - having to juggle some invisible device state around multiple iommu_{attach,detach}_group() calls which look identical but are expected to behave differently at runtime sounds like a recipe for unmaintainable code. I don't think it's necessarily safe to assume that one-struct-device-per-PASID will be the only usage model. Robin. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
On Fri, Oct 19, 2018 at 07:11:52PM +0100, Jean-Philippe Brucker wrote: > This is a first prototype adding auxiliary domain support to Arm SMMUv3, > following Lu Baolu's latest proposal for IOMMU aware mediated devices > [1]. It works, but the attach() API still doesn't feel right. See (2) > below. > > Patch 1 adapts iommu.c to the current proposal for auxiliary domains. > Patches 2-4 rework the PASID allocator to make it usable for SVA and > AUXD simultaneously. Patches 5-6 add AUXD support to SMMUv3. > > > When a device can have multiple address space, for instance with PCI > PASID, an auxiliary domain (AUXD) is the IOMMU representation of one > address space. I distinguish auxiliary from "main" domain, which > represents the non-PASID address space but also (at least for SMMUv3) > the whole device context, PASID tables etc. > > Auxiliary domains will be used by VFIO for IOMMU-aware mdev, and by any > other device driver that wants to use PASID for private address spaces > (as opposed to SVA [2]). The following API is available to device > drivers: > > (1) Enable AUXD for a device. Enable PASID if necessary and set an AUXD > flag on the IOMMU data associated to a device. > > For my own convenience I've been using the SVA infrastructure since > I already had the locking and IOMMU ops in place. The proposed > interface is also missing min_pasid and max_pasid parameters, which > could be needed by device drivers to enforce PASID limits. Variable PASID limits per-device is hard to manage. I suppose we can play some games to get this right, but I suspect that wont be very useful in the long run. #1: Given that PASID is a system wide resource, during boot iommu driver needs to scan and set the max PASID to be no more than min(iommu_supported_max_pasid) of all available IOMMU's. #2: In addition if any device supports less than the choosen system max PASID we should simply disable PASID on that device. The reason is if the process were to bind to 2 or more accelerators and the second device has a limit smaller than the first that the application already attached, the second attemt to bind would fail. (Because we would use the same PASID for a single process) In addition for Intel IOMMU's PASID is a system wide resource. We have a virtual capability for vIOMMU's to allocate PASID's. At the time of allocation we don't know what device this PASID is even being allocated for. Certainly we could add other intelligence to pass a hint for max_pasid in the virtiual interface. I suppose when this becomes real, most serious accelerators will support the full width. Hence doing #1 and #2 above should be good for most cases. Cheers, Ashok ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
> From: Jean-Philippe Brucker [mailto:jean-philippe.bruc...@arm.com] > Sent: Saturday, October 20, 2018 2:12 AM > > This is a first prototype adding auxiliary domain support to Arm SMMUv3, > following Lu Baolu's latest proposal for IOMMU aware mediated devices > [1]. It works, but the attach() API still doesn't feel right. See (2) > below. > > Patch 1 adapts iommu.c to the current proposal for auxiliary domains. > Patches 2-4 rework the PASID allocator to make it usable for SVA and > AUXD simultaneously. Patches 5-6 add AUXD support to SMMUv3. > > > When a device can have multiple address space, for instance with PCI > PASID, an auxiliary domain (AUXD) is the IOMMU representation of one > address space. I distinguish auxiliary from "main" domain, which > represents the non-PASID address space but also (at least for SMMUv3) > the whole device context, PASID tables etc. I'd like to clarify a bit. :-) a domain can always represent one or more address spaces, where an address space could be for IOVA or GPA and/or other address spaces for SVA. Address spaces may be chained together in so-called nested mode. a domain can be associated with a full device (in BDF granular), and/or with a partition of a device (say in PASID granular). In the former case, the domain becomes the master (or primary) domain of the device. In the latter case, the domain becomes the auxiliary domain of the device. vendor domain structure may include vendor-specific states which are applied to device context at attach time, e.g. PASID table (SMMUv3) if representing as master domain, or PASID value (VT-d scalable mode) if representing as auxiliary domain. so the domain definition is never changed with the introduction of AUXD. 'auxiliary' is a per-device attribute which takes effect when at domain attach time. A domain is perfectly sane to represent as a master domain to deviceA and an auxiliary domain to deviceB at the same time (say when device A and a mdev on deviceB are assigned to same VM), while supporting one or more address spaces. I explained above concept in my KVM forum session: https://events.linuxfoundation.org/wp-content/uploads/2017/12/Hardware-Assisted-Mediated-Pass-Through-with-VFIO-Kevin-Tian-Intel.pdf (slide 16/17) > > Auxiliary domains will be used by VFIO for IOMMU-aware mdev, and by > any > other device driver that wants to use PASID for private address spaces > (as opposed to SVA [2]). The following API is available to device > drivers: > > (1) Enable AUXD for a device. Enable PASID if necessary and set an AUXD > flag on the IOMMU data associated to a device. > > For my own convenience I've been using the SVA infrastructure since > I already had the locking and IOMMU ops in place. The proposed > interface is also missing min_pasid and max_pasid parameters, which > could be needed by device drivers to enforce PASID limits. > iommu_sva_init_device() without arguments already enables PASID, so > I just added an AUXD flag to SVA features: > > iommu_sva_init_device(dev, IOMMU_SVA_FEAT_AUXD, > min_pasid, max_pasid, NULL) > iommu_sva_shutdown_device(dev) > > Or as proposed in [1]: > > iommu_set_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_ENABLE, NULL) > iommu_set_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_DISABLE, NULL) > > Finding a compromise for this interface should be easy. > > (2) Allocate a domain and attach it to the device. > > dom = iommu_domain_alloc() > iommu_attach_device(dom, dev) > > I still have concerns about this part, which are highlighted by the > messy changes of patch 1. I think it would make more sense to > introduce new attach/detach_dev_aux() functions instead of reusing > attach/detach_dev() > > Can we reconsider this and avoid unnecessary complications in IOMMU > core and drivers? Does the VFIO code greatly benefit from using the > same attach() function? It could as well use a different one for > devices in AUXD mode, which the mediating driver could tell by > adding a flag in mdev_set_iommu_device(), for example. Baolu gave some recommendations to patch 1. Please check whether it can help reduce the mess. IMO using same API is conceptually clearer to VFIO... let's gather in KVM forum to have a conclusion, with Alex being there. Thanks Kevin > > And I don't think other users of AUXD would benefit from using the > same attach() function, since they will know whether they want to be > using main or auxiliary domain when doing attach(). > > (3) Get the PASID, and program it in the device > > iommu_domain_get_attr(dom, DOMAIN_ATTR_AUXD_ID, &pasid) > > (4) Create DMA mappings > > iommu_map(dom, ...) > iommu_unmap(dom, ...) > > Ultimately it would be nice to add PASID support to the DMA API as > well. For now drivers allocate IOVAs and pages themselves. > > > For vfio-mdev, a driver that wants to create mdevs o
Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
Hi Jean, On 2018/10/20 2:11, Jean-Philippe Brucker wrote: This is a first prototype adding auxiliary domain support to Arm SMMUv3, following Lu Baolu's latest proposal for IOMMU aware mediated devices [1]. It works, but the attach() API still doesn't feel right. See (2) below. Patch 1 adapts iommu.c to the current proposal for auxiliary domains. Patches 2-4 rework the PASID allocator to make it usable for SVA and AUXD simultaneously. Patches 5-6 add AUXD support to SMMUv3. When a device can have multiple address space, for instance with PCI PASID, an auxiliary domain (AUXD) is the IOMMU representation of one address space. I distinguish auxiliary from "main" domain, which represents the non-PASID address space but also (at least for SMMUv3) the whole device context, PASID tables etc. Auxiliary domains will be used by VFIO for IOMMU-aware mdev, and by any other device driver that wants to use PASID for private address spaces (as opposed to SVA [2]). The following API is available to device drivers: (1) Enable AUXD for a device. Enable PASID if necessary and set an AUXD flag on the IOMMU data associated to a device. For my own convenience I've been using the SVA infrastructure since I already had the locking and IOMMU ops in place. The proposed interface is also missing min_pasid and max_pasid parameters, which could be needed by device drivers to enforce PASID limits. iommu_sva_init_device() without arguments already enables PASID, so I just added an AUXD flag to SVA features: iommu_sva_init_device(dev, IOMMU_SVA_FEAT_AUXD, min_pasid, max_pasid, NULL) iommu_sva_shutdown_device(dev) Or as proposed in [1]: iommu_set_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_ENABLE, NULL) iommu_set_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_DISABLE, NULL) Finding a compromise for this interface should be easy. Personal idea, from the respective vendors, PASID limit should be set as enabling. (2) Allocate a domain and attach it to the device. dom = iommu_domain_alloc() iommu_attach_device(dom, dev) I still have concerns about this part, which are highlighted by the messy changes of patch 1. I think it would make more sense to introduce new attach/detach_dev_aux() functions instead of reusing attach/detach_dev() Can we reconsider this and avoid unnecessary complications in IOMMU core and drivers? Does the VFIO code greatly benefit from using the same attach() function? It could as well use a different one for devices in AUXD mode, which the mediating driver could tell by adding a flag in mdev_set_iommu_device(), for example. And I don't think other users of AUXD would benefit from using the same attach() function, since they will know whether they want to be using main or auxiliary domain when doing attach(). It seems that SVA/AUXD have no better sulotion for users applcations except VFIO. But VFIO seems caring for VMs' scenario more than general user space processes at present. :) (3) Get the PASID, and program it in the device iommu_domain_get_attr(dom, DOMAIN_ATTR_AUXD_ID, &pasid) (4) Create DMA mappings iommu_map(dom, ...) iommu_unmap(dom, ...) Ultimately it would be nice to add PASID support to the DMA API as well. For now drivers allocate IOVAs and pages themselves. For vfio-mdev, a driver that wants to create mdevs only performs steps (1) and (3): * When initializing the parent device, enable AUXD (1) * In mdev_parent_ops::create(), call mdev_set_iommu_device(mdev_dev(mdev), mdev_parent_dev(mdev)). * In mdev_parent_ops::open(), get the PASID (3) and install it in the parent device. * In mdev_parent_ops::close(), clear the PASID This may be a good solution to avoid exposing PASID to user space. Personally, VFIO-PCI is not in the range of this using scenario for partial device while PASID/SVA/AUXD are enabled. Cheers, Zaibo . This code and the many patches it depends on can be found on my iommu/auxd branch: git://linux-arm.org/linux-jpb.git iommu/auxd [1] [PATCH v3 0/8] vfio/mdev: IOMMU aware mediated device https://lwn.net/ml/linux-kernel/20181012051632.26064-1-baolu...@linux.intel.com/ [2] [PATCH v3 00/10] Shared Virtual Addressing for the IOMMU https://www.spinics.net/lists/iommu/msg30076.html Jean-Philippe Brucker (6): iommu: Adapt attach/detach_dev() for auxiliary domains drivers core: Add I/O ASID allocator iommu/sva: Use external PASID allocator iommu/sva: Support AUXD feature iommu/arm-smmu-v3: Implement detach_dev op iommu/arm-smmu-v3: Add support for auxiliary domains drivers/base/Kconfig| 7 ++ drivers/base/Makefile | 1 + drivers/base/ioasid.c | 140 ++ drivers/iommu/Kconfig | 1 + drivers/iommu/arm-smmu-v3.c | 164 ++-- drivers/iommu/
[RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
This is a first prototype adding auxiliary domain support to Arm SMMUv3, following Lu Baolu's latest proposal for IOMMU aware mediated devices [1]. It works, but the attach() API still doesn't feel right. See (2) below. Patch 1 adapts iommu.c to the current proposal for auxiliary domains. Patches 2-4 rework the PASID allocator to make it usable for SVA and AUXD simultaneously. Patches 5-6 add AUXD support to SMMUv3. When a device can have multiple address space, for instance with PCI PASID, an auxiliary domain (AUXD) is the IOMMU representation of one address space. I distinguish auxiliary from "main" domain, which represents the non-PASID address space but also (at least for SMMUv3) the whole device context, PASID tables etc. Auxiliary domains will be used by VFIO for IOMMU-aware mdev, and by any other device driver that wants to use PASID for private address spaces (as opposed to SVA [2]). The following API is available to device drivers: (1) Enable AUXD for a device. Enable PASID if necessary and set an AUXD flag on the IOMMU data associated to a device. For my own convenience I've been using the SVA infrastructure since I already had the locking and IOMMU ops in place. The proposed interface is also missing min_pasid and max_pasid parameters, which could be needed by device drivers to enforce PASID limits. iommu_sva_init_device() without arguments already enables PASID, so I just added an AUXD flag to SVA features: iommu_sva_init_device(dev, IOMMU_SVA_FEAT_AUXD, min_pasid, max_pasid, NULL) iommu_sva_shutdown_device(dev) Or as proposed in [1]: iommu_set_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_ENABLE, NULL) iommu_set_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_DISABLE, NULL) Finding a compromise for this interface should be easy. (2) Allocate a domain and attach it to the device. dom = iommu_domain_alloc() iommu_attach_device(dom, dev) I still have concerns about this part, which are highlighted by the messy changes of patch 1. I think it would make more sense to introduce new attach/detach_dev_aux() functions instead of reusing attach/detach_dev() Can we reconsider this and avoid unnecessary complications in IOMMU core and drivers? Does the VFIO code greatly benefit from using the same attach() function? It could as well use a different one for devices in AUXD mode, which the mediating driver could tell by adding a flag in mdev_set_iommu_device(), for example. And I don't think other users of AUXD would benefit from using the same attach() function, since they will know whether they want to be using main or auxiliary domain when doing attach(). (3) Get the PASID, and program it in the device iommu_domain_get_attr(dom, DOMAIN_ATTR_AUXD_ID, &pasid) (4) Create DMA mappings iommu_map(dom, ...) iommu_unmap(dom, ...) Ultimately it would be nice to add PASID support to the DMA API as well. For now drivers allocate IOVAs and pages themselves. For vfio-mdev, a driver that wants to create mdevs only performs steps (1) and (3): * When initializing the parent device, enable AUXD (1) * In mdev_parent_ops::create(), call mdev_set_iommu_device(mdev_dev(mdev), mdev_parent_dev(mdev)). * In mdev_parent_ops::open(), get the PASID (3) and install it in the parent device. * In mdev_parent_ops::close(), clear the PASID This code and the many patches it depends on can be found on my iommu/auxd branch: git://linux-arm.org/linux-jpb.git iommu/auxd [1] [PATCH v3 0/8] vfio/mdev: IOMMU aware mediated device https://lwn.net/ml/linux-kernel/20181012051632.26064-1-baolu...@linux.intel.com/ [2] [PATCH v3 00/10] Shared Virtual Addressing for the IOMMU https://www.spinics.net/lists/iommu/msg30076.html Jean-Philippe Brucker (6): iommu: Adapt attach/detach_dev() for auxiliary domains drivers core: Add I/O ASID allocator iommu/sva: Use external PASID allocator iommu/sva: Support AUXD feature iommu/arm-smmu-v3: Implement detach_dev op iommu/arm-smmu-v3: Add support for auxiliary domains drivers/base/Kconfig| 7 ++ drivers/base/Makefile | 1 + drivers/base/ioasid.c | 140 ++ drivers/iommu/Kconfig | 1 + drivers/iommu/arm-smmu-v3.c | 164 ++-- drivers/iommu/iommu-sva.c | 113 + drivers/iommu/iommu.c | 58 + include/linux/ioasid.h | 45 ++ include/linux/iommu.h | 12 +++ 9 files changed, 479 insertions(+), 62 deletions(-) create mode 100644 drivers/base/ioasid.c create mode 100644 include/linux/ioasid.h -- 2.19.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu