RE: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-12-12 Thread Tian, Kevin
> From: 'j...@8bytes.org' [mailto:j...@8bytes.org]
> Sent: Wednesday, December 12, 2018 5:54 PM
> 
> Hi Kevin,
> 
> On Wed, Dec 12, 2018 at 09:31:27AM +, Tian, Kevin wrote:
> > > From: 'j...@8bytes.org'
> > > Sent: Monday, December 10, 2018 4:58 PM
> > > These represent whether the device together with the IOMMU support
> > > them,
> > > basically whether these features are usable via the IOMMU-API.
> >
> > "device together with IOMMU" or just "IOMMU itself"?
> 
> No, it should mean device together with IOMMU support it. It is a way
> for users of the IOMMU-API to check whether they can successfully use
> the aux-specific functions.
> 
> > however there is a problem with aux. A device may implement both
> > SR-IOV and Scalable IOV capabilities, but at any time only one of them
> > can be enabled. Driver will provide interfaces for end user to choose.
> > In such case we cannot assume that device-side Scalable-IOV can be
> > always enabled while IOMMU is in scalable mode.
> >
> > It works better if we position those features just representing IOMMU
> > support only. In that case, aux is related only to scalable mode of IOMMU
> > itself, which is orthogonal to whether device side supports SIOV.
> 
> Yeah, but we don't make that decision in the IOMMU code. Whether the
> device exposes SR-IOV or PASID based isolation is decided in PCI code
> based on user input (SR-IOV is also enabled in PCI code and IOMMU just
> uses the new devices that appear).
> 
> Only if the user enabled scalable mode on the device and the IOMMU
> supports it too, the feature-check function returns true for the aux
> feature.
> 

Then this is another proof for an enable/disable part in API. :-)

Thanks
kevin
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-12-12 Thread Tian, Kevin
> From: 'j...@8bytes.org'
> Sent: Monday, December 10, 2018 4:58 PM
> 
> Hi Kevin,
> 
> On Mon, Dec 10, 2018 at 02:06:44AM +, Tian, Kevin wrote:
> > Can I interpret above as that you agree with the aux domain concept (i.e.
> one
> > device can be linked to multiple domains) in general, and now we're just
> trying
> > to address the remaining open in API level?
> 
> Yes, I thouht about alternatives, but in the end they are all harder to
> use than the aux-domain concept. We just need to make sure that we have
> a clear definition of what the API extension do and how they impact the
> behavior of existing functions.

sounds great!

> 
> > >   enum iommu_dev_features {
> > >   /* ... */
> > >   IOMMU_DEV_FEAT_AUX,
> > >   IOMMU_DEV_FEAT_SVA,
> > >   /* ... */
> > >   };
> > >
> >
> > Does above represent whether a device implements aux/sva features,
> > or whether a device has been enabled by driver to support aux/sva
> > features?
> 
> These represent whether the device together with the IOMMU support
> them,
> basically whether these features are usable via the IOMMU-API.

"device together with IOMMU" or just "IOMMU itself"? 

the former might be OK for sva. As Jean replied in another mail, enabling
it in both IOMMU and device side just consumes some resources, while 
not impacting other usages on that device.

however there is a problem with aux. A device may implement both
SR-IOV and Scalable IOV capabilities, but at any time only one of them
can be enabled. Driver will provide interfaces for end user to choose.
In such case we cannot assume that device-side Scalable-IOV can be
always enabled while IOMMU is in scalable mode.

It works better if we position those features just representing IOMMU 
support only. In that case, aux is related only to scalable mode of IOMMU
itself, which is orthogonal to whether device side supports SIOV.

> 
> 
> >
> > >   /* Check if a device supports a given feature of the IOMMU-API */
> > >   bool iommu_dev_has_feature(struct device *dev, enum
> > > iommu_dev_features *feat);
> >
> > If the latter we also need iommu_dev_set_feature so driver can poke
> > it based on its own configuration.
> 
> Do you mean we need an explict enable-API for the features? I thought
> about that too, but some features clearly don't need it and I didn't
> want to complicate the usage. My assumption was that when the feature is
> available, it is also enabled.
> 
> > >   /* So we need a iommu_aux_detach_all()? */
> >
> > for what scenario?
> 
> Maybe for detaching any PASID users from a device so that it can be
> assigned to a VM as a whole. But I am not sure it is needed.
> 

yes, possibly useful in reset path as Jean pointed out.

Thanks
Kevin
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-12-12 Thread 'j...@8bytes.org'
On Tue, Dec 11, 2018 at 01:35:23PM +, Jean-Philippe Brucker wrote:
> > /* So we need a iommu_aux_detach_all()? */
> 
> This could be useful for device drivers that want to do bulk cleanup on
> device removal. If they rely on Function Level Reset to disable PASID
> states for example, they could also cleanup with a detach_all(). But
> most will probably need to clean up individual PASID states (for example
> free all mdevs) and therefore don't need detach_all()

Yeah, so the more I think about it, the more dangerous it seems to have
this function. It creates subtle error cases for the users of SVA and
aux-domains because their mapping goes away without notice. It is
certainly better to force an orderly shutdown of all users before the
device can be re-assigned.

> > This concept can also be easily extended for supporting SVA in parallel
> > on the same device, with the same constraints regarding the behavior of
> > iommu_attach_device()/iommu_detach_device().
> 
> So this would be without the initial step of attaching an "ext" domain
> to the device in order to enable SVA/PASID?

No, I don't think anymore we should introduce a special domain type to
enable these features. Separate enable/disable functions per feature
seem to be a better choice.

> If I understood it correctly, I agree with the
> iommu_attach/detach_device() behavior for SVA as well. If a device is
> bound to an mm, then the device cannot be attached to another domain
> with iommu_attach_device(). This prevents leaks and forces the device
> driver to clean up PASID states when switching to a different driver
> (e.g. vfio-pci)

Right, the API should ensure we are safe on that side.

Thanks,

Joerg

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-12-12 Thread 'j...@8bytes.org'
On Tue, Dec 11, 2018 at 06:34:23PM +, Jean-Philippe Brucker wrote:
> The cost of enabling those features for a device does seem negligible.
> For the SMMU we need to allocate about 70k of additional memory for the
> initial PASID table, but enabling the PASID cap shouldn't add any
> overhead otherwise. Same for PRI, shouldn't add any overhead.

Okay, the memory requirements are a pretty good case for an
enable/disable part in the API.

> As for ATS, it supposedly makes things faster, although it does mean
> more invalidation requests. There currently is a global pci=noats
> parameter, but maybe we need something with finer granularity to
> enable/disable it per device?

Yeah, I don't object against this, but I think this is a matter of PCI
code. The IOMMUs should just enable ATS when it is available. PCI core
can fail to enable it if requested by the user.

Regards,

Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-12-11 Thread Jean-Philippe Brucker
On 10/12/2018 08:57, 'j...@8bytes.org' wrote:
> Hi Kevin,
> 
> On Mon, Dec 10, 2018 at 02:06:44AM +, Tian, Kevin wrote:
>> Can I interpret above as that you agree with the aux domain concept (i.e. one
>> device can be linked to multiple domains) in general, and now we're just 
>> trying
>> to address the remaining open in API level?
> 
> Yes, I thouht about alternatives, but in the end they are all harder to
> use than the aux-domain concept. We just need to make sure that we have
> a clear definition of what the API extension do and how they impact the
> behavior of existing functions.
> 
>>> enum iommu_dev_features {
>>> /* ... */
>>> IOMMU_DEV_FEAT_AUX,
>>> IOMMU_DEV_FEAT_SVA,
>>> /* ... */
>>> };
>>>
>>
>> Does above represent whether a device implements aux/sva features,
>> or whether a device has been enabled by driver to support aux/sva 
>> features?
> 
> These represent whether the device together with the IOMMU support them,
> basically whether these features are usable via the IOMMU-API.
> 
> 
>>
>>> /* Check if a device supports a given feature of the IOMMU-API */
>>> bool iommu_dev_has_feature(struct device *dev, enum
>>> iommu_dev_features *feat);
>>
>> If the latter we also need iommu_dev_set_feature so driver can poke
>> it based on its own configuration.
> 
> Do you mean we need an explict enable-API for the features? I thought
> about that too, but some features clearly don't need it and I didn't
> want to complicate the usage. My assumption was that when the feature is
> available, it is also enabled.

The cost of enabling those features for a device does seem negligible.
For the SMMU we need to allocate about 70k of additional memory for the
initial PASID table, but enabling the PASID cap shouldn't add any
overhead otherwise. Same for PRI, shouldn't add any overhead.

As for ATS, it supposedly makes things faster, although it does mean
more invalidation requests. There currently is a global pci=noats
parameter, but maybe we need something with finer granularity to
enable/disable it per device?

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-12-11 Thread Jean-Philippe Brucker
On 07/12/2018 10:29, 'j...@8bytes.org' wrote:
> Hi,
> 
> On Mon, Nov 26, 2018 at 07:29:45AM +, Tian, Kevin wrote:
>> btw Baolu just reminded me one thing which is worthy of noting here.
>> 'primary' vs. 'aux' concept makes sense only when we look from a device
>> p.o.v. That binding relationship is not (*should not be*) carry-and-forwarded
>> cross devices. every domain must be explicitly attached to other devices
>> (instead of implicitly attached being above example), and new primary/aux
>> attribute on another device will be decided at attach time.
> 
> Okay, so after all the discussions we had I learned a few more things
> about the scalable mode feature and thought a bit longer about how to
> best support it in the IOMMU-API.
> 
> The concept of sub-domains I initially proposed certainly makes no
> sense, but scalable-mode specific attach/detach functions do. So instead
> of a sub-domain mode, I'd like to propose device-feature sets.
> 
> The posted patch-set already includes this as device-attributes, but I
> don't like this naming as we are really talking about additional
> feature sets of a device. So how about we introduce this:
> 
>   enum iommu_dev_features {
>   /* ... */
>   IOMMU_DEV_FEAT_AUX,
>   IOMMU_DEV_FEAT_SVA,
>   /* ... */
>   };
> 
>   /* Check if a device supports a given feature of the IOMMU-API */
>   bool iommu_dev_has_feature(struct device *dev, enum iommu_dev_features 
> *feat);
> 
>   /*
>* Only works if iommu_dev_has_feature(dev, IOMMU_DEV_FEAT_AUX)
>* returns true
>*
>* Also, as long as domains are attached to a device through
>* this interface, any trys to call iommu_attach_device() should
>* fail (iommu_detach_device() can't fail, so we fail on the
>* tryint to re-attach). This should make us safe against a
>* device being attached to a guest as a whole while there are
>* still pasid users on it (aux and sva).
>*/
>   int iommu_aux_attach_device(struct iommu_domain *domain,
>   struct device *dev);
> 
>   int iommu_aux_detach_device(struct iommu_domain *domain,
>   struct device *dev);
>   /*
>* I know we are targeting a system-wide pasid-space, so that
>* the pasid would be the same for one domain on all devices,
>* let's just keep the option open to have different
>* pasid-spaces in one system. Also this way we can use it to
>* check whether the domain is attached to this device at all.
>*
>* returns pasid or <0 if domain has no pasid on that device.
>*/
>   int iommu_aux_get_pasid(struct iommu_domain *domain,
>   struct device *dev);>
>   /* So we need a iommu_aux_detach_all()? */

This could be useful for device drivers that want to do bulk cleanup on
device removal. If they rely on Function Level Reset to disable PASID
states for example, they could also cleanup with a detach_all(). But
most will probably need to clean up individual PASID states (for example
free all mdevs) and therefore don't need detach_all()

> This concept can also be easily extended for supporting SVA in parallel
> on the same device, with the same constraints regarding the behavior of
> iommu_attach_device()/iommu_detach_device().

So this would be without the initial step of attaching an "ext" domain
to the device in order to enable SVA/PASID?

If I understood it correctly, I agree with the
iommu_attach/detach_device() behavior for SVA as well. If a device is
bound to an mm, then the device cannot be attached to another domain
with iommu_attach_device(). This prevents leaks and forces the device
driver to clean up PASID states when switching to a different driver
(e.g. vfio-pci)

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-12-10 Thread 'j...@8bytes.org'
Hi Lu Baolu,

On Mon, Dec 10, 2018 at 10:57:22AM +0800, Lu Baolu wrote:
> > /* Check if a device supports a given feature of the IOMMU-API */
> > bool iommu_dev_has_feature(struct device *dev, enum iommu_dev_features 
> > *feat);
> 
> Here we pass in a pointer of "enum iommu_dev_features", do we want
> to return anything here?

Yeah, this is a mistake on my side, The enum parameter should be passed
by-value, so not pointer there. It doesn't need to return anything in
this parameter.

Regards,

Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-12-10 Thread 'j...@8bytes.org'
Hi Kevin,

On Mon, Dec 10, 2018 at 02:06:44AM +, Tian, Kevin wrote:
> Can I interpret above as that you agree with the aux domain concept (i.e. one
> device can be linked to multiple domains) in general, and now we're just 
> trying
> to address the remaining open in API level?

Yes, I thouht about alternatives, but in the end they are all harder to
use than the aux-domain concept. We just need to make sure that we have
a clear definition of what the API extension do and how they impact the
behavior of existing functions.

> > enum iommu_dev_features {
> > /* ... */
> > IOMMU_DEV_FEAT_AUX,
> > IOMMU_DEV_FEAT_SVA,
> > /* ... */
> > };
> > 
> 
> Does above represent whether a device implements aux/sva features,
> or whether a device has been enabled by driver to support aux/sva 
> features?

These represent whether the device together with the IOMMU support them,
basically whether these features are usable via the IOMMU-API.


> 
> > /* Check if a device supports a given feature of the IOMMU-API */
> > bool iommu_dev_has_feature(struct device *dev, enum
> > iommu_dev_features *feat);
> 
> If the latter we also need iommu_dev_set_feature so driver can poke
> it based on its own configuration.

Do you mean we need an explict enable-API for the features? I thought
about that too, but some features clearly don't need it and I didn't
want to complicate the usage. My assumption was that when the feature is
available, it is also enabled.

> > /* So we need a iommu_aux_detach_all()? */
> 
> for what scenario?

Maybe for detaching any PASID users from a device so that it can be
assigned to a VM as a whole. But I am not sure it is needed.

Regards,

Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-12-09 Thread Lu Baolu

Hi Joerg,

On 12/7/18 6:29 PM, 'j...@8bytes.org' wrote:

Hi,

On Mon, Nov 26, 2018 at 07:29:45AM +, Tian, Kevin wrote:

btw Baolu just reminded me one thing which is worthy of noting here.
'primary' vs. 'aux' concept makes sense only when we look from a device
p.o.v. That binding relationship is not (*should not be*) carry-and-forwarded
cross devices. every domain must be explicitly attached to other devices
(instead of implicitly attached being above example), and new primary/aux
attribute on another device will be decided at attach time.


Okay, so after all the discussions we had I learned a few more things
about the scalable mode feature and thought a bit longer about how to
best support it in the IOMMU-API.


Thanks for thinking about this.



The concept of sub-domains I initially proposed certainly makes no
sense, but scalable-mode specific attach/detach functions do. So instead
of a sub-domain mode, I'd like to propose device-feature sets.

The posted patch-set already includes this as device-attributes, but I
don't like this naming as we are really talking about additional
feature sets of a device. So how about we introduce this:

enum iommu_dev_features {
/* ... */
IOMMU_DEV_FEAT_AUX,
IOMMU_DEV_FEAT_SVA,
/* ... */
};

/* Check if a device supports a given feature of the IOMMU-API */
bool iommu_dev_has_feature(struct device *dev, enum iommu_dev_features 
*feat);


Here we pass in a pointer of "enum iommu_dev_features", do we want
to return anything here?

Best regards,
Lu Baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-12-09 Thread Tian, Kevin
> From: 'j...@8bytes.org' [mailto:j...@8bytes.org]
> Sent: Friday, December 7, 2018 6:29 PM
> 
> Hi,
> 
> On Mon, Nov 26, 2018 at 07:29:45AM +, Tian, Kevin wrote:
> > btw Baolu just reminded me one thing which is worthy of noting here.
> > 'primary' vs. 'aux' concept makes sense only when we look from a device
> > p.o.v. That binding relationship is not (*should not be*) carry-and-
> forwarded
> > cross devices. every domain must be explicitly attached to other devices
> > (instead of implicitly attached being above example), and new
> primary/aux
> > attribute on another device will be decided at attach time.
> 
> Okay, so after all the discussions we had I learned a few more things
> about the scalable mode feature and thought a bit longer about how to
> best support it in the IOMMU-API.

Thanks for thinking through this.

> 
> The concept of sub-domains I initially proposed certainly makes no
> sense, but scalable-mode specific attach/detach functions do. So instead
> of a sub-domain mode, I'd like to propose device-feature sets.

Can I interpret above as that you agree with the aux domain concept (i.e. one
device can be linked to multiple domains) in general, and now we're just trying
to address the remaining open in API level?

> 
> The posted patch-set already includes this as device-attributes, but I
> don't like this naming as we are really talking about additional
> feature sets of a device. So how about we introduce this:
> 
>   enum iommu_dev_features {
>   /* ... */
>   IOMMU_DEV_FEAT_AUX,
>   IOMMU_DEV_FEAT_SVA,
>   /* ... */
>   };
> 

Does above represent whether a device implements aux/sva features,
or whether a device has been enabled by driver to support aux/sva 
features?

>   /* Check if a device supports a given feature of the IOMMU-API */
>   bool iommu_dev_has_feature(struct device *dev, enum
> iommu_dev_features *feat);

If the latter we also need iommu_dev_set_feature so driver can poke
it based on its own configuration. 

> 
>   /*
>* Only works if iommu_dev_has_feature(dev,
> IOMMU_DEV_FEAT_AUX)
>* returns true
>*
>* Also, as long as domains are attached to a device through
>* this interface, any trys to call iommu_attach_device() should
>* fail (iommu_detach_device() can't fail, so we fail on the
>* tryint to re-attach). This should make us safe against a
>* device being attached to a guest as a whole while there are
>* still pasid users on it (aux and sva).

yes, it makes sense.

>*/
>   int iommu_aux_attach_device(struct iommu_domain *domain,
>   struct device *dev);
> 
>   int iommu_aux_detach_device(struct iommu_domain *domain,
>   struct device *dev);
>   /*
>* I know we are targeting a system-wide pasid-space, so that
>* the pasid would be the same for one domain on all devices,
>* let's just keep the option open to have different
>* pasid-spaces in one system. Also this way we can use it to
>* check whether the domain is attached to this device at all.
>*
>* returns pasid or <0 if domain has no pasid on that device.
>*/
>   int iommu_aux_get_pasid(struct iommu_domain *domain,
>   struct device *dev);
> 
>   /* So we need a iommu_aux_detach_all()? */

for what scenario?

> 
> This concept can also be easily extended for supporting SVA in parallel
> on the same device, with the same constraints regarding the behavior of
> iommu_attach_device()/iommu_detach_device().
> 
> So what do you think about that approach?
> 
> Regards,
> 
>   Joerg
> 
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-12-07 Thread 'j...@8bytes.org'
Hi,

On Mon, Nov 26, 2018 at 07:29:45AM +, Tian, Kevin wrote:
> btw Baolu just reminded me one thing which is worthy of noting here.
> 'primary' vs. 'aux' concept makes sense only when we look from a device
> p.o.v. That binding relationship is not (*should not be*) carry-and-forwarded
> cross devices. every domain must be explicitly attached to other devices
> (instead of implicitly attached being above example), and new primary/aux
> attribute on another device will be decided at attach time.

Okay, so after all the discussions we had I learned a few more things
about the scalable mode feature and thought a bit longer about how to
best support it in the IOMMU-API.

The concept of sub-domains I initially proposed certainly makes no
sense, but scalable-mode specific attach/detach functions do. So instead
of a sub-domain mode, I'd like to propose device-feature sets.

The posted patch-set already includes this as device-attributes, but I
don't like this naming as we are really talking about additional
feature sets of a device. So how about we introduce this:

enum iommu_dev_features {
/* ... */
IOMMU_DEV_FEAT_AUX,
IOMMU_DEV_FEAT_SVA,
/* ... */
};

/* Check if a device supports a given feature of the IOMMU-API */
bool iommu_dev_has_feature(struct device *dev, enum iommu_dev_features 
*feat);

/*
 * Only works if iommu_dev_has_feature(dev, IOMMU_DEV_FEAT_AUX)
 * returns true
 *
 * Also, as long as domains are attached to a device through
 * this interface, any trys to call iommu_attach_device() should
 * fail (iommu_detach_device() can't fail, so we fail on the
 * tryint to re-attach). This should make us safe against a
 * device being attached to a guest as a whole while there are
 * still pasid users on it (aux and sva).
 */
int iommu_aux_attach_device(struct iommu_domain *domain,
struct device *dev);

int iommu_aux_detach_device(struct iommu_domain *domain,
struct device *dev);
/*
 * I know we are targeting a system-wide pasid-space, so that
 * the pasid would be the same for one domain on all devices,
 * let's just keep the option open to have different
 * pasid-spaces in one system. Also this way we can use it to
 * check whether the domain is attached to this device at all.
 *
 * returns pasid or <0 if domain has no pasid on that device.
 */
int iommu_aux_get_pasid(struct iommu_domain *domain,
struct device *dev);

/* So we need a iommu_aux_detach_all()? */

This concept can also be easily extended for supporting SVA in parallel
on the same device, with the same constraints regarding the behavior of
iommu_attach_device()/iommu_detach_device().

So what do you think about that approach?

Regards,

Joerg


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-11-25 Thread Tian, Kevin
> From: Tian, Kevin
> Sent: Monday, November 26, 2018 11:01 AM
[...]
> > aux-domain is just a normal domain (struct iommu_domain), what
> > happens
> > when a domain that was used as a primary domain and has bound
> > aux-domains to it, is bound with iommu_aux_domain_attach() to another
> > domain?
> 
> aux domain is essentially a way to lend partial DMA capability to less-
> privileged entity (process or VM), which implies that part of DMA
> unmanaged by kernel driver. If we can make such assumption that aux
> only applies to UNMANAGED domain (thus disallow BLOCKED/IDENTITY/
> DMA from using aux), then above open doesn't exist, because if primary
> domain is UNMANAGED type it essentially means the whole device fully
> owned by user space or guest thus no host driver entity to ever create
> aux domain. If primary domain is not UNMANAGED type, it will fail in
> aux API. with this design, each device will have at most one primary
> and multiple aux, i.e. just two layers. no aux-bind-to-aux thing.
> 
> this assumption should be true in discussed usages around aux domain.
> but I may overlook something... Jean?
> 

btw Baolu just reminded me one thing which is worthy of noting here.
'primary' vs. 'aux' concept makes sense only when we look from a device
p.o.v. That binding relationship is not (*should not be*) carry-and-forwarded
cross devices. every domain must be explicitly attached to other devices
(instead of implicitly attached being above example), and new primary/aux
attribute on another device will be decided at attach time.

Thanks
Kevin
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-11-25 Thread Tian, Kevin
> From: j...@8bytes.org [mailto:j...@8bytes.org]
> Sent: Friday, November 23, 2018 7:21 PM
> 
> On Wed, Nov 21, 2018 at 12:40:44PM +0800, Lu Baolu wrote:
> > Can you please elaborate a bit more about the concept of subdomains?
> > From my point of view, an aux-domain is a normal un-managed domain
> which
> > has a PASID and could be attached to any ADIs through the aux-domain
> > specific attach/detach APIs. It could also be attached to a device with
> > the existing domain_attach/detach_device() APIs at the same time, hence
> > mdev and pci devices in a vfio container could share a domain.
> 
> Okay, let's think a bit about having aux-specific attach/detach
> functions, in the end I don't insist on my proposal as long as the
> IOMMU-API extensions are clean, consistent, and have well defined
> semantics.
> 
> If we have aux-domain specific attach/detach functions like
> iommu_aux_domain_attach/detach(), what happens when the primary
> domain
> of the device is changed with iommu_attach/detach()?
> 
>   1) Will the aux-domains stay in place? If yes, how does this
>  work in setups where the pasid-bound page-tables are
>  guest-owned and translated through the primary domain
>  page-tables?
> 
>   2) Will the aux-domains be unbound too? In that case, if the
>  primary domain is re-used, will the aux-domains be implicitly
>  bound too when iommu_device_attach() is called?
> 
>   3) Do we just disallow changing the primary domain through that
>  interface as long as there are aux-domains or mm_structs
>  bound to the device?

3) sounds like a right option. 

IMO Primary domain represents the primary ownership of DMA capability 
of the whole device. The owner (say host driver) may lend partial capability 
(e.g. allocating some queues) to a sub-owner (through aux-domain to a VM 
or mm_struct to a process), when ownership is still claimed by this owner.

Primary domain switch means ownership change (e.g. from host driver to 
guest driver if type is changed from DMA-API to UNMANAGED), which usually
implies driver load/unload thus all resource usages from previous owner must 
be cleaned up before actual switch happens, which include things allocated 
from DMA-API, aux domain API, and SVA APIs. 

> 
> Using option 2) or 3) would mean that the aux-domains are still bound to
> the primary domain, but that is not reflected in the API. Further, if an

Do you suggest some reflection in API interface definition, or specific API 
implementation? If the former I didn't get how to do it. If the latter, yes 
it's missed in current RFC. We should add check in necessary code path.

> aux-domain is just a normal domain (struct iommu_domain), what
> happens
> when a domain that was used as a primary domain and has bound
> aux-domains to it, is bound with iommu_aux_domain_attach() to another
> domain?

aux domain is essentially a way to lend partial DMA capability to less-
privileged entity (process or VM), which implies that part of DMA 
unmanaged by kernel driver. If we can make such assumption that aux 
only applies to UNMANAGED domain (thus disallow BLOCKED/IDENTITY/
DMA from using aux), then above open doesn't exist, because if primary 
domain is UNMANAGED type it essentially means the whole device fully 
owned by user space or guest thus no host driver entity to ever create 
aux domain. If primary domain is not UNMANAGED type, it will fail in 
aux API. with this design, each device will have at most one primary
and multiple aux, i.e. just two layers. no aux-bind-to-aux thing.

this assumption should be true in discussed usages around aux domain. 
but I may overlook something... Jean?

> 
> As you can see, this design decission raises a lot of questions, but
> maybe we can work it out and find a solution we all agree on.
> 

Thanks for raising those good opens!

Thanks
Kevin
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-11-24 Thread Lu Baolu

Hi Joerg,

On 11/23/18 7:21 PM, j...@8bytes.org wrote:

On Wed, Nov 21, 2018 at 12:40:44PM +0800, Lu Baolu wrote:

Can you please elaborate a bit more about the concept of subdomains?
 From my point of view, an aux-domain is a normal un-managed domain which
has a PASID and could be attached to any ADIs through the aux-domain
specific attach/detach APIs. It could also be attached to a device with
the existing domain_attach/detach_device() APIs at the same time, hence
mdev and pci devices in a vfio container could share a domain.


Okay, let's think a bit about having aux-specific attach/detach
functions, in the end I don't insist on my proposal as long as the
IOMMU-API extensions are clean, consistent, and have well defined
semantics.

If we have aux-domain specific attach/detach functions like
iommu_aux_domain_attach/detach(), what happens when the primary domain
of the device is changed with iommu_attach/detach()?

1) Will the aux-domains stay in place? If yes, how does this
   work in setups where the pasid-bound page-tables are
   guest-owned and translated through the primary domain
   page-tables?


Are you thinking about guest SVA? In this case, guest SVA will be
translated through the aux domain page-tables. So, we can safely change
the primary domain with aux-domains staying there.

Conceptually, the primary domain means the existing domain which is for
all Request ID based (a.k.a. transfers without PASID) translations. Any
transfer with PASID will not go through the primary domain.



2) Will the aux-domains be unbound too? In that case, if the
   primary domain is re-used, will the aux-domains be implicitly
   bound too when iommu_device_attach() is called?

3) Do we just disallow changing the primary domain through that
   interface as long as there are aux-domains or mm_structs
   bound to the device?

Using option 2) or 3) would mean that the aux-domains are still bound to
the primary domain, but that is not reflected in the API. Further, if an
aux-domain is just a normal domain (struct iommu_domain), what happens
when a domain that was used as a primary domain and has bound
aux-domains to it, is bound with iommu_aux_domain_attach() to another
domain?

As you can see, this design decission raises a lot of questions, but
maybe we can work it out and find a solution we all agree on.


Regards,

Joerg


Best regards,
Lu Baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-11-23 Thread j...@8bytes.org
Hi Jean-Philippe,

On Wed, Nov 21, 2018 at 07:05:13PM +, Jean-Philippe Brucker wrote:
> For the moment though, I think we should allow device drivers to use the
> DMA-API at the same time as SVA.

Yeah, that makes sense.

> If a device driver has to map a management ring buffer for example,
> it's much nicer to use dma_alloc as usual rather than have them write
> their own DMA allocation routines.  Which is another reason to
> implement 2) above: the DMA-API would still act on the default_domain,
> and attaching an "extended" domain augments it with SVA features.
> Detaching from the device doesn't require copying mappings back to the
> default domain. Does that sound OK?

Yes, as I just wrote to Kevin, this sounds good.

>   struct io_mm *io_mm;
>   struct iommu_domain *domain;
> 
>   domain = iommu_alloc_ext_domain(bus);
> 
>   /* Set an mm-exit callback to disable PASIDs. More attributes
>  could be added later to change the capabilities of the ext
>  domain */
>   iommu_domain_set_attr(domain, DOMAIN_ATTR_MM_EXIT_CB, _exit);
> 
>   /* Fails if the device doesn't support this domain type */
>   iommu_attach_device(domain, dev);
> 
>   /* Same as SVA v3, except a domain instead of dev as argument */
>   io_mm = iommu_sva_bind_mm(domain, current->mm, ...);
> 
>   /* on success, install the PASID in the device */
>   pasid = io_mm->pasid;
> 
>   /* Do more work */
> 
>   iommu_sva_unbind_mm(io_mm);
>   iommu_detach_device(domain, dev);

Okay, we can still discuss the naming and a few details, but that goes
into the right directions. One open questions is, for example, where the
pasid-allocator comes into play. As it looks the amdgpu driver needs it
even without an iommu enabled.


Regards,

Joerg

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-11-23 Thread j...@8bytes.org
Hi Kevin,

On Thu, Nov 22, 2018 at 08:39:19AM +, Tian, Kevin wrote:
> I agree special action needs to be taken for everything else (other than
> DMA-API), but the point that I didn't get is why the action must be based
> a new SVA-type domain, instead of extending default domain with SVA
> capability (including related paging structures which are accessed through 
> new SVA APIs). In the latter case domain-wise attribute (e.g. IRQ mapping) 
> is naturally shared between capabilities (DMA-API and SVA). There is no 
> need to create cross-domain connections as two options that you listed 
> below.
> 
> Can you help elaborate more about the motivation behind proposal?

Yeah, thinking more about it, there are no real reasons against allowing
aux and sva bindings on the default domain. It allows the host to use a
device through the DMA-API and assign parts of it to a guest or a
user-space process, for example.

> |-default domain (DMA-API)
>   |-sva domain1 (SVA)
>   |-sva domain2 (SVA)
>   |-...
>   |-sva domainN (AUX, guest DMA-API)
>   |   |- sva domainN1 (AUX, guest SVA)
>   |   |- sva domainN2 (AUX, guest SVA)
>   |   |-...
>   |-sva domainM (AUX, guest DMA-API)
>   |   |- sva domainM1 (AUX, guest SVA)
>   |   |- sva domainM2 (AUX, guest SVA)
>   |   |-...

In my proposal the sva-domains are no sub-domains of the default-domain,
they exist on the same level. A device is detached from its default
domain and attached to an sva-domain.

> means same domain can be default on deviceA while being aux on deviceB
> (when assigning pci and mdev to same VM).

As already written in another mail, this raises a couple of questions,
like what happens when the aux-domain itself has other aux-domains bound
to it?


Regards,

Joerg

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-11-23 Thread j...@8bytes.org
On Wed, Nov 21, 2018 at 12:40:44PM +0800, Lu Baolu wrote:
> Can you please elaborate a bit more about the concept of subdomains?
> From my point of view, an aux-domain is a normal un-managed domain which
> has a PASID and could be attached to any ADIs through the aux-domain
> specific attach/detach APIs. It could also be attached to a device with
> the existing domain_attach/detach_device() APIs at the same time, hence
> mdev and pci devices in a vfio container could share a domain.

Okay, let's think a bit about having aux-specific attach/detach
functions, in the end I don't insist on my proposal as long as the
IOMMU-API extensions are clean, consistent, and have well defined
semantics.

If we have aux-domain specific attach/detach functions like
iommu_aux_domain_attach/detach(), what happens when the primary domain
of the device is changed with iommu_attach/detach()?

1) Will the aux-domains stay in place? If yes, how does this
   work in setups where the pasid-bound page-tables are
   guest-owned and translated through the primary domain
   page-tables?

2) Will the aux-domains be unbound too? In that case, if the
   primary domain is re-used, will the aux-domains be implicitly
   bound too when iommu_device_attach() is called?

3) Do we just disallow changing the primary domain through that
   interface as long as there are aux-domains or mm_structs
   bound to the device?

Using option 2) or 3) would mean that the aux-domains are still bound to
the primary domain, but that is not reflected in the API. Further, if an
aux-domain is just a normal domain (struct iommu_domain), what happens
when a domain that was used as a primary domain and has bound
aux-domains to it, is bound with iommu_aux_domain_attach() to another
domain?

As you can see, this design decission raises a lot of questions, but
maybe we can work it out and find a solution we all agree on.


Regards,

Joerg

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-11-22 Thread Tian, Kevin
> From: j...@8bytes.org [mailto:j...@8bytes.org]
> Sent: Monday, November 12, 2018 10:56 PM
> 
> Hi Jean-Philippe,
> 
> On Thu, Nov 08, 2018 at 06:29:42PM +, Jean-Philippe Brucker wrote:
> > (1) My initial approach would have been to use the same page tables for
> > the default_domain and this new domain, but it might be precisely what
> > you were trying to avoid with this new proposal: a device attached to
> > two domains at the same time.
> 
> My request was about the initial assumptions a device driver can make
> about a device. This assumptions is that DMA-API is set up and
> initialized for it, for everything else (like SVA) the device driver
> needs to take special action, like allocating an SVA domain and attaching
> the device to it.

Hi, Joerg,

I agree special action needs to be taken for everything else (other than
DMA-API), but the point that I didn't get is why the action must be based
a new SVA-type domain, instead of extending default domain with SVA
capability (including related paging structures which are accessed through 
new SVA APIs). In the latter case domain-wise attribute (e.g. IRQ mapping) 
is naturally shared between capabilities (DMA-API and SVA). There is no 
need to create cross-domain connections as two options that you listed 
below.

Can you help elaborate more about the motivation behind proposal?

P.S. as you may see from other threads, we support same IOMMU 
capabilities (both IOVA and SVA) on vfio-mdev (i.e. aux domain) as ones 
on vfio-pci. which leads to below situation following your proposal, if
'AUX' is treated as a separate capability as SVA:

|-default domain (DMA-API)
|-sva domain1 (SVA)
|-sva domain2 (SVA)
|-...
|-sva domainN (AUX, guest DMA-API)
|   |- sva domainN1 (AUX, guest SVA)
|   |- sva domainN2 (AUX, guest SVA)
|   |-...
|-sva domainM (AUX, guest DMA-API)
|   |- sva domainM1 (AUX, guest SVA)
|   |- sva domainM2 (AUX, guest SVA)
|   |-...

Is that what is in your mind? would it cause unnecessary complexities
on handling those layers?

The hierarchy could be simplified if we allow aux domain to carry both 
DMA-API and SVA capability:

|-default domain (DMA-API)
|-sva domain1 (SVA)
|-sva domain2 (SVA)
|-...
|-sva domainN (AUX, guest DMA-API, guest SVA)
|-sva domainM (AUX, guest DMA-API, guest SVA)

However doing so cause inconsistency on domain definition. why cannot
default domain have full capability but aux domain can?

Further moving along that road, we could have below (as proposed in
this series):

|-default domain (DMA-API, SVA)
|-sva domainN (AUX, guest DMA-API, guest SVA)
|-sva domainM (AUX, guest DMA-API, guest SVA)
(whether putting aux domains as sub-domain or same level is not big deal)

There each domain could support full DMA translation capabilities (IOVA, SVA, 
GPA, GIOVA, etc) with supporting structures together (1st level, 2nd level, irq 
mapping, etc.), and new APIs are invented to utilized new capabilities beyond 
existing DMA-API in each domain. 'AUX' here becomes a orthogonal way to 
those DMA translation  capabilities, and is chosen at domain attach time based 
on device configuration. means same domain can be default on deviceA while
being aux on deviceB (when assigning pci and mdev to same VM).

Thoughts? :-)

> 
> This should of course not break any IRQ setups for the device, and also
> enforcing an ordering is not a good and maintainable solution.
> 
> So what you could do here is to either:
> 
>   1) Save the needed IRQ mappings in some extra datastructure and
>  duplicate it in the SVA domain. This also makes it easier to
>  use the same SVA domain with multiple devices.
> 
>   2) Just re-use the 'page-tables' from the default domain in the
>  sva-domain. This needs to happen at attach-time, because at
>  allocation time you don't know the device yet.
> 
> I think 1) is the best solution, what do you think?
> 
> Btw, things would be different if we could expose SVA through the
> DMA-API to drivers. In this case we could just make the default domain
> of type SVA and be done, but we are not there yet.
> 
> Regards,

Thanks
Kevin
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-11-21 Thread Jean-Philippe Brucker
On 12/11/2018 14:55, j...@8bytes.org wrote:
> Hi Jean-Philippe,
> 
> On Thu, Nov 08, 2018 at 06:29:42PM +, Jean-Philippe Brucker wrote:
>> (1) My initial approach would have been to use the same page tables for
>> the default_domain and this new domain, but it might be precisely what
>> you were trying to avoid with this new proposal: a device attached to
>> two domains at the same time.
> 
> My request was about the initial assumptions a device driver can make
> about a device. This assumptions is that DMA-API is set up and
> initialized for it, for everything else (like SVA) the device driver
> needs to take special action, like allocating an SVA domain and attaching
> the device to it.
> 
> This should of course not break any IRQ setups for the device, and also
> enforcing an ordering is not a good and maintainable solution.
> 
> So what you could do here is to either:
> 
>   1) Save the needed IRQ mappings in some extra datastructure and
>  duplicate it in the SVA domain. This also makes it easier to
>  use the same SVA domain with multiple devices.
> 
>   2) Just re-use the 'page-tables' from the default domain in the
>  sva-domain. This needs to happen at attach-time, because at
>  allocation time you don't know the device yet.
> 
> I think 1) is the best solution, what do you think?

Right now 2) seems more feasible, because changes will be limited to the
IOMMU driver and don't spill in the dma-iommu layer. Maybe it will be
clearer to me once I write a prototype.

> Btw, things would be different if we could expose SVA through the
> DMA-API to drivers. In this case we could just make the default domain
> of type SVA and be done, but we are not there yet.

For the moment though, I think we should allow device drivers to use the
DMA-API at the same time as SVA. If a device driver has to map a
management ring buffer for example, it's much nicer to use dma_alloc as
usual rather than have them write their own DMA allocation routines.
Which is another reason to implement 2) above: the DMA-API would still
act on the default_domain, and attaching an "extended" domain augments
it with SVA features. Detaching from the device doesn't require copying
mappings back to the default domain. Does that sound OK?

Then if vfio-pci wants to use SVA, maybe we can make the UNMANAGED
domain support SVA by default? Otherwise introduce a UNMANAGED+ext
domain. Anyway, that's for later.

This is what I currently plan to add:

struct io_mm *io_mm;
struct iommu_domain *domain;

domain = iommu_alloc_ext_domain(bus);

/* Set an mm-exit callback to disable PASIDs. More attributes
   could be added later to change the capabilities of the ext
   domain */
iommu_domain_set_attr(domain, DOMAIN_ATTR_MM_EXIT_CB, _exit);

/* Fails if the device doesn't support this domain type */
iommu_attach_device(domain, dev);

/* Same as SVA v3, except a domain instead of dev as argument */
io_mm = iommu_sva_bind_mm(domain, current->mm, ...);

/* on success, install the PASID in the device */
pasid = io_mm->pasid;

/* Do more work */

iommu_sva_unbind_mm(io_mm);
iommu_detach_device(domain, dev);


Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-11-20 Thread Lu Baolu

Hi Joerg,

On 11/8/18 12:43 AM, j...@8bytes.org wrote:

Hi,

On Wed, Nov 07, 2018 at 11:40:30AM +0800, Lu Baolu wrote:

Hi Joerg,

On 11/7/18 12:25 AM, j...@8bytes.org wrote:
Nowadays, we can find PASID granular DMA translation on both ARM and x86
platforms. With PASID granular DMA translation supported in system iommu, if
a device divides itself into multiple subsets and tag the DMA
transfers of each subset with a unique PASID, each subset become
assignable. We call the assignable subset as an ADI (Assignable Device
Interface). As the result, each ADI could be attached with a domain.


Yeah, I know the background. The point is, the IOMMU-API as of today
implements a strict one-to-one relationship between domains and devices,
every device can only have one domain assigned at a given time. If we
assign a new domain to a device, the old gets unassigned.

If we allow to attach multiple domains to a single device we
fundamentally break that semantic.


Yes. In the latest v4 submission, we use the aux-domain specific APIs to
attach/detach the domain to a device. Do you see other APIs that will
possibly break this semantic?



Therefore I think it is better is support the ADI devices with
subdomains and a new set of functions in the API to handle only these
sub-domains.


Can you please elaborate a bit more about the concept of subdomains?
From my point of view, an aux-domain is a normal un-managed domain which
has a PASID and could be attached to any ADIs through the aux-domain
specific attach/detach APIs. It could also be attached to a device with
the existing domain_attach/detach_device() APIs at the same time, hence
mdev and pci devices in a vfio container could share a domain.

Best regards,
Lu Baolu




Further more, a single domain might be attached to an ADI of device A,
while attached to another legacy device B which doesn't support PASID
features. In this case, we say "Domain 4" is attached to ADI(PASID#x) in
aux mode and attached to device B in primary mode.


This will of course not work with subdomains, but is that really
necessary? VFIO already allocates a separate domain for each device
anyway (iirc), so it is unlikely that is uses the same domain for a
legacy and an ADI device.


Regards,

Joerg


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-11-12 Thread j...@8bytes.org
Hi Jean-Philippe,

On Thu, Nov 08, 2018 at 06:29:42PM +, Jean-Philippe Brucker wrote:
> (1) My initial approach would have been to use the same page tables for
> the default_domain and this new domain, but it might be precisely what
> you were trying to avoid with this new proposal: a device attached to
> two domains at the same time.

My request was about the initial assumptions a device driver can make
about a device. This assumptions is that DMA-API is set up and
initialized for it, for everything else (like SVA) the device driver
needs to take special action, like allocating an SVA domain and attaching
the device to it.

This should of course not break any IRQ setups for the device, and also
enforcing an ordering is not a good and maintainable solution.

So what you could do here is to either:

1) Save the needed IRQ mappings in some extra datastructure and
   duplicate it in the SVA domain. This also makes it easier to
   use the same SVA domain with multiple devices.

2) Just re-use the 'page-tables' from the default domain in the
   sva-domain. This needs to happen at attach-time, because at
   allocation time you don't know the device yet.

I think 1) is the best solution, what do you think?

Btw, things would be different if we could expose SVA through the
DMA-API to drivers. In this case we could just make the default domain
of type SVA and be done, but we are not there yet.

Regards,

Joerg

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-11-08 Thread Jean-Philippe Brucker
Hi,

On 06/11/2018 16:25, j...@8bytes.org wrote:
> On Mon, Oct 22, 2018 at 12:50:56PM +0100, Robin Murphy wrote:
>> To me, that sounds like a very good argument for having separate "attach as
>> primary domain" and "attach as aux domain" APIs.
> 
> I agree with that, overloading iommu_attach_device() to support
> aux-domains is not going to be a maintainable solution. I'd like this
> to be a two-level approach, where the aux-domains are sub-domains of a
> primary domain (naming is still to-be-improved):
> 
> 
>   struct iommu_domain *domain;
> 
>   domain = iommu_alloc_aux_domain();
> 
>   iommu_attach_device(pdev, domain); /* Fails if device has not
> support for this
> domain-type */

I'm still not sure how this should interact with MSI on Arm. On x86 the
MSI doorbell is a reserved IOVA region in the IOMMU, but for Arm it's a
MMIO register in a separate IRQ chip, upstream of the SMMU. From the
SMMU point of view, MSI is a normal DMA without PASID, and the
corresponding IOMMU mapping must be present.

SMMU mappings for the MSI doorbell are created by the IRQ driver, when
the device driver requests an IRQ. Roughly speaking it goes like:

 request_irq()
  irq_chip_compose_msi_msg()
   iommu_dma_map_msi_msg()
iommu_map(iommu_get_domain_for_dev(dev), ...)

Currently for "normal" device drivers, this will create the MSI mapping
on the default_domain (which is created before the drivers' probe()).
Device drivers that attach an unmanaged domain *and* use MSIs - only
VFIO so far - check if the MSI needs to be mapped, and then call
iommu_get_msi_cookie() before requesting IRQs.

With my previous SVA proposals the default_domain was used for SVA, so
we didn't need any change. But now, depending on the order between the
IRQ request and this attach(), iommu_get_domain_for_dev() will return
either default_domain or the new domain.

(1) My initial approach would have been to use the same page tables for
the default_domain and this new domain, but it might be precisely what
you were trying to avoid with this new proposal: a device attached to
two domains at the same time.

(2) Another solution is to enforce an order between request_irq() and
iommu_attach_device() for all device drivers that intend to use SVA, and
have them do the MSI mapping themselves between attaching the domain and
requesting IRQs.

I don't really like either but to make things easier for device drivers,
I would go with option (1).

Thanks,
Jean

> 
>   /* Bind a process address space to the aux-domain */
>   sva_handle = iommu_sva_bind_mm(domain, current, ...);
>   pasid = iommu_sva_get_pasid(sva_handle);
> 
>   mdev_handle = iommu_mdev_alloc_domain(domain);
>   iommu_mdev_map(mdev_handle, ...);
>   iommu_mdev_unmap(mdev_handle, ...);
> 
>   /* Do more work */
> 
>   iommu_sva_unbind_mm(sva_handle);
> 
>   iommu_mdev_free_domain(mdev_handle);
> 
>   iommu_detach_device(domain, pdev);
> 
> 
> Regards,
> 
>   Joerg
> ___
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-11-07 Thread Alex Williamson
On Wed, 7 Nov 2018 17:43:23 +0100
"j...@8bytes.org"  wrote:

> Hi,
> 
> On Wed, Nov 07, 2018 at 11:40:30AM +0800, Lu Baolu wrote:
> > Hi Joerg,
> > 
> > On 11/7/18 12:25 AM, j...@8bytes.org wrote:
> > Nowadays, we can find PASID granular DMA translation on both ARM and x86
> > platforms. With PASID granular DMA translation supported in system iommu, if
> > a device divides itself into multiple subsets and tag the DMA
> > transfers of each subset with a unique PASID, each subset become
> > assignable. We call the assignable subset as an ADI (Assignable Device
> > Interface). As the result, each ADI could be attached with a domain.  
> 
> Yeah, I know the background. The point is, the IOMMU-API as of today
> implements a strict one-to-one relationship between domains and devices,
> every device can only have one domain assigned at a given time. If we
> assign a new domain to a device, the old gets unassigned.
> 
> If we allow to attach multiple domains to a single device we
> fundamentally break that semantic.
> 
> Therefore I think it is better is support the ADI devices with
> subdomains and a new set of functions in the API to handle only these
> sub-domains.
> 
> > Further more, a single domain might be attached to an ADI of device A,
> > while attached to another legacy device B which doesn't support PASID
> > features. In this case, we say "Domain 4" is attached to ADI(PASID#x) in
> > aux mode and attached to device B in primary mode.  
> 
> This will of course not work with subdomains, but is that really
> necessary? VFIO already allocates a separate domain for each device
> anyway (iirc), so it is unlikely that is uses the same domain for a
> legacy and an ADI device.

VFIO will attempt to use the same domain for all groups within a
container, only falling back to separate domains if there's an
incompatibility.  Multiple domains means that we need to mirror mapping
and unmapping across each domain, so there's more work to do and
theoretically more overhead in the IOMMU implementation assuming those
domains land on the same IOMMU driver.  Thanks,

Alex
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-11-07 Thread j...@8bytes.org
Hi,

On Wed, Nov 07, 2018 at 11:40:30AM +0800, Lu Baolu wrote:
> Hi Joerg,
> 
> On 11/7/18 12:25 AM, j...@8bytes.org wrote:
> Nowadays, we can find PASID granular DMA translation on both ARM and x86
> platforms. With PASID granular DMA translation supported in system iommu, if
> a device divides itself into multiple subsets and tag the DMA
> transfers of each subset with a unique PASID, each subset become
> assignable. We call the assignable subset as an ADI (Assignable Device
> Interface). As the result, each ADI could be attached with a domain.

Yeah, I know the background. The point is, the IOMMU-API as of today
implements a strict one-to-one relationship between domains and devices,
every device can only have one domain assigned at a given time. If we
assign a new domain to a device, the old gets unassigned.

If we allow to attach multiple domains to a single device we
fundamentally break that semantic.

Therefore I think it is better is support the ADI devices with
subdomains and a new set of functions in the API to handle only these
sub-domains.

> Further more, a single domain might be attached to an ADI of device A,
> while attached to another legacy device B which doesn't support PASID
> features. In this case, we say "Domain 4" is attached to ADI(PASID#x) in
> aux mode and attached to device B in primary mode.

This will of course not work with subdomains, but is that really
necessary? VFIO already allocates a separate domain for each device
anyway (iirc), so it is unlikely that is uses the same domain for a
legacy and an ADI device.


Regards,

Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-11-06 Thread j...@8bytes.org
On Mon, Oct 22, 2018 at 12:50:56PM +0100, Robin Murphy wrote:
> To me, that sounds like a very good argument for having separate "attach as
> primary domain" and "attach as aux domain" APIs.

I agree with that, overloading iommu_attach_device() to support
aux-domains is not going to be a maintainable solution. I'd like this
to be a two-level approach, where the aux-domains are sub-domains of a
primary domain (naming is still to-be-improved):


struct iommu_domain *domain;

domain = iommu_alloc_aux_domain();

iommu_attach_device(pdev, domain); /* Fails if device has not
  support for this
  domain-type */

/* Bind a process address space to the aux-domain */
sva_handle = iommu_sva_bind_mm(domain, current, ...);
pasid = iommu_sva_get_pasid(sva_handle);

mdev_handle = iommu_mdev_alloc_domain(domain);
iommu_mdev_map(mdev_handle, ...);
iommu_mdev_unmap(mdev_handle, ...);

/* Do more work */

iommu_sva_unbind_mm(sva_handle);

iommu_mdev_free_domain(mdev_handle);

iommu_detach_device(domain, pdev);


Regards,

Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-11-01 Thread Lu Baolu

Hi,

On 10/23/18 12:48 AM, Jordan Crouse wrote:

On Fri, Oct 19, 2018 at 07:11:52PM +0100, Jean-Philippe Brucker wrote:


(2) Allocate a domain and attach it to the device.

   dom = iommu_domain_alloc()
   iommu_attach_device(dom, dev)

 I still have concerns about this part, which are highlighted by the
 messy changes of patch 1. I think it would make more sense to
 introduce new attach/detach_dev_aux() functions instead of reusing
 attach/detach_dev()

 Can we reconsider this and avoid unnecessary complications in IOMMU
 core and drivers? Does the VFIO code greatly benefit from using the
 same attach() function? It could as well use a different one for
 devices in AUXD mode, which the mediating driver could tell by
 adding a flag in mdev_set_iommu_device(), for example.

 And I don't think other users of AUXD would benefit from using the
 same attach() function, since they will know whether they want to be
 using main or auxiliary domain when doing attach().


I second this. For the arm-smmu-v2 multiple-pagetable model we would either need
new API functions or do something along the lines of

dom = iommu_domain_alloc()
iommu_set_attr(dom, DOMAIN_ATTR_I_WANT_TO_BE_AUX)
iommu_attach_device(dom,dev)

Either that or do some some dummy device magic that I haven't started to work
out in my head. Having aux specific functions just for this step would make the
arm-smmu-v2 version work out much cleaner.


Okay! If there are no more comments, I will prepare a new version with
aux specific functions.

Best regards,
Lu Baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-10-25 Thread Lu Baolu

Hi,

On 10/23/18 12:03 AM, Jean-Philippe Brucker wrote:

#1: Given that PASID is a system wide resource, during boot iommu driver needs
to scan and set the max PASID to be no more than min(iommu_supported_max_pasid)
of all available IOMMU's.

#2: In addition if any device supports less than the choosen system max PASID
we should simply disable PASID on that device.

The reason is if the process were to bind to 2 or more accelerators and
the second device has a limit smaller than the first that the application
already attached, the second attemt to bind would fail. (Because
we would use the same PASID for a single process)

But that's not reason enough to completely disable PASID for the device,
it could be the only one bound to that process, or PASID could be only
used privately by the host device driver.


In addition for Intel IOMMU's PASID is a system wide resource. We have
a virtual capability for vIOMMU's to allocate PASID's. At the time of
allocation we don't know what device this PASID is even being allocated
for.

Ah, I had missed that part, I thought the PV allocation had the device
ID as well. That's a significant change.

I was still hoping that we could go back to per-device PASID spaces in
the host, since I still haven't seen any convincing argument in favor of
system-wide PASID. Get rid of #1 in order to solve #2, and as a bonus
support more PASIDs in total. Until now PASID was a per-device resource
except for choices made when writing the IOMMU driver.



System wide PASID makes things simple and workable in many use cases
although it looks a bit coarse grained.

For an example, in a use case of one application manipulating multiple
devices with a single PASID, it will be complicated to find a PASID
suitable for all the per-device PASID spaces. It will become more
complicated when it comes to PV allocation since PV context might have
no device information when it requires a PASID.

Best regards,
Lu Baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-10-23 Thread Tian, Kevin
> From: Raj, Ashok
> Sent: Wednesday, October 24, 2018 1:17 AM
> 
> >
> > But that's not reason enough to completely disable PASID for the
> > device,
> > it could be the only one bound to that process, or PASID could be
> > only
> > used privately by the host device driver.
> 
> Agree, there could be other use cases.
> 
> If the device was already attached during boot the driver comes early
> to get the low PASID space. If the device was hot-added and the PASID
> supported by device wasn't available its going to fail.
> 
> Enforcing something that will always work will be more reliable. But i
> agree it maybe too strict for some cases.
> 
> Maybe its a IOMMU enforced limit for the platform on the minimum
> requirement for consistency.
> 

what about keeping the flexibility in common logic (e.g. allowing 
pasid_min and pasid_max in PASID allocator, detecting PASID
width limitation at binding time, etc.), while letting vendor driver
to vote disabling PASID on device based on its own need (e.g. if
not supporting 20bit width).

Thanks
kevin
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-10-23 Thread Raj, Ashok
On Mon, 2018-10-22 at 17:03 +0100, Jean-Philippe Brucker wrote:
> On 22/10/2018 11:07, Raj, Ashok wrote:
> > > For my own convenience I've been using the SVA infrastructure
> > > since
> > > I already had the locking and IOMMU ops in place. The
> > > proposed
> > > interface is also missing min_pasid and max_pasid parameters,
> > > which
> > > could be needed by device drivers to enforce PASID limits.
> > 
> > Variable PASID limits per-device is hard to manage. I suppose we
> > can play
> > some games to get this right, but I suspect that wont be very
> > useful in 
> > the long run.
> 
> Devices may have PASID limits that are not discoverable with the PCI
> PASID capability (https://patchwork.kernel.org/patch/9989307/#2138957
> 1).
> Even if we simply reject devices that don't support enough PASIDs, I
> think it's still better to return an error on bind or init_device
> than
> to return a valid PASID that they can't use

That sounds reasonable. What I meant was that trying to do similar
things like what we do for IOVA (Reserving from high etc) may not work
when a process might need to share the same PASID between 2
accelerators.


> 
> > #1: Given that PASID is a system wide resource, during boot iommu
> > driver needs 
> > to scan and set the max PASID to be no more than
> > min(iommu_supported_max_pasid) 
> > of all available IOMMU's. 
> > 
> > #2: In addition if any device supports less than the choosen system
> > max PASID
> > we should simply disable PASID on that device.
> > 
> > The reason is if the process were to bind to 2 or more accelerators
> > and
> > the second device has a limit smaller than the first that the
> > application
> > already attached, the second attemt to bind would fail. (Because
> > we would use the same PASID for a single process)
> 
> But that's not reason enough to completely disable PASID for the
> device,
> it could be the only one bound to that process, or PASID could be
> only
> used privately by the host device driver.

Agree, there could be other use cases. 

If the device was already attached during boot the driver comes early
to get the low PASID space. If the device was hot-added and the PASID
supported by device wasn't available its going to fail.

Enforcing something that will always work will be more reliable. But i
agree it maybe too strict for some cases. 

Maybe its a IOMMU enforced limit for the platform on the minimum
requirement for consistency. 

> 
> > In addition for Intel IOMMU's PASID is a system wide resource. We
> > have
> > a virtual capability for vIOMMU's to allocate PASID's. At the time
> > of
> > allocation we don't know what device this PASID is even being
> > allocated 
> > for. 
> 
> Ah, I had missed that part, I thought the PV allocation had the
> device
> ID as well. That's a significant change.
> 
> I was still hoping that we could go back to per-device PASID spaces
> in
> the host, since I still haven't seen any convincing argument in favor
> of
> system-wide PASID. Get rid of #1 in order to solve #2, and as a bonus
> support more PASIDs in total. Until now PASID was a per-device
> resource
> except for choices made when writing the IOMMU driver.

A global shared PASID table is designed with a future ISA extension for
PASID isolation. 
> 
> > Certainly we could add other intelligence to pass a hint for 
> > max_pasid in the virtiual interface. 
> > 
> > I suppose when this becomes real, most serious accelerators will 
> > support the full width.
> 
> I don't know if that's obvious from the device perspective. If I had
> to
> implement one, I would simply size my PASIDs to the number of
> contexts
> supported by my device (which might be especially small in the
> embedded
> space), given that technically nothing prevents software from
> supporting
> that and the specification allows me to choose a width.

That's very reasonable. But supporting a smaller number of contexts
is different from supporting smaller number of PASID's. A device could
support the full PASID width, but limit the number of contexts to a
smaller number. Certainly if the device vendors don't know upfront that
could be an issue. 

> 
> This was the intended model for PCI, but thankfully version 4.0 added
> an
> implementation note (6.20.2.1 - PASID Field) advising against this
> approach, and to instead support 20 bits for interoperability. Maybe
> it
> will set a trend for non-PCI devices as well.

That's a great suggestion. In fact we have already made this suggestion
 to our PCI WG rep. For Scalable IOV devices the white-paper we
published should hint that we expect devices to support the full 20 bit
PASID width.

> 
> Thanks,
> Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-10-22 Thread Jean-Philippe Brucker
On 22/10/2018 12:50, Robin Murphy wrote:
> To me, that sounds like a very good argument for having separate "attach
> as primary domain" and "attach as aux domain" APIs. Say a driver wants
> to use multiple aux domains itself to isolate logically-separate
> transaction streams, but still also needs to control the main domain for
> transactions without PASID (interrupts?)

Yes, MSIs don't have a PASID (well, shouldn't). So in Arm systems, where
the address of the IRQ chip is translated by the SMMU, they are mapped
in the main domain.

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-10-22 Thread Jordan Crouse
On Fri, Oct 19, 2018 at 07:11:52PM +0100, Jean-Philippe Brucker wrote:

> (2) Allocate a domain and attach it to the device.
> 
>   dom = iommu_domain_alloc()
>   iommu_attach_device(dom, dev)
> 
> I still have concerns about this part, which are highlighted by the
> messy changes of patch 1. I think it would make more sense to
> introduce new attach/detach_dev_aux() functions instead of reusing
> attach/detach_dev()
> 
> Can we reconsider this and avoid unnecessary complications in IOMMU
> core and drivers? Does the VFIO code greatly benefit from using the
> same attach() function? It could as well use a different one for
> devices in AUXD mode, which the mediating driver could tell by
> adding a flag in mdev_set_iommu_device(), for example.
> 
> And I don't think other users of AUXD would benefit from using the
> same attach() function, since they will know whether they want to be
> using main or auxiliary domain when doing attach().

I second this. For the arm-smmu-v2 multiple-pagetable model we would either need
new API functions or do something along the lines of 

dom = iommu_domain_alloc()
iommu_set_attr(dom, DOMAIN_ATTR_I_WANT_TO_BE_AUX)
iommu_attach_device(dom,dev)

Either that or do some some dummy device magic that I haven't started to work
out in my head. Having aux specific functions just for this step would make the
arm-smmu-v2 version work out much cleaner.

Jordan

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-10-22 Thread Jean-Philippe Brucker
On 22/10/2018 11:07, Raj, Ashok wrote:
>> For my own convenience I've been using the SVA infrastructure since
>> I already had the locking and IOMMU ops in place. The proposed
>> interface is also missing min_pasid and max_pasid parameters, which
>> could be needed by device drivers to enforce PASID limits.
> 
> Variable PASID limits per-device is hard to manage. I suppose we can play
> some games to get this right, but I suspect that wont be very useful in 
> the long run.

Devices may have PASID limits that are not discoverable with the PCI
PASID capability (https://patchwork.kernel.org/patch/9989307/#21389571).
Even if we simply reject devices that don't support enough PASIDs, I
think it's still better to return an error on bind or init_device than
to return a valid PASID that they can't use

> #1: Given that PASID is a system wide resource, during boot iommu driver 
> needs 
> to scan and set the max PASID to be no more than 
> min(iommu_supported_max_pasid) 
> of all available IOMMU's. 
> 
> #2: In addition if any device supports less than the choosen system max PASID
> we should simply disable PASID on that device.
> 
> The reason is if the process were to bind to 2 or more accelerators and
> the second device has a limit smaller than the first that the application
> already attached, the second attemt to bind would fail. (Because
> we would use the same PASID for a single process)

But that's not reason enough to completely disable PASID for the device,
it could be the only one bound to that process, or PASID could be only
used privately by the host device driver.

> In addition for Intel IOMMU's PASID is a system wide resource. We have
> a virtual capability for vIOMMU's to allocate PASID's. At the time of
> allocation we don't know what device this PASID is even being allocated 
> for. 

Ah, I had missed that part, I thought the PV allocation had the device
ID as well. That's a significant change.

I was still hoping that we could go back to per-device PASID spaces in
the host, since I still haven't seen any convincing argument in favor of
system-wide PASID. Get rid of #1 in order to solve #2, and as a bonus
support more PASIDs in total. Until now PASID was a per-device resource
except for choices made when writing the IOMMU driver.

> Certainly we could add other intelligence to pass a hint for 
> max_pasid in the virtiual interface. 
> 
> I suppose when this becomes real, most serious accelerators will 
> support the full width.

I don't know if that's obvious from the device perspective. If I had to
implement one, I would simply size my PASIDs to the number of contexts
supported by my device (which might be especially small in the embedded
space), given that technically nothing prevents software from supporting
that and the specification allows me to choose a width.

This was the intended model for PCI, but thankfully version 4.0 added an
implementation note (6.20.2.1 - PASID Field) advising against this
approach, and to instead support 20 bits for interoperability. Maybe it
will set a trend for non-PCI devices as well.

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-10-22 Thread Jordan Crouse
On Mon, Oct 22, 2018 at 12:50:56PM +0100, Robin Murphy wrote:
> On 22/10/2018 07:53, Tian, Kevin wrote:
> >>From: Jean-Philippe Brucker [mailto:jean-philippe.bruc...@arm.com]
> >>Sent: Saturday, October 20, 2018 2:12 AM
> >>
> >>This is a first prototype adding auxiliary domain support to Arm SMMUv3,
> >>following Lu Baolu's latest proposal for IOMMU aware mediated devices
> >>[1]. It works, but the attach() API still doesn't feel right. See (2)
> >>below.
> >>
> >>Patch 1 adapts iommu.c to the current proposal for auxiliary domains.
> >>Patches 2-4 rework the PASID allocator to make it usable for SVA and
> >>AUXD simultaneously. Patches 5-6 add AUXD support to SMMUv3.
> >>
> >>
> >>When a device can have multiple address space, for instance with PCI
> >>PASID, an auxiliary domain (AUXD) is the IOMMU representation of one
> >>address space. I distinguish auxiliary from "main" domain, which
> >>represents the non-PASID address space but also (at least for SMMUv3)
> >>the whole device context, PASID tables etc.
> >
> >I'd like to clarify a bit. :-)
> >
> >a domain can always represent one or more address spaces, where an
> >address space could be for IOVA or GPA and/or other address spaces for
> >SVA. Address spaces may be chained together in so-called nested mode.
> >
> >a domain can be associated with a full device (in BDF granular), and/or
> >with a partition of a device (say in PASID granular). In the former case,
> >the domain becomes the master (or primary) domain of the device. In
> >the latter case, the domain becomes the auxiliary domain of the device.
> >
> >vendor domain structure may include vendor-specific states which
> >are applied to device context at attach time, e.g. PASID table (SMMUv3)
> >if representing as master domain, or PASID value (VT-d scalable mode)
> >if representing as auxiliary domain.
> >
> >so the domain definition is never changed with the introduction of
> >AUXD. 'auxiliary' is a per-device attribute which takes effect when at
> >domain attach time. A domain is perfectly sane to represent as a
> >master domain to deviceA and an auxiliary domain to deviceB at the
> >same time (say when device A and a mdev on deviceB are assigned to
> >same VM), while supporting one or more address spaces.
> 
> To me, that sounds like a very good argument for having separate
> "attach as primary domain" and "attach as aux domain" APIs. Say a
> driver wants to use multiple aux domains itself to isolate
> logically-separate transaction streams, but still also needs to
> control the main domain for transactions without PASID (interrupts?)
> - having to juggle some invisible device state around multiple
> iommu_{attach,detach}_group() calls which look identical but are
> expected to behave differently at runtime sounds like a recipe for
> unmaintainable code.
> 
> I don't think it's necessarily safe to assume that
> one-struct-device-per-PASID will be the only usage model.

This feels exactly like the QCOM GPU model - multiple aux domains
for individual process pagetables but the main domain for TTBR1
(global) allocations.

Jordan

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-10-22 Thread Robin Murphy

On 22/10/2018 07:53, Tian, Kevin wrote:

From: Jean-Philippe Brucker [mailto:jean-philippe.bruc...@arm.com]
Sent: Saturday, October 20, 2018 2:12 AM

This is a first prototype adding auxiliary domain support to Arm SMMUv3,
following Lu Baolu's latest proposal for IOMMU aware mediated devices
[1]. It works, but the attach() API still doesn't feel right. See (2)
below.

Patch 1 adapts iommu.c to the current proposal for auxiliary domains.
Patches 2-4 rework the PASID allocator to make it usable for SVA and
AUXD simultaneously. Patches 5-6 add AUXD support to SMMUv3.


When a device can have multiple address space, for instance with PCI
PASID, an auxiliary domain (AUXD) is the IOMMU representation of one
address space. I distinguish auxiliary from "main" domain, which
represents the non-PASID address space but also (at least for SMMUv3)
the whole device context, PASID tables etc.


I'd like to clarify a bit. :-)

a domain can always represent one or more address spaces, where an
address space could be for IOVA or GPA and/or other address spaces for
SVA. Address spaces may be chained together in so-called nested mode.

a domain can be associated with a full device (in BDF granular), and/or
with a partition of a device (say in PASID granular). In the former case,
the domain becomes the master (or primary) domain of the device. In
the latter case, the domain becomes the auxiliary domain of the device.

vendor domain structure may include vendor-specific states which
are applied to device context at attach time, e.g. PASID table (SMMUv3)
if representing as master domain, or PASID value (VT-d scalable mode)
if representing as auxiliary domain.

so the domain definition is never changed with the introduction of
AUXD. 'auxiliary' is a per-device attribute which takes effect when at
domain attach time. A domain is perfectly sane to represent as a
master domain to deviceA and an auxiliary domain to deviceB at the
same time (say when device A and a mdev on deviceB are assigned to
same VM), while supporting one or more address spaces.


To me, that sounds like a very good argument for having separate "attach 
as primary domain" and "attach as aux domain" APIs. Say a driver wants 
to use multiple aux domains itself to isolate logically-separate 
transaction streams, but still also needs to control the main domain for 
transactions without PASID (interrupts?) - having to juggle some 
invisible device state around multiple iommu_{attach,detach}_group() 
calls which look identical but are expected to behave differently at 
runtime sounds like a recipe for unmaintainable code.


I don't think it's necessarily safe to assume that 
one-struct-device-per-PASID will be the only usage model.


Robin.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-10-22 Thread Raj, Ashok
On Fri, Oct 19, 2018 at 07:11:52PM +0100, Jean-Philippe Brucker wrote:
> This is a first prototype adding auxiliary domain support to Arm SMMUv3,
> following Lu Baolu's latest proposal for IOMMU aware mediated devices
> [1]. It works, but the attach() API still doesn't feel right. See (2)
> below.
> 
> Patch 1 adapts iommu.c to the current proposal for auxiliary domains.
> Patches 2-4 rework the PASID allocator to make it usable for SVA and
> AUXD simultaneously. Patches 5-6 add AUXD support to SMMUv3.
> 
> 
> When a device can have multiple address space, for instance with PCI
> PASID, an auxiliary domain (AUXD) is the IOMMU representation of one
> address space. I distinguish auxiliary from "main" domain, which
> represents the non-PASID address space but also (at least for SMMUv3)
> the whole device context, PASID tables etc.
> 
> Auxiliary domains will be used by VFIO for IOMMU-aware mdev, and by any
> other device driver that wants to use PASID for private address spaces
> (as opposed to SVA [2]). The following API is available to device
> drivers:
> 
> (1) Enable AUXD for a device. Enable PASID if necessary and set an AUXD
> flag on the IOMMU data associated to a device.
> 
> For my own convenience I've been using the SVA infrastructure since
> I already had the locking and IOMMU ops in place. The proposed
> interface is also missing min_pasid and max_pasid parameters, which
> could be needed by device drivers to enforce PASID limits.

Variable PASID limits per-device is hard to manage. I suppose we can play
some games to get this right, but I suspect that wont be very useful in 
the long run.

#1: Given that PASID is a system wide resource, during boot iommu driver needs 
to scan and set the max PASID to be no more than min(iommu_supported_max_pasid) 
of all available IOMMU's. 

#2: In addition if any device supports less than the choosen system max PASID
we should simply disable PASID on that device.

The reason is if the process were to bind to 2 or more accelerators and
the second device has a limit smaller than the first that the application
already attached, the second attemt to bind would fail. (Because
we would use the same PASID for a single process)

In addition for Intel IOMMU's PASID is a system wide resource. We have
a virtual capability for vIOMMU's to allocate PASID's. At the time of
allocation we don't know what device this PASID is even being allocated 
for. Certainly we could add other intelligence to pass a hint for 
max_pasid in the virtiual interface. 

I suppose when this becomes real, most serious accelerators will 
support the full width. Hence doing #1 and #2 above should be good
for most cases.

Cheers,
Ashok
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-10-22 Thread Tian, Kevin
> From: Jean-Philippe Brucker [mailto:jean-philippe.bruc...@arm.com]
> Sent: Saturday, October 20, 2018 2:12 AM
> 
> This is a first prototype adding auxiliary domain support to Arm SMMUv3,
> following Lu Baolu's latest proposal for IOMMU aware mediated devices
> [1]. It works, but the attach() API still doesn't feel right. See (2)
> below.
> 
> Patch 1 adapts iommu.c to the current proposal for auxiliary domains.
> Patches 2-4 rework the PASID allocator to make it usable for SVA and
> AUXD simultaneously. Patches 5-6 add AUXD support to SMMUv3.
> 
> 
> When a device can have multiple address space, for instance with PCI
> PASID, an auxiliary domain (AUXD) is the IOMMU representation of one
> address space. I distinguish auxiliary from "main" domain, which
> represents the non-PASID address space but also (at least for SMMUv3)
> the whole device context, PASID tables etc.

I'd like to clarify a bit. :-)

a domain can always represent one or more address spaces, where an 
address space could be for IOVA or GPA and/or other address spaces for 
SVA. Address spaces may be chained together in so-called nested mode.

a domain can be associated with a full device (in BDF granular), and/or 
with a partition of a device (say in PASID granular). In the former case,
the domain becomes the master (or primary) domain of the device. In
the latter case, the domain becomes the auxiliary domain of the device.

vendor domain structure may include vendor-specific states which
are applied to device context at attach time, e.g. PASID table (SMMUv3) 
if representing as master domain, or PASID value (VT-d scalable mode)
if representing as auxiliary domain.

so the domain definition is never changed with the introduction of
AUXD. 'auxiliary' is a per-device attribute which takes effect when at
domain attach time. A domain is perfectly sane to represent as a
master domain to deviceA and an auxiliary domain to deviceB at the
same time (say when device A and a mdev on deviceB are assigned to
same VM), while supporting one or more address spaces.

I explained above concept in my KVM forum session:

https://events.linuxfoundation.org/wp-content/uploads/2017/12/Hardware-Assisted-Mediated-Pass-Through-with-VFIO-Kevin-Tian-Intel.pdf
 (slide 16/17)

> 
> Auxiliary domains will be used by VFIO for IOMMU-aware mdev, and by
> any
> other device driver that wants to use PASID for private address spaces
> (as opposed to SVA [2]). The following API is available to device
> drivers:
> 
> (1) Enable AUXD for a device. Enable PASID if necessary and set an AUXD
> flag on the IOMMU data associated to a device.
> 
> For my own convenience I've been using the SVA infrastructure since
> I already had the locking and IOMMU ops in place. The proposed
> interface is also missing min_pasid and max_pasid parameters, which
> could be needed by device drivers to enforce PASID limits.
> iommu_sva_init_device() without arguments already enables PASID, so
> I just added an AUXD flag to SVA features:
> 
>   iommu_sva_init_device(dev, IOMMU_SVA_FEAT_AUXD,
> min_pasid, max_pasid, NULL)
>   iommu_sva_shutdown_device(dev)
> 
> Or as proposed in [1]:
> 
>   iommu_set_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_ENABLE, NULL)
>   iommu_set_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_DISABLE, NULL)
> 
> Finding a compromise for this interface should be easy.
> 
> (2) Allocate a domain and attach it to the device.
> 
>   dom = iommu_domain_alloc()
>   iommu_attach_device(dom, dev)
> 
> I still have concerns about this part, which are highlighted by the
> messy changes of patch 1. I think it would make more sense to
> introduce new attach/detach_dev_aux() functions instead of reusing
> attach/detach_dev()
> 
> Can we reconsider this and avoid unnecessary complications in IOMMU
> core and drivers? Does the VFIO code greatly benefit from using the
> same attach() function? It could as well use a different one for
> devices in AUXD mode, which the mediating driver could tell by
> adding a flag in mdev_set_iommu_device(), for example.

Baolu gave some recommendations to patch 1. Please check whether it can
help reduce the mess.

IMO using same API is conceptually clearer to VFIO... let's gather in KVM
forum to have a conclusion, with Alex being there.

Thanks
Kevin

> 
> And I don't think other users of AUXD would benefit from using the
> same attach() function, since they will know whether they want to be
> using main or auxiliary domain when doing attach().
> 
> (3) Get the PASID, and program it in the device
> 
>   iommu_domain_get_attr(dom, DOMAIN_ATTR_AUXD_ID, )
> 
> (4) Create DMA mappings
> 
>   iommu_map(dom, ...)
>   iommu_unmap(dom, ...)
> 
> Ultimately it would be nice to add PASID support to the DMA API as
> well. For now drivers allocate IOVAs and pages themselves.
> 
> 
> For vfio-mdev, a driver that wants to create mdevs only 

Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-10-19 Thread Xu Zaibo

Hi Jean,

On 2018/10/20 2:11, Jean-Philippe Brucker wrote:

This is a first prototype adding auxiliary domain support to Arm SMMUv3,
following Lu Baolu's latest proposal for IOMMU aware mediated devices
[1]. It works, but the attach() API still doesn't feel right. See (2)
below.

Patch 1 adapts iommu.c to the current proposal for auxiliary domains.
Patches 2-4 rework the PASID allocator to make it usable for SVA and
AUXD simultaneously. Patches 5-6 add AUXD support to SMMUv3.


When a device can have multiple address space, for instance with PCI
PASID, an auxiliary domain (AUXD) is the IOMMU representation of one
address space. I distinguish auxiliary from "main" domain, which
represents the non-PASID address space but also (at least for SMMUv3)
the whole device context, PASID tables etc.

Auxiliary domains will be used by VFIO for IOMMU-aware mdev, and by any
other device driver that wants to use PASID for private address spaces
(as opposed to SVA [2]). The following API is available to device
drivers:

(1) Enable AUXD for a device. Enable PASID if necessary and set an AUXD
 flag on the IOMMU data associated to a device.

 For my own convenience I've been using the SVA infrastructure since
 I already had the locking and IOMMU ops in place. The proposed
 interface is also missing min_pasid and max_pasid parameters, which
 could be needed by device drivers to enforce PASID limits.
 iommu_sva_init_device() without arguments already enables PASID, so
 I just added an AUXD flag to SVA features:

   iommu_sva_init_device(dev, IOMMU_SVA_FEAT_AUXD,
 min_pasid, max_pasid, NULL)
   iommu_sva_shutdown_device(dev)

 Or as proposed in [1]:

   iommu_set_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_ENABLE, NULL)
   iommu_set_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_DISABLE, NULL)

 Finding a compromise for this interface should be easy.


Personal idea, from the respective vendors, PASID limit should be set as 
enabling.




(2) Allocate a domain and attach it to the device.

   dom = iommu_domain_alloc()
   iommu_attach_device(dom, dev)

 I still have concerns about this part, which are highlighted by the
 messy changes of patch 1. I think it would make more sense to
 introduce new attach/detach_dev_aux() functions instead of reusing
 attach/detach_dev()

 Can we reconsider this and avoid unnecessary complications in IOMMU
 core and drivers? Does the VFIO code greatly benefit from using the
 same attach() function? It could as well use a different one for
 devices in AUXD mode, which the mediating driver could tell by
 adding a flag in mdev_set_iommu_device(), for example.

 And I don't think other users of AUXD would benefit from using the
 same attach() function, since they will know whether they want to be
 using main or auxiliary domain when doing attach().


It seems that SVA/AUXD have no better sulotion for users applcations 
except VFIO.
But VFIO seems caring for VMs' scenario more than general user space 
processes at present. :)




(3) Get the PASID, and program it in the device

   iommu_domain_get_attr(dom, DOMAIN_ATTR_AUXD_ID, )

(4) Create DMA mappings

   iommu_map(dom, ...)
   iommu_unmap(dom, ...)

 Ultimately it would be nice to add PASID support to the DMA API as
 well. For now drivers allocate IOVAs and pages themselves.


For vfio-mdev, a driver that wants to create mdevs only performs steps (1)
and (3):

* When initializing the parent device, enable AUXD (1)
* In mdev_parent_ops::create(), call mdev_set_iommu_device(mdev_dev(mdev),
   mdev_parent_dev(mdev)).
* In mdev_parent_ops::open(), get the PASID (3) and install it in the
   parent device.
* In mdev_parent_ops::close(), clear the PASID


This may be a good solution to avoid exposing PASID to user space. 
Personally, VFIO-PCI is
not in the range of this using scenario for partial device while 
PASID/SVA/AUXD are enabled.


Cheers,
Zaibo

.



This code and the many patches it depends on can be found on my
iommu/auxd branch:
git://linux-arm.org/linux-jpb.git iommu/auxd

[1] [PATCH v3 0/8] vfio/mdev: IOMMU aware mediated device
 
https://lwn.net/ml/linux-kernel/20181012051632.26064-1-baolu...@linux.intel.com/
[2] [PATCH v3 00/10] Shared Virtual Addressing for the IOMMU
 https://www.spinics.net/lists/iommu/msg30076.html

Jean-Philippe Brucker (6):
   iommu: Adapt attach/detach_dev() for auxiliary domains
   drivers core: Add I/O ASID allocator
   iommu/sva: Use external PASID allocator
   iommu/sva: Support AUXD feature
   iommu/arm-smmu-v3: Implement detach_dev op
   iommu/arm-smmu-v3: Add support for auxiliary domains

  drivers/base/Kconfig|   7 ++
  drivers/base/Makefile   |   1 +
  drivers/base/ioasid.c   | 140 ++
  drivers/iommu/Kconfig   |   1 +
  drivers/iommu/arm-smmu-v3.c | 164 ++--