Re: [RFC 11/20] iommu/iommufd: Add IOMMU_IOASID_ALLOC/FREE

2021-10-17 Thread da...@gibson.dropbear.id.au
On Thu, Oct 14, 2021 at 12:06:10PM -0300, Jason Gunthorpe wrote:
> On Thu, Oct 14, 2021 at 03:33:21PM +1100, da...@gibson.dropbear.id.au wrote:
> 
> > > If the HW can attach multiple non-overlapping IOAS's to the same
> > > device then the HW is routing to the correct IOAS by using the address
> > > bits. This is not much different from the prior discussion we had
> > > where we were thinking of the PASID as an 80 bit address
> > 
> > Ah... that might be a workable approach.  And it even helps me get my
> > head around multiple attachment which I was struggling with before.
> > 
> > So, the rule would be that you can attach multiple IOASes to a device,
> > as long as none of them overlap.  The non-overlapping could be because
> > each IOAS covers a disjoint address range, or it could be because
> > there's some attached information - such as a PASID - to disambiguate.
> 
> Right exactly - it is very parallel to PASID
> 
> And obviously HW support is required to have multiple page table
> pointers per RID - which sounds like PPC does (high/low pointer?)

Hardware support is require *in the IOMMU*.  Nothing (beyond regular
64-bit DMA support) is required in the endpoint devices.  That's not
true of PASID.

> > What remains a question is where the disambiguating information comes
> > from in each case: does it come from properties of the IOAS,
> > propertues of the device, or from extra parameters supplied at attach
> > time.  IIUC, the current draft suggests it always comes at attach time
> > for the PASID information.  Obviously the more consistency we can have
> > here the better.
> 
> From a generic view point I'd say all are fair game. It is up to the
> IOMMU driver to take the requested set of IOAS's, the "at attachment"
> information (like PASID) and decide what to do, or fail.

Ok, that's a model that makes sense to me.

> > I can also see an additional problem in implementation, once we start
> > looking at hot-adding devices to existing address spaces.  
> 
> I won't pretend to guess how to implement this :) Just from a modeling
> perspective is something that works logically. If the kernel
> implementation is too hard then PPC should do one of the other ideas.
> 
> Personally I'd probably try for a nice multi-domain attachment model
> like PASID and not try to create/destroy domains.

I don't really follow what you mean by that.

> As I said in my last email I think it is up to each IOMMU HW driver to
> make these decisions, the iommufd framework just provides a
> standardized API toward the attaching driver that the IOMMU HW must
> fit into.
> 
> Jason
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC 11/20] iommu/iommufd: Add IOMMU_IOASID_ALLOC/FREE

2021-10-17 Thread David Gibson
On Thu, Oct 14, 2021 at 11:52:08AM -0300, Jason Gunthorpe wrote:
> On Thu, Oct 14, 2021 at 03:53:33PM +1100, David Gibson wrote:
> 
> > > My feeling is that qemu should be dealing with the host != target
> > > case, not the kernel.
> > > 
> > > The kernel's job should be to expose the IOMMU HW it has, with all
> > > features accessible, to userspace.
> > 
> > See... to me this is contrary to the point we agreed on above.
> 
> I'm not thinking of these as exclusive ideas.
> 
> The IOCTL interface in iommu can quite happily expose:
>  Create IOAS generically
>  Manipulate IOAS generically
>  Create IOAS with IOMMU driver specific attributes
>  HW specific Manipulate IOAS
> 
> IOCTL commands all together.
> 
> So long as everything is focused on a generic in-kernel IOAS object it
> is fine to have multiple ways in the uAPI to create and manipulate the
> objects.
> 
> When I speak about a generic interface I mean "Create IOAS
> generically" - ie a set of IOCTLs that work on most IOMMU HW and can
> be relied upon by things like DPDK/etc to always work and be portable.
> This is why I like "hints" to provide some limited widely applicable
> micro-optimization.
> 
> When I said "expose the IOMMU HW it has with all features accessible"
> I mean also providing "Create IOAS with IOMMU driver specific
> attributes".
> 
> These other IOCTLs would allow the IOMMU driver to expose every
> configuration knob its HW has, in a natural HW centric language.
> There is no pretense of genericness here, no crazy foo=A, foo=B hidden
> device specific interface.
> 
> Think of it as a high level/low level interface to the same thing.

Ok, I see what you mean.

> > Those are certainly wrong, but they came about explicitly by *not*
> > being generic rather than by being too generic.  So I'm really
> > confused aso to what you're arguing for / against.
> 
> IMHO it is not having a PPC specific interface that was the problem,
> it was making the PPC specific interface exclusive to the type 1
> interface. If type 1 continued to work on PPC then DPDK/etc would
> never learned PPC specific code.

Ok, but the reason this happened is that the initial version of type 1
*could not* be used on PPC.  The original Type 1 implicitly promised a
"large" IOVA range beginning at IOVA 0 without any real way of
specifying or discovering how large that range was.  Since ppc could
typically only give a 2GiB range at IOVA 0, that wasn't usable.

That's why I say the problem was not making type1 generic enough.  I
believe the current version of Type1 has addressed this - at least
enough to be usable in common cases.  But by this time the ppc backend
is already out there, so no-one's had the capacity to go back and make
ppc work with Type1.

> For iommufd with the high/low interface each IOMMU HW should ask basic
> questions:
> 
>  - What should the generic high level interface do on this HW?
>For instance what should 'Create IOAS generically' do for PPC?
>It should not fail, it should create *something*
>What is the best thing for DPDK?
>I guess the 64 bit window is most broadly useful.

Right, which means the kernel must (at least in the common case) have
the capcity to choose and report a non-zero base-IOVA.

Hrm... which makes me think... if we allow this for the common
kernel-managed case, do we even need to have capcity in the high-level
interface for reporting IO holes?  If the kernel can choose a non-zero
base, it could just choose on x86 to place it's advertised window
above the IO hole.

>  - How to accurately describe the HW in terms of standard IOAS objects
>and where to put HW specific structs to support this.
> 
>This is where PPC would decide how best to expose a control over
>its low/high window (eg 1,2,3 IOAS). Whatever the IOMMU driver
>wants, so long as it fits into the kernel IOAS model facing the
>connected device driver.
> 
> QEMU would have IOMMU userspace drivers. One would be the "generic
> driver" using only the high level generic interface. It should work as
> best it can on all HW devices. This is the fallback path you talked
> of.
> 
> QEMU would also have HW specific IOMMU userspace drivers that know how
> to operate the exact HW. eg these drivers would know how to use
> userspace page tables, how to form IOPTEs and how to access the
> special features.
> 
> This is how QEMU could use an optimzed path with nested page tables,
> for instance.

The concept makes sense in general.  The devil's in the details, as usual.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC 13/20] iommu: Extend iommu_at[de]tach_device() for multiple devices group

2021-10-17 Thread David Gibson
On Thu, Oct 14, 2021 at 07:06:07AM +, Tian, Kevin wrote:
> > From: David Gibson 
> > Sent: Thursday, October 14, 2021 1:24 PM
> > 
> > On Sun, Sep 19, 2021 at 02:38:41PM +0800, Liu Yi L wrote:
> > > From: Lu Baolu 
> > >
> > > These two helpers could be used when 1) the iommu group is singleton,
> > > or 2) the upper layer has put the iommu group into the secure state by
> > > calling iommu_device_init_user_dma().
> > >
> > > As we want the iommufd design to be a device-centric model, we want to
> > > remove any group knowledge in iommufd. Given that we already have
> > > iommu_at[de]tach_device() interface, we could extend it for iommufd
> > > simply by doing below:
> > >
> > >  - first device in a group does group attach;
> > >  - last device in a group does group detach.
> > >
> > > as long as the group has been put into the secure context.
> > >
> > > The commit <426a273834eae> ("iommu: Limit
> > iommu_attach/detach_device to
> > > device with their own group") deliberately restricts the two interfaces
> > > to single-device group. To avoid the conflict with existing usages, we
> > > keep this policy and put the new extension only when the group has been
> > > marked for user_dma.
> > 
> > I still kind of hate this interface because it means an operation that
> > appears to be explicitly on a single device has an implicit effect on
> > other devices.
> > 
> 
> I still didn't get your concern why it's such a big deal. With this proposal
> the group restriction will be 'explicitly' documented in the attach uAPI
> comment and sample flow in iommufd.rst. A sane user should read all
> those information to understand how this new sub-system works and
> follow whatever constraints claimed there. In the end the user should
> maintain the same group knowledge regardless of whether to use an
> explicit group uAPI or a device uAPI which has group constraint...

The first user might read this.  Subsequent users are likely to just
copy paste examples from earlier things without fully understanding
them.  In general documenting restrictions somewhere is never as
effective as making those restrictions part of the interface signature
itself.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

RE: [RFC 06/20] iommu: Add iommu_device_init[exit]_user_dma interfaces

2021-10-17 Thread Tian, Kevin
> From: Jason Gunthorpe 
> Sent: Friday, October 15, 2021 7:10 PM
> 
> On Fri, Oct 15, 2021 at 01:29:16AM +, Tian, Kevin wrote:
> > Hi, Jason,
> >
> > > From: Jason Gunthorpe 
> > > Sent: Wednesday, September 29, 2021 8:59 PM
> > >
> > > On Wed, Sep 29, 2021 at 12:38:35AM +, Tian, Kevin wrote:
> > >
> > > > /* If set the driver must call iommu_XX as the first action in probe() 
> > > > or
> > > >   * before it attempts to do DMA
> > > >   */
> > > >  bool suppress_dma_owner:1;
> > >
> > > It is not "attempts to do DMA" but more "operates the physical device
> > > in any away"
> > >
> > > Not having ownership means another entity could be using user space
> > > DMA to manipulate the device state and attack the integrity of the
> > > kernel's programming of the device.
> > >
> >
> > Does suppress_kernel_dma sounds better than suppress_dma_owner?
> > We found the latter causing some confusion when doing internal
> > code review. Somehow this flag represents "don't claim the kernel dma
> > ownership during driver binding". suppress_dma_owner sounds the
> > entire ownership is disabled...
> 
> If in doubt make it
> 
> suppress_iommu_whatever_the_api_is_that_isn't_called

ok

> 
> > Another thing is about DMA_OWNER_SHARED, which is set to indicate
> > no dma at all. Thinking more we feel that this flag is meaningless. Its
> > sole purpose is to show compatibility to any USER/KERNEL ownership,
> > and essentially the same semantics as a device which is not bound to
> > any driver. So we plan to remove it then pci-stub just needs one line
> > change to set the suppress flag. But want to check with you first in case
> > any oversight.
> 
> It sounds reasonable, but also makes it much harder to find the few
> places that have this special relationship - ie we can't grep for
> DMA_OWNER_SHARED anymore.
> 

It's probably fine. People can just search the suppress flag and filter out
drivers with DMA_OWNER_USER. Then the remaining set is about 
drivers in SHARED category. Less straightforward but should be fine
for a relatively small set of drivers.

Thanks
Kevin
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH RFC] virtio: wrap config->reset calls

2021-10-17 Thread Wolfram Sang
On Wed, Oct 13, 2021 at 06:55:31AM -0400, Michael S. Tsirkin wrote:
> This will enable cleanups down the road.
> The idea is to disable cbs, then add "flush_queued_cbs" callback
> as a parameter, this way drivers can flush any work
> queued after callbacks have been disabled.
> 
> Signed-off-by: Michael S. Tsirkin 

Acked-by: Wolfram Sang  # for I2C changes



signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu