Re: [Qemu-devel] [PATCH v4 RESEND 0/3] IOMMU: intel_iommu support map and unmap notifications

2016-10-31 Thread Peter Xu
On Fri, Oct 21, 2016 at 08:43:05AM -0600, Alex Williamson wrote:

[...]

> > > Well, I think that would work.  But I think it would be better to fix
> > > it from the other side:
> > > 
> > > We add the range to be notified into the IOMMUNotifier structure and
> > > filter based on that range in memory_region_notify_iommu.
> > > 
> > > It means a little more list searching and filtering on notify, but it
> > > avoids having to have another list and search on the VFIO side.  I
> > > think it will also better deal with cases where part of an IOMMU
> > > mapped region is inaccessible due to an intermediate bridge.  
> > 
> > IIUC, this will still need to keep several VFIOGuestIOMMUs which
> > contains exactly the same content?
> 
> The contain different offsets since this is based on the section rather
> than the MemoryRegion.  Thanks,

Though vfio_listener_region_add() is per memory region section,
VFIOGuestIOMMU looks like to be for memory region? See in
vfio_listener_region_add():

giommu->iommu_offset = section->offset_within_address_space -
   section->offset_within_region;

Here VFIOGuestIOMMU.iommu_offset should be the offset of memory
region (rather than memory region section offset), right?

Thanks,

-- peterx



Re: [Qemu-devel] [PATCH v4 RESEND 0/3] IOMMU: intel_iommu support map and unmap notifications

2016-10-24 Thread David Gibson
On Fri, Oct 21, 2016 at 01:17:08PM +0800, Peter Xu wrote:
> On Fri, Oct 21, 2016 at 11:50:05AM +1100, David Gibson wrote:
> 
> [...]
> 
> > > > In my setup the VFIO registered two memory areas with one page of
> > > > unregistered memory
> > > > between them.
> > > > 
> > > > When I'm calling memory_region_notify_iommu it calls the notifier 
> > > > function
> > > > of VFIO twice
> > > > when the second time is failing with warning to console as the new 
> > > > mapping
> > > > is already present.
> > > > 
> > > > The notifier function of VFIO should ignore IOMMUTLBEntry that is not in
> > > > the correct
> > > > range.
> > > 
> > > Hmm, right vfio_listener_region_add() is called for a
> > > MemoryRegionSection, but then we add an iommu notifier to the
> > > MemoryRegion, so we end up with a notifier per MemoryRegionSection
> > > regardless of whether they're backed by the same MemoryRegion.  Seems
> > > like we need a MemoryRegion based list of VFIOGuestIOMMUs so we only
> > > register once per MemoryRegion and then sort though the list of
> > > VFIOGuestIOMMUs for a given MemoryRegion to find the one affected.
> > > David, does that sound right to you?
> 
> I think we already have such a list (VFIOContainer.giommu_list)?  Can
> we use that to do it? When we try to add a new IOMMU notifier for
> specific MemoryRegion, we can first traverse VFIOContainer.giommu_list
> and see whether there are existing MemoryRegion registered, and we
> only register if it's the first one.

Yes, I think that would work, but I still prefer the alternate
approach I describe below.

> > Well, I think that would work.  But I think it would be better to fix
> > it from the other side:
> > 
> > We add the range to be notified into the IOMMUNotifier structure and
> > filter based on that range in memory_region_notify_iommu.
> > 
> > It means a little more list searching and filtering on notify, but it
> > avoids having to have another list and search on the VFIO side.  I
> > think it will also better deal with cases where part of an IOMMU
> > mapped region is inaccessible due to an intermediate bridge.
> 
> IIUC, this will still need to keep several VFIOGuestIOMMUs which
> contains exactly the same content?

Well, it would no longer be exactly the same content since the
different ranges would be stored within the IOMMUNotifier
substructure.  But, yes, it would be very similar.

I think it's still worth it for improved clarity, and the possibility
to handle in future cases where part of the IOMMU region simply isn't
accessible at all in the CPU address space.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH v4 RESEND 0/3] IOMMU: intel_iommu support map and unmap notifications

2016-10-21 Thread Alex Williamson
On Fri, 21 Oct 2016 13:17:08 +0800
Peter Xu  wrote:

> On Fri, Oct 21, 2016 at 11:50:05AM +1100, David Gibson wrote:
> 
> [...]
> 
> > > > In my setup the VFIO registered two memory areas with one page of
> > > > unregistered memory
> > > > between them.
> > > > 
> > > > When I'm calling memory_region_notify_iommu it calls the notifier 
> > > > function
> > > > of VFIO twice
> > > > when the second time is failing with warning to console as the new 
> > > > mapping
> > > > is already present.
> > > > 
> > > > The notifier function of VFIO should ignore IOMMUTLBEntry that is not in
> > > > the correct
> > > > range.  
> > > 
> > > Hmm, right vfio_listener_region_add() is called for a
> > > MemoryRegionSection, but then we add an iommu notifier to the
> > > MemoryRegion, so we end up with a notifier per MemoryRegionSection
> > > regardless of whether they're backed by the same MemoryRegion.  Seems
> > > like we need a MemoryRegion based list of VFIOGuestIOMMUs so we only
> > > register once per MemoryRegion and then sort though the list of
> > > VFIOGuestIOMMUs for a given MemoryRegion to find the one affected.
> > > David, does that sound right to you?  
> 
> I think we already have such a list (VFIOContainer.giommu_list)? Can
> we use that to do it? When we try to add a new IOMMU notifier for
> specific MemoryRegion, we can first traverse VFIOContainer.giommu_list
> and see whether there are existing MemoryRegion registered, and we
> only register if it's the first one.

That's effectively what I'm suggesting except I was trying to add some
efficiency by tracking both the MemoryRegion and the
MemoryRegionSection separately so we don't need to traverse a list to
see if a MemoryRegion is already registered.

> > 
> > Well, I think that would work.  But I think it would be better to fix
> > it from the other side:
> > 
> > We add the range to be notified into the IOMMUNotifier structure and
> > filter based on that range in memory_region_notify_iommu.
> > 
> > It means a little more list searching and filtering on notify, but it
> > avoids having to have another list and search on the VFIO side.  I
> > think it will also better deal with cases where part of an IOMMU
> > mapped region is inaccessible due to an intermediate bridge.  
> 
> IIUC, this will still need to keep several VFIOGuestIOMMUs which
> contains exactly the same content?

The contain different offsets since this is based on the section rather
than the MemoryRegion.  Thanks,

Alex



Re: [Qemu-devel] [PATCH v4 RESEND 0/3] IOMMU: intel_iommu support map and unmap notifications

2016-10-20 Thread Peter Xu
On Fri, Oct 21, 2016 at 11:50:05AM +1100, David Gibson wrote:

[...]

> > > In my setup the VFIO registered two memory areas with one page of
> > > unregistered memory
> > > between them.
> > > 
> > > When I'm calling memory_region_notify_iommu it calls the notifier function
> > > of VFIO twice
> > > when the second time is failing with warning to console as the new mapping
> > > is already present.
> > > 
> > > The notifier function of VFIO should ignore IOMMUTLBEntry that is not in
> > > the correct
> > > range.
> > 
> > Hmm, right vfio_listener_region_add() is called for a
> > MemoryRegionSection, but then we add an iommu notifier to the
> > MemoryRegion, so we end up with a notifier per MemoryRegionSection
> > regardless of whether they're backed by the same MemoryRegion.  Seems
> > like we need a MemoryRegion based list of VFIOGuestIOMMUs so we only
> > register once per MemoryRegion and then sort though the list of
> > VFIOGuestIOMMUs for a given MemoryRegion to find the one affected.
> > David, does that sound right to you?

I think we already have such a list (VFIOContainer.giommu_list)? Can
we use that to do it? When we try to add a new IOMMU notifier for
specific MemoryRegion, we can first traverse VFIOContainer.giommu_list
and see whether there are existing MemoryRegion registered, and we
only register if it's the first one.

> 
> Well, I think that would work.  But I think it would be better to fix
> it from the other side:
> 
> We add the range to be notified into the IOMMUNotifier structure and
> filter based on that range in memory_region_notify_iommu.
> 
> It means a little more list searching and filtering on notify, but it
> avoids having to have another list and search on the VFIO side.  I
> think it will also better deal with cases where part of an IOMMU
> mapped region is inaccessible due to an intermediate bridge.

IIUC, this will still need to keep several VFIOGuestIOMMUs which
contains exactly the same content?

Thanks,

-- peterx



Re: [Qemu-devel] [PATCH v4 RESEND 0/3] IOMMU: intel_iommu support map and unmap notifications

2016-10-20 Thread David Gibson
On Thu, Oct 20, 2016 at 02:06:08PM -0600, Alex Williamson wrote:
> [cc +david]
> 
> On Thu, 20 Oct 2016 22:17:18 +0300
> "Aviv B.D."  wrote:
> 
> > On Mon, Oct 17, 2016 at 7:07 PM, Alex Williamson  > > wrote:  
> > 
> > > On Mon, 17 Oct 2016 18:44:21 +0300
> > > "Aviv B.D"  wrote:
> > >  
> > > > From: "Aviv Ben-David" 
> > > >
> > > > * Advertize Cache Mode capability in iommu cap register.
> > > >   This capability is controlled by "cache-mode" property of intel-iommu 
> > > >  
> > > device.  
> > > >   To enable this option call QEMU with "-device  
> > > intel-iommu,cache-mode=true".  
> > > >
> > > > * On page cache invalidation in intel vIOMMU, check if the domain 
> > > > belong  
> > > to  
> > > >   registered notifier, and notify accordingly.
> > > >
> > > > Currently this patch still doesn't enabling VFIO devices support with  
> > > vIOMMU  
> > > > present. Current problems:
> > > > * vfio_iommu_map_notify is not aware about memory range belong to  
> > > specific  
> > > >   VFIOGuestIOMMU.  
> > >
> > > Could you elaborate on why this is an issue?
> > >  
> > 
> > In my setup the VFIO registered two memory areas with one page of
> > unregistered memory
> > between them.
> > 
> > When I'm calling memory_region_notify_iommu it calls the notifier function
> > of VFIO twice
> > when the second time is failing with warning to console as the new mapping
> > is already present.
> > 
> > The notifier function of VFIO should ignore IOMMUTLBEntry that is not in
> > the correct
> > range.
> 
> Hmm, right vfio_listener_region_add() is called for a
> MemoryRegionSection, but then we add an iommu notifier to the
> MemoryRegion, so we end up with a notifier per MemoryRegionSection
> regardless of whether they're backed by the same MemoryRegion.  Seems
> like we need a MemoryRegion based list of VFIOGuestIOMMUs so we only
> register once per MemoryRegion and then sort though the list of
> VFIOGuestIOMMUs for a given MemoryRegion to find the one affected.
> David, does that sound right to you?

Well, I think that would work.  But I think it would be better to fix
it from the other side:

We add the range to be notified into the IOMMUNotifier structure and
filter based on that range in memory_region_notify_iommu.

It means a little more list searching and filtering on notify, but it
avoids having to have another list and search on the VFIO side.  I
think it will also better deal with cases where part of an IOMMU
mapped region is inaccessible due to an intermediate bridge.

> I am curious why you get two regions separated by one page, can you
> give an example of the ranges for each?  Thanks,
> 
> Alex
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH v4 RESEND 0/3] IOMMU: intel_iommu support map and unmap notifications

2016-10-20 Thread Alex Williamson
[cc +david]

On Thu, 20 Oct 2016 22:17:18 +0300
"Aviv B.D."  wrote:

> On Mon, Oct 17, 2016 at 7:07 PM, Alex Williamson  > wrote:  
> 
> > On Mon, 17 Oct 2016 18:44:21 +0300
> > "Aviv B.D"  wrote:
> >  
> > > From: "Aviv Ben-David" 
> > >
> > > * Advertize Cache Mode capability in iommu cap register.
> > >   This capability is controlled by "cache-mode" property of intel-iommu  
> > device.  
> > >   To enable this option call QEMU with "-device  
> > intel-iommu,cache-mode=true".  
> > >
> > > * On page cache invalidation in intel vIOMMU, check if the domain belong  
> > to  
> > >   registered notifier, and notify accordingly.
> > >
> > > Currently this patch still doesn't enabling VFIO devices support with  
> > vIOMMU  
> > > present. Current problems:
> > > * vfio_iommu_map_notify is not aware about memory range belong to  
> > specific  
> > >   VFIOGuestIOMMU.  
> >
> > Could you elaborate on why this is an issue?
> >  
> 
> In my setup the VFIO registered two memory areas with one page of
> unregistered memory
> between them.
> 
> When I'm calling memory_region_notify_iommu it calls the notifier function
> of VFIO twice
> when the second time is failing with warning to console as the new mapping
> is already present.
> 
> The notifier function of VFIO should ignore IOMMUTLBEntry that is not in
> the correct
> range.

Hmm, right vfio_listener_region_add() is called for a
MemoryRegionSection, but then we add an iommu notifier to the
MemoryRegion, so we end up with a notifier per MemoryRegionSection
regardless of whether they're backed by the same MemoryRegion.  Seems
like we need a MemoryRegion based list of VFIOGuestIOMMUs so we only
register once per MemoryRegion and then sort though the list of
VFIOGuestIOMMUs for a given MemoryRegion to find the one affected.
David, does that sound right to you?

I am curious why you get two regions separated by one page, can you
give an example of the ranges for each?  Thanks,

Alex



Re: [Qemu-devel] [PATCH v4 RESEND 0/3] IOMMU: intel_iommu support map and unmap notifications

2016-10-20 Thread Aviv B.D.
On Mon, Oct 17, 2016 at 7:07 PM, Alex Williamson  wrote:

> On Mon, 17 Oct 2016 18:44:21 +0300
> "Aviv B.D"  wrote:
>
> > From: "Aviv Ben-David" 
> >
> > * Advertize Cache Mode capability in iommu cap register.
> >   This capability is controlled by "cache-mode" property of intel-iommu
> device.
> >   To enable this option call QEMU with "-device
> intel-iommu,cache-mode=true".
> >
> > * On page cache invalidation in intel vIOMMU, check if the domain belong
> to
> >   registered notifier, and notify accordingly.
> >
> > Currently this patch still doesn't enabling VFIO devices support with
> vIOMMU
> > present. Current problems:
> > * vfio_iommu_map_notify is not aware about memory range belong to
> specific
> >   VFIOGuestIOMMU.
>
> Could you elaborate on why this is an issue?
>

In my setup the VFIO registered two memory areas with one page of
unregistered memory
between them.

When I'm calling memory_region_notify_iommu it calls the notifier function
of VFIO twice
when the second time is failing with warning to console as the new mapping
is already present.

The notifier function of VFIO should ignore IOMMUTLBEntry that is not in
the correct
range.


>
> > * memory_region_iommu_replay hangs QEMU on start up while it itterate
> over
> >   64bit address space. Commenting out the call to this function enables
> >   workable VFIO device while vIOMMU present.
>
> This has been discussed previously, it would be incorrect for vfio not
> to call the replay function.  The solution is to add an iommu driver
> callback to efficiently walk the mappings within a MemoryRegion.
> Thanks,
>

I'm aware about those discussion, I put this info here as helpful reminder
if someone wants to
test my code.


> Alex
>
> > Changes from v1 to v2:
> > * remove assumption that the cache do not clears
> > * fix lockup on high load.
> >
> > Changes from v2 to v3:
> > * remove debug leftovers
> > * split to sepearate commits
> > * change is_write to flags in vtd_do_iommu_translate, add IOMMU_NO_FAIL
> >   to suppress error propagating to guest.
> >
> > Changes from v3 to v4:
> > * Add property to intel_iommu device to control the CM capability,
> >   default to False.
> > * Use s->iommu_ops.notify_flag_changed to register notifiers.
> >
> > Changes from v4 to v4 RESEND:
> > * Fix codding style pointed by checkpatch.pl script.
> >
> > Aviv Ben-David (3):
> >   IOMMU: add option to enable VTD_CAP_CM to vIOMMU capility exposoed to
> > guest
> >   IOMMU: change iommu_op->translate's is_write to flags, add support to
> > NO_FAIL flag mode
> >   IOMMU: enable intel_iommu map and unmap notifiers
> >
> >  exec.c |   3 +-
> >  hw/i386/amd_iommu.c|   4 +-
> >  hw/i386/intel_iommu.c  | 175 ++
> +++
> >  hw/i386/intel_iommu_internal.h |   3 +
> >  include/exec/memory.h  |   6 +-
> >  include/hw/i386/intel_iommu.h  |  11 +++
> >  memory.c   |   3 +-
> >  7 files changed, 168 insertions(+), 37 deletions(-)
> >
>
>
Aviv.


Re: [Qemu-devel] [PATCH v4 RESEND 0/3] IOMMU: intel_iommu support map and unmap notifications

2016-10-18 Thread Alex Williamson
On Tue, 18 Oct 2016 16:52:04 +1100
David Gibson  wrote:

> On Mon, Oct 17, 2016 at 10:47:02PM -0600, Alex Williamson wrote:
> > On Tue, 18 Oct 2016 15:06:55 +1100
> > David Gibson  wrote:
> >   
> > > On Mon, Oct 17, 2016 at 10:07:36AM -0600, Alex Williamson wrote:  
> > > > On Mon, 17 Oct 2016 18:44:21 +0300
> > > > "Aviv B.D"  wrote:
> > > > 
> > > > > From: "Aviv Ben-David" 
> > > > > 
> > > > > * Advertize Cache Mode capability in iommu cap register. 
> > > > >   This capability is controlled by "cache-mode" property of 
> > > > > intel-iommu device.
> > > > >   To enable this option call QEMU with "-device 
> > > > > intel-iommu,cache-mode=true".
> > > > > 
> > > > > * On page cache invalidation in intel vIOMMU, check if the domain 
> > > > > belong to
> > > > >   registered notifier, and notify accordingly.
> > > > > 
> > > > > Currently this patch still doesn't enabling VFIO devices support with 
> > > > > vIOMMU 
> > > > > present. Current problems:
> > > > > * vfio_iommu_map_notify is not aware about memory range belong to 
> > > > > specific 
> > > > >   VFIOGuestIOMMU.
> > > > 
> > > > Could you elaborate on why this is an issue?
> > > > 
> > > > > * memory_region_iommu_replay hangs QEMU on start up while it itterate 
> > > > > over 
> > > > >   64bit address space. Commenting out the call to this function 
> > > > > enables 
> > > > >   workable VFIO device while vIOMMU present.
> > > > 
> > > > This has been discussed previously, it would be incorrect for vfio not
> > > > to call the replay function.  The solution is to add an iommu driver
> > > > callback to efficiently walk the mappings within a MemoryRegion.
> > > 
> > > Right, replay is a bit of a hack.  There are a couple of other
> > > approaches that might be adequate without a new callback:
> > >- Make the VFIOGuestIOMMU aware of the guest address range mapped
> > >  by the vIOMMU.  Intel currently advertises that as a full 64-bit
> > >  address space, but I bet that's not actually true in practice.
> > >- Have the IOMMU MR advertise a (minimum) page size for vIOMMU
> > >  mappings.  That may let you stpe through the range with greater
> > >  strides  
> > 
> > Hmm, VT-d supports at least a 39-bit address width and always supports
> > a minimum 4k page size, so yes that does reduce us from 2^52 steps down
> > to 2^27,  
> 
> Right, which is probably doable, if not ideal
> 
> > but it's still absurd to walk through the raw address space.  
> 
> Well.. it depends on the internal structure of the IOMMU.  For Power,
> it's traditionally just a 1-level page table, so we can't actually do
> any better than stepping through each IOMMU page.

Intel always has a least a 3-level page table AIUI.
 
> > It does however seem correct to create the MemoryRegion with a width
> > that actually matches the IOMMU capability, but I don't think that's a
> > sufficient fix by itself.  Thanks,  
> 
> I suspect it would actually make it workable in the short term.
> 
> But I don't disagree that a "traverse" or "replay" callback of some
> sort in the iommu_ops is a better idea long term.  Having a fallback
> to the current replay implementation if the callback isn't supplied
> seems pretty reasonable though.

Exactly, the callback could be optional where IOMMUs that supply a
relatively small IOVA window could fallback to the code we have today.
Thanks,

Alex



Re: [Qemu-devel] [PATCH v4 RESEND 0/3] IOMMU: intel_iommu support map and unmap notifications

2016-10-18 Thread David Gibson
On Mon, Oct 17, 2016 at 10:47:02PM -0600, Alex Williamson wrote:
> On Tue, 18 Oct 2016 15:06:55 +1100
> David Gibson  wrote:
> 
> > On Mon, Oct 17, 2016 at 10:07:36AM -0600, Alex Williamson wrote:
> > > On Mon, 17 Oct 2016 18:44:21 +0300
> > > "Aviv B.D"  wrote:
> > >   
> > > > From: "Aviv Ben-David" 
> > > > 
> > > > * Advertize Cache Mode capability in iommu cap register. 
> > > >   This capability is controlled by "cache-mode" property of intel-iommu 
> > > > device.
> > > >   To enable this option call QEMU with "-device 
> > > > intel-iommu,cache-mode=true".
> > > > 
> > > > * On page cache invalidation in intel vIOMMU, check if the domain 
> > > > belong to
> > > >   registered notifier, and notify accordingly.
> > > > 
> > > > Currently this patch still doesn't enabling VFIO devices support with 
> > > > vIOMMU 
> > > > present. Current problems:
> > > > * vfio_iommu_map_notify is not aware about memory range belong to 
> > > > specific 
> > > >   VFIOGuestIOMMU.  
> > > 
> > > Could you elaborate on why this is an issue?
> > >   
> > > > * memory_region_iommu_replay hangs QEMU on start up while it itterate 
> > > > over 
> > > >   64bit address space. Commenting out the call to this function enables 
> > > >   workable VFIO device while vIOMMU present.  
> > > 
> > > This has been discussed previously, it would be incorrect for vfio not
> > > to call the replay function.  The solution is to add an iommu driver
> > > callback to efficiently walk the mappings within a MemoryRegion.  
> > 
> > Right, replay is a bit of a hack.  There are a couple of other
> > approaches that might be adequate without a new callback:
> >- Make the VFIOGuestIOMMU aware of the guest address range mapped
> >  by the vIOMMU.  Intel currently advertises that as a full 64-bit
> >  address space, but I bet that's not actually true in practice.
> >- Have the IOMMU MR advertise a (minimum) page size for vIOMMU
> >  mappings.  That may let you stpe through the range with greater
> >  strides
> 
> Hmm, VT-d supports at least a 39-bit address width and always supports
> a minimum 4k page size, so yes that does reduce us from 2^52 steps down
> to 2^27,

Right, which is probably doable, if not ideal

> but it's still absurd to walk through the raw address space.

Well.. it depends on the internal structure of the IOMMU.  For Power,
it's traditionally just a 1-level page table, so we can't actually do
any better than stepping through each IOMMU page.

> It does however seem correct to create the MemoryRegion with a width
> that actually matches the IOMMU capability, but I don't think that's a
> sufficient fix by itself.  Thanks,

I suspect it would actually make it workable in the short term.

But I don't disagree that a "traverse" or "replay" callback of some
sort in the iommu_ops is a better idea long term.  Having a fallback
to the current replay implementation if the callback isn't supplied
seems pretty reasonable though.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH v4 RESEND 0/3] IOMMU: intel_iommu support map and unmap notifications

2016-10-17 Thread Alex Williamson
On Tue, 18 Oct 2016 15:06:55 +1100
David Gibson  wrote:

> On Mon, Oct 17, 2016 at 10:07:36AM -0600, Alex Williamson wrote:
> > On Mon, 17 Oct 2016 18:44:21 +0300
> > "Aviv B.D"  wrote:
> >   
> > > From: "Aviv Ben-David" 
> > > 
> > > * Advertize Cache Mode capability in iommu cap register. 
> > >   This capability is controlled by "cache-mode" property of intel-iommu 
> > > device.
> > >   To enable this option call QEMU with "-device 
> > > intel-iommu,cache-mode=true".
> > > 
> > > * On page cache invalidation in intel vIOMMU, check if the domain belong 
> > > to
> > >   registered notifier, and notify accordingly.
> > > 
> > > Currently this patch still doesn't enabling VFIO devices support with 
> > > vIOMMU 
> > > present. Current problems:
> > > * vfio_iommu_map_notify is not aware about memory range belong to 
> > > specific 
> > >   VFIOGuestIOMMU.  
> > 
> > Could you elaborate on why this is an issue?
> >   
> > > * memory_region_iommu_replay hangs QEMU on start up while it itterate 
> > > over 
> > >   64bit address space. Commenting out the call to this function enables 
> > >   workable VFIO device while vIOMMU present.  
> > 
> > This has been discussed previously, it would be incorrect for vfio not
> > to call the replay function.  The solution is to add an iommu driver
> > callback to efficiently walk the mappings within a MemoryRegion.  
> 
> Right, replay is a bit of a hack.  There are a couple of other
> approaches that might be adequate without a new callback:
>- Make the VFIOGuestIOMMU aware of the guest address range mapped
>  by the vIOMMU.  Intel currently advertises that as a full 64-bit
>  address space, but I bet that's not actually true in practice.
>- Have the IOMMU MR advertise a (minimum) page size for vIOMMU
>  mappings.  That may let you stpe through the range with greater
>  strides

Hmm, VT-d supports at least a 39-bit address width and always supports
a minimum 4k page size, so yes that does reduce us from 2^52 steps down
to 2^27, but it's still absurd to walk through the raw address space.
It does however seem correct to create the MemoryRegion with a width
that actually matches the IOMMU capability, but I don't think that's a
sufficient fix by itself.  Thanks,

Alex



Re: [Qemu-devel] [PATCH v4 RESEND 0/3] IOMMU: intel_iommu support map and unmap notifications

2016-10-17 Thread David Gibson
On Mon, Oct 17, 2016 at 10:07:36AM -0600, Alex Williamson wrote:
> On Mon, 17 Oct 2016 18:44:21 +0300
> "Aviv B.D"  wrote:
> 
> > From: "Aviv Ben-David" 
> > 
> > * Advertize Cache Mode capability in iommu cap register. 
> >   This capability is controlled by "cache-mode" property of intel-iommu 
> > device.
> >   To enable this option call QEMU with "-device 
> > intel-iommu,cache-mode=true".
> > 
> > * On page cache invalidation in intel vIOMMU, check if the domain belong to
> >   registered notifier, and notify accordingly.
> > 
> > Currently this patch still doesn't enabling VFIO devices support with 
> > vIOMMU 
> > present. Current problems:
> > * vfio_iommu_map_notify is not aware about memory range belong to specific 
> >   VFIOGuestIOMMU.
> 
> Could you elaborate on why this is an issue?
> 
> > * memory_region_iommu_replay hangs QEMU on start up while it itterate over 
> >   64bit address space. Commenting out the call to this function enables 
> >   workable VFIO device while vIOMMU present.
> 
> This has been discussed previously, it would be incorrect for vfio not
> to call the replay function.  The solution is to add an iommu driver
> callback to efficiently walk the mappings within a MemoryRegion.

Right, replay is a bit of a hack.  There are a couple of other
approaches that might be adequate without a new callback:
   - Make the VFIOGuestIOMMU aware of the guest address range mapped
 by the vIOMMU.  Intel currently advertises that as a full 64-bit
 address space, but I bet that's not actually true in practice.
   - Have the IOMMU MR advertise a (minimum) page size for vIOMMU
 mappings.  That may let you stpe through the range with greater
 strides

> Thanks,
> 
> Alex
> 
> > Changes from v1 to v2:
> > * remove assumption that the cache do not clears
> > * fix lockup on high load.
> > 
> > Changes from v2 to v3:
> > * remove debug leftovers
> > * split to sepearate commits
> > * change is_write to flags in vtd_do_iommu_translate, add IOMMU_NO_FAIL 
> >   to suppress error propagating to guest.
> > 
> > Changes from v3 to v4:
> > * Add property to intel_iommu device to control the CM capability, 
> >   default to False.
> > * Use s->iommu_ops.notify_flag_changed to register notifiers.
> > 
> > Changes from v4 to v4 RESEND:
> > * Fix codding style pointed by checkpatch.pl script.
> > 
> > Aviv Ben-David (3):
> >   IOMMU: add option to enable VTD_CAP_CM to vIOMMU capility exposoed to
> > guest
> >   IOMMU: change iommu_op->translate's is_write to flags, add support to
> > NO_FAIL flag mode
> >   IOMMU: enable intel_iommu map and unmap notifiers
> > 
> >  exec.c |   3 +-
> >  hw/i386/amd_iommu.c|   4 +-
> >  hw/i386/intel_iommu.c  | 175 
> > +
> >  hw/i386/intel_iommu_internal.h |   3 +
> >  include/exec/memory.h  |   6 +-
> >  include/hw/i386/intel_iommu.h  |  11 +++
> >  memory.c   |   3 +-
> >  7 files changed, 168 insertions(+), 37 deletions(-)
> > 
> 
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH v4 RESEND 0/3] IOMMU: intel_iommu support map and unmap notifications

2016-10-17 Thread Alex Williamson
On Mon, 17 Oct 2016 18:44:21 +0300
"Aviv B.D"  wrote:

> From: "Aviv Ben-David" 
> 
> * Advertize Cache Mode capability in iommu cap register. 
>   This capability is controlled by "cache-mode" property of intel-iommu 
> device.
>   To enable this option call QEMU with "-device intel-iommu,cache-mode=true".
> 
> * On page cache invalidation in intel vIOMMU, check if the domain belong to
>   registered notifier, and notify accordingly.
> 
> Currently this patch still doesn't enabling VFIO devices support with vIOMMU 
> present. Current problems:
> * vfio_iommu_map_notify is not aware about memory range belong to specific 
>   VFIOGuestIOMMU.

Could you elaborate on why this is an issue?

> * memory_region_iommu_replay hangs QEMU on start up while it itterate over 
>   64bit address space. Commenting out the call to this function enables 
>   workable VFIO device while vIOMMU present.

This has been discussed previously, it would be incorrect for vfio not
to call the replay function.  The solution is to add an iommu driver
callback to efficiently walk the mappings within a MemoryRegion.
Thanks,

Alex

> Changes from v1 to v2:
> * remove assumption that the cache do not clears
> * fix lockup on high load.
> 
> Changes from v2 to v3:
> * remove debug leftovers
> * split to sepearate commits
> * change is_write to flags in vtd_do_iommu_translate, add IOMMU_NO_FAIL 
>   to suppress error propagating to guest.
> 
> Changes from v3 to v4:
> * Add property to intel_iommu device to control the CM capability, 
>   default to False.
> * Use s->iommu_ops.notify_flag_changed to register notifiers.
> 
> Changes from v4 to v4 RESEND:
> * Fix codding style pointed by checkpatch.pl script.
> 
> Aviv Ben-David (3):
>   IOMMU: add option to enable VTD_CAP_CM to vIOMMU capility exposoed to
> guest
>   IOMMU: change iommu_op->translate's is_write to flags, add support to
> NO_FAIL flag mode
>   IOMMU: enable intel_iommu map and unmap notifiers
> 
>  exec.c |   3 +-
>  hw/i386/amd_iommu.c|   4 +-
>  hw/i386/intel_iommu.c  | 175 
> +
>  hw/i386/intel_iommu_internal.h |   3 +
>  include/exec/memory.h  |   6 +-
>  include/hw/i386/intel_iommu.h  |  11 +++
>  memory.c   |   3 +-
>  7 files changed, 168 insertions(+), 37 deletions(-)
> 




[Qemu-devel] [PATCH v4 RESEND 0/3] IOMMU: intel_iommu support map and unmap notifications

2016-10-17 Thread Aviv B.D
From: "Aviv Ben-David" 

* Advertize Cache Mode capability in iommu cap register. 
  This capability is controlled by "cache-mode" property of intel-iommu device.
  To enable this option call QEMU with "-device intel-iommu,cache-mode=true".

* On page cache invalidation in intel vIOMMU, check if the domain belong to
  registered notifier, and notify accordingly.

Currently this patch still doesn't enabling VFIO devices support with vIOMMU 
present. Current problems:
* vfio_iommu_map_notify is not aware about memory range belong to specific 
  VFIOGuestIOMMU.
* memory_region_iommu_replay hangs QEMU on start up while it itterate over 
  64bit address space. Commenting out the call to this function enables 
  workable VFIO device while vIOMMU present.

Changes from v1 to v2:
* remove assumption that the cache do not clears
* fix lockup on high load.

Changes from v2 to v3:
* remove debug leftovers
* split to sepearate commits
* change is_write to flags in vtd_do_iommu_translate, add IOMMU_NO_FAIL 
  to suppress error propagating to guest.

Changes from v3 to v4:
* Add property to intel_iommu device to control the CM capability, 
  default to False.
* Use s->iommu_ops.notify_flag_changed to register notifiers.

Changes from v4 to v4 RESEND:
* Fix codding style pointed by checkpatch.pl script.

Aviv Ben-David (3):
  IOMMU: add option to enable VTD_CAP_CM to vIOMMU capility exposoed to
guest
  IOMMU: change iommu_op->translate's is_write to flags, add support to
NO_FAIL flag mode
  IOMMU: enable intel_iommu map and unmap notifiers

 exec.c |   3 +-
 hw/i386/amd_iommu.c|   4 +-
 hw/i386/intel_iommu.c  | 175 +
 hw/i386/intel_iommu_internal.h |   3 +
 include/exec/memory.h  |   6 +-
 include/hw/i386/intel_iommu.h  |  11 +++
 memory.c   |   3 +-
 7 files changed, 168 insertions(+), 37 deletions(-)

-- 
1.9.1