Re: [PATCH] [RESEND] VT-d: Support multiple device assignment to one guest

2008-10-29 Thread Joerg Roedel
(sorry for joining this late, I returned from a 3-week vacation this
 monday)

On Tue, Oct 07, 2008 at 03:29:21PM +0200, Avi Kivity wrote:
 Oh, I see it now.  Different devices may need to go under different iommus.
 
 This really feels like it should be handled by the iommu API.  Users
 shouldn't need to bother with it.
 
 Joerg, can your dma api handle this?

Yes. The API works in terms of devices and domains. The fact that a
domain may contain devices behind different IOMMUs is hidden to the user
of that API.

Joerg

-- 
   |   AMD Saxony Limited Liability Company  Co. KG
 Operating | Wilschdorfer Landstr. 101, 01109 Dresden, Germany
 System|  Register Court Dresden: HRA 4896
 Research  |  General Partner authorized to represent:
 Center| AMD Saxony LLC (Wilmington, Delaware, US)
   | General Manager of AMD Saxony LLC: Dr. Hans-R. Deppe, Thomas McCoy

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [RESEND] VT-d: Support multiple device assignment to one guest

2008-10-10 Thread Avi Kivity

Han, Weidong wrote:

It's hard to move kvm_vtd_domain inside current iommu API. It's kvm
specific. It's not elegant to include kvm_vtd_domain stuffs in native
VT-d code.


It's cleaner than adding knowledge of how the iommu works to kvm.


 I think leave it in kvm side is more clean at this point.
Moveover it's very simple. I read Joerg's iommu API foils just now, I
think it's good. Native AMD iommu code will be in 2.6.28, it's a
suitable to implement a generic iommu API based both on Intel and AMD
iommu for kvm after 2.6.28. What's your opinion? 
  


2.6.27 is out, so anything we do will be for 2.6.29.

--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] [RESEND] VT-d: Support multiple device assignment to one guest

2008-10-10 Thread Han, Weidong
Avi Kivity wrote:
 Han, Weidong wrote:
 It's hard to move kvm_vtd_domain inside current iommu API. It's kvm
 specific. It's not elegant to include kvm_vtd_domain stuffs in native
 VT-d code.
 
 It's cleaner than adding knowledge of how the iommu works to kvm.

I will try to move kvm_vtd_domain inside iommu API. I suspect it would
need much changes on VT-d code.

 
  I think leave it in kvm side is more clean at this point.
 Moveover it's very simple. I read Joerg's iommu API foils just now, I
 think it's good. Native AMD iommu code will be in 2.6.28, it's a
 suitable to implement a generic iommu API based both on Intel and AMD
 iommu for kvm after 2.6.28. What's your opinion?
 
 
 2.6.27 is out, so anything we do will be for 2.6.29.

Do you mean the VT-d patches which haven't been checked in won't be
pushed into 2.6.28? 

Regards,
Weidong
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [RESEND] VT-d: Support multiple device assignment to one guest

2008-10-10 Thread Avi Kivity

Han, Weidong wrote:


2.6.27 is out, so anything we do will be for 2.6.29.



Do you mean the VT-d patches which haven't been checked in won't be
pushed into 2.6.28? 
  


Yes.  The only exception is the guest interrupt sharing patch, which is 
awaiting testing.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] [RESEND] VT-d: Support multiple device assignment to one guest

2008-10-10 Thread Han, Weidong
Avi Kivity wrote:
 Han, Weidong wrote:
 
 2.6.27 is out, so anything we do will be for 2.6.29.
 
 
 Do you mean the VT-d patches which haven't been checked in won't be
 pushed into 2.6.28? 
 
 
 Yes.  The only exception is the guest interrupt sharing patch, which
 is awaiting testing.
 

Ok, I see. So big changes on VT-d code is not a problem. I will redesign
the multi-device assignment patch. 

Regards,
Weidong

 --
 I have a truly marvellous patch that fixes the bug which this
 signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] [RESEND] VT-d: Support multiple device assignment to one guest

2008-10-09 Thread Han, Weidong
Avi Kivity wrote:
 Han, Weidong wrote:
 If we devolve this to the iommu API, the same io page table can be
 shared by all iommus, so long as they all use the same page table
 format. 
 
 
 I don't understand how to handle this by iommu API. Let me explain
 my thoughts more clearly: 
 
 VT-d spec says:
  Context-entries programmed with the same domain identifier must
 always reference the same address translation structure (through the
 ASR field). Similarly, context-entries referencing the same address
 translation structure must be programmed with the same domain id.
 
 In native VT-d driver, dmar_domain is per device, and has its own
 VT-d page table, which is dynamically setup before each DMA. So it is
 impossible that the same VT-d page table is shared by all iommus.
 Moveover different iommus in system may have different page table
 levels.
 
 Right.  This use case is in essence to prevent unintended sharing.  It
 is also likely to have low page table height, since dma sizes are
 relatively small.
 
 I think it's enough that iommu API tells us its iommu of a
 device.
 
 
 While this is tangential to our conversation, why?  Even for the
 device driver use case, this only makes the API more complex.  If the
 API hides the existence of multiple iommus, it's easier to use and
 harder to make a mistake.
 
 Whereas in KVM side, the same VT-d page table can be shared by the
 devices which are under smae iommu and assigned to the same guest,
 because all of the guest's memory are statically mapped in VT-d page
 table. But it needs to wrap dmar_domain, this patch wraps it with a
 reference count for multiple devices relate to same dmar_domain.
 
 This patch already adds an API (intel_iommu_device_get_iommu()) in
 intel-iommu.c, which returns its iommu of a device.
 
 There is a missed optimization here.  Suppose we have two devices each
 under a different iommu.  With the patch, each will be in a different
 dmar_domain and so will have a different page table.  The amount of
 memory used is doubled.

You cannot let two devices each under a different iommu share one
dmar_domain, becasue dmar_domain has a pointer to iommu.

 
 Suppose the iommu API hides the existence of multiple iommus.  You
 allocate a translation and add devices to it.  When you add a device,
 the iommu API checks which iommu is needed and programs it
 accordingly, but only one io page table is used.
 
 The other benefit is that iommu developers understand this issues
 while kvm developers don't, so it's best managed by the iommu API. 
 This way if things change (as usual, becoming more complicated), the
 iommu can make the changes in their code and hide the complexity from
 kvm or other users.
 
 I'm probably (badly) duplicating Joerg's iommu API here, but this is
 how it could go:
 
 iommu_translation_create() - creates an iommu translation object; this
 allocates the page tables but doesn't do anything with them
 iommu_translation_map() - adds pages to the translation
 iommu_translation_attach() - attach a device to the translation; this
 locates the iommu and programs it
 _detach(), _unmap(), and _free() undo these operations.

In fact, the exported APIs added for KVM VT-d also do
create/map/attach/detach/free functions. Whereas these iommu APIs are
more readable. 

Because kvm VT-d usage is different with native usage, it's inevitable
extend native VT-d code to support KVM VT-d (such as wrap dmar_domain).
For devices under different iommus, they cannot share the same
dmar_domain, thus they cannot share VT-d page table. If we want to
handle this by iommu APIs, I suspect we need to change lots of native
VT-d driver code.

David/Jesse, what's your opinion?



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [RESEND] VT-d: Support multiple device assignment to one guest

2008-10-09 Thread Avi Kivity
Han, Weidong wrote:

 There is a missed optimization here.  Suppose we have two devices each
 under a different iommu.  With the patch, each will be in a different
 dmar_domain and so will have a different page table.  The amount of
 memory used is doubled.
 

 You cannot let two devices each under a different iommu share one
 dmar_domain, becasue dmar_domain has a pointer to iommu.

   

I don't want then to share dmar_domains (these are implementation
details anyway), just io page tables.


kvm --- something (owns io page table) --- dmar_domain (uses shared io
page table) --- device

Even if we don't implement io page table sharing right away,
implementing the 'something' in the iommu api means we can later
impement sharing without changing the iommu/kvm interface.

 In fact, the exported APIs added for KVM VT-d also do
 create/map/attach/detach/free functions. Whereas these iommu APIs are
 more readable. 

   


No; the existing iommu API talks about dmar domains and exposes the
existence of multiple iommus, so it is more complex.

 Because kvm VT-d usage is different with native usage, it's inevitable
 extend native VT-d code to support KVM VT-d (such as wrap dmar_domain).
 For devices under different iommus, they cannot share the same
 dmar_domain, thus they cannot share VT-d page table. If we want to
 handle this by iommu APIs, I suspect we need to change lots of native
 VT-d driver code.
   

As mentioned above, we can start with implementing the API without
actual sharing (basically, your patch, but as an addition to the API
rather than a change to kvm); we can add io pagetable sharing later.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [RESEND] VT-d: Support multiple device assignment to one guest

2008-10-09 Thread Avi Kivity
Han, Weidong wrote:

 I don't want then to share dmar_domains (these are implementation
 details anyway), just io page tables.


 kvm --- something (owns io page table) --- dmar_domain (uses shared
 io page table) --- device

 

 Let dmar_domains share io page table is not allowed. VT-d spec allows
 one domain corresponds to one page table, vice versa. 

Since the io pagetables are read only for the iommu (right?), I don't
see what prevents several iommus from accessing the same pagetable. 
It's just a bunch of memory.

 If we want
 something owns the io page table, which shared by all assigned devices
 to one guest, we need to redefine dmar_domain which covers all devices
 assigned to a guest. Then we need to rewrite most of native VT-d code
 for kvm. Xen doesn't use dmar_domain, instead it implements something
 as a domain sturcture (with domain id) to own page table. 

I imagine, Xen shares the io pagetables with the EPT pagetables as
well.  So io pagetable sharing is allowed.

 One guest has
 only one something instance, thus has only one page table. It looks
 like: xen --- something (owns io page table) --- device. But, in KVM
 side, I think we can reuse native VT-d code, needn't to duplicate
 another VT-d code.
   

I agree that at this stage, we don't want to do optimization, we need
something working first.  But let's at least ensure the API allows the
optimization later on (and also, that iommu implementation details are
hidden from kvm).

What I'm proposing is moving the list of kvm_vtd_domains inside the
iommu API.  The only missing piece is populating a new dmar_domain when
a new device is added.  We already have intel_iommu_iova_to_pfn(), we
need to add a way to read the protection bits and the highest mapped
iova (oh, and intel_iommu_iova_to_pfn() has a bug: it shifts right
instead of left).

Later we can make the something (that already contains the list) also
own the io page table; and non-kvm users can still use the same code
(the list will always be of length 1 for these users).

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] [RESEND] VT-d: Support multiple device assignment to one guest

2008-10-09 Thread Han, Weidong
Avi Kivity wrote:
 Han, Weidong wrote:
 
 I don't want then to share dmar_domains (these are implementation
 details anyway), just io page tables.
 
 
 kvm --- something (owns io page table) --- dmar_domain (uses
 shared io page table) --- device 
 
 
 
 Let dmar_domains share io page table is not allowed. VT-d spec allows
 one domain corresponds to one page table, vice versa.
 
 Since the io pagetables are read only for the iommu (right?), I don't
 see what prevents several iommus from accessing the same pagetable.
 It's just a bunch of memory.

I think the reason is that hardware may use the domain identifier to tag
its internal caches. 

 
 If we want
 something owns the io page table, which shared by all assigned
 devices to one guest, we need to redefine dmar_domain which covers
 all devices assigned to a guest. Then we need to rewrite most of
 native VT-d code for kvm. Xen doesn't use dmar_domain, instead it
 implements something as a domain sturcture (with domain id) to own
 page table. 
 
 I imagine, Xen shares the io pagetables with the EPT pagetables as
 well.  So io pagetable sharing is allowed.

In Xen, VT-d page table doesn't share with EPT pagetable and P2M
pagetable. But they can share if the format is the same.

 
 One guest has
 only one something instance, thus has only one page table. It looks
 like: xen --- something (owns io page table) --- device. But, in
 KVM side, I think we can reuse native VT-d code, needn't to
 duplicate another VT-d code. 
 
 
 I agree that at this stage, we don't want to do optimization, we need
 something working first.  But let's at least ensure the API allows the
 optimization later on (and also, that iommu implementation details are
 hidden from kvm).
 
 What I'm proposing is moving the list of kvm_vtd_domains inside the
 iommu API.  The only missing piece is populating a new dmar_domain
 when a new device is added.  We already have

I will move kvm_vtd_domain inside the iommu API, and also hide
get_kvm_vtd_domain() and release_kvm_vtd_domain() implementation details
from kvm.

 intel_iommu_iova_to_pfn(), we need to add a way to read the
 protection bits and the highest mapped iova (oh, and
 intel_iommu_iova_to_pfn() has a bug: it shifts right instead of left).
 

Why do we need the protection bits and the highest mapped iova? 

Shifting right instead of left in intel_iommu_iova_to_pfn() is not a
bug, because it returns pfn, not address.

Regards,
Weidong

 Later we can make the something (that already contains the list)
 also own the io page table; and non-kvm users can still use the same
 code (the list will always be of length 1 for these users).

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] [RESEND] VT-d: Support multiple device assignment to one guest

2008-10-09 Thread Han, Weidong
Avi Kivity wrote:
 Han, Weidong wrote:
 
 There is a missed optimization here.  Suppose we have two devices
 each under a different iommu.  With the patch, each will be in a
 different dmar_domain and so will have a different page table.  The
 amount of memory used is doubled. 
 
 
 You cannot let two devices each under a different iommu share one
 dmar_domain, becasue dmar_domain has a pointer to iommu.
 
 
 
 I don't want then to share dmar_domains (these are implementation
 details anyway), just io page tables.
 
 
 kvm --- something (owns io page table) --- dmar_domain (uses shared
 io page table) --- device
 

Let dmar_domains share io page table is not allowed. VT-d spec allows
one domain corresponds to one page table, vice versa. If we want
something owns the io page table, which shared by all assigned devices
to one guest, we need to redefine dmar_domain which covers all devices
assigned to a guest. Then we need to rewrite most of native VT-d code
for kvm. Xen doesn't use dmar_domain, instead it implements something
as a domain sturcture (with domain id) to own page table. One guest has
only one something instance, thus has only one page table. It looks
like: xen --- something (owns io page table) --- device. But, in KVM
side, I think we can reuse native VT-d code, needn't to duplicate
another VT-d code.

Regards,
Weidong

 Even if we don't implement io page table sharing right away,
 implementing the 'something' in the iommu api means we can later
 impement sharing without changing the iommu/kvm interface.
 
 In fact, the exported APIs added for KVM VT-d also do
 create/map/attach/detach/free functions. Whereas these iommu APIs
 are more readable. 
 
 
 
 
 No; the existing iommu API talks about dmar domains and exposes the
 existence of multiple iommus, so it is more complex.
 
 Because kvm VT-d usage is different with native usage, it's
 inevitable extend native VT-d code to support KVM VT-d (such as wrap
 dmar_domain). For devices under different iommus, they cannot share
 the same dmar_domain, thus they cannot share VT-d page table. If we
 want to handle this by iommu APIs, I suspect we need to change lots
 of native VT-d driver code. 
 
 
 As mentioned above, we can start with implementing the API without
 actual sharing (basically, your patch, but as an addition to the API
 rather than a change to kvm); we can add io pagetable sharing later.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] [RESEND] VT-d: Support multiple device assignment to one guest

2008-10-09 Thread Han, Weidong
Han, Weidong wrote:
 Avi Kivity wrote:
 Han, Weidong wrote:
 
 I don't want then to share dmar_domains (these are implementation
 details anyway), just io page tables.
 
 
 kvm --- something (owns io page table) --- dmar_domain (uses
 shared io page table) --- device
 
 
 
 Let dmar_domains share io page table is not allowed. VT-d spec
 allows one domain corresponds to one page table, vice versa.
 
 Since the io pagetables are read only for the iommu (right?), I don't
 see what prevents several iommus from accessing the same pagetable.
 It's just a bunch of memory.
 
 I think the reason is that hardware may use the domain identifier to
 tag its internal caches. 
 
 
 If we want
 something owns the io page table, which shared by all assigned
 devices to one guest, we need to redefine dmar_domain which covers
 all devices assigned to a guest. Then we need to rewrite most of
 native VT-d code for kvm. Xen doesn't use dmar_domain, instead it
 implements something as a domain sturcture (with domain id) to own
 page table.
 
 I imagine, Xen shares the io pagetables with the EPT pagetables as
 well.  So io pagetable sharing is allowed.
 
 In Xen, VT-d page table doesn't share with EPT pagetable and P2M
 pagetable. But they can share if the format is the same. 
 
 
 One guest has
 only one something instance, thus has only one page table. It
 looks like: xen --- something (owns io page table) --- device.
 But, in KVM side, I think we can reuse native VT-d code, needn't to
 duplicate another VT-d code. 
 
 
 I agree that at this stage, we don't want to do optimization, we need
 something working first.  But let's at least ensure the API allows
 the optimization later on (and also, that iommu implementation
 details are hidden from kvm). 
 
 What I'm proposing is moving the list of kvm_vtd_domains inside the
 iommu API.  The only missing piece is populating a new dmar_domain
 when a new device is added.  We already have
 
 I will move kvm_vtd_domain inside the iommu API, and also hide
 get_kvm_vtd_domain() and release_kvm_vtd_domain() implementation
 details from kvm.  

It's hard to move kvm_vtd_domain inside current iommu API. It's kvm
specific. It's not elegant to include kvm_vtd_domain stuffs in native
VT-d code. I think leave it in kvm side is more clean at this point.
Moveover it's very simple. I read Joerg's iommu API foils just now, I
think it's good. Native AMD iommu code will be in 2.6.28, it's a
suitable to implement a generic iommu API based both on Intel and AMD
iommu for kvm after 2.6.28. What's your opinion? 

Regards,
Weidong

 
 intel_iommu_iova_to_pfn(), we need to add a way to read the
 protection bits and the highest mapped iova (oh, and
 intel_iommu_iova_to_pfn() has a bug: it shifts right instead of
 left). 
 
 
 Why do we need the protection bits and the highest mapped iova?
 
 Shifting right instead of left in intel_iommu_iova_to_pfn() is not a
 bug, because it returns pfn, not address. 
 
 Regards,
 Weidong
 
 Later we can make the something (that already contains the list)
 also own the io page table; and non-kvm users can still use the same
 code (the list will always be of length 1 for these users).

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [RESEND] VT-d: Support multiple device assignment to one guest

2008-10-08 Thread Avi Kivity

Han, Weidong wrote:

Oh, I see it now.  Different devices may need to go under different
iommus. 


This really feels like it should be handled by the iommu API.  Users
shouldn't need to bother with it.



Why do you say it should be handled by iommu API? 


Because the logic of which iommu controls which device is only 
understood by iommu developers.  Also, because this logic would be 
duplicated by anyone attempting to do the same thing.


So it seems reasonable it should be implemented by the iommu API, not 
its users.



The direct way to
support multiple device assignment is keep a dmar_domain list for each
guest, each device corresponds to one dmar_domain. But this will cost
more memory because each dmar_domain has its own VT-d page table. Our
method lets the devices which are under smae iommu and assigned to the
same guest share the same VT-d page table. 
  


If we devolve this to the iommu API, the same io page table can be 
shared by all iommus, so long as they all use the same page table format.



--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] [RESEND] VT-d: Support multiple device assignment to one guest

2008-10-08 Thread Han, Weidong
Avi Kivity wrote:
 Han, Weidong wrote:
 Oh, I see it now.  Different devices may need to go under different
 iommus. 
 
 This really feels like it should be handled by the iommu API.  Users
 shouldn't need to bother with it.
 
 
 Why do you say it should be handled by iommu API?
 
 Because the logic of which iommu controls which device is only
 understood by iommu developers.  Also, because this logic would be
 duplicated by anyone attempting to do the same thing.
 
 So it seems reasonable it should be implemented by the iommu API, not
 its users.
 
 The direct way to
 support multiple device assignment is keep a dmar_domain list for
 each guest, each device corresponds to one dmar_domain. But this
 will cost more memory because each dmar_domain has its own VT-d page
 table. Our method lets the devices which are under smae iommu and
 assigned to the same guest share the same VT-d page table.
 
 
 If we devolve this to the iommu API, the same io page table can be
 shared by all iommus, so long as they all use the same page table
 format. 

I don't understand how to handle this by iommu API. Let me explain my
thoughts more clearly: 

VT-d spec says: 
Context-entries programmed with the same domain identifier must
always reference the same address translation structure (through the ASR
field). Similarly, context-entries referencing the same address
translation structure must be programmed with the same domain id. 

In native VT-d driver, dmar_domain is per device, and has its own VT-d
page table, which is dynamically setup before each DMA. So it is
impossible that the same VT-d page table is shared by all iommus.
Moveover different iommus in system may have different page table
levels. I think it's enough that iommu API tells us its iommu of a
device. 

Whereas in KVM side, the same VT-d page table can be shared by the
devices which are under smae iommu and assigned to the same guest,
because all of the guest's memory are statically mapped in VT-d page
table. But it needs to wrap dmar_domain, this patch wraps it with a
reference count for multiple devices relate to same dmar_domain.

This patch already adds an API (intel_iommu_device_get_iommu()) in
intel-iommu.c, which returns its iommu of a device. 

Regards,
Weidong
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [RESEND] VT-d: Support multiple device assignment to one guest

2008-10-08 Thread Avi Kivity
Han, Weidong wrote:
 If we devolve this to the iommu API, the same io page table can be
 shared by all iommus, so long as they all use the same page table
 format. 
 

 I don't understand how to handle this by iommu API. Let me explain my
 thoughts more clearly: 

 VT-d spec says: 
   Context-entries programmed with the same domain identifier must
 always reference the same address translation structure (through the ASR
 field). Similarly, context-entries referencing the same address
 translation structure must be programmed with the same domain id. 

 In native VT-d driver, dmar_domain is per device, and has its own VT-d
 page table, which is dynamically setup before each DMA. So it is
 impossible that the same VT-d page table is shared by all iommus.
 Moveover different iommus in system may have different page table
 levels. 

Right.  This use case is in essence to prevent unintended sharing.  It
is also likely to have low page table height, since dma sizes are
relatively small.

 I think it's enough that iommu API tells us its iommu of a
 device. 
   

While this is tangential to our conversation, why?  Even for the device
driver use case, this only makes the API more complex.  If the API hides
the existence of multiple iommus, it's easier to use and harder to make
a mistake.

 Whereas in KVM side, the same VT-d page table can be shared by the
 devices which are under smae iommu and assigned to the same guest,
 because all of the guest's memory are statically mapped in VT-d page
 table. But it needs to wrap dmar_domain, this patch wraps it with a
 reference count for multiple devices relate to same dmar_domain.

 This patch already adds an API (intel_iommu_device_get_iommu()) in
 intel-iommu.c, which returns its iommu of a device. 

There is a missed optimization here.  Suppose we have two devices each
under a different iommu.  With the patch, each will be in a different
dmar_domain and so will have a different page table.  The amount of
memory used is doubled.

Suppose the iommu API hides the existence of multiple iommus.  You
allocate a translation and add devices to it.  When you add a device,
the iommu API checks which iommu is needed and programs it accordingly,
but only one io page table is used.

The other benefit is that iommu developers understand this issues while
kvm developers don't, so it's best managed by the iommu API.  This way
if things change (as usual, becoming more complicated), the iommu can
make the changes in their code and hide the complexity from kvm or other
users.

I'm probably (badly) duplicating Joerg's iommu API here, but this is how
it could go:

iommu_translation_create() - creates an iommu translation object; this
allocates the page tables but doesn't do anything with them
iommu_translation_map() - adds pages to the translation
iommu_translation_attach() - attach a device to the translation; this
locates the iommu and programs it
_detach(), _unmap(), and _free() undo these operations.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [RESEND] VT-d: Support multiple device assignment to one guest

2008-10-07 Thread Avi Kivity

Han, Weidong wrote:

[Rebased the patch due to my mmio's patch (commit: 0d679782) was checked
in]

From 9e68fc762358cc44cfec3968ac5ec65324ce04d7 Mon Sep 17 00:00:00 2001
From: Weidong Han [EMAIL PROTECTED]
Date: Mon, 6 Oct 2008 14:02:18 +0800
Subject: [PATCH] Support multiple device assignment to one guest

Current VT-d patches in kvm only support one device assignment to one
guest due to dmar_domain is per device.

In order to support multiple device assignemnt, this patch wraps
dmar_domain with a reference count (kvm_vtd_domain), and also adds a
pointer in kvm_assigned_dev_kernel to link to a kvm_vtd_domain.

Each dmar_domain owns one VT-d page table, in order to reduce page
tables and improve IOTLB utility, the devices assigned to the same guest
and under the same IOMMU share the same kvm_vtd_domain.

  


I don't understand this.  If we have a one dmar domain per guest, why do 
we need reference counting at all?


We can create the dmar domain when we assign the first device, and 
destroy it when we deassign the last device, but otherwise I don't see a 
need for changes.  Particularly I don't understand this:



@@ -351,7 +351,6 @@ struct kvm_arch{
 */
struct list_head active_mmu_pages;
struct list_head assigned_dev_head;
-   struct dmar_domain *intel_iommu_domain;
struct kvm_pic *vpic;
struct kvm_ioapic *vioapic;
struct kvm_pit *vpit;
@@ -305,6 +310,7 @@ struct kvm_assigned_dev_kernel {
int irq_requested;
struct pci_dev *dev;
struct kvm *kvm;
+   struct kvm_vtd_domain *vtd_domain;
 };


Oh, I see it now.  Different devices may need to go under different iommus.

This really feels like it should be handled by the iommu API.  Users 
shouldn't need to bother with it.


Joerg, can your dma api handle this?

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [RESEND] VT-d: Support multiple device assignment to one guest

2008-10-07 Thread Avi Kivity

Zhang, Xiantao wrote:

Han, Weidong wrote:
  

 #ifdef CONFIG_DMAR
 int intel_iommu_found(void);
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 73b7c52..7a3e1b6 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -293,6 +293,11 @@ struct kvm_irq_ack_notifier {
void (*irq_acked)(struct kvm_irq_ack_notifier *kian);
 };

+struct kvm_vtd_domain {
+   int dev_count;  /* number of assigned devices */



Atomic operations are needed for this field? 
  


Probably not, since it is protected by the kvm lock.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html