Avi Kivity wrote:
> Han, Weidong wrote:
>>> If we devolve this to the iommu API, the same io page table can be
>>> shared by all iommus, so long as they all use the same page table
>>> format. 
>>> 
>> 
>> I don't understand how to handle this by iommu API. Let me explain
>> my thoughts more clearly: 
>> 
>> VT-d spec says:
>>      Context-entries programmed with the same domain identifier must
>> always reference the same address translation structure (through the
>> ASR field). Similarly, context-entries referencing the same address
>> translation structure must be programmed with the same domain id.
>> 
>> In native VT-d driver, dmar_domain is per device, and has its own
>> VT-d page table, which is dynamically setup before each DMA. So it is
>> impossible that the same VT-d page table is shared by all iommus.
>> Moveover different iommus in system may have different page table
>> levels.
> 
> Right.  This use case is in essence to prevent unintended sharing.  It
> is also likely to have low page table height, since dma sizes are
> relatively small.
> 
>> I think it's enough that iommu API tells us its iommu of a
>> device.
>> 
> 
> While this is tangential to our conversation, why?  Even for the
> device driver use case, this only makes the API more complex.  If the
> API hides the existence of multiple iommus, it's easier to use and
> harder to make a mistake.
> 
>> Whereas in KVM side, the same VT-d page table can be shared by the
>> devices which are under smae iommu and assigned to the same guest,
>> because all of the guest's memory are statically mapped in VT-d page
>> table. But it needs to wrap dmar_domain, this patch wraps it with a
>> reference count for multiple devices relate to same dmar_domain.
>> 
>> This patch already adds an API (intel_iommu_device_get_iommu()) in
>> intel-iommu.c, which returns its iommu of a device.
> 
> There is a missed optimization here.  Suppose we have two devices each
> under a different iommu.  With the patch, each will be in a different
> dmar_domain and so will have a different page table.  The amount of
> memory used is doubled.

You cannot let two devices each under a different iommu share one
dmar_domain, becasue dmar_domain has a pointer to iommu.

> 
> Suppose the iommu API hides the existence of multiple iommus.  You
> allocate a translation and add devices to it.  When you add a device,
> the iommu API checks which iommu is needed and programs it
> accordingly, but only one io page table is used.
> 
> The other benefit is that iommu developers understand this issues
> while kvm developers don't, so it's best managed by the iommu API. 
> This way if things change (as usual, becoming more complicated), the
> iommu can make the changes in their code and hide the complexity from
> kvm or other users.
> 
> I'm probably (badly) duplicating Joerg's iommu API here, but this is
> how it could go:
> 
> iommu_translation_create() - creates an iommu translation object; this
> allocates the page tables but doesn't do anything with them
> iommu_translation_map() - adds pages to the translation
> iommu_translation_attach() - attach a device to the translation; this
> locates the iommu and programs it
> _detach(), _unmap(), and _free() undo these operations.

In fact, the exported APIs added for KVM VT-d also do
create/map/attach/detach/free functions. Whereas these iommu APIs are
more readable. 

Because kvm VT-d usage is different with native usage, it's inevitable
extend native VT-d code to support KVM VT-d (such as wrap dmar_domain).
For devices under different iommus, they cannot share the same
dmar_domain, thus they cannot share VT-d page table. If we want to
handle this by iommu APIs, I suspect we need to change lots of native
VT-d driver code.

David/Jesse, what's your opinion?



--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to