On 2019/9/19 下午10:06, Michael S. Tsirkin wrote:
On Thu, Sep 19, 2019 at 05:37:48PM +0800, Jason Wang wrote:
On 2019/9/19 下午3:16, Tian, Kevin wrote:
+Paolo to help clarify here.

From: Jason Wang [mailto:jasow...@redhat.com]
Sent: Thursday, September 19, 2019 2:32 PM


On 2019/9/19 下午2:17, Yan Zhao wrote:
On Thu, Sep 19, 2019 at 02:09:53PM +0800, Jason Wang wrote:
On 2019/9/19 下午1:28, Yan Zhao wrote:
On Thu, Sep 19, 2019 at 09:05:12AM +0800, Jason Wang wrote:
On 2019/9/18 下午4:37, Tian, Kevin wrote:
From: Jason Wang [mailto:jasow...@redhat.com]
Sent: Wednesday, September 18, 2019 2:10 PM

Note that the HVA to GPA mapping is not an 1:1 mapping. One
HVA
range
could be mapped to several GPA ranges.
This is fine. Currently vfio_dma maintains IOVA->HVA mapping.

btw under what condition HVA->GPA is not 1:1 mapping? I didn't
realize it.
I don't remember the details e.g memory region alias? And neither
kvm
nor kvm API does forbid this if my memory is correct.

I checked https://qemu.weilnetz.de/doc/devel/memory.html, which
provides an example of aliased layout. However, its aliasing is all
1:1, instead of N:1. From guest p.o.v every writable GPA implies an
unique location. Why would we hit the situation where multiple
write-able GPAs are mapped to the same HVA (i.e. same physical
memory location)?
I don't know, just want to say current API does not forbid this. So we
probably need to take care it.

yes, in KVM API level, it does not forbid two slots to have the same
HVA(slot->userspace_addr).
But
(1) there's only one kvm instance for each vm for each qemu process.
(2) all ramblock->host (corresponds to HVA and slot->userspace_addr)
in one qemu
process is non-overlapping as it's obtained from mmmap().
(3) qemu ensures two kvm slots will not point to the same section of
one ramblock.
So, as long as kvm instance is not shared in two processes, and
there's no bug in qemu, we can assure that HVA to GPA is 1:1.
Well, you leave this API for userspace, so you can't assume qemu is the
only user or any its behavior. If you had you should limit it in the API
level instead of open window for them.


But even if there are two processes operating on the same kvm
instance
and manipulating on memory slots, adding an extra GPA along side
current
IOVA & HVA to ioctl VFIO_IOMMU_MAP_DMA can still let driver knows
the
right IOVA->GPA mapping, right?
It looks fragile. Consider HVA was mapped to both GPA1 and GPA2.
Guest
maps IOVA to GPA2, so we have IOVA GPA2 HVA in the new ioctl and
then
log through GPA2. If userspace is trying to sync through GPA1, it will
miss the dirty page. So for safety we need log both GPA1 and GPA2. (See
what has been done in log_write_hva() in vhost.c). The only way to do
that is to maintain an independent HVA to GPA mapping like what KVM
or
vhost did.

why GPA1 and GPA2 should be both dirty?
even they have the same HVA due to overlaping virtual address space in
two processes, they still correspond to two physical pages.
don't get what's your meaning :)
The point is not leave any corner case that is hard to debug or fix in
the future.

Let's just start by a single process, the API allows userspace to maps
HVA to both GPA1 and GPA2. Since it knows GPA1 and GPA2 are equivalent,
it's ok to sync just through GPA1. That means if you only log GPA2, it
won't work.

I noted KVM itself doesn't consider such situation (one HVA is mapped
to multiple GPAs), when doing its dirty page tracking. If you look at
kvm_vcpu_mark_page_dirty, it simply finds the unique memslot which
contains the dirty gfn and then set the dirty bit within that slot. It
doesn't attempt to walk all memslots to find out any other GPA which
may be mapped to the same HVA.

So there must be some disconnect here. let's hear from Paolo first and
understand the rationale behind such situation.

Neither did vhost when IOTLB is disabled. And cc Michael who points out this
issue at the beginning.

Thanks


Thanks
Kevin
Yes, we fixed with a kind of a work around, at the time I proposed
a new interace to fix it fully. I don't think we ever got around
to implementing it - right?


Paolo said userspace just need to sync through all GPAs, so my understanding is that work around is ok by redundant, so did the API you proposed. Anything I miss?

Thanks


Reply via email to