On 08/08/2017 08:06 PM, Dr. David Alan Gilbert wrote:
* Alexey Perevalov (a.pereva...@samsung.com) wrote:
On 06/28/2017 10:00 PM, Dr. David Alan Gilbert (git) wrote:
From: "Dr. David Alan Gilbert" <dgilb...@redhat.com>

Clear the area and turn off THP.

Signed-off-by: Dr. David Alan Gilbert <dgilb...@redhat.com>
   contrib/libvhost-user/libvhost-user.c | 32 ++++++++++++++++++++++++++++++--
   1 file changed, 30 insertions(+), 2 deletions(-)

diff --git a/contrib/libvhost-user/libvhost-user.c 
index 0658b6e847..ceddeac74f 100644
--- a/contrib/libvhost-user/libvhost-user.c
+++ b/contrib/libvhost-user/libvhost-user.c
@@ -451,11 +451,39 @@ vu_set_mem_table_exec(VuDev *dev, VhostUserMsg *vmsg)
           if (dev->postcopy_listening) {
+            int ret;
               /* We should already have an open ufd need to mark each memory
                * range as ufd.
-             * Note: Do we need any madvises? Well it's not been accessed
-             * yet, still probably need no THP to be safe, discard to be safe?
+            /* Discard any mapping we have here; note I can't use MADV_REMOVE
+             * or fallocate to make the hole since I don't want to lose
+             * data that's already arrived in the shared process.
+             * TODO: How to do hugepage
+             */
Hi, David, frankly saying, I stuck with my solution, and I have also another
but here I could suggest solution for hugepages. I think we could transmit a
received pages
bitmap in VHOST_USER_SET_MEM_TABLE (VhostUserMemoryRegion), but it will
raise a compatibility issue,
or introduce special message type for that and send it before
So it will be  possible to do fallocate on received bitmap basis, just skip
already copied pages.
If you wish, I could send patches, rebased on yours, for doing it.
What we found works is that actually we don't need to do a discard -
since we've only just done the mmap of the arena, nothing will be
occupying it on the shared client, so we don't need to discard.
Looks like yes, I checked on kernel from Andrea's git,
there is any more EEXIST error in case when client doesn't

We've had a postcopy migrate work now, with a few hacks we're still
cleaning up, both on vhost-user-bridge and dpdk; so I'll get this
updated and reposted.
In you patch series vring is disabling in case of VHOST_USER_GET_VRING_BASE.
It's being called when vhost-user server want's to stop vring.
QEMU is enabling vring as soon as virtual machine is started, so I didn't see
explicit vring disabling for migrating VRING.
So migrating VRING is protected just by uffd_register, isn't it? And PMD thread (any vhost-user thread which accessing migrating VRING) will wait page copying in this case,


+            ret = madvise((void *)dev_region->mmap_addr,
+                          dev_region->size + dev_region->mmap_offset,
+                          MADV_DONTNEED);
+            if (ret) {
+                fprintf(stderr,
+                        "%s: Failed to madvise(DONTNEED) region %d: %s\n",
+                        __func__, i, strerror(errno));
+            }
+            /* Turn off transparent hugepages so we dont get lose wakeups
+             * in neighbouring pages.
+             * TODO: Turn this backon later.
+             */
+            ret = madvise((void *)dev_region->mmap_addr,
+                          dev_region->size + dev_region->mmap_offset,
+                          MADV_NOHUGEPAGE);
+            if (ret) {
+                /* Note: This can happen legally on kernels that are configured
+                 * without madvise'able hugepages
+                 */
+                fprintf(stderr,
+                        "%s: Failed to madvise(NOHUGEPAGE) region %d: %s\n",
+                        __func__, i, strerror(errno));
+            }
               struct uffdio_register reg_struct;
               /* Note: We might need to go back to using mmap_addr and
                * len + mmap_offset for * huge pages, but then we do hope not to

Best regards,
Alexey Perevalov
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Best regards,
Alexey Perevalov

Reply via email to