On 08/08/2017 08:06 PM, Dr. David Alan Gilbert wrote:
* Alexey Perevalov (a.pereva...@samsung.com) wrote:
On 06/28/2017 10:00 PM, Dr. David Alan Gilbert (git) wrote:
From: "Dr. David Alan Gilbert" <dgilb...@redhat.com>
Clear the area and turn off THP.
Signed-off-by: Dr. David Alan Gilbert <dgilb...@redhat.com>
---
contrib/libvhost-user/libvhost-user.c | 32 ++++++++++++++++++++++++++++++--
1 file changed, 30 insertions(+), 2 deletions(-)
diff --git a/contrib/libvhost-user/libvhost-user.c
b/contrib/libvhost-user/libvhost-user.c
index 0658b6e847..ceddeac74f 100644
--- a/contrib/libvhost-user/libvhost-user.c
+++ b/contrib/libvhost-user/libvhost-user.c
@@ -451,11 +451,39 @@ vu_set_mem_table_exec(VuDev *dev, VhostUserMsg *vmsg)
}
if (dev->postcopy_listening) {
+ int ret;
/* We should already have an open ufd need to mark each memory
* range as ufd.
- * Note: Do we need any madvises? Well it's not been accessed
- * yet, still probably need no THP to be safe, discard to be safe?
*/
+
+ /* Discard any mapping we have here; note I can't use MADV_REMOVE
+ * or fallocate to make the hole since I don't want to lose
+ * data that's already arrived in the shared process.
+ * TODO: How to do hugepage
+ */
Hi, David, frankly saying, I stuck with my solution, and I have also another
issues,
but here I could suggest solution for hugepages. I think we could transmit a
received pages
bitmap in VHOST_USER_SET_MEM_TABLE (VhostUserMemoryRegion), but it will
raise a compatibility issue,
or introduce special message type for that and send it before
VHOST_USER_SET_MEM_TABLE.
So it will be possible to do fallocate on received bitmap basis, just skip
already copied pages.
If you wish, I could send patches, rebased on yours, for doing it.
What we found works is that actually we don't need to do a discard -
since we've only just done the mmap of the arena, nothing will be
occupying it on the shared client, so we don't need to discard.
Looks like yes, I checked on kernel from Andrea's git,
there is any more EEXIST error in case when client doesn't
fallocate.
We've had a postcopy migrate work now, with a few hacks we're still
cleaning up, both on vhost-user-bridge and dpdk; so I'll get this
updated and reposted.
In you patch series vring is disabling in case of VHOST_USER_GET_VRING_BASE.
It's being called when vhost-user server want's to stop vring.
QEMU is enabling vring as soon as virtual machine is started, so I
didn't see
explicit vring disabling for migrating VRING.
So migrating VRING is protected just by uffd_register, isn't it? And PMD
thread (any
vhost-user thread which accessing migrating VRING) will wait page
copying in this case,
right?
Dave
+ ret = madvise((void *)dev_region->mmap_addr,
+ dev_region->size + dev_region->mmap_offset,
+ MADV_DONTNEED);
+ if (ret) {
+ fprintf(stderr,
+ "%s: Failed to madvise(DONTNEED) region %d: %s\n",
+ __func__, i, strerror(errno));
+ }
+ /* Turn off transparent hugepages so we dont get lose wakeups
+ * in neighbouring pages.
+ * TODO: Turn this backon later.
+ */
+ ret = madvise((void *)dev_region->mmap_addr,
+ dev_region->size + dev_region->mmap_offset,
+ MADV_NOHUGEPAGE);
+ if (ret) {
+ /* Note: This can happen legally on kernels that are configured
+ * without madvise'able hugepages
+ */
+ fprintf(stderr,
+ "%s: Failed to madvise(NOHUGEPAGE) region %d: %s\n",
+ __func__, i, strerror(errno));
+ }
struct uffdio_register reg_struct;
/* Note: We might need to go back to using mmap_addr and
* len + mmap_offset for * huge pages, but then we do hope not to
--
Best regards,
Alexey Perevalov
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
--
Best regards,
Alexey Perevalov