On Mon, May 12, 2025 at 03:16:34PM +0000, Chaney, Ben wrote: > Hello, > > When live migrating to a destination host with pmem there is a very > long downtime where the guest is paused. In some cases, this can be as high > as 5 minutes, compared to less than one second in the good case. > > > Profiling suggests very high activity in this code path: > > > ffffffffa2956de6 clean_cache_range+0x26 ([kernel.kallsyms]) > ffffffffa2359b0f dax_writeback_mapping_range+0x1ef ([kernel.kallsyms]) > ffffffffc0c6336d ext4_dax_writepages+0x7d ([kernel.kallsyms]) > ffffffffa2242dac do_writepages+0xbc ([kernel.kallsyms]) > ffffffffa2235ea6 filemap_fdatawrite_wbc+0x66 ([kernel.kallsyms]) > ffffffffa223a896 __filemap_fdatawrite_range+0x46 ([kernel.kallsyms]) > ffffffffa223af73 file_write_and_wait_range+0x43 ([kernel.kallsyms]) > ffffffffc0c57ecb ext4_sync_file+0xfb ([kernel.kallsyms]) > ffffffffa228a331 __do_sys_msync+0x1c1 ([kernel.kallsyms]) > ffffffffa2997fe6 do_syscall_64+0x56 ([kernel.kallsyms]) > ffffffffa2a00126 entry_SYSCALL_64_after_hwframe+0x6e ([kernel.kallsyms]) > 11ec5f msync+0x4f (/usr/lib/x86_64-linux-gnu/libc.so.6) > 675ada qemu_ram_msync+0x8a (/usr/local/akamai/qemu/bin/qemu-system-x86_64) > 6873c7 xbzrle_load_cleanup+0x37 (inlined) > 6873c7 ram_load_cleanup+0x37 (/usr/local/akamai/qemu/bin/qemu-system-x86_64) > 4ff375 qemu_loadvm_state_cleanup+0x55 > (/usr/local/akamai/qemu/bin/qemu-system-x86_64) > 500f0b qemu_loadvm_state+0x15b (/usr/local/akamai/qemu/bin/qemu-system-x86_64) > 4ecf85 process_incoming_migration_co+0x95 > (/usr/local/akamai/qemu/bin/qemu-system-x86_64) > 8b6412 qemu_coroutine_self+0x2 (/usr/local/akamai/qemu/bin/qemu-system-x86_64) > ffffffffffffffff [unknown] ([unknown]) > > > I was able to resolve the performance issue by removing the call to > qemu_ram_block_writeback in ram_load_cleanup. This causes the performance to > return to normal. It looks like this code path was initially added to ensure > the memory was synchronized if the persistent memory region is backed by an > NVDIMM device. Does it serve any purpose if pmem is instead backed by > standard DRAM? > > > I'm also curious about the intended use of this code path in NVDIMM > case. It seems like it would run into a few issues. This on its own seems > insufficient to restore the VM state if the host crashes after a live > migration. The memory region being synced is only the guest memory. It > doesn't save the driver state on the host side. Also, once the migration > completes, the guest can redirty the pages. If the host crashes after that > point, the guest memory will still be in an inconsistent state unless the > crash is exceptionally well timed. Does anyone have any insight into why this > sync operation was introduced? > > > Thank you, > Ben Chaney > > > > >
Was added here: commit 56eb90af39abf66c0e80588a9f50c31e7df7320b Author: Junyan He <junyan...@intel.com> Date: Wed Jul 18 15:48:03 2018 +0800 migration/ram: ensure write persistence on loading all data to PMEM. Because we need to make sure the pmem kind memory data is synced after migration, we choose to call pmem_persist() when the migration finish. This will make sure the data of pmem is safe and will not lose if power is off. Signed-off-by: Junyan He <junyan...@intel.com> Reviewed-by: Stefan Hajnoczi <stefa...@redhat.com> Reviewed-by: Igor Mammedov <imamm...@redhat.com> Reviewed-by: Michael S. Tsirkin <m...@redhat.com> Signed-off-by: Michael S. Tsirkin <m...@redhat.com> it kind of sounded reasonable ... but I don't remember. Also CC Haozhong Zhang who worked in this area. > >