I've repeated the experiment without any shared storage, so that
eliminates GlusterFS as a suspect.
server-a# virsh migrate --live --persistent --undefinesource --copy-
storage-inc guest qemu+tls://server-b/system
Result: After about a week of uptime, the guest froze solid for 27
seconds after the migration. This is after the migration, because the
guest is running on the destination server, using up a full core, and
not present on the originating server anymore. CPU usage goes back to
normal once the guest becomes responsive again.
Just before the migration, NTP was perfectly locked to well within
100us. Right after the machine become responsive again, this NTP status
shows the machine simply lost more than 27 seconds:
root@guest:~# ntpq -p
remote refid st t when poll reach delay offset jitter
==============================================================================
*cl0 xx.xx.xx.xx 3 u 15 16 377 0.457 27388.3 0.100
cl1 xx.xx.xx.xx 3 u 13 16 377 0.429 27388.4 0.178
root@guest:~# uptime
16:03:30 up 8 days, 23:45, 1 user, load average: 0.02, 0.02, 0.05
During these 27 seconds, it did not respond to any network activity or
(virtual) console. There is no mention of clock-jumps or anything else
in dmesg this time.
Note that I have now reproduced this on two different pairs of machines:
our original KVM cluster, and two compute nodes (different hardware) to
test this with a supported Ubuntu release.
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1297218
Title:
guest hangs after live migration due to tsc jump
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/glusterfs/+bug/1297218/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs