On 03/31/2017 11:30 AM, Chris Friesen wrote:
On 03/31/2017 11:21 AM, Chris Friesen wrote:

I ran tcpdump looking for TCP traffic between the two libvirtd processes, and
was unable to see any after several minutes.  So it doesn't look like there is
any regular keepalive messaging going on (/etc/libvirt/libvirtd.conf doesn't
specify any keepalive settings so we'd be using the defaults I think).  And yet
the TCP connection is stuck open.

Turns out I ran tcpdump in the wrong window....oops.  There's what appears to be
a keepalive sequence every 5 seconds.

I still don't understand why the connection wasn't taken down when qemu exited
on the destination host.

One final update for now....I attached gdb to libvirtd on the source host and then killed libvirtd on the destination host. I saw the TCP connection get closed down, and gdb showed this:

[Thread 0x7f8948ab3700 (LWP 4514) exited]

At this point "virsh" commands on the source host work as expected, it's no longer hung.

So it appears we have a number of factors contributing to the hang:
1) failure of migration in qemu
2) connection between hosts not getting torn down when migration fails
3) the libvirtd thread managing the migration on the source side appears to be sleeping indefinitely while holding a resource of some sort which causes the apparent hang when we try to do other operations

Chris

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Reply via email to