On 10/08/2013 10:31 AM, Michal Privoznik wrote:
> I've noticed a SIGSEGV-ing libvirtd on the destination when the qemu
> died too quickly = in Prepare phase. What is happening here is:
>
> 1) [Thread 3493] We are in qemuMigrationPrepareAny() and calling
> qemuProcessStart() which subsequently calls qemuProcessWaitForMonitor()
> and qemuConnectMonitor(). So far so good. The qemuMonitorOpen()
> succeeds, however switching monitor to QMP mode fails as qemu died
> meanwhile. That is qemuMonitorSetCapabilities() returns -1.
>
> 2013-10-08 15:54:10.629+0000: 3493: debug : qemuMonitorSetCapabilities:1356 :
> mon=0x14a53da0
> 2013-10-08 15:54:10.630+0000: 3493: debug : qemuMonitorJSONCommandWithFd:262
> : Send command '{"execute":"qmp_capabilities","id":"libvirt-1"}' for write
> with FD -1
> 2013-10-08 15:54:10.630+0000: 3493: debug : virEventPollUpdateHandle:147 :
> EVENT_POLL_UPDATE_HANDLE: watch=17 events=13
> ...
> 2013-10-08 15:54:10.631+0000: 3493: debug : qemuMonitorSend:956 :
> QEMU_MONITOR_SEND_MSG: mon=0x14a53da0
> msg={"execute":"qmp_capabilities","id":"libvirt-1"}
> fd=-1
> 2013-10-08 15:54:10.631+0000: 3262: debug : virEventPollRunOnce:641 : Poll
> got 1 event(s)
>
> 2) [Thread 3262] The event loop is trying to do the talking to monitor.
> However, qemu is dead already, remember?
>
> 2013-10-08 15:54:13.436+0000: 3262: error : qemuMonitorIORead:551 : Unable to
> read from monitor: Connection reset by peer
> 2013-10-08 15:54:13.516+0000: 3262: debug : virFileClose:90 : Closed fd 25
> ...
> 2013-10-08 15:54:13.533+0000: 3493: debug : qemuMonitorSend:968 : Send
> command resulted in error internal error: early end of file from monitor:
> possible problem:
>
> 3) [Thread 3493] qemuProcessStart() failed. No big deal. Go to the
> 'endjob' label and subsequently to the 'cleanup'. Since the domain is
> not persistent and ret is -1, the qemuDomainRemoveInactive() is called.
> This has an (unpleasant) effect of virObjectUnref()-in the @vm object.
> Unpleasant because the event loop which is about to trigger EOF callback
> still holds a pointer to the @vm (not the reference). See the valgrind
> output below.
>
> 4) [Thread 3262] So the even loop starts triggering EOF:s/even/event/ > > 2013-10-08 15:54:13.542+0000: 3262: debug : qemuMonitorIO:729 : Triggering > EOF callback > 2013-10-08 15:54:13.543+0000: 3262: debug : qemuProcessHandleMonitorEOF:294 : > Received EOF on 0x14549110 'migt10' > > And the monitor is cleaned up. This results in calling > qemuProcessHandleMonitorEOF with the @vm pointer passed. The pointer is > kept in qemuMonitor struct. > > ==3262== Thread 1: > ==3262== Invalid read of size 4 > > The mon->vm is set in qemuMonitorOpenInternal() which is the correct > place to increase @vm ref counter. The correct place to decrease the ref > counter is then qemuMonitorDispose(). However, in order to avoid cyclic > calling of dispose functions, we must use hack, and set priv->mon to > NULL. The qemuDomainObjPrivateFree won't call us again then. This > describes the last two chunks of the commit. The others are there for > handling virObjectRef() and virObjectUnref() correctly. So far it > doesn't work well with NULL virClassPtr :) > > Signed-off-by: Michal Privoznik <[email protected]> > --- > > Yet another one of those patches, where chasing it down took a half of a day > and description what went wrong is longer than a few lines of fix. Yeah, we've had a few of those lately. > > src/qemu/qemu_capabilities.c | 14 ++++++++++---- > src/qemu/qemu_monitor.c | 12 ++++++++++++ > 2 files changed, 22 insertions(+), 4 deletions(-) > @@ -781,6 +792,7 @@ qemuMonitorOpenInternal(virDomainObjPtr vm, > } > mon->fd = fd; > mon->hasSendFD = hasSendFD; > + virObjectRef(vm); > mon->vm = vm; Could be written: mon->vm = virObjectRef(vm); Overall, looks fine to me, but I'd feel more comfortable if you also got Dan to review it before pushing. ACK. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
signature.asc
Description: OpenPGP digital signature
-- libvir-list mailing list [email protected] https://www.redhat.com/mailman/listinfo/libvir-list
