On Wed, Dec 17, 2014 at 01:58:28PM +0100, Ján Tomko wrote:
> On 12/17/2014 01:21 PM, Daniel P. Berrange wrote:
> > On Wed, Dec 17, 2014 at 01:18:41PM +0100, Michal Privoznik wrote:
> >> On 16.12.2014 17:41, Ján Tomko wrote:
> >>> The domain might disappear during the time in monitor when
> >>> the virDomainObjPtr is unlocked, so the caller needs to check
> >>> if it's still alive.
> >>>
> >>> Introduce a helper function for the callers that want to report
> >>> an error in this case.
> >>> ---
> >>> src/qemu/qemu_domain.c | 17 +++++++++++++++++
> >>> src/qemu/qemu_domain.h | 3 +++
> >>> 2 files changed, 20 insertions(+)
> >>>
> >>> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
> >>> index 8dd427a..d074429 100644
> >>> --- a/src/qemu/qemu_domain.c
> >>> +++ b/src/qemu/qemu_domain.c
> >>> @@ -1606,6 +1606,23 @@ void qemuDomainObjExitMonitor(virQEMUDriverPtr
> >>> driver,
> >>> qemuDomainObjExitMonitorInternal(driver, obj);
> >>> }
> >>>
> >>> +/* obj must NOT be locked before calling
> >>> + *
> >>> + * Same as qemuDomainObjExitMonitor, but reports an error and
> >>> + * returns -1 if the domain is no longer alive after exiting the monitor
> >>
> >> Can you state explicitly that this function needs to be called whenever the
> >> domain object or vm->def, or privdata (any runtime info held within vm in
> >> general) is used after ExitMonitor()?
> >
>
> Right, I can add this to qemu/THREADS.txt as well.
>
> > Are there any cases where that is not true ? eg could we just avoid
> > introducing this new function, by making the existing funtion check
> > liveliness every time. That would avoid the class of bugs you outline
> > above entirely.
>
> I was worried about backports, but new code checking the return value would
> fail to compile with the void function in older libvirt.
We should add ATTRIBUTE_RETURNCHECK to force all callers to check
the return status too
> There are functions that only call EndJob and return after exiting the
> monitor. But if the domain crashed before the API even returned, can we really
> claim success?
There are some APIs where we do
if (running)
do x
else
do y
even with those though, I think we're justified in reporting an error
if the domain stopped at any point in the 'do x' code path - after all
we're already doing to be doing that if it happens while we're issuing
a mointor API call. So this endjob check is only extending that window
of opportunity for errors by a couple more instructions.
Regards,
Daniel
--
|: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org -o- http://virt-manager.org :|
|: http://autobuild.org -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
--
libvir-list mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/libvir-list