On Thu, Jul 20, 2017 at 5:09 PM, Markus Armbruster <arm...@redhat.com> wrote: > Amador Pahim <apa...@redhat.com> writes: > >> On Thu, Jul 20, 2017 at 1:49 PM, Markus Armbruster <arm...@redhat.com> wrote: >>> Amador Pahim <apa...@redhat.com> writes: >>> >>>> Current implementation is broken. It does not really test if the child >>>> process is running. >>> >>> What usage exactly is broken by this? Got a reproducer for me? >> >> Problem is that 'returncode' is not set without a calling >> poll()/wait()/communicate(), so it's only useful to test if the >> process is running after such calls. But if we use 'poll()' instead, >> it will, according to the docs, "Check if child process has >> terminated. Set and return returncode attribute." >> >> Reproducer is: >> >> >>> import subprocess >> >>> devnull = open('/dev/null', 'rb') >> >>> p = subprocess.Popen(['qemu-system-x86_64', '-broken'], >> stdin=devnull, stdout=devnull, stderr=devnull, shell=False) >> >>> print p.returncode >> None >> >>> print p.poll() >> 1 >> >>> print p.returncode >> 1 >> >>>> The Popen.returncode will only be set after by a poll(), wait() or >>>> communicate(). If the Popen fails to launch a VM, the Popen.returncode >>>> will not turn to None by itself. >>> >>> Hmm. What is the value of .returncode then? >> >> returncode starts with None and becomes the process exit code when the >> process is over and one of that three methods is called (poll(), >> wait() or communicate()). >> >> There's an error in my description though. The correct would be: "The >> Popen.returncode will only be set after a call to poll(), wait() or >> communicate(). If the Popen fails to launch a VM, the Popen.returncode >> will not turn from None to the actual return code by itself." > > Suggest to add ", and is_running() continues to report True". > >>>> Instead of using Popen.returncode, let's use Popen.poll(), which >>>> actually checks if child process has terminated. >>>> >>>> Signed-off-by: Amador Pahim <apa...@redhat.com> >>>> Reviewed-by: Eduardo Habkost <ehabk...@redhat.com> >>>> Reviewed-by: Fam Zheng <f...@redhat.com> >>>> --- >>>> scripts/qemu.py | 2 +- >>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>> >>>> diff --git a/scripts/qemu.py b/scripts/qemu.py >>>> index 880e3e8219..f0fade32bd 100644 >>>> --- a/scripts/qemu.py >>>> +++ b/scripts/qemu.py >>>> @@ -86,7 +86,7 @@ class QEMUMachine(object): >>>> raise >>>> >>>> def is_running(self): >>>> - return self._popen and (self._popen.returncode is None) >>>> + return self._popen and (self._popen.poll() is None) >>>> >>>> def exitcode(self): >>>> if self._popen is None: >>> return None >>> return self._popen.returncode >>> >>> Why is this one safe? >> >> Here it's used just to retrieve the value from the Popen.returncode. >> It's not being used to check whether the process is running or not. > > If self._popen is not None, we return self._popen.returncode. It's None > if .poll() etc. haven't been called. Can this happen? If not, why not? > If yes, why is returning None then okay?
Yes, that can happen. This method is not returning an up-to-date returncode, it's serving just as a wrapper to the attribute, being the attribute updated or not. I lack the background on why it was coded that way, but considering the API-user perspective, I agree with you. We should return self._popen.poll() here indeed. Fixing that.