Hi Laurent, Floris,

On Mon, Mar 19, 2012 at 16:47 -0700, Brack, Laurent P. wrote:
> I don't want to make this issue overly complicated but here are some 
> scenarios (that we have).

proper termination and handling of hanging tests is involved.  So 
discussion and hearing use cases makes sense to me.

> Our tests, in some instances, interact with python extensions written in C or 
> C++ (or in some case COM inproc server - yuk) which can be multi-threaded.
>  
> If the hang occurs in a critical section in the C/C++ domain I am not sure 
> what the timeout effect would be but 
> Assuming that the test is pre-empted in a critical section (of the 
> extension), we would most likely be doomed anyway (mutexes would not be 
> released). Next test would hang, be aborted and so on. 
> 
> In our case, the safest thing to do is to run with xdist -n 1 so then the OS 
> takes care of doing most if not all the cleanup.
> I am not so intimate with xdist but I was hoping that upon detecting that the 
> "remote process" had died, 
> xdist would restart a new one. 

It's reasonable to expect this from the xdist plugin, even when you run with
"-n N" with N>1.

> In this case pytest ended with:
> 
> [gw0] win32 -- Python 2.7.2 C:\Python27\python.exe
> Slave 'gw0' crashed while running 
> 'src/timeout/test_sample.py::TestTimeOut::()::test_hang'
> 
> Which signaled the end of everything (other tests were not run). This is a 
> something we will have to solve internally 
> (and when we have the solution will be more than happy to contribute it 
> back). We might be able to restart a remote 
> Process on pytest_testnodedown(). We would be however too late to run any 
> restore() and chances are that those might hang as well.

Indeed, no attempt is made to restart a node properly. It remains offline
and when run with "-n1" everything dies because no node is left to continue.
Not sure how hard it is to try to fix it.

> More food for thoughts than anything at this point. 
> 
> Floris, thanks for your work. We are not yet in a position where we can help 
> much but we will get to it.

Coming to think of it, the xdist plugin might also do "timeout" handling
and send a keyboard interrupt from the outside to a running node.

> /Laurent
> 
> P.S.: Speaking of xdist, further down the road we are interested in extending 
> xdist (via its hooks) to interface it with OpenStack (dynamic provisioning of 
> VM). Anyone has any use for this or thinking about the same thing?

I am interested in OpenStack but you can you detail a bit more of what
you want to achieve?

best,
holger

> 
> -----Original Message-----
> From: floris.bruynoo...@gmail.com [mailto:floris.bruynoo...@gmail.com] On 
> Behalf Of Floris Bruynooghe
> Sent: Monday, March 19, 2012 3:21 PM
> To: Brack, Laurent P.
> Cc: holger krekel; py-dev@codespeak.net; Brard, Valentin
> Subject: Re: [py-dev] pytest-timeout 0.2
> 
> On 19 March 2012 16:38, Brack, Laurent P. <lpb...@dolby.com> wrote:
> > It would also be nice to have a consistent behavior between sigalarm 
> > on/off. For instance, on Windows, pytest exits on first hang as opposed to 
> > *nix where the test is pre-empted and pytest moves on to the next one.
> 
> I was writing up a response saying I didn't know how to do so, but while 
> doing so I thought it might be possible to emulate sigalrm with a timer 
> thread which fires an other signal.  I might investigate the feasibility of 
> such an approach.  I do worry about gratuitously using signals however, the 
> simple timer thread with os._exit() may always have to stay as the fail-safe 
> option.
> 
> If anyone has any other ideas about this please let me know.
> 
> Regards,
> Floris
> 
> 
> --
> Debian GNU/Linux -- The Power of Freedom www.debian.org | www.gnu.org | 
> www.kernel.org
_______________________________________________
py-dev mailing list
py-dev@codespeak.net
http://codespeak.net/mailman/listinfo/py-dev

Reply via email to