Martin Panter added the comment:

This recently hung AMD64 FreeBSD 9.x 3.5. The stack trace was different, and 
there is only one thread:

http://buildbot.python.org/all/builders/AMD64%20FreeBSD%209.x%203.5/builds/828/steps/test/logs/stdio
[398/398] test_io
Timeout (0:15:00)!
Thread 0x0000000801807400 (most recent call first):
  File 
"/usr/home/buildbot/python/3.5.koobs-freebsd9/build/Lib/unittest/case.py", line 
176 in handle
  File 
"/usr/home/buildbot/python/3.5.koobs-freebsd9/build/Lib/unittest/case.py", line 
727 in assertRaises
  File 
"/usr/home/buildbot/python/3.5.koobs-freebsd9/build/Lib/test/test_io.py", line 
3714 in check_interrupted_write
  File 
"/usr/home/buildbot/python/3.5.koobs-freebsd9/build/Lib/test/test_io.py", line 
3743 in test_interrupted_write_text

Also, x86 Ubuntu Shared 2.7 hung, but the only information I have is it was 
running test_io.

In my Free BSD case, the write() call is stuck, but in Victor’s original case, 
the background read() call is stuck. I could explain both cases as a race 
condition with the alarm signal being delivered:

Victor’s case: SIGALRM delivered somewhere inside assertRaises(), but before 
the write() system call, and Python has a chance to call the Python signal 
handler. No data is ever written, so the background read() hangs.

My case: SIGALRM delivered just before or as write() starts, so Python has no 
chance to interrupt the system call and call its own Python handler.

I wonder if we can change the test to only deliver a signal after first reading 
one byte, i.e. send the signal directly from the background thread. I think 
that should cover 99% of cases, though in theory it is possible for something 
else to interrupt write(), and our signal to be delivered while write() was 
restarting, causing it to hang. But the combination of those two events would 
be so unlikely it may not be worth worrying about.

Also, if you close the read end of the pipe before the write end, it should 
protect against flush() hanging without the EBADF hack by raising EPIPE instead.

I suggest rewriting the background thread like:

def _read():
    s = os.read(r, 1)
    read_results.append(s)
    # The main thread is very likely to be inside the write() syscall now, so 
interrupt it
    os.kill(os.getpid(), SIGALRM)

and cleaning up the pipe like:

finally:
    os.close(r)
    try:
        wio.close()
    except BrokenPipeError:
        pass

----------
nosy: +martin.panter
status: closed -> open
versions: +Python 2.7, Python 3.6 -Python 3.4

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue22331>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to