[issue29988] (async) with blocks and try/finally are not as KeyboardInterrupt-safe as one might like

2017-09-07 Thread Nathaniel Smith

Nathaniel Smith added the comment:

To make sure I understand correctly: your concern is that the event loop is not 
implemented in C. So if you have this patch + an async CM that's implemented in 
C + the async CM never *actually* yields to the event loop, then that will be 
signal safe... but that last restriction is pretty silly, because what's the 
point of an async CM that never yields to the event loop. Right?

I see what you mean, but I think this patch is still useful even if it doesn't 
protect the event loop itself. To get the full benefit, you also need some way 
to protect your event loop from unexpected KeyboardInterrupt, but that's 
something Trio already solves and asyncio potentially could. And we knew from 
the start that to get the *most* benefit from this patch we would also need 
some way to protect arbitrary chunks of Python code from KeyboardInterrupt so 
you could protect the __(a)exit__ method itself; needing to protect the loop 
too isn't a huge stretch beyond that.

(It's possible that enough of uvloop is implemented in C to benefit from this?)

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29988>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29988] (async) with blocks and try/finally are not as KeyboardInterrupt-safe as one might like

2017-09-06 Thread Nathaniel Smith

Nathaniel Smith added the comment:

For purposes of writing a test case, can you install a custom Python-level 
signal handler, and make some assertion about where it runs? I.e., the calling 
frame should be inside the __aexit__ body, not anywhere earlier?

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29988>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30437] SSL_shutdown can return meaningless SSL_ERROR_SYSCALL

2017-09-06 Thread Nathaniel Smith

Nathaniel Smith added the comment:

My reading of the man page is that if SSL_shutdown returns 0, this means that 
it succeeded at doing the first phase of shutdown. If there are errors then 
they should be ignored, because it actually succeeded.

If you want to then complete the second phase of shutdown, of course, you have 
to call it again, but that's no different than any other use of SSL_shutdown.

If two calls to SSL_shutdown both return zero, then that's definitely a bug in 
OpenSSL. A return value of zero means that previously the SSL_SENT_SHUTDOWN 
flag was not set, and now it is set, so that can only happen once per 
connection. But that's orthogonal to the SSL_ERROR_SYSCALL issue.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30437>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31344] f_trace_opcodes frame attribute to switch to per-opcode tracing

2017-09-05 Thread Nathaniel Smith

Nathaniel Smith added the comment:

Adding Ned to CC in case he wants to comment on the utility of per-opcode 
tracing from the perspective of coverage.py.

--
nosy: +nedbat, njs

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue31344>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29302] add contextlib.AsyncExitStack

2017-09-05 Thread Nathaniel Smith

Changes by Nathaniel Smith <n...@pobox.com>:


--
nosy: +njs

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29302>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29988] (async) with blocks and try/finally are not as KeyboardInterrupt-safe as one might like

2017-09-04 Thread Nathaniel Smith

Nathaniel Smith added the comment:

In bpo-30703 Antoine fixed signal handling so it doesn't use Py_AddPendingCall 
anymore. At this point the only time the interpreter itself uses 
Py_AddPendingCall is when there's an error writing to the signal_fd, which 
should never happen in normal usage.

If there are third-party libraries making heavy use of Py_AddPendingCall then 
it could be an issue, but any such libraries should already be making sure they 
never have more than one Py_AddPendingCall pending at a time, because you 
already have to assume that the processing might be arbitrarily delayed.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29988>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29988] (async) with blocks and try/finally are not as KeyboardInterrupt-safe as one might like

2017-09-04 Thread Nathaniel Smith

Nathaniel Smith added the comment:

This would still provide value even if you have to do a GET_AWAITABLE in the 
protected region: the most common case is that __aenter__ is a coroutine 
function, which means that its __await__ is implemented in C and already 
protected against interrupts.

Also, to make this fully useful we need some way to protect arbitrary callables 
against interrupts anyway (to handle the case where the signal arrives during 
the actual __enter__ or __exit__ body), so saying that we also need to be able 
to protect __await__ isn't a big deal.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29988>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11410] Use GCC visibility attrs in PyAPI_*

2017-09-01 Thread Nathaniel Smith

Nathaniel Smith added the comment:

Was this actually fixed, or did everyone just get tired and give up on the 
original patch?

--
nosy: +njs

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue11410>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31198] getaddrinfo: inconsistent handling of port=None

2017-08-14 Thread Nathaniel Smith

Nathaniel Smith added the comment:

Ugh, apparently this weird behavior is actually mandated by the RFC :-(.

RFC 3493:

   The nodename and servname arguments are either null pointers or
   pointers to null-terminated strings.  One or both of these two
   arguments must be a non-null pointer.

So... never mind!

--
resolution:  -> not a bug
stage:  -> resolved
status: open -> closed

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue31198>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31198] getaddrinfo: inconsistent handling of port=None

2017-08-14 Thread Nathaniel Smith

Changes by Nathaniel Smith <n...@pobox.com>:


--
nosy: +giampaolo.rodola

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue31198>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31198] getaddrinfo: inconsistent handling of port=None

2017-08-14 Thread Nathaniel Smith

New submission from Nathaniel Smith:

socket.getaddrinfo accepts None as a port argument, and translates it into 0. 
This is handy, because bind() understands 0 to mean "pick a port for me", and 
if you want bind to pick a port for you and port=None is a slightly more 
obvious way to say that then port=0.

For example:

>>> socket.getaddrinfo("127.0.0.1", None)
[(, , 6, '', ('127.0.0.1', 
0)), (, , 17, '', 
('127.0.0.1', 0)), (, , 0, 
'', ('127.0.0.1', 0))]

socket.getaddrinfo also accepts None as a host name; this is necessary because 
the underlying getaddrinfo(3) call has special handling for host=NULL, and we 
need some way to access it:

>>> socket.getaddrinfo(None, 0)
[(, , 6, '', ('::1', 0, 
0, 0)), (, , 17, '', 
('::1', 0, 0, 0)), (, , 0, 
'', ('::1', 0, 0, 0)), (, , 6, '', ('127.0.0.1', 0)), (, 
, 17, '', ('127.0.0.1', 0)), (, , 0, '', ('127.0.0.1', 0))]

However, even though both of these features are supported separately... if you 
try to use them *together*, then socket.getaddrinfo errors out:

>>> socket.getaddrinfo(None, None)
socket.gaierror: [Errno -2] Name or service not known

I expected that last call to be equivalent to socket.getaddrinfo(None, 0).

--
components: Extension Modules
messages: 300236
nosy: njs
priority: normal
severity: normal
status: open
title: getaddrinfo: inconsistent handling of port=None
versions: Python 3.7

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue31198>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28414] SSL match_hostname fails for internationalized domain names

2017-08-06 Thread Nathaniel Smith

Changes by Nathaniel Smith <n...@pobox.com>:


--
pull_requests: +3043

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue28414>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28414] SSL match_hostname fails for internationalized domain names

2017-08-06 Thread Nathaniel Smith

Nathaniel Smith added the comment:

> I haven't dug in deeply, but it sounds like we handle IDNs in CNs and SANs 
> differently?

No -- Python's ssl module uses exactly the same hostname checking logic in both 
cases, and it's equally broken regardless. But, since CAs do all kinds of weird 
stuff with CNs, there's some chance that our brokenness and their brokenness 
will align and make things work by chance. Specifically, this will happen if 
the CA puts the U-encoded name in the CN field. Nick Lamb's concern is that CAs 
may be using this as justification for continuing to issue certs that are 
broken in this way. I don't know if that's true, but it's possible.

> one solution would be to simply drop support for CNs in match_hostname

That would indeed fix Nick Lamb's concern, but I'm dubious about this word 
"simply" :-). Obviously we should do this eventually, but it's going to break a 
bunch of people, you'll have to have a big fight about Python 2 and Redhat will 
probably refuse to take the patch and etc etc. OTOH fixing match_hostname to 
use A-labels would provide immediate benefits to Python's users (right now 
Python just... can't do SSL connections to IDNs) with minimal breakage, so you 
can call it a bug fix, and then worry about deprecating the CN field on its own 
schedule.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue28414>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31119] Signal tripped flags need memory barriers

2017-08-04 Thread Nathaniel Smith

Nathaniel Smith added the comment:

@arigo: Technically we also need that the writes to memory are observed to 
happen-before the write() call on the wakeup fd, which is not something that 
Intel is going to make any guarantees about. But *probably* this is also safe 
because the kernel has to use some kind of cross-CPU synchronization to wake up 
the read()ing thread, and that imposes a memory barrier.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue31119>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31119] Signal tripped flags need memory barriers

2017-08-04 Thread Nathaniel Smith

Nathaniel Smith added the comment:

On further investigation (= a few hours staring at the ceiling last night), it 
looks like there's another explanation for my particular bug... which is good, 
because on further investigation (= a few hours squinting at google results) it 
looks like this probably can't happen on x86/x86-64 (I think? maybe?).

It's still true though that you can't just throw in a 'volatile' and expect 
cross-thread synchronization to work -- that's not C's semantics, and it's not 
true on popular architectures like ARM.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue31119>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31119] Signal tripped flags need memory barriers

2017-08-04 Thread Nathaniel Smith

New submission from Nathaniel Smith:

Sometimes, CPython's signal handler (signalmodule.c:signal_handler) runs in a 
different thread from the main thread. On Unix this is rare but it does happen 
(esp. for SIGCHLD), and on Windows it happens 100% of the time. It turns out 
that there is a subtle race condition that means such signals can be lost.

To notify the main thread that a signal has been received, and which signal it 
is, CPython's signal handler sets two global variables:

  Handlers[sig_num].tripped = 1;
  is_tripped = 1

And then PyErr_CheckSignals does:

  if (is_tripped) {
  is_tripped = 0;
  for (i = 1; i < NSIG; i++) {
  if (Handlers[i].is_tripped) {
  ... run signal handler ...
  }
  }
  }

As some comments in the source note, this logic depends strongly on the 
assumption that when the signal handler sets the two flags, they are 
immediately visible to PyErr_CheckSignals, and that the two assignments happen 
in the given order. For example, suppose that is_tripped were set before 
Handlers[i].is_tripped. Then we could have the following sequence:

1. Thread A sets is_tripped = 1
2. Thread B runs PyErr_CheckSignals
3. Thread B notices is_tripped == 1, and sets it to 0
4. Thread B scans through all the Handlers, sees that none are set, and carries 
on
5. Thread A sets Handlers[i].is_tripped = 1

In this case the signal is lost until another signal arrives, which may not 
happen until an arbitrary amount of time has passed.

Also, the fix for bpo-30038 (making sure that we set the flags before writing 
to the wakeup fd) assumes that setting the flags before writing to the wakeup 
fd means that, well, the flags will be written before the fd.

However, when you have code running in separate threads, none of this is 
actually guaranteed. In general, the compiler is free to re-order writes, and 
more perniciously, the hardware memory hierarchy is also free to arbitrarily 
delay when writes to RAM made by one CPU become visible to another CPU, which 
can also create arbitrary reorderings.

In an attempt to avoid these issues, signalmodule.c declares the signal flags 
as 'volatile sig_atomic_t'. However, the best one can hope for from this is 
that it'll stop the *compiler* from reordering writes. It does nothing to stop 
the *hardware* from reordering writes. Using volatile like this might be 
sufficient if the signal handler runs in the main thread, but not if it runs in 
another thread.

The correct way to handle this is to use the primitives in pyatomic.h to 
get/set the tripped flags.



I noticed this because of the same test case that was failing due to 
bpo-30038... this thing has been frustrating me for months because it's 
intermittent and never seems to happen on my local VM. But specifically what 
I'm seeing at this point is:

Thread A (the thread where the signal handler runs):
A1. takes a lock.
A2. with the lock held, calls the C function raise(SIGINT), which directly 
calls the CPython C-level signal handler. (Verified by consulting the Windows C 
runtime source.)
A3. writes to the wakeup fd & sets both tripped flags, in whatever order
A4. releases the lock

Simultaneously, in thread B (the main thread):
B1. observes that the wakeup fd has been written to
B2. takes the lock
B3. releases the lock
B4. calls repr(), which unconditionally calls PyObject_Repr, which 
unconditionally calls PyErr_CheckSignals

I expect the call in B4 to run the Python-level signal handler, but this does 
not happen. Instead it runs some time later, causing an unexpected 
KeyboardInterrupt.

My reasoning is: the lock guarantees that no matter what the internal details 
of the C-level signal handler actually are, they have to all be complete before 
my code checks for signals. In excruciating detail:

- the write to the wakeup fd must happen-after the lock is acquired (A1)
- therefore, step B1 happens-after step A1
- B2 happens-after B1, so B2 happens-after A1.
- B2 takes the same lock as A1, so if B2 happens-after A1, it must also 
happen-after A4, when the lock is released.
- step B4 happens-after B2, therefore it happens-after A4 and A3. Therefore the 
tripped flags should be visible and the Python-level signal handler should run.

Yet this does not happen, so *something* really weird is going on. The 
hypothesis that it's the lack of memory barriers both explains how this could 
happen -- specifically I think thread B may be running PyErr_CheckSignals as 
part of the wakeup-fd reading logic, and it's clearing is_tripped before 
Handlers[i].tripped becomes visible, like in my code above. And it also 
explains why it happens in like 1/50 runs in Appveyor, but never on my local 
Windows 10 VM that only has 1 virtual CPU.

But regardless of whether this is the cause of my particular weird bug, it 
clearly *is* a bug and should be fixed.

--
messages: 299736
nosy: haypo, njs
priority: normal
severity: normal
status: open
title: Signal tripped fla

[issue14243] tempfile.NamedTemporaryFile not particularly useful on Windows

2017-08-01 Thread Nathaniel Smith

Changes by Nathaniel Smith <n...@pobox.com>:


--
nosy: +njs

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue14243>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16487] Allow ssl certificates to be specified from memory rather than files.

2017-07-13 Thread Nathaniel Smith

Changes by Nathaniel Smith <n...@pobox.com>:


--
nosy: +njs

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue16487>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29926] IDLE: in shell, time.sleep ignores _thread.interrupt_main()

2017-06-29 Thread Nathaniel Smith

Nathaniel Smith added the comment:

Then maybe simplest solution is to scale back the claim :-).

The important semantic change would be that right now, interrupt_main() is 
documented to cause KeyboardInterrupt to be raised in the main thread. So if 
you register a custom SIGINT handler that *doesn't* raise KeyboardInterrupt, 
interrupt_main() doesn't care: it doesn't invoke your custom handler, it just 
raises KeyboardInterrupt. My proposal would make it so that interrupt_main() 
does cause custom SIGINT handlers to be called, as well as triggering all of 
the interpreters existing machinery for prompt signal delivery (waking up 
time.sleep etc.), so in that respect it would be much more similar to a "real" 
control-C than it is now. This is a non-trivial semantic change, so it would 
need to be documented, but I'm not attached to the "equivalent to control-C" 
phrasing.

Even on UNIX, a "real" control-C is delivered to all processes in the 
foreground process group (or something like that, ttys are weird), but we would 
want interrupt_main() to only be delivered to the current process, so the 
equivalence wouldn't be exact there either.

> That operation isn't buggy on its own to my knowledge

I spent quite some time trying to make this work in order to test control-C 
handling in trio, and eventually gave up because I just could not make it work 
reliably. One notable effect is that if you call 
GenerateConsoleCtrlEvent(CTRL_C_EVENT, 0) in an appveyor test run, then it 
breaks appveyor -- at the end of the run it freezes with "Terminate batch job 
(Y/N)?" and then sits until the test run times out. It's possible to work 
around this with some extremely obscure code (make sure you always spawn a 
subprocess with CREATE_NEW_CONSOLE or maybe CREATE_NEW_CONSOLE_GROUP etc.), but 
(a) we can't tell people that they need to do that before running code that 
uses interrupt_main(), and (b) even when I did that, then I still had bizarre 
issues I couldn't figure out, where sometimes the event just wouldn't be 
delivered unless I ran the interpreter with the -u option for unbuffered stdio 
(?!?!) (see [1]).

I just don't think this is something that CPython can use.

I eventually switched to simulating control-C events for testing by calling 
raise(SIGINT), and that's worked much much better.

> I mentioned the possibility of calling CancelSynchronousIo in order to cancel 
> a console read like Ctrl+C does (but clunkier) -- again, because of the 
> equivalence claim. ERROR_OPERATION_ABORTED would need to be handled like 
> EINTR on Unix. This would entail small changes to a lot of code, so it does 
> need a separate issue if there's any support for this idea.

It still wouldn't be equivalent because control-C only cancels console reads, 
but CancelSynchronousIo would cancel *any* blocking I/O operation that the main 
thread was doing, right? So e.g. if the main thread was blocked doing 
socket.recv, then control-C wouldn't interrupt that, but interrupt_main() / 
PyErr_SetInterrupt() would. I can certainly see an argument that Python's 
C-level signal handler *should* call CancelSynchronousIo in general, and then 
raise(SIGINT) and control-C would be equivalent because they both called 
CancelSynchronousIo, but yeah, that's way beyond the scope of this issue.

[1] 
https://github.com/python-trio/trio/commit/95843654173e3e826c34d70a90b369ba6edf2c23#diff-345cfb6c136028f9514b67ee7bb8e035R11

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29926>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30038] Race condition in how trip_signal writes to wakeup fd

2017-06-29 Thread Nathaniel Smith

Changes by Nathaniel Smith <n...@pobox.com>:


--
resolution:  -> fixed
stage:  -> resolved
status: open -> closed

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30038>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29926] IDLE: in shell, time.sleep ignores _thread.interrupt_main()

2017-06-29 Thread Nathaniel Smith

Nathaniel Smith added the comment:

> A real Ctrl+C executes the registered control handlers for the process.

Right, but it's *extremely* unusual for a python 3 program to have a control 
handler registered directly via SetConsoleCtrlHandler. This isn't an API that 
the interpreter uses or exposes. The vast majority of programs receive 
control-C notification by letting the C runtime convert the low level console 
event into a SIGINT, and then receiving this via the signal module.

> To emulate this, PyErr_SetInterrupt could try calling 
> GenerateConsoleCtrlEvent(CTRL_C_EVENT, 0) to broadcast a Ctrl+C event. 

But as mentioned up thread, this is really flaky - and even when it works it 
tends to kill random other processes, which is *certainly* not what anyone 
expects from calling interrupt_main. You can't really call it experimentally – 
even trying to call it can do stuff like cause appveyor tests to lock up.

Given these two issues, I think that emulating control-C at the signal level 
makes the best tradeoffs. Something that works 100% reliably for 99.99% of 
python programs is way better than something that is 80% reliable for 100% of 
programs.

> One problem is that GenerateConsoleCtrlEvent(CTRL_C_EVENT, 0) doesn't cancel 
> a blocking console read like Ctrl+C does. Python's C handler could call 
> CancelSynchronousIo(hMainThread) to address this problem in general. 
> Unfortunately, when a console read is canceled in the client, it isn't 
> canceled in the console itself. The contents of the read will be discarded, 
> but it's a bit clunky that the user has to press enter.

This might be something to address as a separate issue? I'm guessing this 
doesn't affect idle, and it doesn't affect time.sleep, which seem to be the 
main topics of this thread, plus it sounds like any solution would be mostly 
orthogonal.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29926>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30744] Local variable assignment is broken when combined with threads + tracing + closures

2017-06-28 Thread Nathaniel Smith

Nathaniel Smith added the comment:

> I like it because it categorically eliminates the "tracing or not?" global 
> state dependence when it comes to manipulation of the return value of 
> locals() - manipulating that will either always affect the original execution 
> namespace immediately (modules, classes, exec, eval), or always be a 
> completely independent snapshot that can never lead to changes in the 
> original name bindings (functions, generators, coroutines).

Maybe I was unclear...? my question is why you prefer locals-making-a-copy over 
locals-and-f_locals-for-function-frames-return-proxy-objects. It seems to me 
that both of these proposals eliminate the "tracing or not?" global state 
dependence (right?), so this can't be a reason to prefer one over the other. 
And the latter additionally eliminates the distinction between 
modules/classes/exec/eval and functions/generators/coroutines, while also 
avoiding introducing a distinction between locals() and f_locals.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30744>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29926] IDLE: in shell, time.sleep ignores _thread.interrupt_main()

2017-06-28 Thread Nathaniel Smith

Nathaniel Smith added the comment:

Sorry, I meant bpo-21895.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29926>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29926] IDLE: in shell, time.sleep ignores _thread.interrupt_main()

2017-06-28 Thread Nathaniel Smith

Nathaniel Smith added the comment:

In terms of general design cleanliness, I think it would be better to make 
`interrupt_main` work reliably than to have IDLE add workarounds for its 
unreliability.

AFAICT the ideal, minimal redundancy solution would be:

- interrupt_main just calls raise(SIGINT)
- bpo-21895 is fixed by adding a few lines to Python's C-level signal handler 
on Unix-likes to check if it's running in the main thread, and if not then use 
pthread_kill to redirect the signal to the main thread.

If trying to fix bpo-28195 is too controversial, then a minimal change would be:

- in interrupt_main, if we're on Windows, call raise(SIGINT); otherwise, call 
pthread_kill(main_thread, SIGINT)

And in either case we'd also have to document that interrupt_main is now 
equivalent to the user having hit control-C. But AFAICT that's what all its 
users really want anyway.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29926>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30744] Local variable assignment is broken when combined with threads + tracing + closures

2017-06-27 Thread Nathaniel Smith

Nathaniel Smith added the comment:

> Folks that actually *wanted* the old behaviour would then need to do either 
> "sys._getframe().f_locals" or "inspect.currentframe().f_locals".

So by making locals() and f_locals have different semantics, we'd be adding yet 
another user-visible special-case? That seems unfortunate to me.

> if you want to write access to a function namespace from outside the 
> function, you need to either implement an eval hook (not just a tracing hook)
[...]
> or else a decision to disallow write-backs to frame locals even from tracing 
> functions in 3.7+.

Disallowing writeback from tracing functions would completely break bdb/pdb, so 
unless you're planning to rewrite bdb in C as an eval hook, then I don't think 
this is going to happen :-). Writing back to locals is a useful and important 
feature!

I think I'm missing some rationale here for why you prefer this approach – it 
seems much more complicated in terms of user-visible semantics, and possibly 
implementation-wise as well.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30744>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30744] Local variable assignment is broken when combined with threads + tracing + closures

2017-06-25 Thread Nathaniel Smith

Nathaniel Smith added the comment:

Interesting idea! I'm not sure I fully understand how it would work though.

What would you do for the frames that don't use the fast array, and where 
locals() currently returns the "real" namespace?

How are you imagining that the trace function writeback would be implemented? 
Some sort of thread-local flag saying "we're inside a trace function for frame 
XX" that causes locals() and f_locals to switch to returning a "real" namespace 
object?

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30744>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30744] Local variable assignment is broken when combined with threads + tracing + closures

2017-06-25 Thread Nathaniel Smith

Nathaniel Smith added the comment:

It isn't obvious to me whether the write-through proxy idea is a good one on 
net, but here's the rationale for why it might be.

Currently, the user-visible semantics of locals() and f_locals are a bit 
complicated. AFAIK they aren't documented anywhere outside the CPython and PyPy 
source (PyPy is careful to match all these details), but let me take a stab at 
it:

The mapping object you get from locals()/f_locals represents the relevant 
frame's local namespace. For many frames (e.g. module level, class level, 
anything at the REPL), it literally *is* the local namespace: changes made by 
executing bytecode are immediately represented in it, and changes made to it 
are immediately visible to executing bytecode. Except, for function frames, it 
acts more like a snapshot copy of the local namespace: it shows you the 
namespace at the moment you call locals(), but then future changes to either 
the code's namespace or the object don't affect each other. Except except, the 
snapshot might be automatically updated later to incorporate namespace changes, 
e.g. if I do 'ns = locals()' and then later on someone accesses the frame's 
f_locals attribute, then reading that attribute will cause my 'ns' object to be 
silently updated. But it's still a snapshot; modifications to the mapping 
aren't visible to the executing frame. Except**3, if you happen to modify the 
mapping object while you're inside a trace function callback, then *those* 
modifications are visible to the executing frame. (And also if a function is 
being traced then as a side-effect this means that now our 'ns' object above 
does stay constantly up to date.) Except**4, you don't actually have to be 
inside a trace function callback for your modifications to be visible to the 
executing frame – all that's necessary is that *some* thread somewhere is 
currently inside a trace callback (even if it doesn't modify or even look at 
the locals itself, as e.g. coverage.py doesn't).

This causes a lot of confusion [1].

On top of that, we have this bug here. The writeback-only-if-changed idea would 
make it so that we at least correctly implement the semantics I described in 
the long paragraph above. But I wonder if maybe we should consider this an 
opportunity to fix the underlying problem, which is that allowing skew between 
locals() and the actual execution namespace is this ongoing factory for bugs 
and despair. Specifically, I'm wondering if we could make the semantics be:

"locals() and f_locals return a dict-like object representing the local 
namespace of the given frame. Modifying this object and modifying the 
corresponding local variables are equivalent operations in all cases."

(So I guess this would mean a proxy object that on reads checks the fast array 
first and then falls back to the dict, and on writes updates the fast array as 
well as the dict.)

> you can still have race conditions between "read-update-writeback" operations 
> that affect the cells directly, as well as with those that use the new 
> write-through proxy.

Sure, but that's just a standard concurrent access problem, no different from 
any other case where you have two different threads trying to mutate the same 
local variable or dictionary key at the same time. Everyone who uses threads 
knows that if you want to do that then you need a mutex, and if you don't use 
proper locking then it's widely understood how to recognize and debug the 
resulting failure modes. OTOH, the current situation where modifications to the 
locals object sometimes affect the namespace, and sometimes not, and sometimes 
they get overwritten, and sometimes they don't, and it sometimes depends on 
spooky unrelated things like "is some other thread currently being traced"? 
That's *way* more confusing that figuring out that there might be a race 
condition between 'x = 1' and 'locals()["x"] = 2'.

Plus, pdb depends on write-through working, and there are lots of frames that 
don't even use the fast array and already have my proposed semantics. So 
realistically our choices are either "consistently write-through" or 
"inconsistently write-through".

[1] https://www.google.com/search?q=python+modify+locals=utf-8=utf-8

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30744>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30744] Local variable assignment is broken when combined with threads + tracing + closures

2017-06-24 Thread Nathaniel Smith

Nathaniel Smith added the comment:

Some thoughts based on discussion with Armin in #pypy:

It turns out if you simply delete the LocalsToFast and FastToLocals calls in 
call_trampoline, then the test suite still passes. I'm pretty sure that pdb 
relies on this as a way to set local variables, though, so I think this is 
probably more of a bug in the test suite than anything else.

In some ways, it seems like the most attractive fix to this would be to have 
the Python-level locals() and frame.f_locals return a dict proxy object whose 
__setitem__ writes any mutations back to both the fast array and the real 
frame->f_locals object, so that LocalsToFast becomes unnecessary. As Armin 
points out, though, that would definitely be a semantic change: there may be 
code out there that relies on locals() returning a dict – technically it 
doesn't have to return a dict even now, but it almost always does – or that 
assumes it can mutate the locals() array without that affecting the function. I 
think this would arguably be a *good* thing; it would make the behavior of 
locals() and f_locals much less confusing in general. But he's right that it's 
something we can only consider for 3.7.

The only more-compatible version I can think of is: before calling the trace 
function, do FastToLocals, and also make a "shadow copy" of all the cellvars 
and freevars. Then after the trace function returns, when copying back from 
LocalsToFast, check against these shadow copies, and only write back 
cellvars/freevars that the trace function actually modified (where by modified 
we mean 'old is not new', just a pointer comparison). Since most functions have 
no cellvars or freevars, and since we would only need to do this when there's a 
Python-level trace function active, this shouldn't add much overhead. And it 
should be compatible enough to backport even to 2.7, I think.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30744>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30744] Local variable assignment is broken when combined with threads + tracing + closures

2017-06-23 Thread Nathaniel Smith

New submission from Nathaniel Smith:

The attached script looks innocent, but gives wildly incorrect results on all 
versions of CPython I've tested.

It does two things:

- spawns a thread which just loops, doing nothing

- in the main thread, repeatedly increments a variable 'x'

And most of the increments of the variable are lost!

This requires two key things I haven't mentioned, but that you wouldn't expect 
to change anything. First, the thread function closes over the local variable 
'x' (though it doesn't touch this variable in any way). And second, the thread 
function has a Python-level trace function registered (but this trace function 
is a no-op).

Here's what's going on:

When you use sys.settrace() to install a Python-level trace function, it 
installs the C-level trace function "call_trampoline". And then whenever a 
trace event happens, call_trampoline calls PyFrame_FastToLocalsWithError, then 
calls the Python-level trace function, then calls PyFrame_LocalsToFast (see: 
https://github.com/python/cpython/blob/master/Python/sysmodule.c#L384-L395). 
This save/call/restore sequence is safe if all the variables being 
saved/restored are only visible to the current thread, which used to be true 
back in the days when local variables were really local. But it turns out 
nowadays (since, uh... python 2.1), local variables can be closed over, and 
thus can visible from multiple threads simultaneously.

Which means we get the following sequence of events:

- In thread A, a trace event occurs (e.g., executing a line of code)

- In thread A, call_trampoline does PyFrame_FastToLocalsWithError, which saves 
a snapshot of the current value of 'x'

- In thread A, call_trampoline then starts calling the trace function

- In thread B, we increment 'x'

- In thread A, the trace function returns

- In thread A, call_trampoline then does PyFrame_LocalsToFast, which restores 
the old value of 'x', overwriting thread B's modifications

This means that merely installing a Python-level trace function – for example, 
by running with 'python -m trace' or under pdb – can cause otherwise correct 
code to give wrong answers, in an incredibly obscure fashion.

In real life, I originally ran into this in a substantially more complicated 
situation involving PyPy and coverage.py and a nonlocal variable that was 
protected by a mutex, which you can read about here: 
https://bitbucket.org/pypy/pypy/issues/2591/
It turns out that since this only affects *Python*-level trace functions, 
merely recording coverage information using coverage.py doesn't normally 
trigger it, since on CPython coverage.py tries to install an accelerator module 
that uses a *C*-level trace function. Which is lucky for my particular case. 
But probably it should be fixed anyway :-).

CC'ing belopolsky as the interest area for the trace module, Mark Shannon 
because he seems to enjoy really pathological low-level bugs, and Benjamin and 
Yury as the "bytecode" interest area. I'm surprised that there's apparently no 
interest area for, like, "variables should retain the values they are assigned" 
-- apologies if I've CC'ed anyone incorrectly.

--
components: Interpreter Core
files: thread-closure-bug-demo.py
messages: 296751
nosy: Mark.Shannon, belopolsky, benjamin.peterson, njs, yselivanov
priority: normal
severity: normal
status: open
title: Local variable assignment is broken when combined with threads + tracing 
+ closures
versions: Python 2.7, Python 3.5, Python 3.6, Python 3.7
Added file: http://bugs.python.org/file46971/thread-closure-bug-demo.py

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30744>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30623] python-nightly import numpy fails since recently

2017-06-10 Thread Nathaniel Smith

Nathaniel Smith added the comment:

@Arek: It's great that you're testing your code against the latest 3.7 
pre-release, because that helps give early warning of issues in CPython as its 
developed, which helps everyone. BUT, you really cannot expect to use 
in-development versions and expect everything to be perfectly supported and 
work well together all the time -- we need time and space to work things out 
and coordinate. If you can't tolerate failures in your 3.7-prerelease builds, 
then you should remove them from you CI entirely and wait for the final release 
(or at least sometime in the beta/rc period).

Anyway-

Numpy bug here: https://github.com/numpy/numpy/issues/9227

This CPython PR is also relevant: https://github.com/python/cpython/pull/1236

Basically the bug is that numpy just made a release (1.13.0) which assumed that 
if it's running on some version of 3.7, then that PR is surely merged. But it 
turns out that that PR is not yet merged.

We'll probably need to make the usual emergency x.x.1 release in a few days 
anyway, and the simplest fix would be to switch back to the underscore-prefixed 
versions then. But... this needs a bit of coordination, because if we do that 
then the same thing will happen again at some point when that PR is merged; the 
problem here is that numpy was trying to avoid that breakage and thus created 
different breakage :-).

My suggestion would be that NumPy go ahead with releasing a fixed version ASAP, 
and *also* that CPython PR 1236 add back-compat for the underscore-prefixed 
versions. And maybe that it be merged soon if it's easy and going to happen 
anyway, since that will retroactively unbreak 1.13.0.

Bigger picture: I guess there's some irony that I warned that testing against 
prereleases can get complicated at the language summit based on numpy's 
experience, and now it's numpy's fault that cpython is getting grumpy users 
filing bugs at it... sorry about that. It's a useful data point though that we 
should keep in mind, that if we're encouraging people to test prereleases, then 
what used to be a temporary breakage that no-one would notice can actually 
cause more problems now – again, if numpy hadn't broken this now, PR 1236 would 
have at some point, and with the same results. (In the past it's been common 
that numpy just doesn't support a new python release until a month or two after 
it comes out! Apparently those days have passed...)

--
nosy: +brett.cannon, haypo

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30623>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30623] python-nightly import numpy fails since recently

2017-06-10 Thread Nathaniel Smith

Nathaniel Smith added the comment:

It's because Julian thinks _PyTraceMalloc_Untrack is going to lose its 
underscore in 3.7, so he made numpy assume that. Numpy issue, I'll see what we 
can do.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30623>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30038] Race condition in how trip_signal writes to wakeup fd

2017-06-09 Thread Nathaniel Smith

Nathaniel Smith added the comment:

But, by that definition, like... every change is backwards incompatible.

I'm pretty confident that no-one was relying on this race condition. I can't 
even figure out what that would mean.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30038>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30038] Race condition in how trip_signal writes to wakeup fd

2017-06-09 Thread Nathaniel Smith

Nathaniel Smith added the comment:

I think you mean it's backwards *compatible*? There's definitely no change in 
APIs or ABIs or anything, all that changes is the order of some statements 
inside Python's signal handler. (We used to we write to the magic wakeup fd and 
then set a flag saying a signal arrived; now we do the same things but in the 
opposite order. I wouldn't call it zero risk, just on the general principle 
that signal handlers are tricksy beasts, but I think it's about as low risk as 
a signal handler change can be.)

AFAICT the race condition also affects twisted and tornado, though I don't 
think they've noticed. The main symptom is that it makes control-C handling 
flaky on Windows, so it's a difficult thing to test for.

I would prefer to see it backported to 3.6, but it's a weird edge case bug so 
the world is not going to end if the fix has to wait for 3.7.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30038>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30038] Race condition in how trip_signal writes to wakeup fd

2017-06-08 Thread Nathaniel Smith

Nathaniel Smith added the comment:

I guess now would be a good time to decide whether this should be backported to 
3.6, with 3.6.2 coming out in a few days :-). (Or if not, then it can probably 
be closed?)

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30038>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29943] PySlice_GetIndicesEx change broke ABI in 3.5 and 3.6 branches

2017-06-08 Thread Nathaniel Smith

Nathaniel Smith added the comment:

Looks good to me, thanks Serhiy.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29943>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30594] Refcounting mistake in _ssl.c

2017-06-08 Thread Nathaniel Smith

Changes by Nathaniel Smith <n...@pobox.com>:


--
pull_requests: +2065

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30594>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30594] Refcounting mistake in _ssl.c

2017-06-08 Thread Nathaniel Smith

Nathaniel Smith added the comment:

posted backports for 3.5 and 3.6. It looks like 2.7 is actually unaffected, 
because it doesn't have IDNA support, so there's no failure path in between 
when the reference is stored and when its INCREFed.

--
versions:  -Python 2.7

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30594>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30594] Refcounting mistake in _ssl.c

2017-06-08 Thread Nathaniel Smith

Changes by Nathaniel Smith <n...@pobox.com>:


--
pull_requests: +2057

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30594>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30594] Refcounting mistake in _ssl.c

2017-06-08 Thread Nathaniel Smith

Changes by Nathaniel Smith <n...@pobox.com>:


--
pull_requests: +2058

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30594>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17305] IDNA2008 encoding missing

2017-06-08 Thread Nathaniel Smith

Changes by Nathaniel Smith <n...@pobox.com>:


--
nosy: +njs

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue17305>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28414] SSL match_hostname fails for internationalized domain names

2017-06-08 Thread Nathaniel Smith

Nathaniel Smith added the comment:

If the SSL module followed the pattern of encoding all str to bytes at the 
edges while leaving bytes alone, and used exclusively bytes internally (and in 
this case by "bytes" I mean "bytes objects containing A-labels"), then it would 
at least fix this bug and also make it possible for library authors to 
implement their own IDNA handling. Right now if you pass in a pre-encoded 
byte-string, exactly what ssl.py needs to compare to the certificate, then 
ssl.py will convert it *back* to text :-(.

--
nosy: +njs

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue28414>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30594] Refcounting mistake in _ssl.c

2017-06-08 Thread Nathaniel Smith

New submission from Nathaniel Smith:

If you pass a server_hostname= that fails IDNA decoding to 
SSLContext.wrap_socket or SSLContext.wrap_bio, then the SSLContext object has a 
spurious Py_DECREF called on it, eventually leading to segfaults.

Demo attached.

--
assignee: christian.heimes
components: SSL
files: demo.py
messages: 295380
nosy: alex, christian.heimes, dstufft, janssen, njs
priority: normal
severity: normal
status: open
title: Refcounting mistake in _ssl.c
type: crash
versions: Python 2.7, Python 3.5, Python 3.6, Python 3.7
Added file: http://bugs.python.org/file46932/demo.py

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30594>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30594] Refcounting mistake in _ssl.c

2017-06-08 Thread Nathaniel Smith

Changes by Nathaniel Smith <n...@pobox.com>:


--
pull_requests: +2056

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30594>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30359] Annotating a function as returning an (async) context manager?

2017-06-07 Thread Nathaniel Smith

Nathaniel Smith added the comment:

I can think of two downsides to using __annotations__ for this:

1) Obviously contextlib itself isn't going to add any kind of annotation in any 
versions before 3.7, but third-party projects might (like contextlib2, say). 
And these projects have been known to be used on Python versions that don't 
have __annotations__ support. At the moment I personally don't have to care 
about this because sphinxcontrib-trio doesn't support anything before 3.5, but 
I suspect this will eventually change. (E.g. sphinx has expressed some interest 
in having these changes upstreamed.) So this would mean that we'd still need 
some other ad hoc mechanism to use when running on old Pythons, and have to 
check for both that and the __annotations__. But that's probably doable, so eh. 
It'd be helpful if contextlib2 could join in on whatever protocol, though.

2) sphinxcontrib-trio actually interacts very poorly with __annotations__ right 
now [1]. But I mean, I need to figure out how to fix this anyway, so... not 
really your problem :-).

Neither of these downsides seems very compelling :-). So I guess contextlib 
should add some appropriate __annotations__, and *maybe* also add something 
like __returns_contextmanager__ = True if that's useful to maintain consistency 
with contextlib2 or similar?

(Wasn't there some discussion about keeping type hints out of the stdlib for 
now? is this likely to ruffle any figures on those grounds?)

[1] https://github.com/python-trio/sphinxcontrib-trio/issues/4

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30359>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30579] Allow traceback objects to be instantiated/mutated/annotated

2017-06-05 Thread Nathaniel Smith

Nathaniel Smith added the comment:

My understanding is that the major difference between a real traceback object 
and a TracebackException object is that the latter is specialized for printing, 
so it can be lighter weight (no pinning of frame objects in memory), but loses 
some utility (can't do post-mortem debugging).

If that's right, then that's definitely not a solution, because trio and jinja2 
and import errors all need to support post-mortem debugging.

I'm not against the idea of defining a traceback protocol, but it seems like a 
lot of work when the existing traceback objects are already perfectly good 
container objects that are just missing a few simple features.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30579>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30579] Allow traceback objects to be instantiated/mutated/annotated

2017-06-05 Thread Nathaniel Smith

Nathaniel Smith added the comment:

Uh, please ignore the random second paste of the jinja2 URL in the middle of 
the second to last paragraph.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30579>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30579] Allow traceback objects to be instantiated/mutated/annotated

2017-06-05 Thread Nathaniel Smith

New submission from Nathaniel Smith:

Currently, traceback objects don't expose any public constructor, are 
immutable, and don't have a __dict__ or allow subclassing, which makes it 
impossible to add extra annotations them.

It would be nice if these limitations were lifted, because there are rare but 
important cases where code needs to manipulate tracebacks directly, and 
currently these have to use awful stuff like ctypes. For example:

Jinja2:
https://github.com/pallets/jinja/blob/bec0065c4e7e89e7d893eda7840ba0219824b23c/jinja2/debug.py#L298-L372

Trio:
https://github.com/python-trio/trio/blob/496493afecc22d7d1a17175b6a2748a9c3510066/trio/_core/_multierror.py#L233-L318

(Notice that on PyPy there are no ctypes, but instead they have special 
extension that's supported for this case only to keep jinja2 working.)

For the above cases, what's needed is the ability to instantiate and assign to 
the fields of traceback objects.

In addition, in trio I'd like to be able to annotate traceback objects so that 
the traceback printing machinery can do things like hide "internal" tracebacks, 
or highlight places where exceptions jumped between tasks. This would be much 
easier if there were some way to attach data to tracebacks. Probably it doesn't 
make sense for traceback objects to have a __dict__ by default for speed/memory 
reasons, but we could add a dedicated metadata slot that is normally empty but 
can have arbitrary data assigned, or if they allowed subclassing then I could 
add a __dict__ in a subclass

I'm CC'ing the "import machinery" interest list, because my understanding is 
that with the new import system there's a similar desire to hide "internal" 
traceback frames, and while the features requested in this bug report won't 
solve that problem directly they might (are intended to) provide some useful 
machinery for it. Feel free to un-CC if I'm wrong :Jinja2:
https://github.com/pallets/jinja/blob/bec0065c4e7e89e7d893eda7840ba0219824b23c/jinja2/debug.py#L298-L372
-).

I think that this should be very straightforward to implement: it's just that 
no-one ever implemented setters etc. for tracebacks.

--
components: Interpreter Core
messages: 295235
nosy: brett.cannon, eric.snow, ncoghlan, njs
priority: normal
severity: normal
status: open
title: Allow traceback objects to be instantiated/mutated/annotated
type: enhancement
versions: Python 3.7

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30579>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30482] socket.getservbyname(), socket.getservbyport(), socket.getprotobyname() are not threadsafe

2017-05-27 Thread Nathaniel Smith

Changes by Nathaniel Smith <n...@pobox.com>:


--
nosy: +njs

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30482>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30491] Add a lightweight mechanism for detecting un-awaited coroutine objects

2017-05-27 Thread Nathaniel Smith

New submission from Nathaniel Smith:

A common problem when working with async functions is to attempt to call them 
but forget the 'await', which eventually leads to a 'Warning: coroutine ... was 
never awaited' (possibly buried in the middle of a bunch of traceback shrapnel 
caused by follow-on errors). This can be confusing, lead to spuriously passing 
tests, and generally isn't a great experience for the dev who makes the 
mistake. To improve this, I'd like to do things like have a test harness 
reliably detect when this has happened inside a test case, or have trio's main 
loop check occasionally to see if this has happened and immediately raise a 
useful error. (Unfortunately, it *doesn't* work to promote the warning to an 
error using the usual warnings filter machinery, because the warning is raised 
inside a __del__ method, which suppresses exception propagation.)

In principle this is possible with sys.setcoroutinewrapper, but that adds 
non-trivial runtime overhead to every async function invocation, which is 
something we'd like to avoid. (It's OK for a "debug mode", but "modes" are also 
a poor dev experience: they certainly have their place, but whenever possible 
it's much nicer to give a proper error in the first place instead of relying on 
the dev to realize they need to enable debug mode and then remember how to do 
it.)

Therefore I propose that CPython keep a thread-local counter of how many 
coroutine objects currently exist in a created-but-not-yet-started state -- so 
corofn.__call__ would increment this counter, and the first call to 
coroobj.__next__/send/throw would decrement it. And there's some way to access 
it e.g. via a magic function in the sys module. Then test harnesses can assert 
that this is zero, trio could occasionally poll this from its run loop and 
assert that it's zero, etc., with very low overhead.

(This is a slight modification of the version I discussed with Yury and others 
at PyCon last week; the previous request was to keep a count of how many times 
the "coroutine '...' was never awaited" warning had been emitted. The problem 
with the original idea is that the warning message doesn't fire until the 
garbage collector has collected the coroutine object, and that might not happen 
at a convenient time if we're using PyPy, or if there are cycles, or in a test 
where the missing 'await' eventually leads to an exception whose traceback pins 
the coroutine object in memory just long enough for the warning to be detected 
on the *next* test and confuse everyone. Thanks to Matthias Bussonnier for 
spurring the new idea, and see discussion here: 
https://github.com/python-trio/trio/issues/79#issuecomment-304364010)

--
components: asyncio
messages: 294584
nosy: giampaolo.rodola, haypo, ncoghlan, njs, yselivanov
priority: normal
severity: normal
status: open
title: Add a lightweight mechanism for detecting un-awaited coroutine objects
versions: Python 3.7

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30491>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12857] Expose called function on frame object

2017-05-25 Thread Nathaniel Smith

Nathaniel Smith added the comment:

> Yes, whenever you touch frames you're disabling the JIT for the call site 
> (and maybe for more call sites up the stack, idk).  So it doesn't matter what 
> you use, `f_func` or `f_locals`, the performance will suffer big time.  Is 
> that acceptable for Trio?

Ah, but that's the trick: with f_func the only time I have to touch frames is 
from my SIGINT handler, and its fine if *that* drops us out of the JIT (it 
probably has to in any case). The rest of the time, the interpreter would be 
automatically tracking all the state I need, so it's effectively free, and PyPy 
can do all its normal magic. The problem with using locals() / f_locals for 
this is that I have to constantly touch them during normal execution.

(Probably we shouldn't make a huge deal out of the PyPy case though – for me 
the main advantages of f_func are the simplicity and atomicity. But FWIW I also 
just pinged Armin on #pypy and he said that PyPy wouldn't have a problem 
supporting f_func.)

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue12857>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29988] (async) with blocks and try/finally are not as KeyboardInterrupt-safe as one might like

2017-05-24 Thread Nathaniel Smith

Nathaniel Smith added the comment:

On further thought, I think the way I'd write a test for this is:

(1) add a testing primitive that waits for N instructions and then injects a 
SIGINT. Probably this would require tweaking the definition of 
Py_MakePendingCalls like I described in my previous comment, and then some code 
like:

int sigint_timebomb(void* count_ptr) {
intptr_t count = (intptr_t) count_ptr;
if (count == 0)
raise(SIGINT);
else
Py_AddPendingCall(sigint_timebomb, (void*) (count - 1));
return 0;
}

(2) write a test like:

reached_crashpad = False
lock = threading.Lock()
i = 0
while not reached_crashpad:
i += 1
try:
sigint_timebomb(i)
with lock:
1 + 1
# this loop keeps executing instructions until any still-armed
# timebombs go off
while True:
reached_crashpad = True
except KeyboardInterrupt:
# key invariant: signal can't leave the lock held
assert not lock.locked()
else:
assert False  # can't happen

The idea here is that this sets off SIGINTs at every possible place throughout 
the execution of the 'with lock', while checking that the lock is always 
released, and without needing to hard code any information at all about what 
bytecode the compiler generated.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29988>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12857] Expose called function on frame object

2017-05-24 Thread Nathaniel Smith

Nathaniel Smith added the comment:

> I'm not sure I understand how `f_func` would help to better handle Control-C 
> in Trio. Nathaniel, could you please elaborate on that?

Sure. The issue is that I need to mark certain frames as "protected" from 
KeyboardInterrupt, in a way that my signal handler can see when walking the 
frame stack, so it can decide whether to raise a KeyboardInterrupt immediately 
or to wait until a safer moment. Right now, this is done by injecting a magic 
local variable into the frame object, so the code is effectively like:

def a_protected_function():
_trio_keyboard_interrupt_protection_magic_local_ = True
... actual function ...

But this has two problems: (1) it's not atomic: a signal could arrive in 
between when the frame is created and when the magic local gets set. (This is 
especially a problem for __(a)exit__ methods -- if bpo-29988 is fixed then it 
will guarantee that __(a)exit__ gets called, but this isn't much help if 
__(a)exit__ can instantly abort without doing any actual cleanup :-).) (2) The 
decorator code for setting this magic local ends up having to special-case 
functions/generators/async functions/async generators, so it's very complex and 
fragile: 
https://github.com/python-trio/trio/blob/1586818fbdd5c3468e15b64816db9257adae49c1/trio/_core/_ki.py#L109-L149
(This is possibly the function that took the most tries to get right in all of 
trio, because there are serious subtleties and interactions with multiple 
CPython bugs.)

I suspect the way I'm currently munging frames at runtime also annoys PyPy's 
JIT.

OTOH if frames had f_func, then instead of having to munge frame.f_locals, I 
could set a magic attribute on the *functions* that are supposed to transition 
between protected/unprotected mode. Then I could delete all that nasty code and 
replace it with something like:

def enable_ki_protection(func):
func._trio_keyboard_interrupt_protection_enabled = True
return func

and in addition to being simpler, this would fix the race condition and improve 
performance too. (Note that now this decorator only runs at function definition 
time; I wouldn't have to do any tricky stuff at all during regular execution.)

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue12857>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12857] Expose called function on frame object

2017-05-24 Thread Nathaniel Smith

Nathaniel Smith added the comment:

Certainly which frame is being executed is an implementation detail, and I can 
see an argument from that that we shouldn't have a frame introspection API at 
all... but we do have one, and it has some pretty important use cases, like 
traceback printing. Obviously most people shouldn't be touching this API at 
all, but if you do need it then it IMO should be as useful as possible.

I don't see a better way to handle my (admittedly a bit weird) control-C case 
within the interpreter; I'd be interested to hear what you have in mind. But 
even if there is a nice way to fix control-C, then the proposal here would 
still be useful for e.g. giving better tracebacks.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue12857>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29988] (async) with blocks and try/finally are not as KeyboardInterrupt-safe as one might like

2017-05-24 Thread Nathaniel Smith

Nathaniel Smith added the comment:

> If all you need is that with foo: pass guarantees that either both or neither 
> of __enter__ and __exit__ are called, for C context managers, and only C 
> context managers, then the fix is trivial.

It would be nice to have it for 'async with foo: pass' as well, which is a 
little less trivial because the 'await' dance is inlined into the bytecode (see 
the initial post in this bug), but basically yes.

> Do you have any way to reliably test for this failure mode?

Unfortunately no, I haven't implemented one. Let's see, though...

The test I wrote for issue30039 demonstrates one way to trigger a signal on a 
precise bytecode, by writing some code in C that calls raise(SIGNALNUMBER), and 
then calling it immediately before the bytecode where we want the signal to be 
raised (simulating the situation where the signal happens to arrive while the C 
function is running -- note that raise() is convenient because unlike kill() it 
works cross-platform, even on Windows).

This might be sufficient for testing the 'async with' version; it looks like an 
__await__ method or iterator implemented in C and calling raise() would deliver 
a signal at a point that should be protected but isn't.

The tricky case is plain 'with'. We can write something like:

with lock:
raise_signal()

and this gives bytecode like:

  1   0 LOAD_NAME0 (lock)
  2 SETUP_WITH  12 (to 16)
  4 POP_TOP

  2   6 LOAD_NAME1 (raise_signal)
  8 CALL_FUNCTION0
 10 POP_TOP
 12 POP_BLOCK
 14 LOAD_CONST   0 (None)
>>   16 WITH_CLEANUP_START

So the problem is that at offset 8 is where we can run arbitrary code, but the 
race condition is if a signal arrives between offsets 12 and 16.

One possibility would be to set up a chain of Py_AddPendingCall handlers, 
something like:

int chain1(void* _) {
Py_AddPendingCall(chain2, 0);
return 0;
}
int chain2(void* _) {
Py_AddPendingCall(chain3, 0);
return 0;
}
int chain3(void* _) {
raise(SIGINT);
return 0;
}

(or to reduce brittleness, maybe use the void* to hold an int controlling the 
length of the chain, which would make it easy to run tests with chains of 
length 1, 2, 3, ...)

..except consulting ceval.c I see that currently this won't work, because 
it looks like if you call Py_AddPendingCall from inside a pending call 
callback, then Py_MakePendingCalls will execute the newly added callback 
immediately after the first one returns. It could be made to work by having 
Py_MakePendingCalls do a first pass to check the current length of the queue, 
and then use that as the bound on how many calls it makes.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29988>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30437] SSL_shutdown can return meaningless SSL_ERROR_SYSCALL

2017-05-23 Thread Nathaniel Smith

Nathaniel Smith added the comment:

Debian testing, x86-64, with:

Python 3.5.3rc1 (default, Jan  3 2017, 04:40:57) 
[GCC 6.3.0 20161229] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import ssl
>>> ssl.OPENSSL_VERSION
'OpenSSL 1.1.0e  16 Feb 2017'

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30437>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29988] (async) with blocks and try/finally are not as KeyboardInterrupt-safe as one might like

2017-05-23 Thread Nathaniel Smith

Nathaniel Smith added the comment:

Right, fixing this bug alone can't make programs control-C safe in general. But 
there exist techniques to make __enter__/__exit__ methods safe WRT control-C, 
and if we fix this bug *and* apply those techniques then you can get some 
meaningful guarantees.

For example, threading.Lock's __enter__ and __exit__ methods are written in C, 
so they are guaranteed to happen atomically WRT to control-C. Right now, it's 
possible for a 'with lock: ...' to raise KeyboardInterrupt with the lock still 
held, which is surprising and not what we would like. If this bug were fixed, 
then it would be guaranteed that the lock was always released.

(And I originally discovered this while implementing control-C handling in 
trio, which is pure-Python but also has a way to protect 
__(a)enter__/__(a)exit__ methods against control-C. ...though that won't be 
100% reliable until bpo-12857 or similar is fixed. See 
https://vorpus.org/blog/control-c-handling-in-python-and-trio/ for exhaustive 
details.)

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29988>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12857] Expose called function on frame object

2017-05-23 Thread Nathaniel Smith

Nathaniel Smith added the comment:

I'd also like to make use of this in trio, as a way to get safer and less 
complicated control-C handling without having to implement things in C. 
(Exhaustive discussion: 
https://vorpus.org/blog/control-c-handling-in-python-and-trio/)

@Nick: I understand your proposal is to add a field on the frame that for 
regular function calls points to the function object, and for 
generator/coroutines points to the generator/coroutine object. Is that right? 
Two possible concerns there: (a) this creates a reference cycle, because 
generator/coroutine objects own a reference to the frame object, (b) AFAICT you 
also currently can't get from a generator/coroutine object back to the function 
object, so this would break some (all?) of the use cases for this. I guess the 
solution to (a) is to make it a weak reference, and for (b) either keep f_func 
as always pointing to the function and make f_generator a separate field, or 
else make it f_origin and also add gi_func/co_func fields to 
generator/coroutine objects. (Which might be useful in any case.)

Also, I'm not quite following the proposed use case. When a generator/coroutine 
is resumed, then send() steps through the whole yield stack and reconstructs 
the frame stack so it matches the generator/coroutine stack. (throw() currently 
doesn't do this, but that's a bug that breaks profilers/debuggers/etc - 
bpo-29590 - so we should fix it anyway, and I think Yury is planning to do that 
soon.) So if you get a frame from sys._getframe() or similar, then the stack is 
correct. And if a coroutine/generator isn't running, then the only way to get 
to the frame is by starting with the coroutine/generator object, so we don't 
really need a way to get back.

Tracebacks are a different case again because they continue to hold frame 
references after the stack is unwound, but generally the traceback already has 
the correct stack because the exception propagates up the coroutine/generator 
stack, and also aren't the {gi,co}_frame references cleared as the exception 
propagates anyway?

--
nosy: +njs

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue12857>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29334] ssl.SSLObject method getpeercert() is buggy, do_handshake() is strange

2017-05-23 Thread Nathaniel Smith

Nathaniel Smith added the comment:

Oddly, I expected to run into this with my code using SSLObject in trio [1], 
but if I connect to python.org:443 and then 'await 
trio_ssl_stream.do_handshake(); trio_ssl_stream.getpeercert()' it works just 
fine ... even though when I run the sslbugs.py script I get the same weird 
results Greg reports. As far as I can tell the logic is identical. So I guess 
this might potentially be useful to narrow this down :-).

Test code that works:

@trio.run
async def main():
import trio
sock = trio.socket.socket()
addr = await sock.resolve_remote_address(("python.org", 443))
await sock.connect(addr)
s = trio.SocketStream(sock)
client = trio.ssl.SSLStream(
s, trio.ssl.create_default_context(), server_hostname="python.org")
await client.do_handshake()
print(client.getpeercert())

[1] Currently in https://github.com/python-trio/trio/pull/107, eventually will 
be at https://github.com/python-trio/trio/blob/master/trio/ssl.py

--
nosy: +njs

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29334>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30437] SSL_shutdown can return meaningless SSL_ERROR_SYSCALL

2017-05-22 Thread Nathaniel Smith

New submission from Nathaniel Smith:

The SSL_shutdown man page says that if it returns 0, and an SSL_ERROR_SYSCALL 
is set, then SSL_ERROR_SYSCALL should be ignored - or at least I think that's 
what it's trying to say. See the RETURN VALUES section. I think this means we 
should only raise this error if the return value is <0? Though I suppose we 
need to clear out the error queue in any case.

I ended up changing the code that was triggering this for other reasons and now 
I'm not hitting it, so it's not exactly urgent for me, but FYI... I was getting 
SSLSyscallError exceptions from code using memory BIOs and where everything was 
fine. The test case had one task continually sending data into an 
SSLObject-based stream while the other end called SSLObject.unwrap() and then 
ended up continually getting SSLWantRead and reading more data -- after a few 
cycles of reading it got SSLSyscalError instead.

--
assignee: christian.heimes
components: SSL
messages: 294216
nosy: alex, christian.heimes, dstufft, janssen, njs
priority: normal
severity: normal
status: open
title: SSL_shutdown can return meaningless SSL_ERROR_SYSCALL
versions: Python 2.7, Python 3.6, Python 3.7

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30437>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29137] Fix fpectl-induced ABI breakage

2017-05-17 Thread Nathaniel Smith

Nathaniel Smith added the comment:

@Dima:
> @njs: to point out that usefulness of this module is not just wishful 
> thinking. I just used it to locate, up to the line in a Python extension 
> module written in C, a bug in Sagemath (that has perhaps 20 FPU-using 
> extensions, some of them as large as numpy).  (Without using it we were 
> pulling out our hair for weeks over this)

That's pretty cool :-). But from skimming your link, it sounds like it would 
have been sufficient in your case to add a call to "fesetmask(FP_X_INV)" using 
C or Cython or ctypes (or directly in gdb), and then running the program under 
gdb to get a backtrace where the SIGFPE was delivered? Is that correct? Or did 
your debugging depend on the specific fpectl machinery for responding to that 
signal?

> PS. I would volunteer to fix it and maintain it, assuming I have some modest 
> funding to support such an activity.

I'm not personally aware of any funding sources for this, if that's the 
question.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29137>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30038] Race condition in how trip_signal writes to wakeup fd

2017-05-17 Thread Nathaniel Smith

Nathaniel Smith added the comment:

> (BTW do you happen to know any tricks to force CPython to do an immediate 
> PyErr_CheckSignals on Windows?)

Never mind on this... it looks like calling repr() on any object is sufficient.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30038>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30038] Race condition in how trip_signal writes to wakeup fd

2017-05-16 Thread Nathaniel Smith

Nathaniel Smith added the comment:

> While I suggest you to *not* use an event loop (wakeup fd pipe/socket handle 
> with select) and signal.signal(), you are true that there is a race condition 
> if you use select() with signal.signal() so I merged your change.

Unfortunately this is the only 100% reliable way to do signal handling with an 
event loop so I guess you are unlikely to convince any of the projects that use 
it this way to change their minds :-)

> Do you consider that it's worth it to backport the change to 3.5 and 3.6?

I guess it'd be nice? At least for 3.6? I just had yet another random PR fail 
its tests due to this bug -- it's turning out to be quite difficult to make my 
tests reliable :-/. (BTW do you happen to know any tricks to force CPython to 
do an immediate PyErr_CheckSignals on Windows?)

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30038>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29137] Fix fpectl-induced ABI breakage

2017-05-16 Thread Nathaniel Smith

Nathaniel Smith added the comment:

Another option you might want to consider is proposing to add a proper fpu 
control flag setting/checking API to the math module.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29137>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29137] Fix fpectl-induced ABI breakage

2017-05-16 Thread Nathaniel Smith

Nathaniel Smith added the comment:

Also fixing the abi issues that started this, and probably making an argument 
for why it makes sense for all of cpython's built-in float operations to check 
the fpu flags, and to do so using a weird longjmp-based mechanism that only 
some platforms support. The fact that it's disabled by default and has been 
broken for a decade+ without anyone noticing might be working against you 
here...

You might get the impression that I think this is a bad idea. I do :-). But I 
am genuinely trying to helpful; I'm sure people would be willing to listen to 
an argument, and if you want to make one then those are the sorts of issues 
you're likely to need some answer for.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29137>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29137] Fix fpectl-induced ABI breakage

2017-05-16 Thread Nathaniel Smith

Nathaniel Smith added the comment:

@Dima: are you volunteering to fix and maintain it? I can see why it's useful 
to have some way to get at the fpu flags, but I don't see how fpectl 
specifically helps with that issue, and fpectl has always been broken on x86-64.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29137>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30359] A standard convention for annotating a function as returning an (async) context manager?

2017-05-12 Thread Nathaniel Smith

New submission from Nathaniel Smith:

sphinxcontrib-trio [1] does a few things; one of them is to enhance sphinx's 
autodoc support by trying to sniff out the types of functions so that it can 
automatically determine that something is, say, a generator, or an async 
classmethod.

This runs into problems when it comes to context managers, for two reasons. One 
reason is pretty obvious: the sniffing logic would like to be able to tell by 
looking at a function that it should be used as a context manager, so that it 
can document it as such, but there's no reliable way to do this. The other 
reason is more subtle: the sniffing code has to walk the .__wrapped__ chain in 
order to detect things like async classmethods, but when it walks the chain for 
a @contextmanager function, it finds the underlying generator, and ends up 
thinking that this function should be documented as returning an iterator, 
which is just wrong. If we can detect context managers, we can know to ignore 
any 

Obviously this is always going to be a heuristic process; there's no 100% 
reliable way to tell what arbitrary wrappers are doing. But the heuristic works 
pretty well... it's just that right now the method of detecting context manager 
functions is pretty disgusting:

https://github.com/python-trio/sphinxcontrib-trio/blob/2d9e65187dc7a08863b68a78bdee4fb051f0b99e/sphinxcontrib_trio/__init__.py#L80-L90
https://github.com/python-trio/sphinxcontrib-trio/blob/2d9e65187dc7a08863b68a78bdee4fb051f0b99e/sphinxcontrib_trio/__init__.py#L241-L246

So it would be really nice if contextlib.contextmanager somehow marked its 
functions because:

- then I could (eventually) use that marker instead of making assumptions about 
contextmanager's internals and messing with code objects

- then we could (immediately) point to how the standard library does it as a 
standard for other projects to follow, instead of using this weird 
sphinxcontrib-trio-specific __returns_contextmanager__ thing that I just made 
up.

I don't really care what the marker is, though I suppose 
__returns_contextmanager__ = True is as good as anything and has the advantage 
of a massive installed base (2 projects).

[1] http://sphinxcontrib-trio.readthedocs.io/

--
components: Library (Lib)
messages: 293592
nosy: ncoghlan, njs, yselivanov
priority: normal
severity: normal
status: open
title: A standard convention for annotating a function as returning an (async) 
context manager?
versions: Python 3.7

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30359>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29943] PySlice_GetIndicesEx change broke ABI in 3.5 and 3.6 branches

2017-05-10 Thread Nathaniel Smith

Nathaniel Smith added the comment:

@Jonathan: Even 3.6.1 was careful to retain compatibility with code built by 
3.6.0. And your proposed 3.6.1-patched will generate binaries equivalent to the 
ones 3.6.0 generates. So I don't think you need to worry; 3.6.2 is not going to 
add a new and worse compatibility break, which is what it would take for your 
proposal to cause a problem. In fact, your 3.6.1-patched is equivalent to what 
I think 3.6.2 should do. I say go for it; patching 3.6.1 seems like a good 
solution.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29943>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30300] asyncio.Controller

2017-05-08 Thread Nathaniel Smith

Nathaniel Smith added the comment:

Looks interesting! What's the advantage over running the server and the test in 
the same loop? The ability to use blocking operations in the tests, and to 
re-use an expensive-to-start server over multiple tests? (I've mostly used 
threads in tests to run blocking code for interoperability testing, and kept 
the async code in the main thread, so this is a bit novel to me.)

--
nosy: +njs

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30300>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29943] PySlice_GetIndicesEx change broke ABI in 3.5 and 3.6 branches

2017-05-02 Thread Nathaniel Smith

Nathaniel Smith added the comment:

I don't find it helpful to think of it as declaring 3.6.0 broken vs declaring 
3.6.1 broken. 3.6.0 is definitely good in the sense that if you build a module 
on it then it will import on both 3.6.0 and 3.6.1, and 3.6.1 is definitely good 
in the sense that if you're using it then you can import any module that was 
built on 3.6.0 or 3.6.1.

I think 3.6.2 should combine the best aspects of 3.6.0 and 3.6.1, in the sense 
that modules built on 3.6.2 should be importable on any 3.6.x, and 3.6.2 should 
be able to import modules built on any 3.6.x.

Fortunately this is very easy – it just means that 3.6.2 should continue to 
export the PySlice_AdjustIndices symbol (because existing 3.6.1 builds will 
need it to import on 3.6.2), but should remove the #define of 
PySlice_GetIndicesEx (so code built against 3.6.2 will not *require* 
PySlice_AdjustIndices to be present).

Removing 3.6.0 and 3.6.1 from circulation is simply not possible; the best 
3.6.2 can hope for is to interoperate with them as well as possible.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29943>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29943] PySlice_GetIndicesEx change broke ABI in 3.5 and 3.6 branches

2017-05-01 Thread Nathaniel Smith

Nathaniel Smith added the comment:

More collateral damage: apparently the workaround that Pandas used for this bug 
(#undef'ing PySlice_GetIndicesEx) broke PyPy, so now they need a workaround for 
the workaround: https://github.com/pandas-dev/pandas/pull/16192

Recording this here partly as a nudge since IIUC the breaking change is still 
in the 3.6 tree, and partly for the benefit of future projects that are also 
trying to work around this.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29943>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29943] PySlice_GetIndicesEx change broke ABI in 3.5 and 3.6 branches

2017-04-28 Thread Nathaniel Smith

Nathaniel Smith added the comment:

Pillow also had broken wheels up on pypi for a while; they've now put out a bug 
fix release that #undef's PySlice_GetIndicesEx, basically monkeypatching out 
the bugfix to get back to the 3.6.0 behavior: 
https://github.com/python-pillow/Pillow/issues/2479

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29943>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29943] PySlice_GetIndicesEx change broke ABI in 3.5 and 3.6 branches

2017-04-25 Thread Nathaniel Smith

Nathaniel Smith added the comment:

Apparently this also broke pyqt for multiple users; here's the maintainers at 
conda-forge struggling to figure out the best workaround: 
https://github.com/conda-forge/pyqt-feedstock/pull/25

I really think it would be good to fix this in 3.6 sooner rather than later. 
Downstairs projects are accumulating workarounds and pinning 3.6.0 as we speak.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29943>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30151] Race condition in use of _PyOS_SigintEvent on windows

2017-04-24 Thread Nathaniel Smith

Nathaniel Smith added the comment:

Oh, I should also say that this isn't actually affecting me, I just figured 
that once I was aware of the bug it was worth making a record here. Might be a 
good starter bug for someone trying to get into CPython's internals :-)

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30151>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30151] Race condition in use of _PyOS_SigintEvent on windows

2017-04-24 Thread Nathaniel Smith

New submission from Nathaniel Smith:

As pointed out in this stackoverflow answer: 
http://stackoverflow.com/a/43578450/
and since I seem to be collecting signal-handling bugs these days :-), there's 
a race condition in how the interpreter uses _PyOS_SigintEvent to allow 
control-C to break out of functions like time.sleep on Windows. Suppose we have 
a call to time.sleep(), and the user hits control-C while it's running.

What's supposed to happen is:

- the windows implementation of pysleep in Modules/timemodule.c does 
ResetEvent(hInterruptEvent)
- then it blocks waiting for the interrupt event to be set *or* the timeout to 
expire
- the C-level signal handler runs in a new thread, which sets the "hey a signal 
arrived" flag and then sets the event
- the main thread wakes up because the event is set, and runs 
PyErr_CheckSignals()
- this notices that that the signal has arrived and runs the Python-level 
handler, all is good

But what can happen instead is:

- before doing CALL_FUNCTION, the eval loop checks to see if any signals have 
arrived. They haven't.
- then the C implementation of time.sleep starts executing.
- then a signal arrives; the signal handler sets the flag and sets the event
- then the main thread clears the event again
- then it blocks waiting for the event to be set or the timeout to expire. But 
the C-level signal handler's already done and gone, so we don't realize that 
the flag is set and we should wake up and run the Python-level signal handler.

The solution is that immediately *after* calling 
ResetEvent(_PyOS_SigintEvent()) but *before* sleeping, we should call 
PyErr_CheckSignals(). This catches any signals that arrived before we called 
ResetEvent, and any signals that arrive after that will cause the event to 
become set and wake us up, so that eliminates the race condition.

This same race-y pattern seems to apply to appear in Modules/timemodule.c, 
Modules/_multiprocessing/semaphore.c, and Modules/_winapi.c. _winapi.c also 
handles the event in a weird way that doesn't make sense to me – if the user 
hits control-C it raises an OSError instead of running the signal handler? OTOH 
I *think* Modules/_io/winconsoleio.c already handles the race condition 
correctly, and I don't dare make a guess about whatever Parser/myreadline.c is 
doing.

--
components: Interpreter Core
messages: 292196
nosy: njs
priority: normal
severity: normal
status: open
title: Race condition in use of _PyOS_SigintEvent on windows
versions: Python 3.5, Python 3.6, Python 3.7

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30151>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30141] If you forget to call do_handshake, then everything seems to work but hostname checking is disabled

2017-04-23 Thread Nathaniel Smith

Changes by Nathaniel Smith <n...@pobox.com>:


--
title: If you forget to call do_handshake, then everything seems to work but 
hostname is disabled -> If you forget to call do_handshake, then everything 
seems to work but hostname checking is disabled

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30141>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30141] If you forget to call do_handshake, then everything seems to work but hostname is disabled

2017-04-23 Thread Nathaniel Smith

New submission from Nathaniel Smith:

Basically what it says in the title... if you create an SSL object via 
wrap_socket with do_handshake_on_connect=False, or via wrap_bio, and then 
forget to call do_handshake and just go straight to sending and receiving data, 
then the encrypted connection is successfully established and everything seems 
to work. However, in this mode the hostname is silently *not* checked, so the 
connection is vulnerable to MITM attacks.

(I guess from reading the SSL_read and SSL_write manpages that openssl is just 
silently doing the handshake automatically – very helpfully! – but it's only 
Python's do_handshake code that knows to check the hostname?)

This doesn't affect correctly written programs that follow the documentation 
and either use do_handshake_on_connect=True (the default for wrap_socket) or 
explicitly call do_handshake, so it's not a super-scary bug. But IMHO it 
definitely shouldn't be this easy to silently fail-open.

The attached test script sets up a TLS echo server that has a certificate for 
the host "trio-test-1.example.org" that's signed by a locally trusted CA, and 
then checks:

- connecting to it with do_handshake and expecting the correct hostname: works, 
as expected
- connecting to it with do_handshake and expecting a different hostname: 
correctly raises an error due to the mismatched hostnames
- connecting to it withOUT do_handshake and expecting a different hostname: 
incorrectly succeeds at connecting, sending data, receiving data, etc., without 
any error

and it checks using both ctx.wrap_socket(..., do_handshake_on_connect=False) 
and a little custom socket wrapper class defined using ctx.wrap_bio(...).

I've only marked 3.5 and 3.6 as affected because those are the only versions 
I've tested, but I suspect other versions are affected as well.

--
assignee: christian.heimes
components: SSL
files: ssl-handshake.zip
messages: 292158
nosy: christian.heimes, njs
priority: normal
severity: normal
status: open
title: If you forget to call do_handshake, then everything seems to work but 
hostname is disabled
versions: Python 3.5, Python 3.6
Added file: http://bugs.python.org/file46827/ssl-handshake.zip

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30141>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29943] PySlice_GetIndicesEx change broke ABI in 3.5 and 3.6 branches

2017-04-20 Thread Nathaniel Smith

Nathaniel Smith added the comment:

FYI: https://github.com/pandas-dev/pandas/pull/16066

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29943>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30050] Please provide a way to disable the warning printed if the signal module's wakeup fd overflows

2017-04-12 Thread Nathaniel Smith

Nathaniel Smith added the comment:

I haven't noticed the error in the wild because I don't use set_wakeup_fd on 
Linux/MacOS, because of this issue :-). But on MacOS literally all it would 
take is to receive two signals in quick succession, or to receive one signal at 
a moment when someone has recently scheduled a callback from a thread (which 
also writes to the wakeup fd). In principle it can also happen on Windows, but 
on Windows apparently the smallest buffer you get for a socketpair is like 500 
kilobytes, so it doesn't come up except in rather artificial situations.

And yes, a possible workaround is to use an artificially large buffer on 
Linux/MacOS as well. I'm not saying this is a huge showstopper. But it's silly 
for downstream libraries to carry around workarounds for issues in CPython when 
we could just fix CPython.

> If the pipe becomes full, you loose signals: it's an hard error. I decided to 
> log an error into fd 2 because in a signal handler, the number of allowed 
> functions is very limited.

This makes sense for asyncio, but not for most other async libraries in Python, 
because they *don't* lose signals when the pipe becomes full. That's why I'm 
suggesting that the signal module should have an option to let users decide 
whether they want the warning or not.

(It might also make sense to switch asyncio to stop using the fd as a 
communications channel. Theoretically this would be the ideal solution, but I 
understand why it's not a priority.)

> Related question: does asyncio support signals on Windows? I added support 
> for set_wakeup_fd() to Windows for asyncio, but then I forgot this TODO item 
> :-)

No, asyncio doesn't use set_wakeup_fd on Windows, so code like 
'loop.run_until_complete(asyncio.sleep(9))' can't be interrupted. I'd 
appreciate if you could wait a few days to fix this though, because the blog 
post that I'm almost finished writing about signal handling currently uses this 
as an example of how trio is more awesome than asyncio ;-).

(This is a joke. Well, I mean, mostly. My draft does literally have that sleep 
example in it, but I won't be grumpy if you fix it first. Or at least, not 
*very* grumpy :-).)

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30050>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30050] Please provide a way to disable the warning printed if the signal module's wakeup fd overflows

2017-04-12 Thread Nathaniel Smith

New submission from Nathaniel Smith:

When a wakeup fd is registered via signal.set_wakeup_fd, then the C level 
signal handler writes a byte to the wakeup fd on each signal received. If this 
write fails, then it prints an error message to the console.

Some projects use the wakeup fd as a way to see which signals have occurred 
(asyncio, curio). Others use it only as a toggle for "a wake up is needed", and 
transmit the actual data out-of-line (twisted, tornado, trio – I guess it has 
something to do with the letter "t").

One way that writing to the wakeup fd can fail is if the pipe or socket's 
buffer is already full, in which case we get EWOULDBLOCK or WSAEWOULDBLOCK. For 
asyncio/curio, this is a problem: it indicates a lost signal! Printing to the 
console isn't a great solution, but it's better than letting the error pass 
silently.

For twisted/tornado/trio, this is a normal and expected thing – the semantics 
we want are that after a signal is received then the fd will be readable, and 
if its buffer is full then it's certainly readable! So for them, 
EWOULDBLOCK/WSAEWOULDBLOCK are *success* conditions. Yet currently, the signal 
module insists on printing a scary message to the console whenever we succeed 
in this way.

It would be nice if there were a way to disable this; perhaps something like: 
signal.set_wakeup_fd(fd, warn_on_full_buffer=False)

This is particularly annoying for trio, because I try to minimize the size of 
the wakeup fd's send buffer to avoid wasting non-swappable kernel memory on 
what's essentially an overgrown bool. This ends up meaning that on Linux the 
buffer is 6 bytes, and on MacOS it's 1 byte. So currently I don't use the 
wakeup fd on Linux/MacOS, which is *mostly*  OK but it would be better if we 
could use it. Trio bug with a few more details: 
https://github.com/python-trio/trio/issues/109

--
messages: 291532
nosy: haypo, njs
priority: normal
severity: normal
status: open
title: Please provide a way to disable the warning printed if the signal 
module's wakeup fd overflows
versions: Python 3.7

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30050>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30038] Race condition in how trip_signal writes to wakeup fd

2017-04-11 Thread Nathaniel Smith

Nathaniel Smith added the comment:

The attached script wakeup-fd-racer.py fails consistently for me using cpython 
3.6.0 on my windows 10 vm:

> python wakeup-fd-racer.py
Attempt 0: start
Attempt 0: FAILED, took 10.0160076 seconds
select_calls = 2

(It may help that the VM only has 1 CPU? But the same test passes on Linux even 
when I use taskset to restrict it to 1 cpu. Maybe Windows has some scheduling 
heuristic where one thread writing to a socket when another thread is blocked 
on it triggers an immediate context switch.)

--
Added file: http://bugs.python.org/file46797/wakeup-fd-racer.py

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30038>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30038] Race condition in how trip_signal writes to wakeup fd

2017-04-11 Thread Nathaniel Smith

Nathaniel Smith added the comment:

If it helps, notice that the SetEvent(sigint_event) call used to wake up the 
main thread on windows is also performed unconditionally and after the call to 
Py_AddPendingEvent. From the point of view of twisted/tornado/trio, this is 
exactly the same as the write to the wakeup fd -- the only reason we use the 
wakeup fd instead of the sigint_event is that it's more convenient to wait on 
an fd than on an event object.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30038>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30038] Race condition in how trip_signal writes to wakeup fd

2017-04-11 Thread Nathaniel Smith

Nathaniel Smith added the comment:

I think the idea in c13ef6664998 wasn't so much that we wanted the wakeup fd to 
be written to first, as that the way the code was written back then, the 
presence of 'if (is_tripped) return;' meant that it wasn't getting written to 
*at all* in some cases. Since then the code got restructured and the early 
return was removed, so this isn't an issue anymore.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30038>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30038] Race condition in how trip_signal writes to wakeup fd

2017-04-11 Thread Nathaniel Smith

Nathaniel Smith added the comment:

Err, libuv obviously doesn't use a Python-level signal handler. I just meant to 
include them as another example of a library I checked that uses a self-pipe to 
handle signals but relies on out-of-band information to transmit what the 
actual signal is :-).

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30038>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30038] Race condition in how trip_signal writes to wakeup fd

2017-04-11 Thread Nathaniel Smith

Nathaniel Smith added the comment:

Right. My claim would be that the PR I just submitted is the correct fix for 
bpo-21645 as well.

The approach asyncio uses is very elegant, but unfortunately it assumes that 
the wakeup fd has infinite buffer, which isn't true. If enough signals or other 
callbacks are scheduled to the event loop quickly enough, then the buffer can 
overflow and signals can get lost. This is a very classic issue and source of 
confusion for everyone learning Unix – the reason that there are traditionally 
32 unix signals, and that signals can be coalesced and SIGCHLD is so annoying, 
is that the kernel treats pending signals as flags, not a queue; this means 
that in the worst case it only takes 32 bits to store all pending signals. 
(Even the posix committee screwed this up when designing the real-time signals 
system, which is almost unusable as a result.) On Unix, if you send a process 
10 SIGHUP and 1 SIGCHLD, then it might get only 1 SIGHUP and 1 SIGCHLD, but 
it's guaranteed to get that. With asyncio's model, it might get 10 SIGHUP and 0 
SIGCHLD, or even 0 SIGHUP and 0 SIGCHLD (since other methods like 
call_soon_threadsafe also write to the wakeup fd).

In any case, this is why other async libraries (at least twisted, libuv, 
tornado, trio) treat the wakeup fd as a boolean empty-or-not, and pass the 
actual information via some other out-of-band mechanism that involves a 
Python-level signal handler. So the patch here is necessary for everyone else 
to safely use set_wakeup_fd.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30038>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30038] Race condition in how trip_signal writes to wakeup fd

2017-04-11 Thread Nathaniel Smith

Changes by Nathaniel Smith <n...@pobox.com>:


--
pull_requests: +1226

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30038>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30039] Resuming a 'yield from' stack is broken if a signal arrives in the middle

2017-04-11 Thread Nathaniel Smith

Changes by Nathaniel Smith <n...@pobox.com>:


--
pull_requests: +1224

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30039>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30039] Resuming a 'yield from' stack is broken if a signal arrives in the middle

2017-04-11 Thread Nathaniel Smith

New submission from Nathaniel Smith:

If we have a chain of generators/coroutines that are 'yield from'ing each 
other, then resuming the stack works like:

- call send() on the outermost generator
- this enters _PyEval_EvalFrameDefault, which re-executes the YIELD_FROM opcode
- which calls send() on the next generator
- which enters _PyEval_EvalFrameDefault, which re-executes the YIELD_FROM opcode
- ...etc.

However, every time we enter _PyEval_EvalFrameDefault, the first thing we do is 
to check for pending signals, and if there are any then we run the signal 
handler. And if it raises an exception, then we immediately propagate that 
exception *instead* of starting to execute bytecode. This means that e.g. a 
SIGINT at the wrong moment can "break the chain" – it can be raised in the 
middle of our yield from chain, with the bottom part of the stack abandoned for 
the garbage collector.

The fix is pretty simple: there's already a special case in _PyEval_EvalFrameEx 
where it skips running signal handlers if the next opcode is SETUP_FINALLY. (I 
don't see how this accomplishes anything useful, but that's another story.) If 
we extend this check to also skip running signal handlers when the next opcode 
is YIELD_FROM, then that closes the hole – now the exception can only be raised 
at the innermost stack frame.

This shouldn't have any performance implications, because the opcode check 
happens inside the "slow path" after we've already determined that there's a 
pending signal or something similar for us to process; the vast majority of the 
time this isn't true.

I'll post a PR in a few minutes that has a test case that demonstrates the 
problem and fails on current master, plus the fix.

--
components: Interpreter Core
messages: 291469
nosy: njs, yselivanov
priority: normal
severity: normal
status: open
title: Resuming a 'yield from' stack is broken if a signal arrives in the middle
versions: Python 3.5, Python 3.6, Python 3.7

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30039>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30038] Race condition in how trip_signal writes to wakeup fd

2017-04-11 Thread Nathaniel Smith

New submission from Nathaniel Smith:

In trip_signal [1], the logic goes:

1) set the flag saying that this particular signal was tripped
2) write to the wakeup fd
3) set the global is_tripped flag saying "at least one signal was tripped", and 
do Py_AddPendingCall (which sets some global flags that the bytecode 
interpreter checks on every pass through the loop)

So the problem here is that it's step (2) that wakes up the main thread to 
check for signals, but it's step (3) that actually arranges for the 
Python-level signal handler to run. (Step (1) turns out to be irrelevant, 
because no-one looks at the per-signal flags unless the global is_tripped flag 
is set. This might be why no-one noticed this bug through code inspection 
though – I certainly missed it, despite explicitly checking for it several 
times!)

The result is that the following sequence of events is possible:

- signal arrives (e.g. SIGINT)
- trip_signal writes to the wakeup fd
- the main thread blocked in IO wait sees this, and wakes up
- the main thread checks for signals, and doesn't find any
- the main thread empties the wakeup fd
- the main thread goes back to sleep
- trip_signal sets the flags to request the Python-level signal handler be run
- the main thread doesn't notice, because it's asleep

It turns out that this is a real thing that can actually happen; it's causing 
an annoying intermittent failure in the trio testsuite on appveyor; and under 
the correct conditions I can reproduce it very reliably in my local Windows VM. 
See [2].

I think the fix is just to swap the order of steps (2) and (3), so we write to 
the wakeup fd last. Unfortunately I can't easily test this because I don't have 
a way to build CPython on Windows. But [2] has some IMHO pretty compelling 
evidence that this is what's happening.

[1] 
https://github.com/python/cpython/blob/6fab78e9027f9ebd6414995580781b480433e595/Modules/signalmodule.c#L238-L291
[2] https://github.com/python-trio/trio/issues/119

--
messages: 291467
nosy: haypo, njs
priority: normal
severity: normal
status: open
title: Race condition in how trip_signal writes to wakeup fd
versions: Python 3.5, Python 3.6, Python 3.7

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30038>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29943] PySlice_GetIndicesEx change broke ABI in 3.5 and 3.6 branches

2017-04-09 Thread Nathaniel Smith

Nathaniel Smith added the comment:

It looks like PyTorch got bitten by this: 
https://discuss.pytorch.org/t/pyslice-adjustindices-error/1475/11

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29943>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29943] PySlice_GetIndicesEx change broke ABI in 3.5 and 3.6 branches

2017-04-09 Thread Nathaniel Smith

Nathaniel Smith added the comment:

> Can we consider 3.6.0 rather than 3.6.1 as broken release?

In the last week, pypi downloads were about evenly split between 3.6.0 and 
3.6.1 (2269969 for "3.6.1", 1927189 for "3.6.0", and those two were ~2 orders 
of magnitude more common than other strings like "3.6.1+", "3.6.0b2", etc. 
[1]). Not sure what that to conclude from that, but certainly if people start 
uploading 3.6.1-only wheels right now then it will break things for a lot of 
end users.

With my manylinux docker image maintainer hat on: we're currently shipping 
3.6.0. I'm extremely confident that if we stick with this we'll never get any 
complaints about the obscure bug with malicious __index__ implementations 
that's being fixed here. OTOH if we upgrade to 3.6.1, or any version with this 
ABI change, then we'll definitely get many complaints so long as there's anyone 
at all still using 3.6.0, which is probably forever. So I'm not sure not sure 
what incentive we would have to ever upgrade to 3.6.1+ if this ABI change is 
kept?

(This isn't saying the bug is unimportant! But it sure is hard to sell its 
importance to folks trying to ship packages and support end-users...)



[1] Somewhat crude query I used in case it's useful for future reference:

SELECT
  REGEXP_EXTRACT(details.python, r"^([^\.]+\.[^\.]+\.[^\.]+)") as 
python_version,
  COUNT(*) as download_count,
FROM
  TABLE_DATE_RANGE(
[the-psf:pypi.downloads],
DATE_ADD(CURRENT_TIMESTAMP(), -7, "day"),
DATE_ADD(CURRENT_TIMESTAMP(), 0, "day")
  )
WHERE
  REGEXP_MATCH(details.python, r"^3\.6\.")
GROUP BY
  python_version,
ORDER BY
  download_count DESC
LIMIT 100

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29943>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29988] (async) with blocks and try/finally are not as KeyboardInterrupt-safe as one might like

2017-04-04 Thread Nathaniel Smith

New submission from Nathaniel Smith:

You might hope the interpreter would enforce the invariant that for 'with' and 
'async with' blocks, either '__(a)enter__' and '__(a)exit__' are both called, 
or else neither of them is called. But it turns out that this is not true once 
KeyboardInterrupt gets involved – even if we write our (async) context manager 
in C, or otherwise guarantee that the actual '__(a)enter__' and '__(a)exit__' 
methods are immune to KeyboardInterrupt.

The invariant is *almost* preserved for 'with' blocks: there's one instruction 
(SETUP_WITH) that atomically (wrt signals) calls '__enter__' and then enters 
the implicit 'try' block, and there's one instruction (WITH_CLEANUP_START) that 
atomically enters the implicit 'finally' block and then calls '__exit__'. But 
there's a gap between exiting the 'try' block and WITH_CLEANUP_START where a 
signal can arrive and cause us to exit without running the 'finally' block at 
all. In this disassembly, the POP_BLOCK at offset 7 is the end of the 'try' 
block; if a KeyboardInterrupt is raised between POP_BLOCK and 
WITH_CLEANUP_START, then it will propagate out without '__exit__' being run:

In [2]: def f():
   ...: with a:
   ...: pass
   ...: 

In [3]: dis.dis(f)
  2   0 LOAD_GLOBAL  0 (a)
  3 SETUP_WITH   5 (to 11)
  6 POP_TOP

  3   7 POP_BLOCK
  8 LOAD_CONST   0 (None)
>>   11 WITH_CLEANUP_START
 12 WITH_CLEANUP_FINISH
 13 END_FINALLY
 14 LOAD_CONST   0 (None)
 17 RETURN_VALUE

For async context managers, the race condition is substantially worse, because 
the 'await' dance is inlined into the bytecode:

In [4]: async def f():
   ...: async with a:
   ...: pass
   ...: 

In [5]: dis.dis(f)
  2   0 LOAD_GLOBAL  0 (a)
  3 BEFORE_ASYNC_WITH
  4 GET_AWAITABLE
  5 LOAD_CONST   0 (None)
  8 YIELD_FROM
  9 SETUP_ASYNC_WITH 5 (to 17)
 12 POP_TOP

  3  13 POP_BLOCK
 14 LOAD_CONST   0 (None)
>>   17 WITH_CLEANUP_START
 18 GET_AWAITABLE
 19 LOAD_CONST   0 (None)
 22 YIELD_FROM
 23 WITH_CLEANUP_FINISH
 24 END_FINALLY
 25 LOAD_CONST   0 (None)
 28 RETURN_VALUE

Really the sequence from 3 BEFORE_ASYNC_WITH to 9 SETUP_ASYNC_WITH should be 
atomic wrt signal delivery, and from 13 POP_BLOCK to 22 YIELD_FROM likewise.

This probably isn't the highest priority bug in practice, but I feel like it'd 
be nice if this kind of basic language invariant could be 100% guaranteed, not 
just 99% guaranteed :-). And the 'async with' race condition is plausible to 
hit in practice, because if I have an '__aenter__' that's otherwise protected 
from KeyboardInterrupt, then it can run for some time, and any control-C during 
that time will get noticed just before the WITH_CLEANUP_START, so e.g. 'async 
with lock: ...' might complete while still holding the lock.

The traditional solution would be to define single "super-instructions" that do 
all of the work we want to be atomic. This would be pretty tricky here though, 
because WITH_CLEANUP_START is a jump target (so naively we'd need to jump into 
the "middle" of a hypothetical new super-instruction), and because the 
implementation of YIELD_FROM kind of assumes that it's a standalone instruction 
exposed directly in the bytecode. Probably there is some solution to these 
issues but some cleverness would be required.

A alternative approach would be to keep the current bytecode, but somehow mark 
certain stretches of bytecode as bad places to run signal handlers. The eval 
loop's "check for signal handlers" code is run rarely, so we could afford to do 
relatively expensive things like check a lookaside table that says "no signal 
handlers when 13 < f_lasti <= 22". Or we could steal a bit in the opcode 
encoding or something.

--
components: Interpreter Core
messages: 291148
nosy: ncoghlan, njs, yselivanov
priority: normal
severity: normal
status: open
title: (async) with blocks and try/finally are not as KeyboardInterrupt-safe as 
one might like
versions: Python 3.7

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29988>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28629] Emit ResourceWarning when implicitly terminating a suspended frame?

2017-03-31 Thread Nathaniel Smith

Nathaniel Smith added the comment:

It does make sense to skip emitting a warning if there's no try or with block 
active.

Don't we already have the ability to check for this, though, without any 
extensions to the frame or code objects? That's what the public 
PyGen_NeedsFinalizing does, right? It's implemented as a for loop over the 
frame's block stack, which in most cases should be extremely small, so the 
performance cost of running this from generator.__del__ seems likely to be 
minimal.

(I think currently the implementation is not *quite* correct if there's a 
'yield from' active – in most cases it won't much matter because if A is 
yielding from B and A is collected, then B will probably be collected a moment 
later and have its own __del__ called. But this is not *quite* the same as what 
should happen, which is that collecting A should immediately call B.close(), 
which in principle might do something arbitrarily different than B.__del__. But 
adding a check for whether a 'yield from' is active would be pretty trivial.)

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue28629>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29943] PySlice_GetIndicesEx change broke ABI in 3.5 and 3.6 branches

2017-03-29 Thread Nathaniel Smith

New submission from Nathaniel Smith:

In the process of fixing issue 27867, a new function PySlice_AdjustIndices was 
added, and PySlice_GetIndicesEx was converted into a macro that calls this new 
function. The patch was backported to both the 3.5 and 3.6 branches, was 
released in 3.6.1, and is currently slated to be released as part of 3.5.4.

Unfortunately, this breaks our normal ABI stability guarantees for micro 
releases: it means that if a C module that uses PySlice_GetIndicesEx is 
compiled against e.g. 3.6.1, then it cannot be imported on 3.6.0. This affects 
a number of high-profile packages (cffi, pandas, mpi4py, dnf, ...). The only 
workaround is that if you are distributing binary extension modules (e.g. 
wheels), then you need to be careful not to upgrade to 3.6.1. It's not possible 
for a wheel to declare that it requires 3.6.1-or-better, because CPython 
normally follows the rule that we don't make these kinds of changes. Oops.

CC'ing Ned and Larry, because it's possible this should trigger a 3.6.2, and I 
think it's a blocker for 3.5.4. CC'ing Serhiy as the author of the original 
patch, since you probably have the best idea how this could be unwound with 
minimal breakage :-).

python-dev discussion: 
https://mail.python.org/pipermail/python-dev/2017-March/147707.html

Fedora bug: https://bugzilla.redhat.com/show_bug.cgi?id=1435135

--
messages: 290796
nosy: larry, ned.deily, njs, serhiy.storchaka
priority: normal
severity: normal
status: open
title: PySlice_GetIndicesEx change broke ABI in 3.5 and 3.6 branches
versions: Python 3.5, Python 3.6

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29943>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29926] time.sleep ignores _thread.interrupt_main()

2017-03-28 Thread Nathaniel Smith

Nathaniel Smith added the comment:

(oh, in case it wasn't obvious: the advantage of raise() over kill() and 
pthread_kill() is that raise() works everywhere, including Windows, so it would 
avoid platform specific logic. Or if you don't like raise() for some reason 
then you can get the same effect by doing handler = PyOS_getsig(SIGINT); if 
(handler) handler(SIGINT); )

(Also, this is all assuming the python 3 logic on Windows. On Windows versions 
of python 2 I can't see any way to make time.sleep interruptible without 
significant rewriting. The problem is that on py2 the special Event used to 
signal the arrival of a control-C is a private variable that gets hooked up 
directly to Windows's low-level control-C logic, and there's effectively no way 
to get at it from outside. You might think GenerateConsoleCtrlEvent would help, 
but it is broken in too many ways for me to fit in this text box.)

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29926>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29926] time.sleep ignores _thread.interrupt_main()

2017-03-28 Thread Nathaniel Smith

Nathaniel Smith added the comment:

If you want to trigger the standard signal handling logic, then raise(SIGINT) 
is also an option. On unix it probably won't help because of issue 21895, but 
using raise() here + fixing 21895 by using pthread_kill in the c level signal 
handler would together give a pretty simple and robust way to pretend that the 
user hit control-C.

But... interrupt_main isn't documented to raise a SIGINT, it's documented to 
raise KeyboardInterrupt. These are very different, e.g. if the user has 
installed a custom handler or even set SIGINT to SIG_IGN or masked it. (In the 
later two cases, using pthread_kill to send a signal to the main thread *won't* 
interrupt any ongoing syscall.) This is *much* more difficult to make work 
correctly. And actually what IDLE wants is a fake control-C anyway!

So I'd suggest either redefining the semantics of interrupt_main to be "acts as 
if SIGINT was received" or else abandoning interrupt_main to its slightly-buggy 
semantics and adding a new function that does raise(SIGINT) and switching IDLE 
to use this new function. And fix issue 21895 in either case.

--
nosy: +njs

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29926>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21895] signal.pause() and signal handlers don't react to SIGCHLD in non-main thread

2017-03-22 Thread Nathaniel Smith

Nathaniel Smith added the comment:

Letting Python-level signal handlers run in arbitrary threads is an interesting 
idea, but it's a non-trivial change to Python semantics that may well break 
some programs (previously it was guaranteed that signal handlers couldn't race 
with main thread code), and it doesn't fully fix the bug here (because it's 
common for Python programs to contain threads which don't run Python code, e.g. 
if they use zmq). I can imagine some potential benefits, but I feel like this 
needs some substantial rationale?

OTOH the pthread_kill-based fix I suggested is like 3 lines and AFAICT fixes 
the problem exactly.

> I dislike the idea of transfering the signal to the main thread from another 
> thread in the C signal handler using pthread_kill(). Most code behave very 
> badly when getting a signal, so getting a signal twice is likely to double 
> the pain.

This doesn't make much sense to me... CPython's pretty good at handling EINTR 
these days, and, well, the only case that's different is when you have a signal 
handler that needs to be run and the main thread is blocked in a syscall, which 
is exactly the case that's currently broken? 

> Moreover, pthread_sigmask() can block signals in the main thread for 
> deliberate reasons.

Sure, and with pthread_kill() this would have the effect that the signal would 
be queued by the kernel and delivered after signal is unmasked. That seems like 
pretty sensible semantics to me; in fact it's exactly what you get in 
single-threaded code.

Compare that to right now where AFAICT pthread_sigmask() is pretty useless in 
the presence of threads, because of the thing where if one thread has a signal 
blocked then the kernel will pick another thread to deliver it to.

> From the point of view of the Python signal handler, the current "if 
> (PyThread_get_thread_ident() != main_thread) return 0;" code in the C signal 
> handler is somehow an implicit pthread_sigmask(signal.SIG_BLOCK, range(1, 
> signal.NSIG)) on all threads except of the main thread, whereas Unix gives a 
> fine control on these masks with the pthread_sigmask() function.

That code isn't in the C signal handler, it's in the code that the interpreter 
runs occasionally to check if the C signal handler has been called, right?

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue21895>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21895] signal.pause() and signal handlers don't react to SIGCHLD in non-main thread

2017-03-21 Thread Nathaniel Smith

Nathaniel Smith added the comment:

@haypo: okay, looked over things over for a third time and this time I found my 
very silly error :-). So I'm now able to use set_wakeup_fd on Windows 
(https://github.com/python-trio/trio/pull/108), but not on Unix 
(https://github.com/python-trio/trio/issues/109).

In any case, the issue here remains that one shouldn't have to use 
set_wakeup_fd for a signal to interrupt time.sleep etc.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue21895>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21895] signal.pause() and signal handlers don't react to SIGCHLD in non-main thread

2017-03-21 Thread Nathaniel Smith

Nathaniel Smith added the comment:

@haypo: It's a socketpair. It works fine when I set up a toy test case using 
set_wakeup_fd + select, and it works fine in my real code when I use CFFI 
cleverness to register a signal handler that manually writes a byte to my 
wakeup socket, but when I pass that exact same socket to set_wakeup_fd in my 
real code, it doesn't work. It's pretty mysterious, and I have no particular 
reason to think that the problem is in CPython as opposed to some stupid 
mistake I'm making.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue21895>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21895] signal.pause() and signal handlers don't react to SIGCHLD in non-main thread

2017-03-21 Thread Nathaniel Smith

Nathaniel Smith added the comment:

I don't really have a specific use case personally -- for trio, I haven't found 
a way to make use of set_wakeup_fd because of various issues[1], but I'm also 
not planning to use SIGCHLD, so this isn't very urgent.

In general set_wakeup_fd can be a workaround, but I wanted to note this 
upstream because it's a bug in Python's signal handler logic. Note that Python 
already goes to great lengths to make e.g. signal handlers run during 
time.sleep on Windows; they ought to just work on Unix too.

--

[1] (Off-topic but in case you're curious: I register actual signal handlers 
anyway because I follow the philosophy that the wakeup pipe should only be used 
for wakeups rather than transmitting information, so as long as signal handlers 
work I can do my own wakeups, + for some reason set_wakeup_fd doesn't work for 
me on Windows (no idea why, can't reproduce in simplified tests, might be my 
fault, need to debug), + set_wakeup_fd's handling of buffer overflow is broken 
for my purposes.)

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue21895>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21895] signal.pause() and signal handlers don't react to SIGCHLD in non-main thread

2017-03-20 Thread Nathaniel Smith

Nathaniel Smith added the comment:

It turns out that this bug is more general than signal.pause, and has caused 
problems for a few different people:
  https://github.com/dabeaz/curio/issues/118#issuecomment-287735781
  https://github.com/dabeaz/curio/issues/118#issuecomment-287798241
  https://github.com/dabeaz/curio/issues/118#issuecomment-287887843

The basic problem is that CPython's signal handling strategy on Unix-likes 
assumes that if a signal is delivered to the process at a time when the main 
thread is blocked in a syscall, then that syscall will promptly exit with 
EINTR. So for example, time.sleep has some clever code on Windows to make it 
interruptible with control-C, but on Unix we assume that the kernel will break 
the sleep for us.

The problem with this is that POSIX only guarantees that *some* thread will 
receive the signal -- not necessarily the main thread. In practice it seems 
like most implementations do deliver most signals to the main thread or CPython 
wouldn't have gotten away with this for as long as it has, but in particular it 
seems like Linux is happy to deliver SIGCHLD to random other threads. So the 
C-level signal handler runs in whatever thread, it sets the flag for the main 
thread to run the Python-level signal handler... and then the main thread sits 
there blocked in sleep() or select() or pause() or whatever, and never checks 
the flag.

A simple solution would be to make sure signals are always delivered to the 
main thread, by adjusting the C-level signal handler to do something like:

if (current_thread != main_thread) {
pthread_kill(main_thread, sig_num);
return;
}

--
nosy: +njs
title: signal.pause() doesn't wake up on SIGCHLD in non-main thread -> 
signal.pause() and signal handlers don't react to SIGCHLD in non-main thread
versions: +Python 3.6, Python 3.7

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue21895>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



<    1   2   3   4   5   >