[issue29988] (async) with blocks and try/finally are not as KeyboardInterrupt-safe as one might like
Nathaniel Smith added the comment: To make sure I understand correctly: your concern is that the event loop is not implemented in C. So if you have this patch + an async CM that's implemented in C + the async CM never *actually* yields to the event loop, then that will be signal safe... but that last restriction is pretty silly, because what's the point of an async CM that never yields to the event loop. Right? I see what you mean, but I think this patch is still useful even if it doesn't protect the event loop itself. To get the full benefit, you also need some way to protect your event loop from unexpected KeyboardInterrupt, but that's something Trio already solves and asyncio potentially could. And we knew from the start that to get the *most* benefit from this patch we would also need some way to protect arbitrary chunks of Python code from KeyboardInterrupt so you could protect the __(a)exit__ method itself; needing to protect the loop too isn't a huge stretch beyond that. (It's possible that enough of uvloop is implemented in C to benefit from this?) -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue29988> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue29988] (async) with blocks and try/finally are not as KeyboardInterrupt-safe as one might like
Nathaniel Smith added the comment: For purposes of writing a test case, can you install a custom Python-level signal handler, and make some assertion about where it runs? I.e., the calling frame should be inside the __aexit__ body, not anywhere earlier? -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue29988> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30437] SSL_shutdown can return meaningless SSL_ERROR_SYSCALL
Nathaniel Smith added the comment: My reading of the man page is that if SSL_shutdown returns 0, this means that it succeeded at doing the first phase of shutdown. If there are errors then they should be ignored, because it actually succeeded. If you want to then complete the second phase of shutdown, of course, you have to call it again, but that's no different than any other use of SSL_shutdown. If two calls to SSL_shutdown both return zero, then that's definitely a bug in OpenSSL. A return value of zero means that previously the SSL_SENT_SHUTDOWN flag was not set, and now it is set, so that can only happen once per connection. But that's orthogonal to the SSL_ERROR_SYSCALL issue. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30437> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue31344] f_trace_opcodes frame attribute to switch to per-opcode tracing
Nathaniel Smith added the comment: Adding Ned to CC in case he wants to comment on the utility of per-opcode tracing from the perspective of coverage.py. -- nosy: +nedbat, njs ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue31344> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue29302] add contextlib.AsyncExitStack
Changes by Nathaniel Smith <n...@pobox.com>: -- nosy: +njs ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue29302> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue29988] (async) with blocks and try/finally are not as KeyboardInterrupt-safe as one might like
Nathaniel Smith added the comment: In bpo-30703 Antoine fixed signal handling so it doesn't use Py_AddPendingCall anymore. At this point the only time the interpreter itself uses Py_AddPendingCall is when there's an error writing to the signal_fd, which should never happen in normal usage. If there are third-party libraries making heavy use of Py_AddPendingCall then it could be an issue, but any such libraries should already be making sure they never have more than one Py_AddPendingCall pending at a time, because you already have to assume that the processing might be arbitrarily delayed. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue29988> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue29988] (async) with blocks and try/finally are not as KeyboardInterrupt-safe as one might like
Nathaniel Smith added the comment: This would still provide value even if you have to do a GET_AWAITABLE in the protected region: the most common case is that __aenter__ is a coroutine function, which means that its __await__ is implemented in C and already protected against interrupts. Also, to make this fully useful we need some way to protect arbitrary callables against interrupts anyway (to handle the case where the signal arrives during the actual __enter__ or __exit__ body), so saying that we also need to be able to protect __await__ isn't a big deal. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue29988> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11410] Use GCC visibility attrs in PyAPI_*
Nathaniel Smith added the comment: Was this actually fixed, or did everyone just get tired and give up on the original patch? -- nosy: +njs ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue11410> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue31198] getaddrinfo: inconsistent handling of port=None
Nathaniel Smith added the comment: Ugh, apparently this weird behavior is actually mandated by the RFC :-(. RFC 3493: The nodename and servname arguments are either null pointers or pointers to null-terminated strings. One or both of these two arguments must be a non-null pointer. So... never mind! -- resolution: -> not a bug stage: -> resolved status: open -> closed ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue31198> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue31198] getaddrinfo: inconsistent handling of port=None
Changes by Nathaniel Smith <n...@pobox.com>: -- nosy: +giampaolo.rodola ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue31198> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue31198] getaddrinfo: inconsistent handling of port=None
New submission from Nathaniel Smith: socket.getaddrinfo accepts None as a port argument, and translates it into 0. This is handy, because bind() understands 0 to mean "pick a port for me", and if you want bind to pick a port for you and port=None is a slightly more obvious way to say that then port=0. For example: >>> socket.getaddrinfo("127.0.0.1", None) [(, , 6, '', ('127.0.0.1', 0)), (, , 17, '', ('127.0.0.1', 0)), (, , 0, '', ('127.0.0.1', 0))] socket.getaddrinfo also accepts None as a host name; this is necessary because the underlying getaddrinfo(3) call has special handling for host=NULL, and we need some way to access it: >>> socket.getaddrinfo(None, 0) [(, , 6, '', ('::1', 0, 0, 0)), (, , 17, '', ('::1', 0, 0, 0)), (, , 0, '', ('::1', 0, 0, 0)), (, , 6, '', ('127.0.0.1', 0)), (, , 17, '', ('127.0.0.1', 0)), (, , 0, '', ('127.0.0.1', 0))] However, even though both of these features are supported separately... if you try to use them *together*, then socket.getaddrinfo errors out: >>> socket.getaddrinfo(None, None) socket.gaierror: [Errno -2] Name or service not known I expected that last call to be equivalent to socket.getaddrinfo(None, 0). -- components: Extension Modules messages: 300236 nosy: njs priority: normal severity: normal status: open title: getaddrinfo: inconsistent handling of port=None versions: Python 3.7 ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue31198> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28414] SSL match_hostname fails for internationalized domain names
Changes by Nathaniel Smith <n...@pobox.com>: -- pull_requests: +3043 ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue28414> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28414] SSL match_hostname fails for internationalized domain names
Nathaniel Smith added the comment: > I haven't dug in deeply, but it sounds like we handle IDNs in CNs and SANs > differently? No -- Python's ssl module uses exactly the same hostname checking logic in both cases, and it's equally broken regardless. But, since CAs do all kinds of weird stuff with CNs, there's some chance that our brokenness and their brokenness will align and make things work by chance. Specifically, this will happen if the CA puts the U-encoded name in the CN field. Nick Lamb's concern is that CAs may be using this as justification for continuing to issue certs that are broken in this way. I don't know if that's true, but it's possible. > one solution would be to simply drop support for CNs in match_hostname That would indeed fix Nick Lamb's concern, but I'm dubious about this word "simply" :-). Obviously we should do this eventually, but it's going to break a bunch of people, you'll have to have a big fight about Python 2 and Redhat will probably refuse to take the patch and etc etc. OTOH fixing match_hostname to use A-labels would provide immediate benefits to Python's users (right now Python just... can't do SSL connections to IDNs) with minimal breakage, so you can call it a bug fix, and then worry about deprecating the CN field on its own schedule. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue28414> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue31119] Signal tripped flags need memory barriers
Nathaniel Smith added the comment: @arigo: Technically we also need that the writes to memory are observed to happen-before the write() call on the wakeup fd, which is not something that Intel is going to make any guarantees about. But *probably* this is also safe because the kernel has to use some kind of cross-CPU synchronization to wake up the read()ing thread, and that imposes a memory barrier. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue31119> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue31119] Signal tripped flags need memory barriers
Nathaniel Smith added the comment: On further investigation (= a few hours staring at the ceiling last night), it looks like there's another explanation for my particular bug... which is good, because on further investigation (= a few hours squinting at google results) it looks like this probably can't happen on x86/x86-64 (I think? maybe?). It's still true though that you can't just throw in a 'volatile' and expect cross-thread synchronization to work -- that's not C's semantics, and it's not true on popular architectures like ARM. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue31119> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue31119] Signal tripped flags need memory barriers
New submission from Nathaniel Smith: Sometimes, CPython's signal handler (signalmodule.c:signal_handler) runs in a different thread from the main thread. On Unix this is rare but it does happen (esp. for SIGCHLD), and on Windows it happens 100% of the time. It turns out that there is a subtle race condition that means such signals can be lost. To notify the main thread that a signal has been received, and which signal it is, CPython's signal handler sets two global variables: Handlers[sig_num].tripped = 1; is_tripped = 1 And then PyErr_CheckSignals does: if (is_tripped) { is_tripped = 0; for (i = 1; i < NSIG; i++) { if (Handlers[i].is_tripped) { ... run signal handler ... } } } As some comments in the source note, this logic depends strongly on the assumption that when the signal handler sets the two flags, they are immediately visible to PyErr_CheckSignals, and that the two assignments happen in the given order. For example, suppose that is_tripped were set before Handlers[i].is_tripped. Then we could have the following sequence: 1. Thread A sets is_tripped = 1 2. Thread B runs PyErr_CheckSignals 3. Thread B notices is_tripped == 1, and sets it to 0 4. Thread B scans through all the Handlers, sees that none are set, and carries on 5. Thread A sets Handlers[i].is_tripped = 1 In this case the signal is lost until another signal arrives, which may not happen until an arbitrary amount of time has passed. Also, the fix for bpo-30038 (making sure that we set the flags before writing to the wakeup fd) assumes that setting the flags before writing to the wakeup fd means that, well, the flags will be written before the fd. However, when you have code running in separate threads, none of this is actually guaranteed. In general, the compiler is free to re-order writes, and more perniciously, the hardware memory hierarchy is also free to arbitrarily delay when writes to RAM made by one CPU become visible to another CPU, which can also create arbitrary reorderings. In an attempt to avoid these issues, signalmodule.c declares the signal flags as 'volatile sig_atomic_t'. However, the best one can hope for from this is that it'll stop the *compiler* from reordering writes. It does nothing to stop the *hardware* from reordering writes. Using volatile like this might be sufficient if the signal handler runs in the main thread, but not if it runs in another thread. The correct way to handle this is to use the primitives in pyatomic.h to get/set the tripped flags. I noticed this because of the same test case that was failing due to bpo-30038... this thing has been frustrating me for months because it's intermittent and never seems to happen on my local VM. But specifically what I'm seeing at this point is: Thread A (the thread where the signal handler runs): A1. takes a lock. A2. with the lock held, calls the C function raise(SIGINT), which directly calls the CPython C-level signal handler. (Verified by consulting the Windows C runtime source.) A3. writes to the wakeup fd & sets both tripped flags, in whatever order A4. releases the lock Simultaneously, in thread B (the main thread): B1. observes that the wakeup fd has been written to B2. takes the lock B3. releases the lock B4. calls repr(), which unconditionally calls PyObject_Repr, which unconditionally calls PyErr_CheckSignals I expect the call in B4 to run the Python-level signal handler, but this does not happen. Instead it runs some time later, causing an unexpected KeyboardInterrupt. My reasoning is: the lock guarantees that no matter what the internal details of the C-level signal handler actually are, they have to all be complete before my code checks for signals. In excruciating detail: - the write to the wakeup fd must happen-after the lock is acquired (A1) - therefore, step B1 happens-after step A1 - B2 happens-after B1, so B2 happens-after A1. - B2 takes the same lock as A1, so if B2 happens-after A1, it must also happen-after A4, when the lock is released. - step B4 happens-after B2, therefore it happens-after A4 and A3. Therefore the tripped flags should be visible and the Python-level signal handler should run. Yet this does not happen, so *something* really weird is going on. The hypothesis that it's the lack of memory barriers both explains how this could happen -- specifically I think thread B may be running PyErr_CheckSignals as part of the wakeup-fd reading logic, and it's clearing is_tripped before Handlers[i].tripped becomes visible, like in my code above. And it also explains why it happens in like 1/50 runs in Appveyor, but never on my local Windows 10 VM that only has 1 virtual CPU. But regardless of whether this is the cause of my particular weird bug, it clearly *is* a bug and should be fixed. -- messages: 299736 nosy: haypo, njs priority: normal severity: normal status: open title: Signal tripped fla
[issue14243] tempfile.NamedTemporaryFile not particularly useful on Windows
Changes by Nathaniel Smith <n...@pobox.com>: -- nosy: +njs ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue14243> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16487] Allow ssl certificates to be specified from memory rather than files.
Changes by Nathaniel Smith <n...@pobox.com>: -- nosy: +njs ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue16487> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue29926] IDLE: in shell, time.sleep ignores _thread.interrupt_main()
Nathaniel Smith added the comment: Then maybe simplest solution is to scale back the claim :-). The important semantic change would be that right now, interrupt_main() is documented to cause KeyboardInterrupt to be raised in the main thread. So if you register a custom SIGINT handler that *doesn't* raise KeyboardInterrupt, interrupt_main() doesn't care: it doesn't invoke your custom handler, it just raises KeyboardInterrupt. My proposal would make it so that interrupt_main() does cause custom SIGINT handlers to be called, as well as triggering all of the interpreters existing machinery for prompt signal delivery (waking up time.sleep etc.), so in that respect it would be much more similar to a "real" control-C than it is now. This is a non-trivial semantic change, so it would need to be documented, but I'm not attached to the "equivalent to control-C" phrasing. Even on UNIX, a "real" control-C is delivered to all processes in the foreground process group (or something like that, ttys are weird), but we would want interrupt_main() to only be delivered to the current process, so the equivalence wouldn't be exact there either. > That operation isn't buggy on its own to my knowledge I spent quite some time trying to make this work in order to test control-C handling in trio, and eventually gave up because I just could not make it work reliably. One notable effect is that if you call GenerateConsoleCtrlEvent(CTRL_C_EVENT, 0) in an appveyor test run, then it breaks appveyor -- at the end of the run it freezes with "Terminate batch job (Y/N)?" and then sits until the test run times out. It's possible to work around this with some extremely obscure code (make sure you always spawn a subprocess with CREATE_NEW_CONSOLE or maybe CREATE_NEW_CONSOLE_GROUP etc.), but (a) we can't tell people that they need to do that before running code that uses interrupt_main(), and (b) even when I did that, then I still had bizarre issues I couldn't figure out, where sometimes the event just wouldn't be delivered unless I ran the interpreter with the -u option for unbuffered stdio (?!?!) (see [1]). I just don't think this is something that CPython can use. I eventually switched to simulating control-C events for testing by calling raise(SIGINT), and that's worked much much better. > I mentioned the possibility of calling CancelSynchronousIo in order to cancel > a console read like Ctrl+C does (but clunkier) -- again, because of the > equivalence claim. ERROR_OPERATION_ABORTED would need to be handled like > EINTR on Unix. This would entail small changes to a lot of code, so it does > need a separate issue if there's any support for this idea. It still wouldn't be equivalent because control-C only cancels console reads, but CancelSynchronousIo would cancel *any* blocking I/O operation that the main thread was doing, right? So e.g. if the main thread was blocked doing socket.recv, then control-C wouldn't interrupt that, but interrupt_main() / PyErr_SetInterrupt() would. I can certainly see an argument that Python's C-level signal handler *should* call CancelSynchronousIo in general, and then raise(SIGINT) and control-C would be equivalent because they both called CancelSynchronousIo, but yeah, that's way beyond the scope of this issue. [1] https://github.com/python-trio/trio/commit/95843654173e3e826c34d70a90b369ba6edf2c23#diff-345cfb6c136028f9514b67ee7bb8e035R11 -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue29926> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30038] Race condition in how trip_signal writes to wakeup fd
Changes by Nathaniel Smith <n...@pobox.com>: -- resolution: -> fixed stage: -> resolved status: open -> closed ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30038> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue29926] IDLE: in shell, time.sleep ignores _thread.interrupt_main()
Nathaniel Smith added the comment: > A real Ctrl+C executes the registered control handlers for the process. Right, but it's *extremely* unusual for a python 3 program to have a control handler registered directly via SetConsoleCtrlHandler. This isn't an API that the interpreter uses or exposes. The vast majority of programs receive control-C notification by letting the C runtime convert the low level console event into a SIGINT, and then receiving this via the signal module. > To emulate this, PyErr_SetInterrupt could try calling > GenerateConsoleCtrlEvent(CTRL_C_EVENT, 0) to broadcast a Ctrl+C event. But as mentioned up thread, this is really flaky - and even when it works it tends to kill random other processes, which is *certainly* not what anyone expects from calling interrupt_main. You can't really call it experimentally – even trying to call it can do stuff like cause appveyor tests to lock up. Given these two issues, I think that emulating control-C at the signal level makes the best tradeoffs. Something that works 100% reliably for 99.99% of python programs is way better than something that is 80% reliable for 100% of programs. > One problem is that GenerateConsoleCtrlEvent(CTRL_C_EVENT, 0) doesn't cancel > a blocking console read like Ctrl+C does. Python's C handler could call > CancelSynchronousIo(hMainThread) to address this problem in general. > Unfortunately, when a console read is canceled in the client, it isn't > canceled in the console itself. The contents of the read will be discarded, > but it's a bit clunky that the user has to press enter. This might be something to address as a separate issue? I'm guessing this doesn't affect idle, and it doesn't affect time.sleep, which seem to be the main topics of this thread, plus it sounds like any solution would be mostly orthogonal. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue29926> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30744] Local variable assignment is broken when combined with threads + tracing + closures
Nathaniel Smith added the comment: > I like it because it categorically eliminates the "tracing or not?" global > state dependence when it comes to manipulation of the return value of > locals() - manipulating that will either always affect the original execution > namespace immediately (modules, classes, exec, eval), or always be a > completely independent snapshot that can never lead to changes in the > original name bindings (functions, generators, coroutines). Maybe I was unclear...? my question is why you prefer locals-making-a-copy over locals-and-f_locals-for-function-frames-return-proxy-objects. It seems to me that both of these proposals eliminate the "tracing or not?" global state dependence (right?), so this can't be a reason to prefer one over the other. And the latter additionally eliminates the distinction between modules/classes/exec/eval and functions/generators/coroutines, while also avoiding introducing a distinction between locals() and f_locals. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30744> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue29926] IDLE: in shell, time.sleep ignores _thread.interrupt_main()
Nathaniel Smith added the comment: Sorry, I meant bpo-21895. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue29926> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue29926] IDLE: in shell, time.sleep ignores _thread.interrupt_main()
Nathaniel Smith added the comment: In terms of general design cleanliness, I think it would be better to make `interrupt_main` work reliably than to have IDLE add workarounds for its unreliability. AFAICT the ideal, minimal redundancy solution would be: - interrupt_main just calls raise(SIGINT) - bpo-21895 is fixed by adding a few lines to Python's C-level signal handler on Unix-likes to check if it's running in the main thread, and if not then use pthread_kill to redirect the signal to the main thread. If trying to fix bpo-28195 is too controversial, then a minimal change would be: - in interrupt_main, if we're on Windows, call raise(SIGINT); otherwise, call pthread_kill(main_thread, SIGINT) And in either case we'd also have to document that interrupt_main is now equivalent to the user having hit control-C. But AFAICT that's what all its users really want anyway. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue29926> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30744] Local variable assignment is broken when combined with threads + tracing + closures
Nathaniel Smith added the comment: > Folks that actually *wanted* the old behaviour would then need to do either > "sys._getframe().f_locals" or "inspect.currentframe().f_locals". So by making locals() and f_locals have different semantics, we'd be adding yet another user-visible special-case? That seems unfortunate to me. > if you want to write access to a function namespace from outside the > function, you need to either implement an eval hook (not just a tracing hook) [...] > or else a decision to disallow write-backs to frame locals even from tracing > functions in 3.7+. Disallowing writeback from tracing functions would completely break bdb/pdb, so unless you're planning to rewrite bdb in C as an eval hook, then I don't think this is going to happen :-). Writing back to locals is a useful and important feature! I think I'm missing some rationale here for why you prefer this approach – it seems much more complicated in terms of user-visible semantics, and possibly implementation-wise as well. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30744> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30744] Local variable assignment is broken when combined with threads + tracing + closures
Nathaniel Smith added the comment: Interesting idea! I'm not sure I fully understand how it would work though. What would you do for the frames that don't use the fast array, and where locals() currently returns the "real" namespace? How are you imagining that the trace function writeback would be implemented? Some sort of thread-local flag saying "we're inside a trace function for frame XX" that causes locals() and f_locals to switch to returning a "real" namespace object? -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30744> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30744] Local variable assignment is broken when combined with threads + tracing + closures
Nathaniel Smith added the comment: It isn't obvious to me whether the write-through proxy idea is a good one on net, but here's the rationale for why it might be. Currently, the user-visible semantics of locals() and f_locals are a bit complicated. AFAIK they aren't documented anywhere outside the CPython and PyPy source (PyPy is careful to match all these details), but let me take a stab at it: The mapping object you get from locals()/f_locals represents the relevant frame's local namespace. For many frames (e.g. module level, class level, anything at the REPL), it literally *is* the local namespace: changes made by executing bytecode are immediately represented in it, and changes made to it are immediately visible to executing bytecode. Except, for function frames, it acts more like a snapshot copy of the local namespace: it shows you the namespace at the moment you call locals(), but then future changes to either the code's namespace or the object don't affect each other. Except except, the snapshot might be automatically updated later to incorporate namespace changes, e.g. if I do 'ns = locals()' and then later on someone accesses the frame's f_locals attribute, then reading that attribute will cause my 'ns' object to be silently updated. But it's still a snapshot; modifications to the mapping aren't visible to the executing frame. Except**3, if you happen to modify the mapping object while you're inside a trace function callback, then *those* modifications are visible to the executing frame. (And also if a function is being traced then as a side-effect this means that now our 'ns' object above does stay constantly up to date.) Except**4, you don't actually have to be inside a trace function callback for your modifications to be visible to the executing frame – all that's necessary is that *some* thread somewhere is currently inside a trace callback (even if it doesn't modify or even look at the locals itself, as e.g. coverage.py doesn't). This causes a lot of confusion [1]. On top of that, we have this bug here. The writeback-only-if-changed idea would make it so that we at least correctly implement the semantics I described in the long paragraph above. But I wonder if maybe we should consider this an opportunity to fix the underlying problem, which is that allowing skew between locals() and the actual execution namespace is this ongoing factory for bugs and despair. Specifically, I'm wondering if we could make the semantics be: "locals() and f_locals return a dict-like object representing the local namespace of the given frame. Modifying this object and modifying the corresponding local variables are equivalent operations in all cases." (So I guess this would mean a proxy object that on reads checks the fast array first and then falls back to the dict, and on writes updates the fast array as well as the dict.) > you can still have race conditions between "read-update-writeback" operations > that affect the cells directly, as well as with those that use the new > write-through proxy. Sure, but that's just a standard concurrent access problem, no different from any other case where you have two different threads trying to mutate the same local variable or dictionary key at the same time. Everyone who uses threads knows that if you want to do that then you need a mutex, and if you don't use proper locking then it's widely understood how to recognize and debug the resulting failure modes. OTOH, the current situation where modifications to the locals object sometimes affect the namespace, and sometimes not, and sometimes they get overwritten, and sometimes they don't, and it sometimes depends on spooky unrelated things like "is some other thread currently being traced"? That's *way* more confusing that figuring out that there might be a race condition between 'x = 1' and 'locals()["x"] = 2'. Plus, pdb depends on write-through working, and there are lots of frames that don't even use the fast array and already have my proposed semantics. So realistically our choices are either "consistently write-through" or "inconsistently write-through". [1] https://www.google.com/search?q=python+modify+locals=utf-8=utf-8 -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30744> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30744] Local variable assignment is broken when combined with threads + tracing + closures
Nathaniel Smith added the comment: Some thoughts based on discussion with Armin in #pypy: It turns out if you simply delete the LocalsToFast and FastToLocals calls in call_trampoline, then the test suite still passes. I'm pretty sure that pdb relies on this as a way to set local variables, though, so I think this is probably more of a bug in the test suite than anything else. In some ways, it seems like the most attractive fix to this would be to have the Python-level locals() and frame.f_locals return a dict proxy object whose __setitem__ writes any mutations back to both the fast array and the real frame->f_locals object, so that LocalsToFast becomes unnecessary. As Armin points out, though, that would definitely be a semantic change: there may be code out there that relies on locals() returning a dict – technically it doesn't have to return a dict even now, but it almost always does – or that assumes it can mutate the locals() array without that affecting the function. I think this would arguably be a *good* thing; it would make the behavior of locals() and f_locals much less confusing in general. But he's right that it's something we can only consider for 3.7. The only more-compatible version I can think of is: before calling the trace function, do FastToLocals, and also make a "shadow copy" of all the cellvars and freevars. Then after the trace function returns, when copying back from LocalsToFast, check against these shadow copies, and only write back cellvars/freevars that the trace function actually modified (where by modified we mean 'old is not new', just a pointer comparison). Since most functions have no cellvars or freevars, and since we would only need to do this when there's a Python-level trace function active, this shouldn't add much overhead. And it should be compatible enough to backport even to 2.7, I think. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30744> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30744] Local variable assignment is broken when combined with threads + tracing + closures
New submission from Nathaniel Smith: The attached script looks innocent, but gives wildly incorrect results on all versions of CPython I've tested. It does two things: - spawns a thread which just loops, doing nothing - in the main thread, repeatedly increments a variable 'x' And most of the increments of the variable are lost! This requires two key things I haven't mentioned, but that you wouldn't expect to change anything. First, the thread function closes over the local variable 'x' (though it doesn't touch this variable in any way). And second, the thread function has a Python-level trace function registered (but this trace function is a no-op). Here's what's going on: When you use sys.settrace() to install a Python-level trace function, it installs the C-level trace function "call_trampoline". And then whenever a trace event happens, call_trampoline calls PyFrame_FastToLocalsWithError, then calls the Python-level trace function, then calls PyFrame_LocalsToFast (see: https://github.com/python/cpython/blob/master/Python/sysmodule.c#L384-L395). This save/call/restore sequence is safe if all the variables being saved/restored are only visible to the current thread, which used to be true back in the days when local variables were really local. But it turns out nowadays (since, uh... python 2.1), local variables can be closed over, and thus can visible from multiple threads simultaneously. Which means we get the following sequence of events: - In thread A, a trace event occurs (e.g., executing a line of code) - In thread A, call_trampoline does PyFrame_FastToLocalsWithError, which saves a snapshot of the current value of 'x' - In thread A, call_trampoline then starts calling the trace function - In thread B, we increment 'x' - In thread A, the trace function returns - In thread A, call_trampoline then does PyFrame_LocalsToFast, which restores the old value of 'x', overwriting thread B's modifications This means that merely installing a Python-level trace function – for example, by running with 'python -m trace' or under pdb – can cause otherwise correct code to give wrong answers, in an incredibly obscure fashion. In real life, I originally ran into this in a substantially more complicated situation involving PyPy and coverage.py and a nonlocal variable that was protected by a mutex, which you can read about here: https://bitbucket.org/pypy/pypy/issues/2591/ It turns out that since this only affects *Python*-level trace functions, merely recording coverage information using coverage.py doesn't normally trigger it, since on CPython coverage.py tries to install an accelerator module that uses a *C*-level trace function. Which is lucky for my particular case. But probably it should be fixed anyway :-). CC'ing belopolsky as the interest area for the trace module, Mark Shannon because he seems to enjoy really pathological low-level bugs, and Benjamin and Yury as the "bytecode" interest area. I'm surprised that there's apparently no interest area for, like, "variables should retain the values they are assigned" -- apologies if I've CC'ed anyone incorrectly. -- components: Interpreter Core files: thread-closure-bug-demo.py messages: 296751 nosy: Mark.Shannon, belopolsky, benjamin.peterson, njs, yselivanov priority: normal severity: normal status: open title: Local variable assignment is broken when combined with threads + tracing + closures versions: Python 2.7, Python 3.5, Python 3.6, Python 3.7 Added file: http://bugs.python.org/file46971/thread-closure-bug-demo.py ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30744> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30623] python-nightly import numpy fails since recently
Nathaniel Smith added the comment: @Arek: It's great that you're testing your code against the latest 3.7 pre-release, because that helps give early warning of issues in CPython as its developed, which helps everyone. BUT, you really cannot expect to use in-development versions and expect everything to be perfectly supported and work well together all the time -- we need time and space to work things out and coordinate. If you can't tolerate failures in your 3.7-prerelease builds, then you should remove them from you CI entirely and wait for the final release (or at least sometime in the beta/rc period). Anyway- Numpy bug here: https://github.com/numpy/numpy/issues/9227 This CPython PR is also relevant: https://github.com/python/cpython/pull/1236 Basically the bug is that numpy just made a release (1.13.0) which assumed that if it's running on some version of 3.7, then that PR is surely merged. But it turns out that that PR is not yet merged. We'll probably need to make the usual emergency x.x.1 release in a few days anyway, and the simplest fix would be to switch back to the underscore-prefixed versions then. But... this needs a bit of coordination, because if we do that then the same thing will happen again at some point when that PR is merged; the problem here is that numpy was trying to avoid that breakage and thus created different breakage :-). My suggestion would be that NumPy go ahead with releasing a fixed version ASAP, and *also* that CPython PR 1236 add back-compat for the underscore-prefixed versions. And maybe that it be merged soon if it's easy and going to happen anyway, since that will retroactively unbreak 1.13.0. Bigger picture: I guess there's some irony that I warned that testing against prereleases can get complicated at the language summit based on numpy's experience, and now it's numpy's fault that cpython is getting grumpy users filing bugs at it... sorry about that. It's a useful data point though that we should keep in mind, that if we're encouraging people to test prereleases, then what used to be a temporary breakage that no-one would notice can actually cause more problems now – again, if numpy hadn't broken this now, PR 1236 would have at some point, and with the same results. (In the past it's been common that numpy just doesn't support a new python release until a month or two after it comes out! Apparently those days have passed...) -- nosy: +brett.cannon, haypo ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30623> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30623] python-nightly import numpy fails since recently
Nathaniel Smith added the comment: It's because Julian thinks _PyTraceMalloc_Untrack is going to lose its underscore in 3.7, so he made numpy assume that. Numpy issue, I'll see what we can do. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30623> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30038] Race condition in how trip_signal writes to wakeup fd
Nathaniel Smith added the comment: But, by that definition, like... every change is backwards incompatible. I'm pretty confident that no-one was relying on this race condition. I can't even figure out what that would mean. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30038> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30038] Race condition in how trip_signal writes to wakeup fd
Nathaniel Smith added the comment: I think you mean it's backwards *compatible*? There's definitely no change in APIs or ABIs or anything, all that changes is the order of some statements inside Python's signal handler. (We used to we write to the magic wakeup fd and then set a flag saying a signal arrived; now we do the same things but in the opposite order. I wouldn't call it zero risk, just on the general principle that signal handlers are tricksy beasts, but I think it's about as low risk as a signal handler change can be.) AFAICT the race condition also affects twisted and tornado, though I don't think they've noticed. The main symptom is that it makes control-C handling flaky on Windows, so it's a difficult thing to test for. I would prefer to see it backported to 3.6, but it's a weird edge case bug so the world is not going to end if the fix has to wait for 3.7. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30038> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30038] Race condition in how trip_signal writes to wakeup fd
Nathaniel Smith added the comment: I guess now would be a good time to decide whether this should be backported to 3.6, with 3.6.2 coming out in a few days :-). (Or if not, then it can probably be closed?) -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30038> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue29943] PySlice_GetIndicesEx change broke ABI in 3.5 and 3.6 branches
Nathaniel Smith added the comment: Looks good to me, thanks Serhiy. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue29943> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30594] Refcounting mistake in _ssl.c
Changes by Nathaniel Smith <n...@pobox.com>: -- pull_requests: +2065 ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30594> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30594] Refcounting mistake in _ssl.c
Nathaniel Smith added the comment: posted backports for 3.5 and 3.6. It looks like 2.7 is actually unaffected, because it doesn't have IDNA support, so there's no failure path in between when the reference is stored and when its INCREFed. -- versions: -Python 2.7 ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30594> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30594] Refcounting mistake in _ssl.c
Changes by Nathaniel Smith <n...@pobox.com>: -- pull_requests: +2057 ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30594> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30594] Refcounting mistake in _ssl.c
Changes by Nathaniel Smith <n...@pobox.com>: -- pull_requests: +2058 ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30594> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue17305] IDNA2008 encoding missing
Changes by Nathaniel Smith <n...@pobox.com>: -- nosy: +njs ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue17305> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28414] SSL match_hostname fails for internationalized domain names
Nathaniel Smith added the comment: If the SSL module followed the pattern of encoding all str to bytes at the edges while leaving bytes alone, and used exclusively bytes internally (and in this case by "bytes" I mean "bytes objects containing A-labels"), then it would at least fix this bug and also make it possible for library authors to implement their own IDNA handling. Right now if you pass in a pre-encoded byte-string, exactly what ssl.py needs to compare to the certificate, then ssl.py will convert it *back* to text :-(. -- nosy: +njs ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue28414> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30594] Refcounting mistake in _ssl.c
New submission from Nathaniel Smith: If you pass a server_hostname= that fails IDNA decoding to SSLContext.wrap_socket or SSLContext.wrap_bio, then the SSLContext object has a spurious Py_DECREF called on it, eventually leading to segfaults. Demo attached. -- assignee: christian.heimes components: SSL files: demo.py messages: 295380 nosy: alex, christian.heimes, dstufft, janssen, njs priority: normal severity: normal status: open title: Refcounting mistake in _ssl.c type: crash versions: Python 2.7, Python 3.5, Python 3.6, Python 3.7 Added file: http://bugs.python.org/file46932/demo.py ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30594> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30594] Refcounting mistake in _ssl.c
Changes by Nathaniel Smith <n...@pobox.com>: -- pull_requests: +2056 ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30594> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30359] Annotating a function as returning an (async) context manager?
Nathaniel Smith added the comment: I can think of two downsides to using __annotations__ for this: 1) Obviously contextlib itself isn't going to add any kind of annotation in any versions before 3.7, but third-party projects might (like contextlib2, say). And these projects have been known to be used on Python versions that don't have __annotations__ support. At the moment I personally don't have to care about this because sphinxcontrib-trio doesn't support anything before 3.5, but I suspect this will eventually change. (E.g. sphinx has expressed some interest in having these changes upstreamed.) So this would mean that we'd still need some other ad hoc mechanism to use when running on old Pythons, and have to check for both that and the __annotations__. But that's probably doable, so eh. It'd be helpful if contextlib2 could join in on whatever protocol, though. 2) sphinxcontrib-trio actually interacts very poorly with __annotations__ right now [1]. But I mean, I need to figure out how to fix this anyway, so... not really your problem :-). Neither of these downsides seems very compelling :-). So I guess contextlib should add some appropriate __annotations__, and *maybe* also add something like __returns_contextmanager__ = True if that's useful to maintain consistency with contextlib2 or similar? (Wasn't there some discussion about keeping type hints out of the stdlib for now? is this likely to ruffle any figures on those grounds?) [1] https://github.com/python-trio/sphinxcontrib-trio/issues/4 -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30359> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30579] Allow traceback objects to be instantiated/mutated/annotated
Nathaniel Smith added the comment: My understanding is that the major difference between a real traceback object and a TracebackException object is that the latter is specialized for printing, so it can be lighter weight (no pinning of frame objects in memory), but loses some utility (can't do post-mortem debugging). If that's right, then that's definitely not a solution, because trio and jinja2 and import errors all need to support post-mortem debugging. I'm not against the idea of defining a traceback protocol, but it seems like a lot of work when the existing traceback objects are already perfectly good container objects that are just missing a few simple features. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30579> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30579] Allow traceback objects to be instantiated/mutated/annotated
Nathaniel Smith added the comment: Uh, please ignore the random second paste of the jinja2 URL in the middle of the second to last paragraph. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30579> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30579] Allow traceback objects to be instantiated/mutated/annotated
New submission from Nathaniel Smith: Currently, traceback objects don't expose any public constructor, are immutable, and don't have a __dict__ or allow subclassing, which makes it impossible to add extra annotations them. It would be nice if these limitations were lifted, because there are rare but important cases where code needs to manipulate tracebacks directly, and currently these have to use awful stuff like ctypes. For example: Jinja2: https://github.com/pallets/jinja/blob/bec0065c4e7e89e7d893eda7840ba0219824b23c/jinja2/debug.py#L298-L372 Trio: https://github.com/python-trio/trio/blob/496493afecc22d7d1a17175b6a2748a9c3510066/trio/_core/_multierror.py#L233-L318 (Notice that on PyPy there are no ctypes, but instead they have special extension that's supported for this case only to keep jinja2 working.) For the above cases, what's needed is the ability to instantiate and assign to the fields of traceback objects. In addition, in trio I'd like to be able to annotate traceback objects so that the traceback printing machinery can do things like hide "internal" tracebacks, or highlight places where exceptions jumped between tasks. This would be much easier if there were some way to attach data to tracebacks. Probably it doesn't make sense for traceback objects to have a __dict__ by default for speed/memory reasons, but we could add a dedicated metadata slot that is normally empty but can have arbitrary data assigned, or if they allowed subclassing then I could add a __dict__ in a subclass I'm CC'ing the "import machinery" interest list, because my understanding is that with the new import system there's a similar desire to hide "internal" traceback frames, and while the features requested in this bug report won't solve that problem directly they might (are intended to) provide some useful machinery for it. Feel free to un-CC if I'm wrong :Jinja2: https://github.com/pallets/jinja/blob/bec0065c4e7e89e7d893eda7840ba0219824b23c/jinja2/debug.py#L298-L372 -). I think that this should be very straightforward to implement: it's just that no-one ever implemented setters etc. for tracebacks. -- components: Interpreter Core messages: 295235 nosy: brett.cannon, eric.snow, ncoghlan, njs priority: normal severity: normal status: open title: Allow traceback objects to be instantiated/mutated/annotated type: enhancement versions: Python 3.7 ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30579> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30482] socket.getservbyname(), socket.getservbyport(), socket.getprotobyname() are not threadsafe
Changes by Nathaniel Smith <n...@pobox.com>: -- nosy: +njs ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30482> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30491] Add a lightweight mechanism for detecting un-awaited coroutine objects
New submission from Nathaniel Smith: A common problem when working with async functions is to attempt to call them but forget the 'await', which eventually leads to a 'Warning: coroutine ... was never awaited' (possibly buried in the middle of a bunch of traceback shrapnel caused by follow-on errors). This can be confusing, lead to spuriously passing tests, and generally isn't a great experience for the dev who makes the mistake. To improve this, I'd like to do things like have a test harness reliably detect when this has happened inside a test case, or have trio's main loop check occasionally to see if this has happened and immediately raise a useful error. (Unfortunately, it *doesn't* work to promote the warning to an error using the usual warnings filter machinery, because the warning is raised inside a __del__ method, which suppresses exception propagation.) In principle this is possible with sys.setcoroutinewrapper, but that adds non-trivial runtime overhead to every async function invocation, which is something we'd like to avoid. (It's OK for a "debug mode", but "modes" are also a poor dev experience: they certainly have their place, but whenever possible it's much nicer to give a proper error in the first place instead of relying on the dev to realize they need to enable debug mode and then remember how to do it.) Therefore I propose that CPython keep a thread-local counter of how many coroutine objects currently exist in a created-but-not-yet-started state -- so corofn.__call__ would increment this counter, and the first call to coroobj.__next__/send/throw would decrement it. And there's some way to access it e.g. via a magic function in the sys module. Then test harnesses can assert that this is zero, trio could occasionally poll this from its run loop and assert that it's zero, etc., with very low overhead. (This is a slight modification of the version I discussed with Yury and others at PyCon last week; the previous request was to keep a count of how many times the "coroutine '...' was never awaited" warning had been emitted. The problem with the original idea is that the warning message doesn't fire until the garbage collector has collected the coroutine object, and that might not happen at a convenient time if we're using PyPy, or if there are cycles, or in a test where the missing 'await' eventually leads to an exception whose traceback pins the coroutine object in memory just long enough for the warning to be detected on the *next* test and confuse everyone. Thanks to Matthias Bussonnier for spurring the new idea, and see discussion here: https://github.com/python-trio/trio/issues/79#issuecomment-304364010) -- components: asyncio messages: 294584 nosy: giampaolo.rodola, haypo, ncoghlan, njs, yselivanov priority: normal severity: normal status: open title: Add a lightweight mechanism for detecting un-awaited coroutine objects versions: Python 3.7 ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30491> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12857] Expose called function on frame object
Nathaniel Smith added the comment: > Yes, whenever you touch frames you're disabling the JIT for the call site > (and maybe for more call sites up the stack, idk). So it doesn't matter what > you use, `f_func` or `f_locals`, the performance will suffer big time. Is > that acceptable for Trio? Ah, but that's the trick: with f_func the only time I have to touch frames is from my SIGINT handler, and its fine if *that* drops us out of the JIT (it probably has to in any case). The rest of the time, the interpreter would be automatically tracking all the state I need, so it's effectively free, and PyPy can do all its normal magic. The problem with using locals() / f_locals for this is that I have to constantly touch them during normal execution. (Probably we shouldn't make a huge deal out of the PyPy case though – for me the main advantages of f_func are the simplicity and atomicity. But FWIW I also just pinged Armin on #pypy and he said that PyPy wouldn't have a problem supporting f_func.) -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue12857> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue29988] (async) with blocks and try/finally are not as KeyboardInterrupt-safe as one might like
Nathaniel Smith added the comment: On further thought, I think the way I'd write a test for this is: (1) add a testing primitive that waits for N instructions and then injects a SIGINT. Probably this would require tweaking the definition of Py_MakePendingCalls like I described in my previous comment, and then some code like: int sigint_timebomb(void* count_ptr) { intptr_t count = (intptr_t) count_ptr; if (count == 0) raise(SIGINT); else Py_AddPendingCall(sigint_timebomb, (void*) (count - 1)); return 0; } (2) write a test like: reached_crashpad = False lock = threading.Lock() i = 0 while not reached_crashpad: i += 1 try: sigint_timebomb(i) with lock: 1 + 1 # this loop keeps executing instructions until any still-armed # timebombs go off while True: reached_crashpad = True except KeyboardInterrupt: # key invariant: signal can't leave the lock held assert not lock.locked() else: assert False # can't happen The idea here is that this sets off SIGINTs at every possible place throughout the execution of the 'with lock', while checking that the lock is always released, and without needing to hard code any information at all about what bytecode the compiler generated. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue29988> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12857] Expose called function on frame object
Nathaniel Smith added the comment: > I'm not sure I understand how `f_func` would help to better handle Control-C > in Trio. Nathaniel, could you please elaborate on that? Sure. The issue is that I need to mark certain frames as "protected" from KeyboardInterrupt, in a way that my signal handler can see when walking the frame stack, so it can decide whether to raise a KeyboardInterrupt immediately or to wait until a safer moment. Right now, this is done by injecting a magic local variable into the frame object, so the code is effectively like: def a_protected_function(): _trio_keyboard_interrupt_protection_magic_local_ = True ... actual function ... But this has two problems: (1) it's not atomic: a signal could arrive in between when the frame is created and when the magic local gets set. (This is especially a problem for __(a)exit__ methods -- if bpo-29988 is fixed then it will guarantee that __(a)exit__ gets called, but this isn't much help if __(a)exit__ can instantly abort without doing any actual cleanup :-).) (2) The decorator code for setting this magic local ends up having to special-case functions/generators/async functions/async generators, so it's very complex and fragile: https://github.com/python-trio/trio/blob/1586818fbdd5c3468e15b64816db9257adae49c1/trio/_core/_ki.py#L109-L149 (This is possibly the function that took the most tries to get right in all of trio, because there are serious subtleties and interactions with multiple CPython bugs.) I suspect the way I'm currently munging frames at runtime also annoys PyPy's JIT. OTOH if frames had f_func, then instead of having to munge frame.f_locals, I could set a magic attribute on the *functions* that are supposed to transition between protected/unprotected mode. Then I could delete all that nasty code and replace it with something like: def enable_ki_protection(func): func._trio_keyboard_interrupt_protection_enabled = True return func and in addition to being simpler, this would fix the race condition and improve performance too. (Note that now this decorator only runs at function definition time; I wouldn't have to do any tricky stuff at all during regular execution.) -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue12857> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12857] Expose called function on frame object
Nathaniel Smith added the comment: Certainly which frame is being executed is an implementation detail, and I can see an argument from that that we shouldn't have a frame introspection API at all... but we do have one, and it has some pretty important use cases, like traceback printing. Obviously most people shouldn't be touching this API at all, but if you do need it then it IMO should be as useful as possible. I don't see a better way to handle my (admittedly a bit weird) control-C case within the interpreter; I'd be interested to hear what you have in mind. But even if there is a nice way to fix control-C, then the proposal here would still be useful for e.g. giving better tracebacks. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue12857> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue29988] (async) with blocks and try/finally are not as KeyboardInterrupt-safe as one might like
Nathaniel Smith added the comment: > If all you need is that with foo: pass guarantees that either both or neither > of __enter__ and __exit__ are called, for C context managers, and only C > context managers, then the fix is trivial. It would be nice to have it for 'async with foo: pass' as well, which is a little less trivial because the 'await' dance is inlined into the bytecode (see the initial post in this bug), but basically yes. > Do you have any way to reliably test for this failure mode? Unfortunately no, I haven't implemented one. Let's see, though... The test I wrote for issue30039 demonstrates one way to trigger a signal on a precise bytecode, by writing some code in C that calls raise(SIGNALNUMBER), and then calling it immediately before the bytecode where we want the signal to be raised (simulating the situation where the signal happens to arrive while the C function is running -- note that raise() is convenient because unlike kill() it works cross-platform, even on Windows). This might be sufficient for testing the 'async with' version; it looks like an __await__ method or iterator implemented in C and calling raise() would deliver a signal at a point that should be protected but isn't. The tricky case is plain 'with'. We can write something like: with lock: raise_signal() and this gives bytecode like: 1 0 LOAD_NAME0 (lock) 2 SETUP_WITH 12 (to 16) 4 POP_TOP 2 6 LOAD_NAME1 (raise_signal) 8 CALL_FUNCTION0 10 POP_TOP 12 POP_BLOCK 14 LOAD_CONST 0 (None) >> 16 WITH_CLEANUP_START So the problem is that at offset 8 is where we can run arbitrary code, but the race condition is if a signal arrives between offsets 12 and 16. One possibility would be to set up a chain of Py_AddPendingCall handlers, something like: int chain1(void* _) { Py_AddPendingCall(chain2, 0); return 0; } int chain2(void* _) { Py_AddPendingCall(chain3, 0); return 0; } int chain3(void* _) { raise(SIGINT); return 0; } (or to reduce brittleness, maybe use the void* to hold an int controlling the length of the chain, which would make it easy to run tests with chains of length 1, 2, 3, ...) ..except consulting ceval.c I see that currently this won't work, because it looks like if you call Py_AddPendingCall from inside a pending call callback, then Py_MakePendingCalls will execute the newly added callback immediately after the first one returns. It could be made to work by having Py_MakePendingCalls do a first pass to check the current length of the queue, and then use that as the bound on how many calls it makes. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue29988> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30437] SSL_shutdown can return meaningless SSL_ERROR_SYSCALL
Nathaniel Smith added the comment: Debian testing, x86-64, with: Python 3.5.3rc1 (default, Jan 3 2017, 04:40:57) [GCC 6.3.0 20161229] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import ssl >>> ssl.OPENSSL_VERSION 'OpenSSL 1.1.0e 16 Feb 2017' -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30437> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue29988] (async) with blocks and try/finally are not as KeyboardInterrupt-safe as one might like
Nathaniel Smith added the comment: Right, fixing this bug alone can't make programs control-C safe in general. But there exist techniques to make __enter__/__exit__ methods safe WRT control-C, and if we fix this bug *and* apply those techniques then you can get some meaningful guarantees. For example, threading.Lock's __enter__ and __exit__ methods are written in C, so they are guaranteed to happen atomically WRT to control-C. Right now, it's possible for a 'with lock: ...' to raise KeyboardInterrupt with the lock still held, which is surprising and not what we would like. If this bug were fixed, then it would be guaranteed that the lock was always released. (And I originally discovered this while implementing control-C handling in trio, which is pure-Python but also has a way to protect __(a)enter__/__(a)exit__ methods against control-C. ...though that won't be 100% reliable until bpo-12857 or similar is fixed. See https://vorpus.org/blog/control-c-handling-in-python-and-trio/ for exhaustive details.) -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue29988> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12857] Expose called function on frame object
Nathaniel Smith added the comment: I'd also like to make use of this in trio, as a way to get safer and less complicated control-C handling without having to implement things in C. (Exhaustive discussion: https://vorpus.org/blog/control-c-handling-in-python-and-trio/) @Nick: I understand your proposal is to add a field on the frame that for regular function calls points to the function object, and for generator/coroutines points to the generator/coroutine object. Is that right? Two possible concerns there: (a) this creates a reference cycle, because generator/coroutine objects own a reference to the frame object, (b) AFAICT you also currently can't get from a generator/coroutine object back to the function object, so this would break some (all?) of the use cases for this. I guess the solution to (a) is to make it a weak reference, and for (b) either keep f_func as always pointing to the function and make f_generator a separate field, or else make it f_origin and also add gi_func/co_func fields to generator/coroutine objects. (Which might be useful in any case.) Also, I'm not quite following the proposed use case. When a generator/coroutine is resumed, then send() steps through the whole yield stack and reconstructs the frame stack so it matches the generator/coroutine stack. (throw() currently doesn't do this, but that's a bug that breaks profilers/debuggers/etc - bpo-29590 - so we should fix it anyway, and I think Yury is planning to do that soon.) So if you get a frame from sys._getframe() or similar, then the stack is correct. And if a coroutine/generator isn't running, then the only way to get to the frame is by starting with the coroutine/generator object, so we don't really need a way to get back. Tracebacks are a different case again because they continue to hold frame references after the stack is unwound, but generally the traceback already has the correct stack because the exception propagates up the coroutine/generator stack, and also aren't the {gi,co}_frame references cleared as the exception propagates anyway? -- nosy: +njs ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue12857> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue29334] ssl.SSLObject method getpeercert() is buggy, do_handshake() is strange
Nathaniel Smith added the comment: Oddly, I expected to run into this with my code using SSLObject in trio [1], but if I connect to python.org:443 and then 'await trio_ssl_stream.do_handshake(); trio_ssl_stream.getpeercert()' it works just fine ... even though when I run the sslbugs.py script I get the same weird results Greg reports. As far as I can tell the logic is identical. So I guess this might potentially be useful to narrow this down :-). Test code that works: @trio.run async def main(): import trio sock = trio.socket.socket() addr = await sock.resolve_remote_address(("python.org", 443)) await sock.connect(addr) s = trio.SocketStream(sock) client = trio.ssl.SSLStream( s, trio.ssl.create_default_context(), server_hostname="python.org") await client.do_handshake() print(client.getpeercert()) [1] Currently in https://github.com/python-trio/trio/pull/107, eventually will be at https://github.com/python-trio/trio/blob/master/trio/ssl.py -- nosy: +njs ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue29334> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30437] SSL_shutdown can return meaningless SSL_ERROR_SYSCALL
New submission from Nathaniel Smith: The SSL_shutdown man page says that if it returns 0, and an SSL_ERROR_SYSCALL is set, then SSL_ERROR_SYSCALL should be ignored - or at least I think that's what it's trying to say. See the RETURN VALUES section. I think this means we should only raise this error if the return value is <0? Though I suppose we need to clear out the error queue in any case. I ended up changing the code that was triggering this for other reasons and now I'm not hitting it, so it's not exactly urgent for me, but FYI... I was getting SSLSyscallError exceptions from code using memory BIOs and where everything was fine. The test case had one task continually sending data into an SSLObject-based stream while the other end called SSLObject.unwrap() and then ended up continually getting SSLWantRead and reading more data -- after a few cycles of reading it got SSLSyscalError instead. -- assignee: christian.heimes components: SSL messages: 294216 nosy: alex, christian.heimes, dstufft, janssen, njs priority: normal severity: normal status: open title: SSL_shutdown can return meaningless SSL_ERROR_SYSCALL versions: Python 2.7, Python 3.6, Python 3.7 ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30437> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue29137] Fix fpectl-induced ABI breakage
Nathaniel Smith added the comment: @Dima: > @njs: to point out that usefulness of this module is not just wishful > thinking. I just used it to locate, up to the line in a Python extension > module written in C, a bug in Sagemath (that has perhaps 20 FPU-using > extensions, some of them as large as numpy). (Without using it we were > pulling out our hair for weeks over this) That's pretty cool :-). But from skimming your link, it sounds like it would have been sufficient in your case to add a call to "fesetmask(FP_X_INV)" using C or Cython or ctypes (or directly in gdb), and then running the program under gdb to get a backtrace where the SIGFPE was delivered? Is that correct? Or did your debugging depend on the specific fpectl machinery for responding to that signal? > PS. I would volunteer to fix it and maintain it, assuming I have some modest > funding to support such an activity. I'm not personally aware of any funding sources for this, if that's the question. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue29137> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30038] Race condition in how trip_signal writes to wakeup fd
Nathaniel Smith added the comment: > (BTW do you happen to know any tricks to force CPython to do an immediate > PyErr_CheckSignals on Windows?) Never mind on this... it looks like calling repr() on any object is sufficient. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30038> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30038] Race condition in how trip_signal writes to wakeup fd
Nathaniel Smith added the comment: > While I suggest you to *not* use an event loop (wakeup fd pipe/socket handle > with select) and signal.signal(), you are true that there is a race condition > if you use select() with signal.signal() so I merged your change. Unfortunately this is the only 100% reliable way to do signal handling with an event loop so I guess you are unlikely to convince any of the projects that use it this way to change their minds :-) > Do you consider that it's worth it to backport the change to 3.5 and 3.6? I guess it'd be nice? At least for 3.6? I just had yet another random PR fail its tests due to this bug -- it's turning out to be quite difficult to make my tests reliable :-/. (BTW do you happen to know any tricks to force CPython to do an immediate PyErr_CheckSignals on Windows?) -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30038> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue29137] Fix fpectl-induced ABI breakage
Nathaniel Smith added the comment: Another option you might want to consider is proposing to add a proper fpu control flag setting/checking API to the math module. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue29137> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue29137] Fix fpectl-induced ABI breakage
Nathaniel Smith added the comment: Also fixing the abi issues that started this, and probably making an argument for why it makes sense for all of cpython's built-in float operations to check the fpu flags, and to do so using a weird longjmp-based mechanism that only some platforms support. The fact that it's disabled by default and has been broken for a decade+ without anyone noticing might be working against you here... You might get the impression that I think this is a bad idea. I do :-). But I am genuinely trying to helpful; I'm sure people would be willing to listen to an argument, and if you want to make one then those are the sorts of issues you're likely to need some answer for. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue29137> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue29137] Fix fpectl-induced ABI breakage
Nathaniel Smith added the comment: @Dima: are you volunteering to fix and maintain it? I can see why it's useful to have some way to get at the fpu flags, but I don't see how fpectl specifically helps with that issue, and fpectl has always been broken on x86-64. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue29137> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30359] A standard convention for annotating a function as returning an (async) context manager?
New submission from Nathaniel Smith: sphinxcontrib-trio [1] does a few things; one of them is to enhance sphinx's autodoc support by trying to sniff out the types of functions so that it can automatically determine that something is, say, a generator, or an async classmethod. This runs into problems when it comes to context managers, for two reasons. One reason is pretty obvious: the sniffing logic would like to be able to tell by looking at a function that it should be used as a context manager, so that it can document it as such, but there's no reliable way to do this. The other reason is more subtle: the sniffing code has to walk the .__wrapped__ chain in order to detect things like async classmethods, but when it walks the chain for a @contextmanager function, it finds the underlying generator, and ends up thinking that this function should be documented as returning an iterator, which is just wrong. If we can detect context managers, we can know to ignore any Obviously this is always going to be a heuristic process; there's no 100% reliable way to tell what arbitrary wrappers are doing. But the heuristic works pretty well... it's just that right now the method of detecting context manager functions is pretty disgusting: https://github.com/python-trio/sphinxcontrib-trio/blob/2d9e65187dc7a08863b68a78bdee4fb051f0b99e/sphinxcontrib_trio/__init__.py#L80-L90 https://github.com/python-trio/sphinxcontrib-trio/blob/2d9e65187dc7a08863b68a78bdee4fb051f0b99e/sphinxcontrib_trio/__init__.py#L241-L246 So it would be really nice if contextlib.contextmanager somehow marked its functions because: - then I could (eventually) use that marker instead of making assumptions about contextmanager's internals and messing with code objects - then we could (immediately) point to how the standard library does it as a standard for other projects to follow, instead of using this weird sphinxcontrib-trio-specific __returns_contextmanager__ thing that I just made up. I don't really care what the marker is, though I suppose __returns_contextmanager__ = True is as good as anything and has the advantage of a massive installed base (2 projects). [1] http://sphinxcontrib-trio.readthedocs.io/ -- components: Library (Lib) messages: 293592 nosy: ncoghlan, njs, yselivanov priority: normal severity: normal status: open title: A standard convention for annotating a function as returning an (async) context manager? versions: Python 3.7 ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30359> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue29943] PySlice_GetIndicesEx change broke ABI in 3.5 and 3.6 branches
Nathaniel Smith added the comment: @Jonathan: Even 3.6.1 was careful to retain compatibility with code built by 3.6.0. And your proposed 3.6.1-patched will generate binaries equivalent to the ones 3.6.0 generates. So I don't think you need to worry; 3.6.2 is not going to add a new and worse compatibility break, which is what it would take for your proposal to cause a problem. In fact, your 3.6.1-patched is equivalent to what I think 3.6.2 should do. I say go for it; patching 3.6.1 seems like a good solution. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue29943> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30300] asyncio.Controller
Nathaniel Smith added the comment: Looks interesting! What's the advantage over running the server and the test in the same loop? The ability to use blocking operations in the tests, and to re-use an expensive-to-start server over multiple tests? (I've mostly used threads in tests to run blocking code for interoperability testing, and kept the async code in the main thread, so this is a bit novel to me.) -- nosy: +njs ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30300> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue29943] PySlice_GetIndicesEx change broke ABI in 3.5 and 3.6 branches
Nathaniel Smith added the comment: I don't find it helpful to think of it as declaring 3.6.0 broken vs declaring 3.6.1 broken. 3.6.0 is definitely good in the sense that if you build a module on it then it will import on both 3.6.0 and 3.6.1, and 3.6.1 is definitely good in the sense that if you're using it then you can import any module that was built on 3.6.0 or 3.6.1. I think 3.6.2 should combine the best aspects of 3.6.0 and 3.6.1, in the sense that modules built on 3.6.2 should be importable on any 3.6.x, and 3.6.2 should be able to import modules built on any 3.6.x. Fortunately this is very easy – it just means that 3.6.2 should continue to export the PySlice_AdjustIndices symbol (because existing 3.6.1 builds will need it to import on 3.6.2), but should remove the #define of PySlice_GetIndicesEx (so code built against 3.6.2 will not *require* PySlice_AdjustIndices to be present). Removing 3.6.0 and 3.6.1 from circulation is simply not possible; the best 3.6.2 can hope for is to interoperate with them as well as possible. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue29943> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue29943] PySlice_GetIndicesEx change broke ABI in 3.5 and 3.6 branches
Nathaniel Smith added the comment: More collateral damage: apparently the workaround that Pandas used for this bug (#undef'ing PySlice_GetIndicesEx) broke PyPy, so now they need a workaround for the workaround: https://github.com/pandas-dev/pandas/pull/16192 Recording this here partly as a nudge since IIUC the breaking change is still in the 3.6 tree, and partly for the benefit of future projects that are also trying to work around this. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue29943> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue29943] PySlice_GetIndicesEx change broke ABI in 3.5 and 3.6 branches
Nathaniel Smith added the comment: Pillow also had broken wheels up on pypi for a while; they've now put out a bug fix release that #undef's PySlice_GetIndicesEx, basically monkeypatching out the bugfix to get back to the 3.6.0 behavior: https://github.com/python-pillow/Pillow/issues/2479 -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue29943> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue29943] PySlice_GetIndicesEx change broke ABI in 3.5 and 3.6 branches
Nathaniel Smith added the comment: Apparently this also broke pyqt for multiple users; here's the maintainers at conda-forge struggling to figure out the best workaround: https://github.com/conda-forge/pyqt-feedstock/pull/25 I really think it would be good to fix this in 3.6 sooner rather than later. Downstairs projects are accumulating workarounds and pinning 3.6.0 as we speak. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue29943> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30151] Race condition in use of _PyOS_SigintEvent on windows
Nathaniel Smith added the comment: Oh, I should also say that this isn't actually affecting me, I just figured that once I was aware of the bug it was worth making a record here. Might be a good starter bug for someone trying to get into CPython's internals :-) -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30151> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30151] Race condition in use of _PyOS_SigintEvent on windows
New submission from Nathaniel Smith: As pointed out in this stackoverflow answer: http://stackoverflow.com/a/43578450/ and since I seem to be collecting signal-handling bugs these days :-), there's a race condition in how the interpreter uses _PyOS_SigintEvent to allow control-C to break out of functions like time.sleep on Windows. Suppose we have a call to time.sleep(), and the user hits control-C while it's running. What's supposed to happen is: - the windows implementation of pysleep in Modules/timemodule.c does ResetEvent(hInterruptEvent) - then it blocks waiting for the interrupt event to be set *or* the timeout to expire - the C-level signal handler runs in a new thread, which sets the "hey a signal arrived" flag and then sets the event - the main thread wakes up because the event is set, and runs PyErr_CheckSignals() - this notices that that the signal has arrived and runs the Python-level handler, all is good But what can happen instead is: - before doing CALL_FUNCTION, the eval loop checks to see if any signals have arrived. They haven't. - then the C implementation of time.sleep starts executing. - then a signal arrives; the signal handler sets the flag and sets the event - then the main thread clears the event again - then it blocks waiting for the event to be set or the timeout to expire. But the C-level signal handler's already done and gone, so we don't realize that the flag is set and we should wake up and run the Python-level signal handler. The solution is that immediately *after* calling ResetEvent(_PyOS_SigintEvent()) but *before* sleeping, we should call PyErr_CheckSignals(). This catches any signals that arrived before we called ResetEvent, and any signals that arrive after that will cause the event to become set and wake us up, so that eliminates the race condition. This same race-y pattern seems to apply to appear in Modules/timemodule.c, Modules/_multiprocessing/semaphore.c, and Modules/_winapi.c. _winapi.c also handles the event in a weird way that doesn't make sense to me – if the user hits control-C it raises an OSError instead of running the signal handler? OTOH I *think* Modules/_io/winconsoleio.c already handles the race condition correctly, and I don't dare make a guess about whatever Parser/myreadline.c is doing. -- components: Interpreter Core messages: 292196 nosy: njs priority: normal severity: normal status: open title: Race condition in use of _PyOS_SigintEvent on windows versions: Python 3.5, Python 3.6, Python 3.7 ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30151> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30141] If you forget to call do_handshake, then everything seems to work but hostname checking is disabled
Changes by Nathaniel Smith <n...@pobox.com>: -- title: If you forget to call do_handshake, then everything seems to work but hostname is disabled -> If you forget to call do_handshake, then everything seems to work but hostname checking is disabled ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30141> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30141] If you forget to call do_handshake, then everything seems to work but hostname is disabled
New submission from Nathaniel Smith: Basically what it says in the title... if you create an SSL object via wrap_socket with do_handshake_on_connect=False, or via wrap_bio, and then forget to call do_handshake and just go straight to sending and receiving data, then the encrypted connection is successfully established and everything seems to work. However, in this mode the hostname is silently *not* checked, so the connection is vulnerable to MITM attacks. (I guess from reading the SSL_read and SSL_write manpages that openssl is just silently doing the handshake automatically – very helpfully! – but it's only Python's do_handshake code that knows to check the hostname?) This doesn't affect correctly written programs that follow the documentation and either use do_handshake_on_connect=True (the default for wrap_socket) or explicitly call do_handshake, so it's not a super-scary bug. But IMHO it definitely shouldn't be this easy to silently fail-open. The attached test script sets up a TLS echo server that has a certificate for the host "trio-test-1.example.org" that's signed by a locally trusted CA, and then checks: - connecting to it with do_handshake and expecting the correct hostname: works, as expected - connecting to it with do_handshake and expecting a different hostname: correctly raises an error due to the mismatched hostnames - connecting to it withOUT do_handshake and expecting a different hostname: incorrectly succeeds at connecting, sending data, receiving data, etc., without any error and it checks using both ctx.wrap_socket(..., do_handshake_on_connect=False) and a little custom socket wrapper class defined using ctx.wrap_bio(...). I've only marked 3.5 and 3.6 as affected because those are the only versions I've tested, but I suspect other versions are affected as well. -- assignee: christian.heimes components: SSL files: ssl-handshake.zip messages: 292158 nosy: christian.heimes, njs priority: normal severity: normal status: open title: If you forget to call do_handshake, then everything seems to work but hostname is disabled versions: Python 3.5, Python 3.6 Added file: http://bugs.python.org/file46827/ssl-handshake.zip ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30141> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue29943] PySlice_GetIndicesEx change broke ABI in 3.5 and 3.6 branches
Nathaniel Smith added the comment: FYI: https://github.com/pandas-dev/pandas/pull/16066 -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue29943> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30050] Please provide a way to disable the warning printed if the signal module's wakeup fd overflows
Nathaniel Smith added the comment: I haven't noticed the error in the wild because I don't use set_wakeup_fd on Linux/MacOS, because of this issue :-). But on MacOS literally all it would take is to receive two signals in quick succession, or to receive one signal at a moment when someone has recently scheduled a callback from a thread (which also writes to the wakeup fd). In principle it can also happen on Windows, but on Windows apparently the smallest buffer you get for a socketpair is like 500 kilobytes, so it doesn't come up except in rather artificial situations. And yes, a possible workaround is to use an artificially large buffer on Linux/MacOS as well. I'm not saying this is a huge showstopper. But it's silly for downstream libraries to carry around workarounds for issues in CPython when we could just fix CPython. > If the pipe becomes full, you loose signals: it's an hard error. I decided to > log an error into fd 2 because in a signal handler, the number of allowed > functions is very limited. This makes sense for asyncio, but not for most other async libraries in Python, because they *don't* lose signals when the pipe becomes full. That's why I'm suggesting that the signal module should have an option to let users decide whether they want the warning or not. (It might also make sense to switch asyncio to stop using the fd as a communications channel. Theoretically this would be the ideal solution, but I understand why it's not a priority.) > Related question: does asyncio support signals on Windows? I added support > for set_wakeup_fd() to Windows for asyncio, but then I forgot this TODO item > :-) No, asyncio doesn't use set_wakeup_fd on Windows, so code like 'loop.run_until_complete(asyncio.sleep(9))' can't be interrupted. I'd appreciate if you could wait a few days to fix this though, because the blog post that I'm almost finished writing about signal handling currently uses this as an example of how trio is more awesome than asyncio ;-). (This is a joke. Well, I mean, mostly. My draft does literally have that sleep example in it, but I won't be grumpy if you fix it first. Or at least, not *very* grumpy :-).) -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30050> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30050] Please provide a way to disable the warning printed if the signal module's wakeup fd overflows
New submission from Nathaniel Smith: When a wakeup fd is registered via signal.set_wakeup_fd, then the C level signal handler writes a byte to the wakeup fd on each signal received. If this write fails, then it prints an error message to the console. Some projects use the wakeup fd as a way to see which signals have occurred (asyncio, curio). Others use it only as a toggle for "a wake up is needed", and transmit the actual data out-of-line (twisted, tornado, trio – I guess it has something to do with the letter "t"). One way that writing to the wakeup fd can fail is if the pipe or socket's buffer is already full, in which case we get EWOULDBLOCK or WSAEWOULDBLOCK. For asyncio/curio, this is a problem: it indicates a lost signal! Printing to the console isn't a great solution, but it's better than letting the error pass silently. For twisted/tornado/trio, this is a normal and expected thing – the semantics we want are that after a signal is received then the fd will be readable, and if its buffer is full then it's certainly readable! So for them, EWOULDBLOCK/WSAEWOULDBLOCK are *success* conditions. Yet currently, the signal module insists on printing a scary message to the console whenever we succeed in this way. It would be nice if there were a way to disable this; perhaps something like: signal.set_wakeup_fd(fd, warn_on_full_buffer=False) This is particularly annoying for trio, because I try to minimize the size of the wakeup fd's send buffer to avoid wasting non-swappable kernel memory on what's essentially an overgrown bool. This ends up meaning that on Linux the buffer is 6 bytes, and on MacOS it's 1 byte. So currently I don't use the wakeup fd on Linux/MacOS, which is *mostly* OK but it would be better if we could use it. Trio bug with a few more details: https://github.com/python-trio/trio/issues/109 -- messages: 291532 nosy: haypo, njs priority: normal severity: normal status: open title: Please provide a way to disable the warning printed if the signal module's wakeup fd overflows versions: Python 3.7 ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30050> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30038] Race condition in how trip_signal writes to wakeup fd
Nathaniel Smith added the comment: The attached script wakeup-fd-racer.py fails consistently for me using cpython 3.6.0 on my windows 10 vm: > python wakeup-fd-racer.py Attempt 0: start Attempt 0: FAILED, took 10.0160076 seconds select_calls = 2 (It may help that the VM only has 1 CPU? But the same test passes on Linux even when I use taskset to restrict it to 1 cpu. Maybe Windows has some scheduling heuristic where one thread writing to a socket when another thread is blocked on it triggers an immediate context switch.) -- Added file: http://bugs.python.org/file46797/wakeup-fd-racer.py ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30038> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30038] Race condition in how trip_signal writes to wakeup fd
Nathaniel Smith added the comment: If it helps, notice that the SetEvent(sigint_event) call used to wake up the main thread on windows is also performed unconditionally and after the call to Py_AddPendingEvent. From the point of view of twisted/tornado/trio, this is exactly the same as the write to the wakeup fd -- the only reason we use the wakeup fd instead of the sigint_event is that it's more convenient to wait on an fd than on an event object. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30038> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30038] Race condition in how trip_signal writes to wakeup fd
Nathaniel Smith added the comment: I think the idea in c13ef6664998 wasn't so much that we wanted the wakeup fd to be written to first, as that the way the code was written back then, the presence of 'if (is_tripped) return;' meant that it wasn't getting written to *at all* in some cases. Since then the code got restructured and the early return was removed, so this isn't an issue anymore. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30038> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30038] Race condition in how trip_signal writes to wakeup fd
Nathaniel Smith added the comment: Err, libuv obviously doesn't use a Python-level signal handler. I just meant to include them as another example of a library I checked that uses a self-pipe to handle signals but relies on out-of-band information to transmit what the actual signal is :-). -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30038> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30038] Race condition in how trip_signal writes to wakeup fd
Nathaniel Smith added the comment: Right. My claim would be that the PR I just submitted is the correct fix for bpo-21645 as well. The approach asyncio uses is very elegant, but unfortunately it assumes that the wakeup fd has infinite buffer, which isn't true. If enough signals or other callbacks are scheduled to the event loop quickly enough, then the buffer can overflow and signals can get lost. This is a very classic issue and source of confusion for everyone learning Unix – the reason that there are traditionally 32 unix signals, and that signals can be coalesced and SIGCHLD is so annoying, is that the kernel treats pending signals as flags, not a queue; this means that in the worst case it only takes 32 bits to store all pending signals. (Even the posix committee screwed this up when designing the real-time signals system, which is almost unusable as a result.) On Unix, if you send a process 10 SIGHUP and 1 SIGCHLD, then it might get only 1 SIGHUP and 1 SIGCHLD, but it's guaranteed to get that. With asyncio's model, it might get 10 SIGHUP and 0 SIGCHLD, or even 0 SIGHUP and 0 SIGCHLD (since other methods like call_soon_threadsafe also write to the wakeup fd). In any case, this is why other async libraries (at least twisted, libuv, tornado, trio) treat the wakeup fd as a boolean empty-or-not, and pass the actual information via some other out-of-band mechanism that involves a Python-level signal handler. So the patch here is necessary for everyone else to safely use set_wakeup_fd. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30038> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30038] Race condition in how trip_signal writes to wakeup fd
Changes by Nathaniel Smith <n...@pobox.com>: -- pull_requests: +1226 ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30038> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30039] Resuming a 'yield from' stack is broken if a signal arrives in the middle
Changes by Nathaniel Smith <n...@pobox.com>: -- pull_requests: +1224 ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30039> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30039] Resuming a 'yield from' stack is broken if a signal arrives in the middle
New submission from Nathaniel Smith: If we have a chain of generators/coroutines that are 'yield from'ing each other, then resuming the stack works like: - call send() on the outermost generator - this enters _PyEval_EvalFrameDefault, which re-executes the YIELD_FROM opcode - which calls send() on the next generator - which enters _PyEval_EvalFrameDefault, which re-executes the YIELD_FROM opcode - ...etc. However, every time we enter _PyEval_EvalFrameDefault, the first thing we do is to check for pending signals, and if there are any then we run the signal handler. And if it raises an exception, then we immediately propagate that exception *instead* of starting to execute bytecode. This means that e.g. a SIGINT at the wrong moment can "break the chain" – it can be raised in the middle of our yield from chain, with the bottom part of the stack abandoned for the garbage collector. The fix is pretty simple: there's already a special case in _PyEval_EvalFrameEx where it skips running signal handlers if the next opcode is SETUP_FINALLY. (I don't see how this accomplishes anything useful, but that's another story.) If we extend this check to also skip running signal handlers when the next opcode is YIELD_FROM, then that closes the hole – now the exception can only be raised at the innermost stack frame. This shouldn't have any performance implications, because the opcode check happens inside the "slow path" after we've already determined that there's a pending signal or something similar for us to process; the vast majority of the time this isn't true. I'll post a PR in a few minutes that has a test case that demonstrates the problem and fails on current master, plus the fix. -- components: Interpreter Core messages: 291469 nosy: njs, yselivanov priority: normal severity: normal status: open title: Resuming a 'yield from' stack is broken if a signal arrives in the middle versions: Python 3.5, Python 3.6, Python 3.7 ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30039> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30038] Race condition in how trip_signal writes to wakeup fd
New submission from Nathaniel Smith: In trip_signal [1], the logic goes: 1) set the flag saying that this particular signal was tripped 2) write to the wakeup fd 3) set the global is_tripped flag saying "at least one signal was tripped", and do Py_AddPendingCall (which sets some global flags that the bytecode interpreter checks on every pass through the loop) So the problem here is that it's step (2) that wakes up the main thread to check for signals, but it's step (3) that actually arranges for the Python-level signal handler to run. (Step (1) turns out to be irrelevant, because no-one looks at the per-signal flags unless the global is_tripped flag is set. This might be why no-one noticed this bug through code inspection though – I certainly missed it, despite explicitly checking for it several times!) The result is that the following sequence of events is possible: - signal arrives (e.g. SIGINT) - trip_signal writes to the wakeup fd - the main thread blocked in IO wait sees this, and wakes up - the main thread checks for signals, and doesn't find any - the main thread empties the wakeup fd - the main thread goes back to sleep - trip_signal sets the flags to request the Python-level signal handler be run - the main thread doesn't notice, because it's asleep It turns out that this is a real thing that can actually happen; it's causing an annoying intermittent failure in the trio testsuite on appveyor; and under the correct conditions I can reproduce it very reliably in my local Windows VM. See [2]. I think the fix is just to swap the order of steps (2) and (3), so we write to the wakeup fd last. Unfortunately I can't easily test this because I don't have a way to build CPython on Windows. But [2] has some IMHO pretty compelling evidence that this is what's happening. [1] https://github.com/python/cpython/blob/6fab78e9027f9ebd6414995580781b480433e595/Modules/signalmodule.c#L238-L291 [2] https://github.com/python-trio/trio/issues/119 -- messages: 291467 nosy: haypo, njs priority: normal severity: normal status: open title: Race condition in how trip_signal writes to wakeup fd versions: Python 3.5, Python 3.6, Python 3.7 ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30038> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue29943] PySlice_GetIndicesEx change broke ABI in 3.5 and 3.6 branches
Nathaniel Smith added the comment: It looks like PyTorch got bitten by this: https://discuss.pytorch.org/t/pyslice-adjustindices-error/1475/11 -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue29943> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue29943] PySlice_GetIndicesEx change broke ABI in 3.5 and 3.6 branches
Nathaniel Smith added the comment: > Can we consider 3.6.0 rather than 3.6.1 as broken release? In the last week, pypi downloads were about evenly split between 3.6.0 and 3.6.1 (2269969 for "3.6.1", 1927189 for "3.6.0", and those two were ~2 orders of magnitude more common than other strings like "3.6.1+", "3.6.0b2", etc. [1]). Not sure what that to conclude from that, but certainly if people start uploading 3.6.1-only wheels right now then it will break things for a lot of end users. With my manylinux docker image maintainer hat on: we're currently shipping 3.6.0. I'm extremely confident that if we stick with this we'll never get any complaints about the obscure bug with malicious __index__ implementations that's being fixed here. OTOH if we upgrade to 3.6.1, or any version with this ABI change, then we'll definitely get many complaints so long as there's anyone at all still using 3.6.0, which is probably forever. So I'm not sure not sure what incentive we would have to ever upgrade to 3.6.1+ if this ABI change is kept? (This isn't saying the bug is unimportant! But it sure is hard to sell its importance to folks trying to ship packages and support end-users...) [1] Somewhat crude query I used in case it's useful for future reference: SELECT REGEXP_EXTRACT(details.python, r"^([^\.]+\.[^\.]+\.[^\.]+)") as python_version, COUNT(*) as download_count, FROM TABLE_DATE_RANGE( [the-psf:pypi.downloads], DATE_ADD(CURRENT_TIMESTAMP(), -7, "day"), DATE_ADD(CURRENT_TIMESTAMP(), 0, "day") ) WHERE REGEXP_MATCH(details.python, r"^3\.6\.") GROUP BY python_version, ORDER BY download_count DESC LIMIT 100 -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue29943> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue29988] (async) with blocks and try/finally are not as KeyboardInterrupt-safe as one might like
New submission from Nathaniel Smith: You might hope the interpreter would enforce the invariant that for 'with' and 'async with' blocks, either '__(a)enter__' and '__(a)exit__' are both called, or else neither of them is called. But it turns out that this is not true once KeyboardInterrupt gets involved – even if we write our (async) context manager in C, or otherwise guarantee that the actual '__(a)enter__' and '__(a)exit__' methods are immune to KeyboardInterrupt. The invariant is *almost* preserved for 'with' blocks: there's one instruction (SETUP_WITH) that atomically (wrt signals) calls '__enter__' and then enters the implicit 'try' block, and there's one instruction (WITH_CLEANUP_START) that atomically enters the implicit 'finally' block and then calls '__exit__'. But there's a gap between exiting the 'try' block and WITH_CLEANUP_START where a signal can arrive and cause us to exit without running the 'finally' block at all. In this disassembly, the POP_BLOCK at offset 7 is the end of the 'try' block; if a KeyboardInterrupt is raised between POP_BLOCK and WITH_CLEANUP_START, then it will propagate out without '__exit__' being run: In [2]: def f(): ...: with a: ...: pass ...: In [3]: dis.dis(f) 2 0 LOAD_GLOBAL 0 (a) 3 SETUP_WITH 5 (to 11) 6 POP_TOP 3 7 POP_BLOCK 8 LOAD_CONST 0 (None) >> 11 WITH_CLEANUP_START 12 WITH_CLEANUP_FINISH 13 END_FINALLY 14 LOAD_CONST 0 (None) 17 RETURN_VALUE For async context managers, the race condition is substantially worse, because the 'await' dance is inlined into the bytecode: In [4]: async def f(): ...: async with a: ...: pass ...: In [5]: dis.dis(f) 2 0 LOAD_GLOBAL 0 (a) 3 BEFORE_ASYNC_WITH 4 GET_AWAITABLE 5 LOAD_CONST 0 (None) 8 YIELD_FROM 9 SETUP_ASYNC_WITH 5 (to 17) 12 POP_TOP 3 13 POP_BLOCK 14 LOAD_CONST 0 (None) >> 17 WITH_CLEANUP_START 18 GET_AWAITABLE 19 LOAD_CONST 0 (None) 22 YIELD_FROM 23 WITH_CLEANUP_FINISH 24 END_FINALLY 25 LOAD_CONST 0 (None) 28 RETURN_VALUE Really the sequence from 3 BEFORE_ASYNC_WITH to 9 SETUP_ASYNC_WITH should be atomic wrt signal delivery, and from 13 POP_BLOCK to 22 YIELD_FROM likewise. This probably isn't the highest priority bug in practice, but I feel like it'd be nice if this kind of basic language invariant could be 100% guaranteed, not just 99% guaranteed :-). And the 'async with' race condition is plausible to hit in practice, because if I have an '__aenter__' that's otherwise protected from KeyboardInterrupt, then it can run for some time, and any control-C during that time will get noticed just before the WITH_CLEANUP_START, so e.g. 'async with lock: ...' might complete while still holding the lock. The traditional solution would be to define single "super-instructions" that do all of the work we want to be atomic. This would be pretty tricky here though, because WITH_CLEANUP_START is a jump target (so naively we'd need to jump into the "middle" of a hypothetical new super-instruction), and because the implementation of YIELD_FROM kind of assumes that it's a standalone instruction exposed directly in the bytecode. Probably there is some solution to these issues but some cleverness would be required. A alternative approach would be to keep the current bytecode, but somehow mark certain stretches of bytecode as bad places to run signal handlers. The eval loop's "check for signal handlers" code is run rarely, so we could afford to do relatively expensive things like check a lookaside table that says "no signal handlers when 13 < f_lasti <= 22". Or we could steal a bit in the opcode encoding or something. -- components: Interpreter Core messages: 291148 nosy: ncoghlan, njs, yselivanov priority: normal severity: normal status: open title: (async) with blocks and try/finally are not as KeyboardInterrupt-safe as one might like versions: Python 3.7 ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue29988> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28629] Emit ResourceWarning when implicitly terminating a suspended frame?
Nathaniel Smith added the comment: It does make sense to skip emitting a warning if there's no try or with block active. Don't we already have the ability to check for this, though, without any extensions to the frame or code objects? That's what the public PyGen_NeedsFinalizing does, right? It's implemented as a for loop over the frame's block stack, which in most cases should be extremely small, so the performance cost of running this from generator.__del__ seems likely to be minimal. (I think currently the implementation is not *quite* correct if there's a 'yield from' active – in most cases it won't much matter because if A is yielding from B and A is collected, then B will probably be collected a moment later and have its own __del__ called. But this is not *quite* the same as what should happen, which is that collecting A should immediately call B.close(), which in principle might do something arbitrarily different than B.__del__. But adding a check for whether a 'yield from' is active would be pretty trivial.) -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue28629> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue29943] PySlice_GetIndicesEx change broke ABI in 3.5 and 3.6 branches
New submission from Nathaniel Smith: In the process of fixing issue 27867, a new function PySlice_AdjustIndices was added, and PySlice_GetIndicesEx was converted into a macro that calls this new function. The patch was backported to both the 3.5 and 3.6 branches, was released in 3.6.1, and is currently slated to be released as part of 3.5.4. Unfortunately, this breaks our normal ABI stability guarantees for micro releases: it means that if a C module that uses PySlice_GetIndicesEx is compiled against e.g. 3.6.1, then it cannot be imported on 3.6.0. This affects a number of high-profile packages (cffi, pandas, mpi4py, dnf, ...). The only workaround is that if you are distributing binary extension modules (e.g. wheels), then you need to be careful not to upgrade to 3.6.1. It's not possible for a wheel to declare that it requires 3.6.1-or-better, because CPython normally follows the rule that we don't make these kinds of changes. Oops. CC'ing Ned and Larry, because it's possible this should trigger a 3.6.2, and I think it's a blocker for 3.5.4. CC'ing Serhiy as the author of the original patch, since you probably have the best idea how this could be unwound with minimal breakage :-). python-dev discussion: https://mail.python.org/pipermail/python-dev/2017-March/147707.html Fedora bug: https://bugzilla.redhat.com/show_bug.cgi?id=1435135 -- messages: 290796 nosy: larry, ned.deily, njs, serhiy.storchaka priority: normal severity: normal status: open title: PySlice_GetIndicesEx change broke ABI in 3.5 and 3.6 branches versions: Python 3.5, Python 3.6 ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue29943> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue29926] time.sleep ignores _thread.interrupt_main()
Nathaniel Smith added the comment: (oh, in case it wasn't obvious: the advantage of raise() over kill() and pthread_kill() is that raise() works everywhere, including Windows, so it would avoid platform specific logic. Or if you don't like raise() for some reason then you can get the same effect by doing handler = PyOS_getsig(SIGINT); if (handler) handler(SIGINT); ) (Also, this is all assuming the python 3 logic on Windows. On Windows versions of python 2 I can't see any way to make time.sleep interruptible without significant rewriting. The problem is that on py2 the special Event used to signal the arrival of a control-C is a private variable that gets hooked up directly to Windows's low-level control-C logic, and there's effectively no way to get at it from outside. You might think GenerateConsoleCtrlEvent would help, but it is broken in too many ways for me to fit in this text box.) -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue29926> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue29926] time.sleep ignores _thread.interrupt_main()
Nathaniel Smith added the comment: If you want to trigger the standard signal handling logic, then raise(SIGINT) is also an option. On unix it probably won't help because of issue 21895, but using raise() here + fixing 21895 by using pthread_kill in the c level signal handler would together give a pretty simple and robust way to pretend that the user hit control-C. But... interrupt_main isn't documented to raise a SIGINT, it's documented to raise KeyboardInterrupt. These are very different, e.g. if the user has installed a custom handler or even set SIGINT to SIG_IGN or masked it. (In the later two cases, using pthread_kill to send a signal to the main thread *won't* interrupt any ongoing syscall.) This is *much* more difficult to make work correctly. And actually what IDLE wants is a fake control-C anyway! So I'd suggest either redefining the semantics of interrupt_main to be "acts as if SIGINT was received" or else abandoning interrupt_main to its slightly-buggy semantics and adding a new function that does raise(SIGINT) and switching IDLE to use this new function. And fix issue 21895 in either case. -- nosy: +njs ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue29926> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21895] signal.pause() and signal handlers don't react to SIGCHLD in non-main thread
Nathaniel Smith added the comment: Letting Python-level signal handlers run in arbitrary threads is an interesting idea, but it's a non-trivial change to Python semantics that may well break some programs (previously it was guaranteed that signal handlers couldn't race with main thread code), and it doesn't fully fix the bug here (because it's common for Python programs to contain threads which don't run Python code, e.g. if they use zmq). I can imagine some potential benefits, but I feel like this needs some substantial rationale? OTOH the pthread_kill-based fix I suggested is like 3 lines and AFAICT fixes the problem exactly. > I dislike the idea of transfering the signal to the main thread from another > thread in the C signal handler using pthread_kill(). Most code behave very > badly when getting a signal, so getting a signal twice is likely to double > the pain. This doesn't make much sense to me... CPython's pretty good at handling EINTR these days, and, well, the only case that's different is when you have a signal handler that needs to be run and the main thread is blocked in a syscall, which is exactly the case that's currently broken? > Moreover, pthread_sigmask() can block signals in the main thread for > deliberate reasons. Sure, and with pthread_kill() this would have the effect that the signal would be queued by the kernel and delivered after signal is unmasked. That seems like pretty sensible semantics to me; in fact it's exactly what you get in single-threaded code. Compare that to right now where AFAICT pthread_sigmask() is pretty useless in the presence of threads, because of the thing where if one thread has a signal blocked then the kernel will pick another thread to deliver it to. > From the point of view of the Python signal handler, the current "if > (PyThread_get_thread_ident() != main_thread) return 0;" code in the C signal > handler is somehow an implicit pthread_sigmask(signal.SIG_BLOCK, range(1, > signal.NSIG)) on all threads except of the main thread, whereas Unix gives a > fine control on these masks with the pthread_sigmask() function. That code isn't in the C signal handler, it's in the code that the interpreter runs occasionally to check if the C signal handler has been called, right? -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue21895> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21895] signal.pause() and signal handlers don't react to SIGCHLD in non-main thread
Nathaniel Smith added the comment: @haypo: okay, looked over things over for a third time and this time I found my very silly error :-). So I'm now able to use set_wakeup_fd on Windows (https://github.com/python-trio/trio/pull/108), but not on Unix (https://github.com/python-trio/trio/issues/109). In any case, the issue here remains that one shouldn't have to use set_wakeup_fd for a signal to interrupt time.sleep etc. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue21895> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21895] signal.pause() and signal handlers don't react to SIGCHLD in non-main thread
Nathaniel Smith added the comment: @haypo: It's a socketpair. It works fine when I set up a toy test case using set_wakeup_fd + select, and it works fine in my real code when I use CFFI cleverness to register a signal handler that manually writes a byte to my wakeup socket, but when I pass that exact same socket to set_wakeup_fd in my real code, it doesn't work. It's pretty mysterious, and I have no particular reason to think that the problem is in CPython as opposed to some stupid mistake I'm making. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue21895> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21895] signal.pause() and signal handlers don't react to SIGCHLD in non-main thread
Nathaniel Smith added the comment: I don't really have a specific use case personally -- for trio, I haven't found a way to make use of set_wakeup_fd because of various issues[1], but I'm also not planning to use SIGCHLD, so this isn't very urgent. In general set_wakeup_fd can be a workaround, but I wanted to note this upstream because it's a bug in Python's signal handler logic. Note that Python already goes to great lengths to make e.g. signal handlers run during time.sleep on Windows; they ought to just work on Unix too. -- [1] (Off-topic but in case you're curious: I register actual signal handlers anyway because I follow the philosophy that the wakeup pipe should only be used for wakeups rather than transmitting information, so as long as signal handlers work I can do my own wakeups, + for some reason set_wakeup_fd doesn't work for me on Windows (no idea why, can't reproduce in simplified tests, might be my fault, need to debug), + set_wakeup_fd's handling of buffer overflow is broken for my purposes.) -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue21895> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21895] signal.pause() and signal handlers don't react to SIGCHLD in non-main thread
Nathaniel Smith added the comment: It turns out that this bug is more general than signal.pause, and has caused problems for a few different people: https://github.com/dabeaz/curio/issues/118#issuecomment-287735781 https://github.com/dabeaz/curio/issues/118#issuecomment-287798241 https://github.com/dabeaz/curio/issues/118#issuecomment-287887843 The basic problem is that CPython's signal handling strategy on Unix-likes assumes that if a signal is delivered to the process at a time when the main thread is blocked in a syscall, then that syscall will promptly exit with EINTR. So for example, time.sleep has some clever code on Windows to make it interruptible with control-C, but on Unix we assume that the kernel will break the sleep for us. The problem with this is that POSIX only guarantees that *some* thread will receive the signal -- not necessarily the main thread. In practice it seems like most implementations do deliver most signals to the main thread or CPython wouldn't have gotten away with this for as long as it has, but in particular it seems like Linux is happy to deliver SIGCHLD to random other threads. So the C-level signal handler runs in whatever thread, it sets the flag for the main thread to run the Python-level signal handler... and then the main thread sits there blocked in sleep() or select() or pause() or whatever, and never checks the flag. A simple solution would be to make sure signals are always delivered to the main thread, by adjusting the C-level signal handler to do something like: if (current_thread != main_thread) { pthread_kill(main_thread, sig_num); return; } -- nosy: +njs title: signal.pause() doesn't wake up on SIGCHLD in non-main thread -> signal.pause() and signal handlers don't react to SIGCHLD in non-main thread versions: +Python 3.6, Python 3.7 ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue21895> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com