[Python-Dev] Intended invariants for signals in CPython

2020-06-24 Thread Yonatan Zunger via Python-Dev
Hi everyone,

I'm in the process of writing some code to defer signals during critical
regions, which has involved a good deal of reading through the CPython
implementation to understand the behaviors. Something I've found is that
there appears to be a lot of thoughtfulness about where the signal handlers
can be triggered, but this thoughtfulness is largely undocumented. I've put
together a working list of behaviors from staring at the code, but what I'd
like to figure out is which of these behaviors the devs think of as
intended to be invariants, versus which are just accidents of how the code
currently works and might change unpredictably.

And if there are things which are intended to be genuine invariants, would
it be reasonable to document these formally and make them part of the
language, not just for inside the CPython codebase?

What appears to be true is this:

   - Signal handlers are only invoked in the main thread (documented with
   the signal library)
   - High-level: Signal handlers may be invoked at any instruction
   boundary. External C libraries *may* invoke them as well, but there are
   no general guarantees. (Documented with the signal library)
   - Low-level: Certain functions can be described as "interruptable," and
   signal handlers may be invoked whenever these functions are called.
   - Signal handlers are thus partially reentrant: a signal handler may be
   interrupted by another signal iff it invokes an interruptable function.

In particular, the thing whose intentionality I'm not sure about is whether
the notion of an interruptable function or instruction is meant to be an
actual property of the language and/or of the CPython runtime, or whether
it's actually intended that only the "high-level" rule above be true, and
that all signal handlers should be considered to be fully reentrant at all
times. The comments in sysmodule.c about avoiding triggering
PyErr_CheckSignals() suggest that there definitely is some thinking about
this within the CPython code itself.

The reason it would be useful to document this is so that if I'm trying to
write a fairly generic library that handles signals (like the one I'm doing
now) I can reason about where I need to be defensive about an instruction
being interrupted by yet another signal, and maybe avoid calls to certain
functions which are known to be interruptable, much like I would avoid
calling malloc() in a C signal handler.

In the current implementation, the interruptable functions and instructions
are:

Big categories:

   - Any function which calls PyErr_SetFromErrno, *if* errno == EINTR.
   (Catalogue needs to be made of these -- it's a much smaller set than the
   set of all calls to PyErr_SetFromErrno)
   - Basically any open, read, or write method of a raw or buffered file
   object.
   - Likewise, any open, read, or write method on a socket.
   - In any interactive console readline, or in input().
   - object.__str__, object.__repr__, and PyObject_Print, and anything that
   falls back to these.

Specific instructions:

   -
   - Multiplication, division, or stringification of long integers.

More specific functions:

   - In `multiprocessing.shared_memory`, SharedMemory.__init__, .close, and
   .unlink.
   - In `multiprocessing.semaphore`, Semaphore.acquire. (But interestingly,
   *not* threading.Semaphore.acquire)
   - In `signal`, pause, signal, sigwaitinfo, sigtimedwait, pthread_kill,
   and pthread_sigmask.
   - In `fcntl`, fcntl and ioctl.
   - In `traceback`, any of the print methods.
   - In `faulthandler`, dump_traceback
   - In `select`, all of the methods. (select, epoll, etc)
   - In `time`, sleep.
   - In `curses`, whenever you look for key input.
   - In `tkinter`, during the main loop of a Tcl/Tk app.
   - During an SSL handshake.

-- 

Yonatan Zunger

Distinguished Engineer and Chief Ethics Officer

He / Him

[email protected]

100 View St, Suite 101

Mountain View, CA 94041

Humu.com   · LinkedIn
  · Twitter

___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/W5LGEEWGGO7ODIAJXM54YSI2PZR5UO6Y/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Intended invariants for signals in CPython

2020-06-24 Thread Yonatan Zunger via Python-Dev
... Reading through more of the code, I realized that I greatly
underestimated the number of interruptible operations.

That said, the meta-question still applies: Are there things which are
generally intended *not* to be interruptible by signals, and if so, is
there some consistent way of indicating this?

On Wed, Jun 24, 2020 at 2:34 PM Yonatan Zunger  wrote:

> Hi everyone,
>
> I'm in the process of writing some code to defer signals during critical
> regions, which has involved a good deal of reading through the CPython
> implementation to understand the behaviors. Something I've found is that
> there appears to be a lot of thoughtfulness about where the signal handlers
> can be triggered, but this thoughtfulness is largely undocumented. I've put
> together a working list of behaviors from staring at the code, but what I'd
> like to figure out is which of these behaviors the devs think of as
> intended to be invariants, versus which are just accidents of how the code
> currently works and might change unpredictably.
>
> And if there are things which are intended to be genuine invariants, would
> it be reasonable to document these formally and make them part of the
> language, not just for inside the CPython codebase?
>
> What appears to be true is this:
>
>- Signal handlers are only invoked in the main thread (documented with
>the signal library)
>- High-level: Signal handlers may be invoked at any instruction
>boundary. External C libraries *may* invoke them as well, but there
>are no general guarantees. (Documented with the signal library)
>- Low-level: Certain functions can be described as "interruptable,"
>and signal handlers may be invoked whenever these functions are called.
>- Signal handlers are thus partially reentrant: a signal handler may
>be interrupted by another signal iff it invokes an interruptable function.
>
> In particular, the thing whose intentionality I'm not sure about is
> whether the notion of an interruptable function or instruction is meant to
> be an actual property of the language and/or of the CPython runtime, or
> whether it's actually intended that only the "high-level" rule above be
> true, and that all signal handlers should be considered to be fully
> reentrant at all times. The comments in sysmodule.c about avoiding
> triggering PyErr_CheckSignals() suggest that there definitely is some
> thinking about this within the CPython code itself.
>
> The reason it would be useful to document this is so that if I'm trying to
> write a fairly generic library that handles signals (like the one I'm doing
> now) I can reason about where I need to be defensive about an instruction
> being interrupted by yet another signal, and maybe avoid calls to certain
> functions which are known to be interruptable, much like I would avoid
> calling malloc() in a C signal handler.
>
> In the current implementation, the interruptable functions and
> instructions are:
>
> Big categories:
>
>- Any function which calls PyErr_SetFromErrno, *if* errno == EINTR.
>(Catalogue needs to be made of these -- it's a much smaller set than the
>set of all calls to PyErr_SetFromErrno)
>- Basically any open, read, or write method of a raw or buffered file
>object.
>- Likewise, any open, read, or write method on a socket.
>- In any interactive console readline, or in input().
>- object.__str__, object.__repr__, and PyObject_Print, and anything
>that falls back to these.
>
> Specific instructions:
>
>-
>- Multiplication, division, or stringification of long integers.
>
> More specific functions:
>
>- In `multiprocessing.shared_memory`, SharedMemory.__init__, .close,
>and .unlink.
>- In `multiprocessing.semaphore`, Semaphore.acquire. (But
>interestingly, *not* threading.Semaphore.acquire)
>- In `signal`, pause, signal, sigwaitinfo, sigtimedwait, pthread_kill,
>and pthread_sigmask.
>- In `fcntl`, fcntl and ioctl.
>- In `traceback`, any of the print methods.
>- In `faulthandler`, dump_traceback
>- In `select`, all of the methods. (select, epoll, etc)
>- In `time`, sleep.
>- In `curses`, whenever you look for key input.
>- In `tkinter`, during the main loop of a Tcl/Tk app.
>- During an SSL handshake.
>
> --
>
> Yonatan Zunger
>
> Distinguished Engineer and Chief Ethics Officer
>
> He / Him
>
> [email protected]
>
> 100 View St, Suite 101
>
> Mountain View, CA 94041
>
> Humu.com   · LinkedIn
>   · Twitter
> 
>


-- 

Yonatan Zunger

Distinguished Engineer and Chief Ethics Officer

He / Him

[email protected]

100 View St, Suite 101

Mountain View, CA 94041

Humu.com   · LinkedIn
  · Twitter

___
Python-Dev mailing list -- [email protected]
To unsubscribe send

[Python-Dev] Re: Intended invariants for signals in CPython

2020-06-25 Thread Yonatan Zunger via Python-Dev
I'm taking it from this thread that suppressing signals in a small window
is not something anyone in their right mind would really want to attempt.
:) (Or that if they did, it would have to be through a proper change to the
runtime, not something higher-level)

On Thu, Jun 25, 2020 at 7:14 AM Antoine Pitrou  wrote:

>
> Le 25/06/2020 à 16:00, Guido van Rossum a écrit :
> > On Thu, Jun 25, 2020 at 02:02 Antoine Pitrou  > > wrote:
> >
> > ...  The intent, though, is that any function
> > waiting on an external event (this can be a timer, a socket, a
> > lock, a directory...) should be interruptible so that Ctrl-C works in
> > an interactive prompt.
> >
> > That’s not really true though right? Locks can block the REPL.
>
> On POSIX they don't.  On Windows it's a long-standing bug:
> https://bugs.python.org/issue29971
>
> Regards
>
> Antoine.
> ___
> Python-Dev mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/[email protected]/message/4TNEA5KNWCYTJVIPISUZKVXVDK2BQJWT/
> Code of Conduct: http://python.org/psf/codeofconduct/
>


-- 

Yonatan Zunger

Distinguished Engineer and Chief Ethics Officer

He / Him

[email protected]

100 View St, Suite 101

Mountain View, CA 94041

Humu.com   · LinkedIn
  · Twitter

___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/BYZD3YJ52BGHIGS7M5IAWO3MJYLWAVAD/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Intended invariants for signals in CPython

2020-06-25 Thread Yonatan Zunger via Python-Dev
Also, just to sanity-check that I understand things correctly: Python
signal handlers *are* reentrant, in that a signal handler can be
interrupted by another signal, is that right? Is there any general
recommendation on how to write signal handlers in order to manage that?

(Antoine, I *so* wish I could be doing less with signals and signal
handlers right now. Alas, I have a combination of a SIGTERM-happy runtime
environment and a long-story situation involving wacky multiprocessing to
avoid issues in someone else's C library that make that impossible. So
instead I'm trying to write a general library to help simplify the task,
and so thinking about a lot of slightly nutty corner cases...)

On Thu, Jun 25, 2020 at 10:33 AM Yonatan Zunger  wrote:

> I'm taking it from this thread that suppressing signals in a small window
> is not something anyone in their right mind would really want to attempt.
> :) (Or that if they did, it would have to be through a proper change to the
> runtime, not something higher-level)
>
> On Thu, Jun 25, 2020 at 7:14 AM Antoine Pitrou  wrote:
>
>>
>> Le 25/06/2020 à 16:00, Guido van Rossum a écrit :
>> > On Thu, Jun 25, 2020 at 02:02 Antoine Pitrou > > > wrote:
>> >
>> > ...  The intent, though, is that any function
>> > waiting on an external event (this can be a timer, a socket, a
>> > lock, a directory...) should be interruptible so that Ctrl-C works
>> in
>> > an interactive prompt.
>> >
>> > That’s not really true though right? Locks can block the REPL.
>>
>> On POSIX they don't.  On Windows it's a long-standing bug:
>> https://bugs.python.org/issue29971
>>
>> Regards
>>
>> Antoine.
>> ___
>> Python-Dev mailing list -- [email protected]
>> To unsubscribe send an email to [email protected]
>> https://mail.python.org/mailman3/lists/python-dev.python.org/
>> Message archived at
>> https://mail.python.org/archives/list/[email protected]/message/4TNEA5KNWCYTJVIPISUZKVXVDK2BQJWT/
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>
>
> --
>
> Yonatan Zunger
>
> Distinguished Engineer and Chief Ethics Officer
>
> He / Him
>
> [email protected]
>
> 100 View St, Suite 101
>
> Mountain View, CA 94041
>
> Humu.com   · LinkedIn
>   · Twitter
> 
>


-- 

Yonatan Zunger

Distinguished Engineer and Chief Ethics Officer

He / Him

[email protected]

100 View St, Suite 101

Mountain View, CA 94041

Humu.com   · LinkedIn
  · Twitter

___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/ZSR73MQWMFUXFOADDMEG5JBXOJSZ232Y/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Intended invariants for signals in CPython

2020-06-25 Thread Yonatan Zunger via Python-Dev
I had not -- thank you!

On Thu, Jun 25, 2020 at 1:49 PM Chris Jerdonek 
wrote:

> On Wed, Jun 24, 2020 at 5:15 PM Yonatan Zunger via Python-Dev <
> [email protected]> wrote:
>
>> That said, the meta-question still applies: Are there things which are
>> generally intended *not* to be interruptible by signals, and if so, is
>> there some consistent way of indicating this?
>>
>
> Yonatan, Nathaniel Smith wrote an interesting post a few years ago that
> includes some background about signal handling:
> https://vorpus.org/blog/control-c-handling-in-python-and-trio/
> Have you seen that?
>
> --Chris
>
>
>>

-- 

Yonatan Zunger

Distinguished Engineer and Chief Ethics Officer

He / Him

[email protected]

100 View St, Suite 101

Mountain View, CA 94041

Humu.com <https://www.humu.com>  · LinkedIn
<https://www.linkedin.com/company/humuhq>  · Twitter
<https://twitter.com/humuinc>
___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/BG35OR3HK7NRKPT7Q6L5Y36WQK2MWQK6/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Intended invariants for signals in CPython

2020-06-25 Thread Yonatan Zunger via Python-Dev
HOLY CRAP THIS IS MADNESS. I kind of love it. :)

And it's related to some other problems that have been on my mind (how to
"paint" stack frames with user-defined variables, with those variables then
being used by things like CPU/heap profilers as smart annotations), and I
have to say it's a damned clever solution to the problem.

On Thu, Jun 25, 2020 at 6:35 PM Yonatan Zunger  wrote:

> I had not -- thank you!
>
> On Thu, Jun 25, 2020 at 1:49 PM Chris Jerdonek 
> wrote:
>
>> On Wed, Jun 24, 2020 at 5:15 PM Yonatan Zunger via Python-Dev <
>> [email protected]> wrote:
>>
>>> That said, the meta-question still applies: Are there things which are
>>> generally intended *not* to be interruptible by signals, and if so, is
>>> there some consistent way of indicating this?
>>>
>>
>> Yonatan, Nathaniel Smith wrote an interesting post a few years ago that
>> includes some background about signal handling:
>> https://vorpus.org/blog/control-c-handling-in-python-and-trio/
>> Have you seen that?
>>
>> --Chris
>>
>>
>>>
>
> --
>
> Yonatan Zunger
>
> Distinguished Engineer and Chief Ethics Officer
>
> He / Him
>
> [email protected]
>
> 100 View St, Suite 101
>
> Mountain View, CA 94041
>
> Humu.com <https://www.humu.com>  · LinkedIn
> <https://www.linkedin.com/company/humuhq>  · Twitter
> <https://twitter.com/humuinc>
>


-- 

Yonatan Zunger

Distinguished Engineer and Chief Ethics Officer

He / Him

[email protected]

100 View St, Suite 101

Mountain View, CA 94041

Humu.com <https://www.humu.com>  · LinkedIn
<https://www.linkedin.com/company/humuhq>  · Twitter
<https://twitter.com/humuinc>
___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/UQZDHEKI5OXULCHWCYA4AAPI52HZ3JK2/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Intended invariants for signals in CPython

2020-06-25 Thread Yonatan Zunger via Python-Dev
What, weird edge cases involving *signals?* Never! :)

Here's a nice simple one: it takes at least a few opcodes to set said
global flag, during which (depending on the whims of how eval_break gets
set) yet another signal might get raised and handled.

I did just make a post to python-ideas about the possibility of adding a
"sys.suppress_signals" method; it seems like it would be surprisingly easy
in CPython (basically by just adding another check at the start of
_PyErr_CheckSignalsTstate) but would also be a truly impressive footgun.
Not sure if I'm going to try to climb that particular mountain yet, but I
figured I'd see what obvious holes other people could poke in it.

Thanks for your help!

On Thu, Jun 25, 2020 at 1:27 PM Antoine Pitrou  wrote:

> On Thu, 25 Jun 2020 11:18:13 -0700
> Yonatan Zunger via Python-Dev  wrote:
> > Also, just to sanity-check that I understand things correctly: Python
> > signal handlers *are* reentrant, in that a signal handler can be
> > interrupted by another signal, is that right? Is there any general
> > recommendation on how to write signal handlers in order to manage that?
>
> To be honest, I've never thought about that.  If you need to care about
> reentrancy, you should perhaps use some kind of global flag to detect
> it (hopefully you won't run into weird edge cases...).
>
> > (Antoine, I *so* wish I could be doing less with signals and signal
> > handlers right now. Alas, I have a combination of a SIGTERM-happy runtime
> > environment and a long-story situation involving wacky multiprocessing to
> > avoid issues in someone else's C library that make that impossible. So
> > instead I'm trying to write a general library to help simplify the task,
> > and so thinking about a lot of slightly nutty corner cases...)
>
> Ha, I wisk you good luck with that :-)
>
> Best regards
>
> Antoine.
>
> ___
> Python-Dev mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/[email protected]/message/KBD7XG5QPRQRP52FVPAFLZ3G6PSPPVYE/
> Code of Conduct: http://python.org/psf/codeofconduct/
>


-- 

Yonatan Zunger

Distinguished Engineer and Chief Ethics Officer

He / Him

[email protected]

100 View St, Suite 101

Mountain View, CA 94041

Humu.com <https://www.humu.com>  · LinkedIn
<https://www.linkedin.com/company/humuhq>  · Twitter
<https://twitter.com/humuinc>
___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/PZUPF2X6KVV5SQLIWUCOK4CN34ENLKDU/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Intended invariants for signals in CPython

2020-06-29 Thread Yonatan Zunger via Python-Dev
Whew. Nick, Antoine, and Chris, thanks to each of you for your feedback --
with it, I *think* I've managed to write a pure-Python signal suppression
library. I'm nowhere near confident enough in its handling of corner cases
yet to release it to the general public, but hopefully I'll be able to
acquire that faith in it over time and do that.

(It ended up involving a new & improved Semaphore class with some more
functions like pausability, having a signal handler that puts things in a
SimpleQueue [thanks, Antoine] and dequeues them when the semaphore is empty
and paused, creative use of with instead of try/finally to leverage some of
the ideas in the blog post Chris linked and manage reentrancy, and with all
that allow the main thread to meaningfully know when it needs to defer
dealing with a signal until later so that the threads can safely finish.
Whew.)

Good rule of thumb: If your Python code has comments talking about specific
opcodes, you are writing some Really Interesting Python Code. :)

Yonatan

On Sat, Jun 27, 2020 at 10:26 PM Nick Coghlan  wrote:

> On Fri., 26 Jun. 2020, 7:02 am Chris Jerdonek, 
> wrote:
>
>> On Wed, Jun 24, 2020 at 5:15 PM Yonatan Zunger via Python-Dev <
>> [email protected]> wrote:
>>
>>> That said, the meta-question still applies: Are there things which are
>>> generally intended *not* to be interruptible by signals, and if so, is
>>> there some consistent way of indicating this?
>>>
>>
>> Yonatan, Nathaniel Smith wrote an interesting post a few years ago that
>> includes some background about signal handling:
>> https://vorpus.org/blog/control-c-handling-in-python-and-trio/
>>
>
> Related to that is this CPython bug report:
> https://bugs.python.org/issue29988
>
> The short version is that Greg Smith and I tried to close some of the
> remaining signal safety holes a couple of years ago, and I made it as far
> as building better tools for provoking the bugs (this is the origin of
> per-opcode tracing hooks in CPython), but we never came up with an actual
> solution.
>
> So the workaround remains to run anything that absolutely cannot be
> interrupted by poorly timed signals in a subthread, and dedicate the main
> thread to signal handling.
>
> Cheers,
> Nick.
>
>
>
>>

-- 

Yonatan Zunger

Distinguished Engineer and Chief Ethics Officer

He / Him

[email protected]

100 View St, Suite 101

Mountain View, CA 94041

Humu.com <https://www.humu.com>  · LinkedIn
<https://www.linkedin.com/company/humuhq>  · Twitter
<https://twitter.com/humuinc>
___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/VMKL26EOUELDRG24JLYN6RIQW25N4BXV/
Code of Conduct: http://python.org/psf/codeofconduct/