[Python-Dev] Re: [Steering-council] Re: PEP 651, Robust Stack Overflow Handling, Rejection notice

Mark Shannon Mon, 31 Jan 2022 02:45:34 -0800


On 31/01/2022 5:23 am, Gregory P. Smith wrote:

-cc: python-steering-council

On Fri, Mar 5, 2021 at 4:26 PM Guido van Rossum <gu...@python.org 
<mailto:gu...@python.org>> wrote:

    On Fri, Mar 5, 2021 at 11:11 AM Brett Cannon <br...@python.org 
<mailto:br...@python.org>> wrote:

        Speaking for myself ...


    Ditto ...

        On Fri, Mar 5, 2021 at 7:04 AM Mark Shannon <m...@hotpy.org 
<mailto:m...@hotpy.org>> wrote:
        [...]

            In some cases, the PEP would have improved the situation.

            For example:
            sys.setrecursionlimit(5000)
            def f():
                  f()

            Currently, it raises a RecursionError on linux, but crashes the
            interpreter on Windows.
            With PEP 651 it would have raised a RecursionError on both 
platforms.

            Am I missing something here?


        So your example shows a user already comfortable in raising their 
recursion limit to work around needing more stack space to reach completion. 
What is stopping the user from continuing to raise the limit until they still 
reach their memory limit even with PEP 651? If you're worried about runaway 
recursion you will very likely hit that with the default stack depth already, 
so I personally don't see how a decoupled stack counter from the C stack 
specifically makes it any easier/better to detect runaway recursion. And if I 
need more recursion than the default, you're going to bump the recursion depth 
anyway, which weakens the protection in either the C or decoupled counter 
scenarios. Sure, it's currently platform-specific, but plenty of people want to 
push that limit based on their machine anyway and don't need consistency on 
platforms they will never run on, i.e. I don't see a huge benefit to being able 
to say that an algorithm consistently won't go past 5000 calls on
        all platforms compared to what the C stack protection already gives us (not to 
say there's zero benefit, but it isn't massive or widespread either IMO). I personally 
just don't see many people saying, "I really want to limit my program to an exact 
call stack depth of 5000 on all platforms which is beyond the default, but anything under 
is fine and anything over -- regardless of what the system can actually handle -- is 
unacceptable".

        Tack on the amount of changes required to give a cross-platform stack 
count and limit check compared to the benefit being proposed, and to me that 
pushes what the PEP is proposing into net-negative payoff.


    To me, the point of that example is as a reminder that currently fiddling 
with the recursion limit can cause segfaults.

    Mark's PEP proposes two, somewhat independent, changes: (1) don't consume C 
stack on pure Python-to-Python (pp2p) function calls; (2) implement fool-proof 
C stack overflow checks.

    Change (2) makes it safe for users to mess with the stack overflow limit 
however they see fit. Despite (1), the limit for pp2p calls remains at 1000 so 
that users who unintentionally write some naively recursive code don't have to 
wait until they fill up all of memory before they get a traceback. (Of course 
they could also write a while-True loop that keeps adding an item to a list and 
they'd end up in the same situation. But in my experience that situation is 
less painful to deal with than accidental stack overflow, and I'd shudder at 
the thought of a traceback of a million lines.)

    Given that we have (1), why is (2) still needed? Because there are ways to 
recursively call Python code that aren't pp2p calls. By a pp2p (pure 
Python-to-Python) call, I mean any direct call, e.g. a method call or a 
function call. But what about other operations that can invoke Python code? 
E.g. if we have a dict d and a class C, we could create an instance of C and 
use it to index d, e.g. d[C()]. This operation is not a p2pp call -- the 
BINARY_SUBSCR opcode calls the dict's `__getitem__` method, and that calls the 
key's `__hash__` method. Here's a silly demonstration:
    ```
    def f(c):
         d = {}
         return d[c]

    class C:
         def __hash__(self):
             return f(self)

    f(C())
    ```
    Note that the "traceback compression" implemented for simple recursive 
calls fails here -- I just ran this code and got 2000 lines of output.

    The way I imagine Mark wants to implement pp2p calls means that in this 
case each recursion step *does* add several other C stack frames, and this 
would be caught by the limit implemented in (2). I see no easy way around this 
-- after all the C code involved in the recursion could be a piece of 3rd party 
C code that itself is not at fault.

    So we could choose to implement only (2), robust C stack overflow checks. 
This would require a bunch of platform-specific code, and there might be 
platforms where we don't know how to implement this (I vaguely recall a UNIX 
version where main() received a growable stack but each thread only had a fixed 
64kb stack), but those would be no worse off than today.

    Or we could choose to implement only (1), eliminating C stack usage for pp2p calls. But in that 
case we'd still need a recursion limit for non-pp2p calls. We could eliminate the recursion limit 
entirely for pp2p calls, but then we'd probably end up with user complaints -- I could imagine a 
framework that executes arbitrary user code and attempts recovery after receiving an exception (for 
example a REPL). Such a framework can never be 100% foolproof (e.g. `while True: l.append(1)`, or 
`os._exit(0)`), but accidental recursion would be a reasonable thing to attempt to curtail, so we'd 
still need to implement a limit on pp2p calls. Since the new pp2p implementation allows for a much 
larger recursion limit, Mark's design with two separate limits and exceptions makes sense -- 
"recursion" to limit pp2p call recursion, and "stack overflow" to limit the C 
stack overflow. To make things more reliable, it makes sense to throw in (2), better checks for C 
stack overflow, but that
    could be done independently -- we could just implement the two separate limits and 
exceptions and C-stack-frame-free pp2p calls without (2), and we'd be no worse off than 
today with regard to the safety of "wild" recursion like my example above.


/(Resurrecting a 9 month old thread a bit as C stack overflows came up in the context 
of https://github.com/python/cpython/pull/30855 
<https://github.com/python/cpython/pull/30855>)/

Regarding (2)... robust C stack overflow checks are hard - if even possible. At 
most you can get an improvement but never be able to guarantee it. What I've 
seen proposed for that in this thread I believe won't deliver in some real 
scenarios. Any code can spawn threads from extension modules or embedding code 
with whatever C stack size it wants. Those threads can (read: do) call back 
into the CPython interpreter. Any assumption of a constant minimum C stack size 
is incorrect unless you go with the absolute platform minimum possible value 
(too small). 512k is considered huge.

One way other language runtimes deal with stack overflows is via a 
sigaltstack'd SIGSEGV handler that introspects what caused the signal (!) to 
see if it was maybe a stack overflow and if so fixes things up (hello Golang 
dynamic stack implementation) to recover before resuming execution. This is 
difficult and potentially not even possible to guarantee correct when you don't 
control the entire runtime as Golang does.

There has been some interesting talk about techniques recently in The Orange Site's 
https://news.ycombinator.com/item?id=28682945 
<https://news.ycombinator.com/item?id=28682945> thread which has cropped up since this 
discussion. The idea around allocating our own C stack and swapping over to that is 
"neat" - but it'd be a challenge as we're a re-entrant runtime so we'd effectively 
need to grow a list of newly allocated C stacks for use upon every entry where we find one of 
our thread's stacks is not the one in use. I suspect such an implementation would probably be 
more complex than the feature is worth - but I'll leave judgment on that up to after anyone 
actually manages to implement a cross platform version of such an idea.

-gps


Robust stack overflow checks are hard in the most general case, yes.

But on Windows, or any posix system, we can ask the O/S what the stack bounds 
are. It's a lot simpler than many other features we try to support portably.

Cheers,
Mark.
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/DEMG2NPV5FQP2FFKTEGOM7OAMPC6P6PW/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: [Steering-council] Re: PEP 651, Robust Stack Overflow Handling, Rejection notice

Reply via email to