On Wed, Mar 28, 2012 at 06:45, Victor Stinner <victor.stin...@gmail.com> wrote:
> We have a crash in our product when tracing is enabled by
> sys.settrace() and threading.settrace(). If a Python generator is
> created in a C thread, calling the generator later in another thread
> may crash if Python tracing is enabled.
>
>  - the C thread calls PyGILState_Ensure() which creates a temporary
> Python thread state
>  - a generator is created, the generator has a reference to a Python
> frame which keeps a reference to the temporary Python thread state
>  - the C thread calls PyGILState_Releases() which destroys the
> temporary Python thread state
>  - when the generator is called later in another thread, call_trace()
> reads the Python thread state from the generator frame, which is the
> destroyed frame => it does crash on a pointer dereference if the
> memory was reallocated (by malloc()) and the data were erased
>

This is timely.  We've seen several similar bugs in our 3.1.2-based
embedded product--in fact, I'm tracking down what might be another
this week.

The common theme is that C extension code that depends on
PyGILState_Ensure() for GIL management runs the risk of causing latent
crashes in any situation that involves preserving a frame beyond the
lifetime of the thread that created it.

In our case, the culprits have usually been the frames attached to
exceptions thrown across the extension->Python->embedding app
boundaries.

From a theoretical standpoint, I can't quite decide what the real error is:

1) the fact that PyGILState_Release() destroys a temporary thread
state that may still be referenced by some objects, or
2) the fact that some code is trying to keep frame objects after the
creating thread state no longer exists.

This week I've been leaning toward 2), but then I realized that
keeping frames post-thread-death is not that uncommon (for example,
debuggers and other diagnostic techniques like
http://nedbatchelder.com/blog/200711/rethrowing_exceptions_in_python.html).

Locally we added some (unfortunate) code to our 3.1.2 port to wrap
PyGILState_Ensure(), which I thought had sidestepped the issue for us:

void takeGIL()
{
    PyGILState_Ensure();
    // This has the side effect of keeping such thread states alive until
    // the interpreter is finalized; however, all thread state objects get
    // unconditionally deleted during Py_Finalize, so they won't leak.
    PyThreadState* pThreadState = PyGILState_GetThisThreadState();
    if (pThreadState->gilstate_counter == 1)
    {
        ++pThreadState->gilstate_counter;
    }
}

But clearly that can't be a correct answer (and it may not even be a
functioning one, given that I'm seeing a similar issue again).
-- 
Tim Lesher <tles...@gmail.com>
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to