#36714: Async signals lose ContextVar state due to use of asyncio.gather
----------------------------------+------------------------------------
     Reporter:  Mykhailo Havelia  |                    Owner:  (none)
         Type:  Bug               |                   Status:  new
    Component:  HTTP handling     |                  Version:  dev
     Severity:  Normal            |               Resolution:
     Keywords:  asyncio, signals  |             Triage Stage:  Accepted
    Has patch:  0                 |      Needs documentation:  0
  Needs tests:  0                 |  Patch needs improvement:  0
Easy pickings:  0                 |                    UI/UX:  0
----------------------------------+------------------------------------
Comment (by Mykhailo Havelia):

 Replying to [comment:10 Carlton Gibson]:

 Hi βœ‹

 > I continue to think that it would be a shame to not at least try to
 maintain the concurrent signal dispatch here.

 I've spent quite a bit of time trying to make this work, and I believe I
 now have enough evidence that, unfortunately, we can't do it πŸ˜….

 Consider the example below:

 {{{
 import asyncio
 import contextvars
 import time

 from asgiref.sync import async_to_sync, sync_to_async


 async def sync_to_async_version():
     context = contextvars.copy_context()

     def sync_fn():
         # some cpu-bound task
         time.sleep(0.1)

     async def async_fn():
         # to enforce release an event loop (or replace with
 time.sleep(0.1))
         asyncio.sleep(0)
         asyncio.sleep(0)

     async with asyncio.TaskGroup() as tg:
         tg.create_task(sync_to_async(context=context)(sync_fn)())
         tg.create_task(async_fn(), context=context)


 async_to_sync(sync_to_async_version)()
 }}}


 Running this results in (probably 2-3 times to reproduce):

 {{{
 RuntimeError: cannot enter context: <_contextvars.Context object at
 0x7f08aa52c100> is already entered
 }}}
 == Why this happens

 The core issue is that a single `contextvars.Context` cannot be entered
 more than once at the same time. Under normal circumstances, this error
 does not occur because the asyncio event loop is cooperative and single-
 threaded. Tasks are interleaved, not executed in parallel. Internally, the
 loop drives each coroutine step by step like this:

 {{{
 ctx.run(coro.send(None))
 }}}

 (see:
 
[https://github.com/python/cpython/blob/d3c888b4ec15dbd7d6b6ef4f15b558af77c228af/Lib/asyncio/tasks.py#L252
 asyncio/tasks.py])

 Because everything runs on the same thread, the context is entered and
 exited sequentially, so no conflict occurs.

 In our case, however, `asgiref.sync_to_async` uses a `ThreadPoolExecutor`
 to run the synchronous function on a separate OS thread. Here’s the
 problematic sequence:

 - The thread pool starts executing
 [https://github.com/django/asgiref/blob/main/asgiref/sync.py#L491-L501
 `ctx.run(long_sync_task)`], and the Context becomes entered on the worker
 thread.
 - While that thread is still running, the OS scheduler switches back to
 the event loop thread.
 - The event loop continues executing the coroutine and tries to resume it
 using:
 {{{ctx.run(coro.send(None))}}}
 - This attempts to enter the same Context again, while it's already active
 on another thread, and because Context objects are not re-entrant, Python
 raises:
 {{{RuntimeError: cannot enter context: <Context> is already entered}}}

 == Key takeaways

 We cannot control OS thread scheduling, so we can't avoid conflicts when
 the same Context is shared across threads. Consequently, the only safe way
 to share a Context between sync and async signal handlers is to run sync
 handlers and async handlers serially rather than concurrently.

 Can I prepare the MR with these changes?

 **BTW:**

 There are a lot of passing tests in the PR
 ([https://github.com/django/django/pull/20288/files#diff-
 f15e01d2278c7416164820f9d95d2f02859da046ba67c8473320ecdd7ee140ceR651-R883
 see]), mainly because the handlers run too quickly to trigger the error.
 In real-world projects, signal handlers are usually slower, which makes
 these race conditions much more likely to appear πŸ˜….
-- 
Ticket URL: <https://code.djangoproject.com/ticket/36714#comment:21>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/django-updates/0107019ae014b2d0-d8a0c571-0c4a-43ed-8135-41283b88f8e8-000000%40eu-central-1.amazonses.com.

Reply via email to