dcelasun
    I think it's worth it. The context passes through most parts of the stack 
and having a test to ensure it's propagated properly feels imporant.
    Also, we can at least reduce the flakiness with longer timers, say 2s sleep 
and 1s timeout. The test suite already takes quite a bit of time to run, an 
additional second should be fine.


