Re: [Async-sig] Blog post: Timeouts and cancellation for humans

Dima Tisnek Sun, 14 Jan 2018 18:33:59 -0800

I suppose the websocket case ought to follow conventions similar to kernel
TCP API where `close` returns immediately but continues to send packets
behind the scenes. It could look something like this:



with move_on_after(10):
    await get_ws_message(url):

async def get_ws_message(url):
    async def close():
        if sock and sock.is_connected and ...:
            await sock.send(build_close_packet())
            await sock.recv()  # or something
        if sock:
            sock.close()

    sock = socket.socket()
    try:
        await sock.connect(url)
        data = sock.recv(...)
        return decode(data)
    finally:
        with move_on_after(30):
            someio.spawn_tak(close())


I believe the concern is more general than supporting "broken" protocols,
like websocket.

When someone writes `with move_on_after(N): a = await foo()` it can be
understood in two ways:

* perform foo for N seconds or else, or
* I want the result in N seconds or else

The latter doesn't imply that foo should be interrupted, only that caller
wishes to proceed without the result. It makes sense if the action involves
an unrelated, long-running process, where `foo()` is something like
`anext(some_async_generator)`.

Both solve the original concern, that caller should not block for more than
N.
I suppose one can be implemented in terms of the other.

Perhaps the latter is what `shield` should do? That is detach computation
as opposed to blocking the caller past caller's deadline?

What do you all think?


On Mon, 15 Jan 2018 at 6:45 AM, Nick Badger <nbadg...@gmail.com> wrote:

> However, I think this is probably a code smell. Like all code smells,
>> there are probably cases where it's the right thing to do, but when
>> you see it you should stop and think carefully.
>
>
> Huh. That's a really good point. But I'm not sure the source of the smell
> is the code that needs the shield logic -- I think this might instead be
> indicative of upstream code smell. Put a bit more concretely: if you're
> writing a protocol for an unreliable network (and of course, every network
> is unreliable), requiring a closure operation to transmit something over
> that network is inherently problematic, because it inevitably leads to
> multiple-stage timeouts or ungraceful shutdowns.
>
> Clearly, changing anything upstream is out of scope here. So if the smell
> is, in fact, "upwind", there's not really much you could do about that in
> asyncio, Curio, Trio, etc, other than minimize the additional smell you
> need to accommodate smelly protocols. Unfortunately, I'm not sure there's
> any one approach to that problem that isn't application-specific.
>
>
> Nick Badger
> https://www.nickbadger.com
>
> 2018-01-14 3:33 GMT-08:00 Nathaniel Smith <n...@pobox.com>:
>
>> On Fri, Jan 12, 2018 at 4:17 AM, Chris Jerdonek
>> <chris.jerdo...@gmail.com> wrote:
>> > Thanks, Nathaniel. Very instructive, thought-provoking write-up!
>> >
>> > One thing occurred to me around the time of reading this passage:
>> >
>> >> "Once the cancel token is triggered, then all future operations on
>> that token are cancelled, so the call to ws.close doesn't get stuck. It's a
>> less error-prone paradigm. ... If you follow the path we did in this blog
>> post, and start by thinking about applying a timeout to a complex operation
>> composed out of multiple blocking calls, then it's obvious that if the
>> first call uses up the whole timeout budget, then any future calls should
>> fail immediately."
>> >
>> > One case that's not clear how should be addressed is the following.
>> > It's something I've wrestled with in the context of asyncio, and it
>> > doesn't seem to be raised as a possibility in your write-up.
>> >
>> > Say you have a complex operation that you want to be able to timeout
>> > or cancel, but the process of cleanup / cancelling might also require
>> > a certain amount of time that you'd want to allow time for (likely a
>> > smaller time in normal circumstances). Then it seems like you'd want
>> > to be able to allocate a separate timeout for the clean-up portion
>> > (independent of the timeout allotted for the original operation).
>> >
>> > It's not clear to me how this case would best be handled with the
>> > primitives you described. In your text above ("then any future calls
>> > should fail immediately"), without any changes, it seems there
>> > wouldn't be "time" for any clean-up to complete.
>> >
>> > With asyncio, one way to handle this is to await on a task with a
>> > smaller timeout after calling task.cancel(). That lets you assign a
>> > different timeout to waiting for cancellation to complete.
>>
>> You can get these semantics using the "shielding" feature, which the
>> post discusses a bit later:
>>
>> try:
>>     await do_some_stuff()
>> finally:
>>     # Always give this 30 seconds to clean up, even if we've
>>     # been cancelled
>>     with trio.move_on_after(30) as cscope:
>>         cscope.shield = True
>>         await do_cleanup()
>>
>> Here the inner scope "hides" the code inside it from any external
>> cancel scopes, so it can continue executing even of the overall
>> context has been cancelled.
>>
>> However, I think this is probably a code smell. Like all code smells,
>> there are probably cases where it's the right thing to do, but when
>> you see it you should stop and think carefully. If you're writing code
>> like this, then it means that there are multiple different layers in
>> your code that are implementing timeout policies, that might end up
>> fighting with each other. What if the caller really needs this to
>> finish in 15 seconds? So if you have some way to move the timeout
>> handling into the same layer, then I suspect that will make your
>> program easier to understand and maintain. OTOH, if you decide you
>> want it, the code above works :-). I'm not 100% sure here; I'd
>> definitely be interested to hear about more use cases.
>>
>> One thing I've thought about that might help is adding a kind of "soft
>> cancelled" state to the cancel scopes, inspired by the "graceful
>> shutdown" mode that you'll often see in servers where you stop
>> accepting new connections, then try to finish up old ones (with some
>> time limit). So in this case you might mark 'do_some_stuff()' as being
>> cancelled immediately when we entered the 'soft cancel' phase, but let
>> the 'do_cleanup' code keep running until the grace period expired and
>> the region was hard-cancelled. This idea isn't fully baked yet though.
>> (There's some more mumbling about this at
>> https://github.com/python-trio/trio/issues/147.)
>>
>> -n
>>
>> --
>> Nathaniel J. Smith -- https://vorpus.org
>> _______________________________________________
>> Async-sig mailing list
>> Async-sig@python.org
>> https://mail.python.org/mailman/listinfo/async-sig
>> Code of Conduct: https://www.python.org/psf/codeofconduct/
>>
>
> _______________________________________________
> Async-sig mailing list
> Async-sig@python.org
> https://mail.python.org/mailman/listinfo/async-sig
> Code of Conduct: https://www.python.org/psf/codeofconduct/
>

_______________________________________________
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/

Re: [Async-sig] Blog post: Timeouts and cancellation for humans

Reply via email to