Re: [python-tulip] StreamWriter.drain cannot be called concurrently

Glyph Sat, 13 Jun 2015 01:30:13 -0700

> On Jun 12, 2015, at 2:37 PM, Martin Teichmann <[email protected]> 
> wrote:
> 
> Hi Glyph, hi Guido, hi everyone.
> 
> You have two very different points of critique,
> let me respond to both of them:
> 
>> StreamWriter.drain cannot be called from different tasks (asyncio tasks that 
>> is)
>> at the same time.
>> 
>> 
>> In my opinion, this is fine.  It should probably be some exception other 
>> than AssertionError, but it should be an exception.  Do not try to 
>> manipulate a StreamWriter or a StreamReader from multiple tasks at once.  If 
>> you encourage people to do this, it is a recipe for corrupted output streams 
>> and interleaved writes.
> 
> No, it isn't. Asyncio is completely single threaded,


I am aware of that; that's why I said "tasks" and not "threads".

> only one task is running at any given time, until the next
> yield from. So no writes can ever be interleaved,
> unless you explicitly yield from something.

Yes.  However, if you were to consider something like

for each in range(10):
    writer.write(b"A")
    yield from writer.drain() # be polite! don't buffer too much at once!

you might reasonably expect, reasoning about this coroutine by itself, that you 
will see a string of 10 "A"s on the wire.  But of course you might get 
"ABABABABABABABABABAB" if another coroutine were doing the same thing with "B" 
elsewhere.

This becomes a practical concern when you have a protocol that involves 
potentially large messages (like everybody's favorite, HTTP) where yielding to 
drain() before writing chunks of those messages would be good form, but would 
be highly dangerous if you had concurrent writers.  Which, in the future, you 
likely will, in the form of HTTP/2.

Keep in mind also that if you have multiple writers to a stream, you might get 
interleaved writes even if you are not calling drain(); anything you 'yield 
from' might task switch to another writer.

>> Agreed. These streams should not be accessed "concurrently" by different 
>> coroutines. A single coroutine that repeatedly calls write() and eventually 
>> drain() is in full control over how many bytes it writes before calling 
>> drain(), and thus it can easily ensure the memory needed for buffering is 
>> strictly bounded. But if multiple independent coroutines engage in this 
>> pattern for the same stream, the amount of buffer space is not under control 
>> of any single coroutine. 
> 
> Guido definitely has a point here.
> But this problem is solvable: in a system where the writer is slow -
> and this is the situation you typically want to use drain() -
> all tasks will normally be waiting in drain until the writer reached
> its low water mark again. Now my patch will resume all tasks. That is
> wrong. The correct solution is to resume only one task (first come first serve
> probably being best), and the others only once the writer is fine again.

It's theoretically solvable, but for social reasons this is still not something 
that you want to encourage.  Once you start having a fair queueing system 
around drain() it raises the question of having a fair queueing system for 
writing whole large messages, which means you then need some kind of higher 
level lock on the stream, which becomes very confusing (every stream becomes 
its own full-fledged task scheduler).

-glyph

Re: [python-tulip] StreamWriter.drain cannot be called concurrently

Reply via email to