Hello,

On Thu, 11 Jun 2015 02:05:23 -0700 (PDT)
Martin Teichmann <[email protected]> wrote:

[]

> What I am doing is the following: several tasks in my program are
> generating big amounts of data to be shipped out on a StreamWriter.
> This can easily overload the receiver of all that data. This is why
> every task, after calling
> writer.write also calls "yield from writer.drain()". Unfortunately,
> while draining
> another task may write to the same stream writer, also wants to call
> drain. This raises an AssertionError.

This is a big problem, about which I wanted to write for a long time.
The root of the problem is however not drain(), but a synchronous
write() method, whose semantics seems to be drawn as to easily allow DoS
attacks on the platform where the code runs - it's required to buffer
unlimited amounts of data, which is not possible on any physical
platform, and will only lead to excessive virtual memory swapping and
out-of-memory killings on real systems (why the reference to DoS).

Can we please-please have async_write() method? Two boundary
implementations of it would be:

# Same behavior as currently - unlimited buffering
def async_write(...):
    return self.write()
    yield


# Memory-conscious implementation
def async_write(...):
    self.write()
    yield from self.drain()


But the point is that it's an asyncio implementation what will be able
to decide on a behavior. For example, there can be implementation which
decides how much to buffer based on physical memory availability, which
is global shared resource, and thus needs to be scheduled globally.

Contrast that with Transport-local flow control as how the problem is
supposed to be tackled currently. A particular Transport doesn't have
insight into complete system functioning, so cannot schedule resources
efficiently. Moreover, Transports is a separate, tangled API. People
who want to use asyncio want to deal with nice Python coroutines, and
suddenly face completely foreign "Transport API" picked up from a
20-year old, legacy package. The only reasonable response to expect
from people as that they will ignore all the complexities of it, which
is the way to DoS, as described above.


So, as I already argued previously on this list and python-dev, please
kindly structure asyncio API in such a way that alternative
implementations are possible, which support asyncio coroutine paradigm,
but are devoid of extra layers not directly related to this native
paradigm. Thus, such layers should not be on a critical path to be able
to use asyncio in such native way. Examples of such layers are Future
(discussed before) and Transport (discussed here).

Implementation of these ideas is available in uasyncio, an asyncio
implementation for MicroPython. async_write() method above is called
awrite() for brevity:
https://github.com/micropython/micropython-lib/blob/master/uasyncio/uasyncio/__init__.py#L102

There exist compatibility layer which implements uasyncio-compatible
API on top of CPython asyncio (by monkey-patching):

https://github.com/micropython/micropython-lib/blob/master/cpython-uasyncio/uasyncio.py#L83



Thanks for your consideration of this issue!



-- 
Best regards,
 Paul                          mailto:[email protected]

Reply via email to