On Mon, Aug 25, 2014 at 6:25 AM, Martin Richard <[email protected]> wrote:
> On Monday, August 25, 2014 2:03:36 PM UTC+2, Victor Stinner wrote: >> >> Hi, >> >> It's probably a bug. >> > > Ok, should I open an issue? > Hold on, it doesn't seem to be a bug, although it may be a poorly designed feature. create_server() has a backlog parameter (defaulting to 100) and must call listen() to implement it. AFAIU listen() is needed to set the socket in listening mode. If we skipped the listen() call, requiring the caller to make it, we would have a backwards incompatibility, and that may cause problems. Such a backwards incompatible change is not 100% disallowed: asyncio is in "provisional mode" until Python 3.5, meaning small API adjustments are allowed based on issues that came up during the 3.4 cycle. But before I accept this as an exception I'd like to understand your use case better. Have you actually run into a situation where the previously established backlog was important yet impossible to retrieve (so you could not possibly pass the correct backlog parameter to create_server())? > > I also have questions about StreamWriter and the flow control system. >> > >> > I understood that I am expected to yield from writer.drain() after any >> call >> > to writer.write(), so the flow control mechanism can make the calling >> task >> > wait until the buffer gets downsized to the low-water limit. >> >> Nope, see the documentation: >> >> "drain(): >> Wait until the write buffer of the underlying transport is flushed." >> >> > > I don't >> > understand why the writer.write[lines]() functions are not coroutines >> which >> > actually yield from writer.drain(), nor why the "yield from >> writer.drain()" >> > is not performed before the call to write(). >> >> The purpose of a buffer is performances. You may be able to pack >> multiple small writes into a single call to socket.send(). Flushing >> after each call to stream.write() would call socket.send() each time, >> which is less efficient. >> > > That is what the documentation says, but it's almost a contradiction with > the next sentence: I fail to understand why it doesn't wait if the protocol > is not paused. The protocol will only be paused if the buffer reaches the > high water limit, thus, drain() will indeed not wait for the underlying > buffer to be flushed in most cases. > If this is true, I also don't get how we can be notified that the > high-water limit has been reached using StreamWriter(). If as a user i keep > calling write(), I can always fill my buffer without knowing that the other > end can not keep up. > Right. The situation where you aren't required to call drain() is pretty specific, but it is also pretty common -- it is for those situations where the nature of your application implies that you won't be writing a lot of data before you have a natural point where your code yields anyway. For example in an http client there's probably a pretty low practical limit (compared to the typical buffer size) of the size of all headers combined, so you won't need to drain() between headers, even if you use a separate write() call for each header. (However, once you are sending unlimited data, e.g. a request body, you should probably insert drain() calls.) So the guideline is, if you call write() in an unbounded loop that doesn't contain yield-from, you should definitely call drain(); if you call write() just a few times with bounded data, you don't need to bother. FWIW, there is a subtle API usability issue that made me design write() this way. A lot of code calls write() without checking for the return value, so if write() was a coroutine, forgetting to add "yield from" in front of a write() call would be pretty painful to debug. Input calls don't have this problem (at least not to the same extent) -- you rarely call read() or readline() without immediately doing something with the result, so if you forget the yield-from with one of these your code will most likely crash instead of being silent or hanging. > In fact, you don't need to wait for drain(), asyncio automatically >> flushs the buffer "in background". drain() is only required when you >> have to respect a protocol, for example write and then read when the >> write is done. >> > > > On a related topic, is there a reason why StreamWriter does not have a >> > flush() coroutine, or any other way to wait until the buffer is empty? >> The >> > only workaround I've got for this is to temporarily force high and low >> water >> > limits to 0 so writer.drain() will wait until the buffer is actually >> empty. >> >> Limits are only used to pause the protocol. The protocol is not >> directly related to the buffer. >> > > So on which object do these limits apply? > On the StreamWriter object. > An example of situation which I can't solve is when I want to run the loop > until the transport wrote everything. I think there is currently no way to > synchronize on this event. > Why do you want to do that? It seems you are still struggling with figuring out how to use asyncio well, hence your requests for features it does not want to provide. Or are you trying to wrap it into an existing API that you cannot change for backwards compatible reasons? In that case perhaps you should try to use bare protocols and transports instead of stream readers and writers. -- --Guido van Rossum (python.org/~guido)
