Hi,

I think the peculiarities of doing file I/O in asyncio applications aren't 
as well documented as they could be.

The PEP doesn't say much, except that disk files can't be used with I/O 
callbacks. The terms file descriptor and file-like object are used a whole 
lot, but they don't apply to what a non-expert user would consider a file. 
Fine, PEPs aren't aimed at novice users anyway, I think.

The main docs (https://docs.python.org/3/library/asyncio.html) again don't 
say much. 

https://docs.python.org/3/library/asyncio-eventloop.html#connect-pipes 
mentions that pipes are file-like objects, and links to the definition of a 
file-like object (https://docs.python.org/3/glossary.html#term-file-object) 
where it basically states that file-like objects can be files. Technically 
true, but might be misleading to a novice. Everyone on this list might know 
the relationship of file descriptors and sockets/files/pipes, but someone 
at my company needing to develop with asyncio, who is not an expert, might 
think passing an open file to connect_read/write_pipe might work.

Then, Googling around we find:

* https://code.google.com/p/tulip/wiki/ThirdParty#Filesystem - finally, a 
clear message: asyncio does *not* support asynchronous operations on the 
filesystem. Even if files are opened with O_NONBLOCK, read and write will 
block
* 
http://stackoverflow.com/questions/26916294/python-asyncio-read-file-and-execute-another-activity-at-intervals
 
- says it can't be done.
* https://groups.google.com/forum/#!topic/python-tulip/MvpkQeetWZA - can't 
be done. Other systems use threads. Use a threadpool (somehow).
* http://stackoverflow.com/questions/3908809/the-state-of-linux-async-io - 
if you broaden your search

What I'm proposing is this:
* for the official docs:
* * a note at the beginning making clear asyncio doesn't generally work on 
files, and that the terms 'file descriptor' and 'file-like object' are used 
in a more specific, UNIX-type of sense. Also a link to a "Develop with 
asyncio" chapter which I'm about to mention
* * A new "Develop with asyncio" chapter which explains how files should be 
handled, using run_in_executor and maybe separate processes (either 
subprocess or a process pool executor)?
* for the examples:
* * an example of serving a small file (read into memory first) using the 
default run_in_executor
* * an example of serving a large file using the default run_in_executor
* * an example of serving a large file using a separate process pool 
executor

What do you folks think? I'm willing to work on all of these if you think 
it's a good idea.

Reply via email to