Hi, I think the peculiarities of doing file I/O in asyncio applications aren't as well documented as they could be.
The PEP doesn't say much, except that disk files can't be used with I/O callbacks. The terms file descriptor and file-like object are used a whole lot, but they don't apply to what a non-expert user would consider a file. Fine, PEPs aren't aimed at novice users anyway, I think. The main docs (https://docs.python.org/3/library/asyncio.html) again don't say much. https://docs.python.org/3/library/asyncio-eventloop.html#connect-pipes mentions that pipes are file-like objects, and links to the definition of a file-like object (https://docs.python.org/3/glossary.html#term-file-object) where it basically states that file-like objects can be files. Technically true, but might be misleading to a novice. Everyone on this list might know the relationship of file descriptors and sockets/files/pipes, but someone at my company needing to develop with asyncio, who is not an expert, might think passing an open file to connect_read/write_pipe might work. Then, Googling around we find: * https://code.google.com/p/tulip/wiki/ThirdParty#Filesystem - finally, a clear message: asyncio does *not* support asynchronous operations on the filesystem. Even if files are opened with O_NONBLOCK, read and write will block * http://stackoverflow.com/questions/26916294/python-asyncio-read-file-and-execute-another-activity-at-intervals - says it can't be done. * https://groups.google.com/forum/#!topic/python-tulip/MvpkQeetWZA - can't be done. Other systems use threads. Use a threadpool (somehow). * http://stackoverflow.com/questions/3908809/the-state-of-linux-async-io - if you broaden your search What I'm proposing is this: * for the official docs: * * a note at the beginning making clear asyncio doesn't generally work on files, and that the terms 'file descriptor' and 'file-like object' are used in a more specific, UNIX-type of sense. Also a link to a "Develop with asyncio" chapter which I'm about to mention * * A new "Develop with asyncio" chapter which explains how files should be handled, using run_in_executor and maybe separate processes (either subprocess or a process pool executor)? * for the examples: * * an example of serving a small file (read into memory first) using the default run_in_executor * * an example of serving a large file using the default run_in_executor * * an example of serving a large file using a separate process pool executor What do you folks think? I'm willing to work on all of these if you think it's a good idea.
