I think you are absolutely right, Tin, as a newcomer to asyncio I was very surprised that file I/O was not supported by asyncio, having played with Node.js before (it was also interesting to learn that Node.js uses threads to deal with files because there are no portable, reliable async APIs for file operations).
As a first step, your documentation fixes would be a great contribution. Eventually asyncio should implement its own asynchronous API for handling files, like Node.js did. Best, Luciano On Sat, Feb 28, 2015 at 12:47 PM, Tin Tvrtković <[email protected]> wrote: > Hi, > > I think the peculiarities of doing file I/O in asyncio applications aren't > as well documented as they could be. > > The PEP doesn't say much, except that disk files can't be used with I/O > callbacks. The terms file descriptor and file-like object are used a whole > lot, but they don't apply to what a non-expert user would consider a file. > Fine, PEPs aren't aimed at novice users anyway, I think. > > The main docs (https://docs.python.org/3/library/asyncio.html) again don't > say much. > > https://docs.python.org/3/library/asyncio-eventloop.html#connect-pipes > mentions that pipes are file-like objects, and links to the definition of a > file-like object (https://docs.python.org/3/glossary.html#term-file-object) > where it basically states that file-like objects can be files. Technically > true, but might be misleading to a novice. Everyone on this list might know > the relationship of file descriptors and sockets/files/pipes, but someone at > my company needing to develop with asyncio, who is not an expert, might > think passing an open file to connect_read/write_pipe might work. > > Then, Googling around we find: > > * https://code.google.com/p/tulip/wiki/ThirdParty#Filesystem - finally, a > clear message: asyncio does not support asynchronous operations on the > filesystem. Even if files are opened with O_NONBLOCK, read and write will > block > * > http://stackoverflow.com/questions/26916294/python-asyncio-read-file-and-execute-another-activity-at-intervals > - says it can't be done. > * https://groups.google.com/forum/#!topic/python-tulip/MvpkQeetWZA - can't > be done. Other systems use threads. Use a threadpool (somehow). > * http://stackoverflow.com/questions/3908809/the-state-of-linux-async-io - > if you broaden your search > > What I'm proposing is this: > * for the official docs: > * * a note at the beginning making clear asyncio doesn't generally work on > files, and that the terms 'file descriptor' and 'file-like object' are used > in a more specific, UNIX-type of sense. Also a link to a "Develop with > asyncio" chapter which I'm about to mention > * * A new "Develop with asyncio" chapter which explains how files should be > handled, using run_in_executor and maybe separate processes (either > subprocess or a process pool executor)? > * for the examples: > * * an example of serving a small file (read into memory first) using the > default run_in_executor > * * an example of serving a large file using the default run_in_executor > * * an example of serving a large file using a separate process pool > executor > > What do you folks think? I'm willing to work on all of these if you think > it's a good idea. -- Luciano Ramalho Twitter: @ramalhoorg Professor em: http://python.pro.br Twitter: @pythonprobr
