Re: Asynchronous IO
Hi, On Fri, Apr 13, 2001 at 04:45:07AM -0400, Dan Maas wrote: > IIRC the problem with implementing asynchronous *disk* I/O in Linux today is > that the filesystem code assumes synchronous I/O operations that block the > whole process/thread. So implementing "real" asynch I/O (without the > overhead of creating a process context for each operation) would require > re-writing the filesystems as non-blocking state machines. Last I heard this > was a long-term goal, but nobody's done the work yet SGI and Ben LaHaise both have kernel async IO functionality working, and Ingo Molnar's Tux code has support for doing certain filesystem lookup operations asynchronously too. --Stephen - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Asynchronous IO
--On Friday, April 13, 2001 04:45:07 -0400 Dan Maas <[EMAIL PROTECTED]> wrote: > IIRC the problem with implementing asynchronous *disk* I/O in Linux today > is that the filesystem code assumes synchronous I/O operations that block > the whole process/thread. So implementing "real" asynch I/O (without the > overhead of creating a process context for each operation) would require > re-writing the filesystems as non-blocking state machines. Last I heard > this was a long-term goal, but nobody's done the work yet (aside from > maybe the SGI folks with XFS?). Or maybe I don't know what I'm talking > about... If the FS supports generic read then this is not a problem. This is what SGI's KAIO does as well as Bart's work. > Bart, glad to hear you are working on an event interface, sounds cool! One > feature that I really, really, *really* want to see implemented is the > ability to block on a set of any "waitable kernel objects" with one > syscall - not just file descriptors, but also SysV semaphores and message > queues, UNIX signals and child proceses, file locks, pthreads condition > variables, asynch disk I/O completions, etc. I am dying for a clean way to > accomplish this that doesn't require more than one thread... (Win32 and > FreeBSD kick our butts here with MsgWaitForMultipleObjects() and > kevent()...) IMHO cleaning up this API deficiency is just as important as > optimizing the extreme case of socket I/O with zillions of file > descriptors... Actually, sigwaitinfo() has zero problem waiting on muliple signals. If you are using real-time signals each signal can pass a pointer to the relevant object, so even if you're only blocking on a single signal you can receive info about several objects. --Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Asynchronous IO
IIRC the problem with implementing asynchronous *disk* I/O in Linux today is that the filesystem code assumes synchronous I/O operations that block the whole process/thread. So implementing "real" asynch I/O (without the overhead of creating a process context for each operation) would require re-writing the filesystems as non-blocking state machines. Last I heard this was a long-term goal, but nobody's done the work yet (aside from maybe the SGI folks with XFS?). Or maybe I don't know what I'm talking about... Bart, glad to hear you are working on an event interface, sounds cool! One feature that I really, really, *really* want to see implemented is the ability to block on a set of any "waitable kernel objects" with one syscall - not just file descriptors, but also SysV semaphores and message queues, UNIX signals and child proceses, file locks, pthreads condition variables, asynch disk I/O completions, etc. I am dying for a clean way to accomplish this that doesn't require more than one thread... (Win32 and FreeBSD kick our butts here with MsgWaitForMultipleObjects() and kevent()...) IMHO cleaning up this API deficiency is just as important as optimizing the extreme case of socket I/O with zillions of file descriptors... Regards, Dan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Asynchronous io
Hi CJ, you should really read the thread titled "Linux's implementation of poll() not scalable?" in the LKML archives, here is a link: http://www.uwsg.iu.edu/hypermail/linux/kernel/0010.3/0003.html There are many problems with the /dev/something interface for events and all is described in that thread. I have worked on a way suggested by Linus to get rid of the hit in performance when using select() and poll(). I have a working model for TCP sockets (as that is what I wanted to speed up - a TCP based proxy). My implementation is still in alpha but is available here: http://www.jukie.net/~bart/kernel/fdevent/ Now, before anyone gets to excited... I spoke with Linus about this and he suggested that I speak with Ben LaHaise who is working on async io using some modifications to the wail queue. I have send him mail but have not heard from Ben - I guess he must be as busy as the rest of us with a full mailbox of messages that he has no time to reply to. :) My implementation introduces two new system calls: bind_event and get_events (as descibed in Linus' email above). The project is still in an alpha stage so I don't have any benchmarks. I am working on this at my own time so progress is moving at a slow pace... unfortunately. Regards, Bart. On Thu, 12 Apr 2001, CJ wrote: > //Linux really needs a clean basis for asynchronous and > //unbuffered i/o libraries. Something like the fork/thread > //clone(), but to replace select() and aio_* polling. This > //might be a start. And it is just a file and very like a > //pipe or socket. > > //Suppose we add /dev/qio with 64 byte sectors as follows: > > struct qio{//64 byte i/o request > u16 flags; //0.0 request block variant, SEEK_SET... > u16 verb; //0.2 open,close,read,mmap,sync,write, > //ioctl > //mallocIO&read,write&freeIO, > //mallocIO,freeIO > //autothread might be an ioctl() > u16 errno; //0.4 per request status > u16 completehow;//0.6 queue,AST,pipe,SIGIO,SIGIO||delete ok > u64 offset; //1 > u32 length; //2.0 bytes requested > u32 timeout;//2.4 im ms or us? > u32 transferred;//3.0 bytes > u32 qiohandle; //3.4 for cancell or polling > void* handle; //4 (open & close might write) > void* buffer; //5 > void* callback; //6 optimize special cases w/ completehow > void* callparam;//7 > }; //all fields are read xor write > > //Writing to the device would schedule i/o, reading would reap > //completions. Bad writes would give the byte offset to the > //rejected sector field if detected synchronously. Multiple > //sector writes would be truncated on the first bad sector. > //Accepted writes would be buffered in the kernel. > > //Each open creates a new queue, each write is read in the > //same queue. Any number of threads can read or write a queue. > > //some cases might be simplified by kernel processed completions, > //such as VMS AST emulation, or putting results in a pipe. Hence > //completehow, which might use callback and callparam. > > //timeout? > //canceling i/o? > //Sun aio emulation? > //VMS qio emulation? > //MS IOCP emulation? > //malloc()&free() safe across threads? > //Should O_DIRECT would error unless properly aligned etc. > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- WebSig: http://www.jukie.net/~bart/sig/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Asynchronous io
//Linux really needs a clean basis for asynchronous and //unbuffered i/o libraries. Something like the fork/thread //clone(), but to replace select() and aio_* polling. This //might be a start. And it is just a file and very like a //pipe or socket. //Suppose we add /dev/qio with 64 byte sectors as follows: struct qio{//64 byte i/o request u16 flags; //0.0 request block variant, SEEK_SET... u16 verb; //0.2 open,close,read,mmap,sync,write, //ioctl //mallocIO&read,write&freeIO, //mallocIO,freeIO //autothread might be an ioctl() u16 errno; //0.4 per request status u16 completehow;//0.6 queue,AST,pipe,SIGIO,SIGIO||delete ok u64 offset; //1 u32 length; //2.0 bytes requested u32 timeout;//2.4 im ms or us? u32 transferred;//3.0 bytes u32 qiohandle; //3.4 for cancell or polling void* handle; //4 (open & close might write) void* buffer; //5 void* callback; //6 optimize special cases w/ completehow void* callparam;//7 }; //all fields are read xor write //Writing to the device would schedule i/o, reading would reap //completions. Bad writes would give the byte offset to the //rejected sector field if detected synchronously. Multiple //sector writes would be truncated on the first bad sector. //Accepted writes would be buffered in the kernel. //Each open creates a new queue, each write is read in the //same queue. Any number of threads can read or write a queue. //some cases might be simplified by kernel processed completions, //such as VMS AST emulation, or putting results in a pipe. Hence //completehow, which might use callback and callparam. //timeout? //canceling i/o? //Sun aio emulation? //VMS qio emulation? //MS IOCP emulation? //malloc()&free() safe across threads? //Should O_DIRECT would error unless properly aligned etc. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/