Re: Asynchronous IO

2001-04-20 Thread Stephen C. Tweedie

Hi,

On Fri, Apr 13, 2001 at 04:45:07AM -0400, Dan Maas wrote:
> IIRC the problem with implementing asynchronous *disk* I/O in Linux today is
> that the filesystem code assumes synchronous I/O operations that block the
> whole process/thread. So implementing "real" asynch I/O (without the
> overhead of creating a process context for each operation) would require
> re-writing the filesystems as non-blocking state machines. Last I heard this
> was a long-term goal, but nobody's done the work yet

SGI and Ben LaHaise both have kernel async IO functionality working,
and Ingo Molnar's Tux code has support for doing certain filesystem
lookup operations asynchronously too.  

--Stephen
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Asynchronous IO

2001-04-13 Thread Christopher Smith

--On Friday, April 13, 2001 04:45:07 -0400 Dan Maas <[EMAIL PROTECTED]> wrote:
> IIRC the problem with implementing asynchronous *disk* I/O in Linux today
> is that the filesystem code assumes synchronous I/O operations that block
> the whole process/thread. So implementing "real" asynch I/O (without the
> overhead of creating a process context for each operation) would require
> re-writing the filesystems as non-blocking state machines. Last I heard
> this was a long-term goal, but nobody's done the work yet (aside from
> maybe the SGI folks with XFS?). Or maybe I don't know what I'm talking
> about...

If the FS supports generic read then this is not a problem. This is what 
SGI's KAIO does as well as Bart's work.

> Bart, glad to hear you are working on an event interface, sounds cool! One
> feature that I really, really, *really* want to see implemented is the
> ability to block on a set of any "waitable kernel objects" with one
> syscall - not just file descriptors, but also SysV semaphores and message
> queues, UNIX signals and child proceses, file locks, pthreads condition
> variables, asynch disk I/O completions, etc. I am dying for a clean way to
> accomplish this that doesn't require more than one thread... (Win32 and
> FreeBSD kick our butts here with MsgWaitForMultipleObjects() and
> kevent()...) IMHO cleaning up this API deficiency is just as important as
> optimizing the extreme case of socket I/O with zillions of file
> descriptors...

Actually, sigwaitinfo() has zero problem waiting on muliple signals. If you 
are using real-time signals each signal can pass a pointer to the relevant 
object, so even if you're only blocking on a single signal you can receive 
info about several objects.



--Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Asynchronous IO

2001-04-13 Thread Dan Maas

IIRC the problem with implementing asynchronous *disk* I/O in Linux today is
that the filesystem code assumes synchronous I/O operations that block the
whole process/thread. So implementing "real" asynch I/O (without the
overhead of creating a process context for each operation) would require
re-writing the filesystems as non-blocking state machines. Last I heard this
was a long-term goal, but nobody's done the work yet (aside from maybe the
SGI folks with XFS?). Or maybe I don't know what I'm talking about...

Bart, glad to hear you are working on an event interface, sounds cool! One
feature that I really, really, *really* want to see implemented is the
ability to block on a set of any "waitable kernel objects" with one
syscall - not just file descriptors, but also SysV semaphores and message
queues, UNIX signals and child proceses, file locks, pthreads condition
variables, asynch disk I/O completions, etc. I am dying for a clean way to
accomplish this that doesn't require more than one thread... (Win32 and
FreeBSD kick our butts here with MsgWaitForMultipleObjects() and
kevent()...) IMHO cleaning up this API deficiency is just as important as
optimizing the extreme case of socket I/O with zillions of file
descriptors...

Regards,
Dan

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Asynchronous io

2001-04-12 Thread Bart Trojanowski

Hi CJ,
  you should really read the thread titled "Linux's implementation of
poll() not scalable?" in the LKML archives, here is a link:
http://www.uwsg.iu.edu/hypermail/linux/kernel/0010.3/0003.html
There are many problems with the /dev/something interface for events and
all is described in that thread.

  I have worked on a way suggested by Linus to get rid of the hit in
performance when using select() and poll().  I have a working model for
TCP sockets (as that is what I wanted to speed up - a TCP based proxy).
My implementation is still in alpha but is available here:
  http://www.jukie.net/~bart/kernel/fdevent/

  Now, before anyone gets to excited... I spoke with Linus about this and
he suggested that I speak with Ben LaHaise who is working on async io
using some modifications to the wail queue.  I have send him mail but have
not heard from Ben - I guess he must be as busy as the rest of us with a
full mailbox of messages that he has no time to reply to. :)

  My implementation introduces two new system calls: bind_event and
get_events (as descibed in Linus' email above).  The project is still in
an alpha stage so I don't have any benchmarks.  I am working on this at my
own time so progress is moving at a slow pace... unfortunately.

Regards,
Bart.

On Thu, 12 Apr 2001, CJ wrote:

> //Linux really needs a clean basis for asynchronous and
> //unbuffered i/o libraries.  Something like the fork/thread
> //clone(), but to replace select() and aio_* polling.  This
> //might be a start. And it is just a file and very like a
> //pipe or socket.
>
> //Suppose we add /dev/qio with 64 byte sectors as follows:
>
> struct qio{//64 byte i/o request
> u16 flags;  //0.0 request block variant, SEEK_SET...
> u16 verb;   //0.2 open,close,read,mmap,sync,write,
> //ioctl
> //mallocIO&read,write&freeIO,
> //mallocIO,freeIO
> //autothread might be an ioctl()
> u16 errno;  //0.4 per request status
> u16 completehow;//0.6 queue,AST,pipe,SIGIO,SIGIO||delete ok
> u64 offset; //1
> u32 length; //2.0 bytes requested
> u32 timeout;//2.4 im ms or us?
> u32 transferred;//3.0 bytes
> u32 qiohandle;  //3.4 for cancell or polling
> void* handle;   //4 (open & close might write)
> void* buffer;   //5
> void* callback; //6 optimize special cases w/ completehow
> void* callparam;//7
> };  //all fields are read xor write
>
> //Writing to the device would schedule i/o, reading would reap
> //completions.  Bad writes would give the byte offset to the
> //rejected sector field if detected synchronously.  Multiple
> //sector writes would be truncated on the first bad sector.
> //Accepted writes would be buffered in the kernel.
>
> //Each open creates a new queue, each write is read in the
> //same queue.  Any number of threads can read or write a queue.
>
> //some cases might be simplified by kernel processed completions,
> //such as VMS AST emulation, or putting results in a pipe. Hence
> //completehow, which might use callback and callparam.
>
> //timeout?
> //canceling i/o?
> //Sun aio emulation?
> //VMS qio emulation?
> //MS IOCP emulation?
> //malloc()&free() safe across threads?
> //Should O_DIRECT would error unless properly aligned etc.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

-- 
WebSig: http://www.jukie.net/~bart/sig/



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Asynchronous io

2001-04-12 Thread CJ

//Linux really needs a clean basis for asynchronous and 
//unbuffered i/o libraries.  Something like the fork/thread
//clone(), but to replace select() and aio_* polling.  This 
//might be a start. And it is just a file and very like a
//pipe or socket.

//Suppose we add /dev/qio with 64 byte sectors as follows: 

struct qio{//64 byte i/o request
u16 flags;  //0.0 request block variant, SEEK_SET...
u16 verb;   //0.2 open,close,read,mmap,sync,write,
//ioctl
//mallocIO&read,write&freeIO,
//mallocIO,freeIO
//autothread might be an ioctl()
u16 errno;  //0.4 per request status
u16 completehow;//0.6 queue,AST,pipe,SIGIO,SIGIO||delete ok
u64 offset; //1 
u32 length; //2.0 bytes requested
u32 timeout;//2.4 im ms or us?
u32 transferred;//3.0 bytes
u32 qiohandle;  //3.4 for cancell or polling
void* handle;   //4 (open & close might write)
void* buffer;   //5
void* callback; //6 optimize special cases w/ completehow
void* callparam;//7 
};  //all fields are read xor write

//Writing to the device would schedule i/o, reading would reap
//completions.  Bad writes would give the byte offset to the 
//rejected sector field if detected synchronously.  Multiple 
//sector writes would be truncated on the first bad sector.
//Accepted writes would be buffered in the kernel.

//Each open creates a new queue, each write is read in the
//same queue.  Any number of threads can read or write a queue.

//some cases might be simplified by kernel processed completions, 
//such as VMS AST emulation, or putting results in a pipe. Hence
//completehow, which might use callback and callparam.

//timeout?  
//canceling i/o?  
//Sun aio emulation?  
//VMS qio emulation?  
//MS IOCP emulation?
//malloc()&free() safe across threads?
//Should O_DIRECT would error unless properly aligned etc.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/