Re: [3/4] kevent: AIO, aio_sendfile() implementation.

2006-08-11 Thread Ulrich Drepper
Sébastien Dugué wrote:
aio completion notification

I looked over this now but I don't think I understand everything.  Or I
don't see how it all is integrated.  And no, I'm not looking at the
proposed glibc code since would mean being tainted.


 Details:
 ---
 
   A struct sigevent *aio_sigeventp is added to struct iocb in
 include/linux/aio_abi.h
 
   An enum {IO_NOTIFY_SIGNAL = 0, IO_NOTIFY_THREAD_ID = 1} is added in
 include/linux/aio.h:
 
   - IO_NOTIFY_SIGNAL means that the signal is to be sent to the
 requesting thread 
 
   - IO_NOTIFY_THREAD_ID means that the signal is to be sent to a
 specifi thread.

This has been proved to be sufficient in the timer code which basically
has the same problem.  But why do you need separate constants?  We have
the various SIGEV_* constants, among them SIGEV_THREAD_ID.  Just use
these constants for the values of ki_notify.


   The following fields are added to struct kiocb in include/linux/aio.h:
 
   - pid_t ki_pid: target of the signal
 
   - __u16 ki_signo: signal number
 
   - __u16 ki_notify: kind of notification, IO_NOTIFY_SIGNAL or
  IO_NOTIFY_THREAD_ID
 
   - uid_t ki_uid, ki_euid: filled with the submitter credentials

These two fields aren't needed for the POSIX interfaces.  Where does the
requirement come from?  I don't say they should be removed, they might
be useful, but if the costs are non-negligible then they could go away.


   - check whether the submitting thread wants to be notified directly
 (sigevent-sigev_notify_thread_id is 0) or wants the signal to be sent
 to another thread.
 In the latter case a check is made to assert that the target thread
 is in the same thread group

Is this really how it's implemented?  This is not how it should be.
Either a signal is sent to a specific thread in the same process (this
is what SIGEV_THREAD_ID is for) or the signal is sent to a calling
process.  Sending a signal to the process means that from the kernel's
POV any thread which doesn't have the signal blocked can receive it.
The final decision is made by the kernel.  There is no mechanism to send
the signal to another process.

So, for the purpose of the POSIX AIO code the ki_pid value is only
needed when the SIGEV_THREAD_ID bit is set.

It could be an extension and I don't mind it being introduced.  But
again, it's not necessary and if it adds costs then it could be left
out.  It is something which could easily be introduced later if the need
arises.


   listio support
 

I really don't understand the kernel interface for this feature.


 Details:
 ---
 
   An IOCB_CMD_GROUP is added to the IOCB_CMD enum in include/linux/aio_abi.h
 
   A struct lio_event is added in include/linux/aio.h
 
   A struct lio_event *ki_lio is added to struct iocb in include/linux/aio.h

So you have a pointer in the structure for the individual requests.  I
assume you use the atomic counter to trigger the final delivery.  I
further assume that if lio_wait is set the calling thread is suspended
until all requests are handled and that the final notification in this
case means that thread gets woken.

This is all fine.

But how do you pass the requests to the kernel?  If you have a new
lio_listio-like syscall it'll be easy.  But I haven't seen anything like
this mentioned.

The alternative is to pass the requests one-by-one in which case I don't
see how you create the reference to the lio_listio control block.  This
approach seems to be slower.

If all requests are passed at once, do you have the equivalent of
LIO_NOP entries?


How can we support the extension where we wait for a number of requests
which need not be all of them.  I.e., I submit N requests and want to be
notified when at least M (M = N) notified.  I am not yet clear about
the actual semantics we should implement (e.g., do we send another
notification after the first one?) but it's something which IMO should
be taken into account in the design.


Finally, and this is very important, does you code send out the
individual requests notification and then in the end the lio_listio
completion?  I think Suparna wrote this is the case but I want to make sure.


Overall, this looks much better than the old code.  If the answers to my
questions show that the behavior is compatible with the POSIX AIO code
I'm certainly very much in favor of adding the kernel code.

-- 
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖



signature.asc
Description: OpenPGP digital signature


[take2 3/4] kevent: AIO, aio_sendfile() implementation.

2006-08-01 Thread Evgeniy Polyakov

This patch includes asynchronous propagation of file's data into VFS
cache and aio_sendfile() implementation.
Network aio_sendfile() works lazily - it asynchronously populates pages
into the VFS cache (which can be used for various tricks with adaptive
readahead) and then uses usual -sendfile() callback.

Signed-off-by: Evgeniy Polyakov [EMAIL PROTECTED]

diff --git a/fs/bio.c b/fs/bio.c
index 6a0b9ad..a3ee530 100644
--- a/fs/bio.c
+++ b/fs/bio.c
@@ -119,7 +119,7 @@ void bio_free(struct bio *bio, struct bi
 /*
  * default destructor for a bio allocated with bio_alloc_bioset()
  */
-static void bio_fs_destructor(struct bio *bio)
+void bio_fs_destructor(struct bio *bio)
 {
bio_free(bio, fs_bio_set);
 }
diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c
index fb4d322..9316551 100644
--- a/fs/ext2/inode.c
+++ b/fs/ext2/inode.c
@@ -685,6 +685,7 @@ ext2_writepages(struct address_space *ma
 }
 
 const struct address_space_operations ext2_aops = {
+   .get_block  = ext2_get_block,
.readpage   = ext2_readpage,
.readpages  = ext2_readpages,
.writepage  = ext2_writepage,
diff --git a/fs/ext3/inode.c b/fs/ext3/inode.c
index c5ee9f0..d9210d4 100644
--- a/fs/ext3/inode.c
+++ b/fs/ext3/inode.c
@@ -1699,6 +1699,7 @@ static int ext3_journalled_set_page_dirt
 }
 
 static const struct address_space_operations ext3_ordered_aops = {
+   .get_block  = ext3_get_block,
.readpage   = ext3_readpage,
.readpages  = ext3_readpages,
.writepage  = ext3_ordered_writepage,
diff --git a/fs/file_table.c b/fs/file_table.c
index 0131ba0..b649317 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -112,6 +112,9 @@ struct file *get_empty_filp(void)
if (security_file_alloc(f))
goto fail_sec;
 
+#ifdef CONFIG_KEVENT_POLL
+   kevent_storage_init(f, f-st);
+#endif
tsk = current;
INIT_LIST_HEAD(f-f_u.fu_list);
atomic_set(f-f_count, 1);
@@ -159,6 +162,9 @@ void fastcall __fput(struct file *file)
might_sleep();
 
fsnotify_close(file);
+#ifdef CONFIG_KEVENT_POLL
+   kevent_storage_fini(file-st);
+#endif
/*
 * The function eventpoll_release() should be the first called
 * in the file cleanup chain.
diff --git a/fs/inode.c b/fs/inode.c
index 0bf9f04..fdbd0ba 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -21,6 +21,7 @@ #include linux/pagemap.h
 #include linux/cdev.h
 #include linux/bootmem.h
 #include linux/inotify.h
+#include linux/kevent.h
 #include linux/mount.h
 
 /*
@@ -165,12 +166,18 @@ #endif
}
memset(inode-u, 0, sizeof(inode-u));
inode-i_mapping = mapping;
+#if defined CONFIG_KEVENT
+   kevent_storage_init(inode, inode-st);
+#endif
}
return inode;
 }
 
 void destroy_inode(struct inode *inode) 
 {
+#if defined CONFIG_KEVENT_INODE || defined CONFIG_KEVENT_SOCKET
+   kevent_storage_fini(inode-st);
+#endif
BUG_ON(inode_has_buffers(inode));
security_inode_free(inode);
if (inode-i_sb-s_op-destroy_inode)
diff --git a/fs/reiserfs/inode.c b/fs/reiserfs/inode.c
index 12dfdcf..f8dca72 100644
--- a/fs/reiserfs/inode.c
+++ b/fs/reiserfs/inode.c
@@ -3001,6 +3001,7 @@ int reiserfs_setattr(struct dentry *dent
 }
 
 const struct address_space_operations reiserfs_address_space_operations = {
+   .get_block = reiserfs_get_block,
.writepage = reiserfs_writepage,
.readpage = reiserfs_readpage,
.readpages = reiserfs_readpages,

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 2561020..65eb438 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -240,6 +240,9 @@ #include linux/mutex.h
 #include asm/atomic.h
 #include asm/semaphore.h
 #include asm/byteorder.h
+#ifdef CONFIG_KEVENT
+#include linux/kevent_storage.h
+#endif
 
 struct hd_geometry;
 struct iovec;
@@ -352,6 +355,8 @@ struct address_space;
 struct writeback_control;
 
 struct address_space_operations {
+   int  (*get_block)(struct inode *inode, sector_t iblock,
+   struct buffer_head *bh_result, int create);
int (*writepage)(struct page *page, struct writeback_control *wbc);
int (*readpage)(struct file *, struct page *);
void (*sync_page)(struct page *);
@@ -546,6 +551,10 @@ #ifdef CONFIG_INOTIFY
struct mutexinotify_mutex;  /* protects the watches list */
 #endif
 
+#ifdef CONFIG_KEVENT_INODE
+   struct kevent_storage   st;
+#endif
+
unsigned long   i_state;
unsigned long   dirtied_when;   /* jiffies of first dirtying */
 
@@ -698,6 +707,9 @@ #ifdef CONFIG_EPOLL
struct list_headf_ep_links;
spinlock_t  f_ep_lock;
 #endif /* #ifdef CONFIG_EPOLL */
+#ifdef CONFIG_KEVENT_POLL
+   struct kevent_storage   st;
+#endif
struct address_space*f_mapping;
 };
 extern spinlock_t files_lock;
diff --git 

Re: [3/4] kevent: AIO, aio_sendfile() implementation.

2006-07-31 Thread Suparna Bhattacharya
On Thu, Jul 27, 2006 at 11:44:23AM -0700, Ulrich Drepper wrote:
 Badari Pulavarty wrote:
  Before we spend too much time cleaning up and merging into mainline -
  I would like an agreement that what we add is good enough for glibc
  POSIX AIO.
 
 I haven't seen a description of the interface so far.  Would be good if

Did Sébastien's mail with the description help ? 

 it existed.  But I briefly mentioned one quirk in the interface about
 which Suparna wasn't sure whether it's implemented/implementable in the
 current interface.
 
 If a lio_listio call is made the individual requests are handle just as
 if they'd be issue separately.  I.e., the notification specified in the
 individual aiocb is performed when the specific request is done.  Then,
 once all requests are done, another notification is made, this time
 controlled by the sigevent parameter if lio_listio.

Looking at the code in lio kernel patch, this should be already covered:

if (iocb-ki_signo)
__aio_send_signal(iocb);

+   if (iocb-ki_lio)
+   lio_check(iocb-ki_lio);

That is, it first checks the notification in the individual iocb, and then
the one for the LIO.

 
 
 Another feature which I always wanted: the current lio_listio call
 returns in blocking mode only if all requests are done.  In non-blocking
 mode it returns immediately and the program needs to poll the aiocbs.
 What is needed is something in the middle.  For instance, if multiple
 read requests are issued the program might be able to start working as
 soon as one request is satisfied.  I.e., a call similar to lio_listio
 would be nice which also takes another parameter specifying how many of
 the NENT aiocbs have to finish before the call returns.

I imagine the kernel could enable this by incorporating this additional
parameter for IOCB_CMD_GROUP in the ABI (in the default case this should be the
same as the total number of iocbs submitted to lio_listio). Now should the
at least NENT check apply only to LIO_WAIT or also to the LIO_NOWAIT
notification case ? 

BTW, the native io_getevents does support a min_nr wakeup already, except that
it applies to any iocb on the io_context, and not just a given lio_listio call.

Regards
Suparna


-- 
Suparna Bhattacharya ([EMAIL PROTECTED])
Linux Technology Center
IBM Software Lab, India

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [3/4] kevent: AIO, aio_sendfile() implementation.

2006-07-28 Thread Sébastien Dugué
On Thu, 2006-07-27 at 08:28 -0700, Badari Pulavarty wrote:
 Sébastien Dugué wrote:
  On Wed, 2006-07-26 at 09:22 -0700, Badari Pulavarty wrote:

  Ulrich Drepper wrote:
  
  Christoph Hellwig wrote:


  My personal opinion on existing AIO is that it is not the right design.
  Benjamin LaHaise agree with me (if I understood him right),


  I completely agree with that aswell.
  
  
  I agree, too, but the current code is not the last of the line.  Suparna
  has a st of patches which make the current kernel aio code work much
  better and especially make it really usable to implement POSIX AIO.
 
  In Ottawa we were talking about submitting it and Suparna will.  We just
  thought about a little longer timeframe.  I guess it could be
  accelerated since he mostly has the patch done.  But I don't know her
  schedule.
 
  Important here is, don't base any decision on the current aio
  implementation.


  Ulrich,
 
  Suparna mentioned your interest in making POSIX glibc aio work with 
  kernel-aio at OLS.
  We thought taking a re-look at the (kernel side) work BULL did, would be 
  a nice starting
  point. I re-based those patches to 2.6.18-rc2 and sent it to Zach Brown 
  for review before
  sending them out to list.
 
  These patches does NOT make AIO any cleaner. All they do is add 
  functionality to support
  POSIX AIO easier. These are
 
  [ PATCH 1/3 ]  Adding signal notification for event completion
 
  [ PATCH 2/3 ]  lio (listio) completion semantics
 
  [ PATCH 3/3 ] cancel_fd support
  
 
Badari,
 
Thanks for refreshing those patches, they have been sitting here
  for quite some time now and collected dust.
 
I also think Suparna's patchset for doing buffered AIO would be
  a real plus here.
 

  Suparna explained these in the following article:
 
  http://lwn.net/Articles/148755/
 
  If you think, this is a reasonable direction/approach for the kernel and 
  you would take care
  of glibc side of things - I can spend time on these patches, getting 
  them to reasonable shape
  and push for inclusion.
  
 
Ulrich, I you want to have a look at how those patches are put to
  use in libposix-aio, have a look at http://sourceforge.net/projects/paiol.
 
It could be a starting point for glibc.
 
Thanks,
 
Sébastien.
 

 Sebastien,
 
 Suparna mentioned at Ulrich wants us to concentrate on kernel-side 
 support, so that he
 can look at glibc side of things (along with other work he is already 
 doing). So, if we
 can get an agreement on what kind of kernel support is needed - we can 
 focus our
 efforts on kernel side first and leave glibc enablement to capable hands 
 of Uli :)
 

  That's fine with me. 

  Sébastien.

-- 
-

  Sébastien DuguéBULL/FREC:B1-247
  phone: (+33) 476 29 77 70  Bullcom: 229-7770

  mailto:[EMAIL PROTECTED]

  Linux POSIX AIO: http://www.bullopensource.org/posix
   http://sourceforge.net/projects/paiol

-

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [3/4] kevent: AIO, aio_sendfile() implementation.

2006-07-28 Thread Sébastien Dugué
On Thu, 2006-07-27 at 11:44 -0700, Ulrich Drepper wrote:
 Badari Pulavarty wrote:
  Before we spend too much time cleaning up and merging into mainline -
  I would like an agreement that what we add is good enough for glibc
  POSIX AIO.
 
 I haven't seen a description of the interface so far.  Would be good if
 it existed.  But I briefly mentioned one quirk in the interface about
 which Suparna wasn't sure whether it's implemented/implementable in the
 current interface.
 
 If a lio_listio call is made the individual requests are handle just as
 if they'd be issue separately.  I.e., the notification specified in the
 individual aiocb is performed when the specific request is done.  Then,
 once all requests are done, another notification is made, this time
 controlled by the sigevent parameter if lio_listio.
 
 
 Another feature which I always wanted: the current lio_listio call
 returns in blocking mode only if all requests are done.  In non-blocking
 mode it returns immediately and the program needs to poll the aiocbs.
 What is needed is something in the middle.  For instance, if multiple
 read requests are issued the program might be able to start working as
 soon as one request is satisfied.  I.e., a call similar to lio_listio
 would be nice which also takes another parameter specifying how many of
 the NENT aiocbs have to finish before the call returns.

  You're right here, that definitely would be a plus.


-- 
-

  Sébastien DuguéBULL/FREC:B1-247
  phone: (+33) 476 29 77 70  Bullcom: 229-7770

  mailto:[EMAIL PROTECTED]

  Linux POSIX AIO: http://www.bullopensource.org/posix
   http://sourceforge.net/projects/paiol

-

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [3/4] kevent: AIO, aio_sendfile() implementation.

2006-07-28 Thread Sébastien Dugué
On Thu, 2006-07-27 at 14:02 -0700, Badari Pulavarty wrote:
 On Thu, 2006-07-27 at 11:44 -0700, Ulrich Drepper wrote:
  Badari Pulavarty wrote:
   Before we spend too much time cleaning up and merging into mainline -
   I would like an agreement that what we add is good enough for glibc
   POSIX AIO.
  
  I haven't seen a description of the interface so far.  Would be good if
  it existed.  But I briefly mentioned one quirk in the interface about
  which Suparna wasn't sure whether it's implemented/implementable in the
  current interface.
 
 Sebastien, could you provide a description of interfaces you are
 adding ? Since you did all the work, it would be appropriate for
 you to do it :)
 

  I will clean up what description I have and send it soon.

  Sébastien.


-- 
-

  Sébastien DuguéBULL/FREC:B1-247
  phone: (+33) 476 29 77 70  Bullcom: 229-7770

  mailto:[EMAIL PROTECTED]

  Linux POSIX AIO: http://www.bullopensource.org/posix
   http://sourceforge.net/projects/paiol

-

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [3/4] kevent: AIO, aio_sendfile() implementation.

2006-07-28 Thread Sébastien Dugué
On Thu, 2006-07-27 at 14:02 -0700, Badari Pulavarty wrote:
 On Thu, 2006-07-27 at 11:44 -0700, Ulrich Drepper wrote:
  Badari Pulavarty wrote:
   Before we spend too much time cleaning up and merging into mainline -
   I would like an agreement that what we add is good enough for glibc
   POSIX AIO.
  
  I haven't seen a description of the interface so far.  Would be good if
  it existed.  But I briefly mentioned one quirk in the interface about
  which Suparna wasn't sure whether it's implemented/implementable in the
  current interface.
 
 Sebastien, could you provide a description of interfaces you are
 adding ? Since you did all the work, it would be appropriate for
 you to do it :)
 

  Here are the descriptions for the AIO completion notification and
listio patches. Hope I did not leave out too much.

  Sébastien.

-- 
-

  Sébastien DuguéBULL/FREC:B1-247
  phone: (+33) 476 29 77 70  Bullcom: 229-7770

  mailto:[EMAIL PROTECTED]

  Linux POSIX AIO: http://www.bullopensource.org/posix
   http://sourceforge.net/projects/paiol

-


 aio completion notification

Summary:
---

  The current 2.6 kernel does not support notification of user space via
an RT signal upon an asynchronous IO completion. The POSIX specification
states that when an AIO request completes, a signal can be delivered to
the application as notification.

  The aioevent patch adds a struct sigevent *aio_sigeventp to the iocb.
The relevant fields (pid, signal number and value) are stored in the kiocb
for use when the request completes.

  That sigevent structure is filled by the application as part of the AIO
request preparation. Upon request completion, the kernel notifies the
application using those sigevent parameters. If SIGEV_NONE has been specified,
then the old behaviour is retained and the application must rely on polling
the completion queue using io_getevents().

Details:
---

  A struct sigevent *aio_sigeventp is added to struct iocb in
include/linux/aio_abi.h

  An enum {IO_NOTIFY_SIGNAL = 0, IO_NOTIFY_THREAD_ID = 1} is added in
include/linux/aio.h:

- IO_NOTIFY_SIGNAL means that the signal is to be sent to the
  requesting thread 

- IO_NOTIFY_THREAD_ID means that the signal is to be sent to a
  specifi thread.

  The following fields are added to struct kiocb in include/linux/aio.h:

- pid_t ki_pid: target of the signal

- __u16 ki_signo: signal number

- __u16 ki_notify: kind of notification, IO_NOTIFY_SIGNAL or
   IO_NOTIFY_THREAD_ID

- uid_t ki_uid, ki_euid: filled with the submitter credentials

- sigval_t ki_sigev_value: value stuffed in siginfo

  these fields are only valid if ki_signo != 0.



  In io_submit_one(), if the application provided a sigevent then
iocb_setup_sigevent() is called which does the following:

- save current-uid and current-euid in the kiocb fields ki_uid and
  ki_euid for use in the completion path to check permissions

- check access to the user sigevent

- extract the needed fields from the sigevent (pid, signo, and value).
  If the signal number passed from userspace is 0 then no notification
  is to occur and ki_signo is set to 0

- check whether the submitting thread wants to be notified directly
  (sigevent-sigev_notify_thread_id is 0) or wants the signal to be sent
  to another thread.
  In the latter case a check is made to assert that the target thread
  is in the same thread group

- fill in the kiocb fields (ki_pid, ki_signo, ki_notify and 
ki_sigev_value)
  for that request.

  Upon request completion, in aio_complete(), if ki_signo is not 0, then
__aio_send_signal() is called which sends the signal as follows:

- fill in the siginfo struct to be sent to the application

- check whether we have permission to signal the given thread

- send the signal

listio support


Summary:
---
  
  The lio patch adds POSIX listio completion notification support. It builds
on support provided by the aio event patch and adds an IOCB_CMD_GROUP
command to sys_io_submit().

  The purpose of IOCB_CMD_GROUP is to group together the following requests in
the list up to the end of the list.

  As part of listio submission, the user process prepends to a list of requests
an empty special aiocb with an aio_lio_opcode of IOCB_CMD_GROUP, filling only
the aio_sigevent fields.



Details:
---

  An IOCB_CMD_GROUP is added to the IOCB_CMD enum in include/linux/aio_abi.h

  A struct lio_event is added in include/linux/aio.h

  A struct lio_event *ki_lio is added to struct iocb in include/linux/aio.h


 In sys_io_submit(), upon detecting such an IOCB_CMD_GROUP marker iocb, an

Re: [3/4] kevent: AIO, aio_sendfile() implementation.

2006-07-27 Thread Badari Pulavarty

Sébastien Dugué wrote:

On Wed, 2006-07-26 at 09:22 -0700, Badari Pulavarty wrote:
  

Ulrich Drepper wrote:


Christoph Hellwig wrote:
  
  

My personal opinion on existing AIO is that it is not the right design.
Benjamin LaHaise agree with me (if I understood him right),
  
  

I completely agree with that aswell.



I agree, too, but the current code is not the last of the line.  Suparna
has a st of patches which make the current kernel aio code work much
better and especially make it really usable to implement POSIX AIO.

In Ottawa we were talking about submitting it and Suparna will.  We just
thought about a little longer timeframe.  I guess it could be
accelerated since he mostly has the patch done.  But I don't know her
schedule.

Important here is, don't base any decision on the current aio
implementation.
  
  

Ulrich,

Suparna mentioned your interest in making POSIX glibc aio work with 
kernel-aio at OLS.
We thought taking a re-look at the (kernel side) work BULL did, would be 
a nice starting
point. I re-based those patches to 2.6.18-rc2 and sent it to Zach Brown 
for review before

sending them out to list.

These patches does NOT make AIO any cleaner. All they do is add 
functionality to support

POSIX AIO easier. These are

[ PATCH 1/3 ]  Adding signal notification for event completion

[ PATCH 2/3 ]  lio (listio) completion semantics

[ PATCH 3/3 ] cancel_fd support



  Badari,

  Thanks for refreshing those patches, they have been sitting here
for quite some time now and collected dust.

  I also think Suparna's patchset for doing buffered AIO would be
a real plus here.

  

Suparna explained these in the following article:

http://lwn.net/Articles/148755/

If you think, this is a reasonable direction/approach for the kernel and 
you would take care
of glibc side of things - I can spend time on these patches, getting 
them to reasonable shape

and push for inclusion.



  Ulrich, I you want to have a look at how those patches are put to
use in libposix-aio, have a look at http://sourceforge.net/projects/paiol.

  It could be a starting point for glibc.

  Thanks,

  Sébastien.

  

Sebastien,

Suparna mentioned at Ulrich wants us to concentrate on kernel-side 
support, so that he
can look at glibc side of things (along with other work he is already 
doing). So, if we
can get an agreement on what kind of kernel support is needed - we can 
focus our
efforts on kernel side first and leave glibc enablement to capable hands 
of Uli :)


Thanks,
Badari

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [3/4] kevent: AIO, aio_sendfile() implementation.

2006-07-27 Thread Zach Brown

 Suparna mentioned at Ulrich wants us to concentrate on kernel-side 
 support, so that he can look at glibc side of things (along with
 other work he is already doing). So, if we can get an agreement on
 what kind of kernel support is needed - we can focus our efforts on
 kernel side first and leave glibc enablement to capable hands of Uli
 :)

Yeah, and the existing patches still need some cleanup.  Badari, did you
still want me to look into that?

We need someone to claim ultimate responsibility for getting these
patches suitable for merging :).  I'm happy to do that if Suparna isn't
already on it.

- z
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [3/4] kevent: AIO, aio_sendfile() implementation.

2006-07-27 Thread Badari Pulavarty
On Thu, 2006-07-27 at 11:14 -0700, Zach Brown wrote:
  Suparna mentioned at Ulrich wants us to concentrate on kernel-side 
  support, so that he can look at glibc side of things (along with
  other work he is already doing). So, if we can get an agreement on
  what kind of kernel support is needed - we can focus our efforts on
  kernel side first and leave glibc enablement to capable hands of Uli
  :)
 
 Yeah, and the existing patches still need some cleanup.  Badari, did you
 still want me to look into that?
 
 We need someone to claim ultimate responsibility for getting these
 patches suitable for merging :).  I'm happy to do that if Suparna isn't
 already on it.

Zach,

Thanks for volunteering !! Sebastien  I should be able to help you.

Before we spend too much time cleaning up and merging into mainline -
I would like an agreement that what we add is good enough for glibc
POSIX AIO. I hate to waste everyone's time and add complexity to the
kernel - if glibc side is not going to happen :(

Thanks,
Badari

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [3/4] kevent: AIO, aio_sendfile() implementation.

2006-07-27 Thread Ulrich Drepper
Badari Pulavarty wrote:
 Before we spend too much time cleaning up and merging into mainline -
 I would like an agreement that what we add is good enough for glibc
 POSIX AIO.

I haven't seen a description of the interface so far.  Would be good if
it existed.  But I briefly mentioned one quirk in the interface about
which Suparna wasn't sure whether it's implemented/implementable in the
current interface.

If a lio_listio call is made the individual requests are handle just as
if they'd be issue separately.  I.e., the notification specified in the
individual aiocb is performed when the specific request is done.  Then,
once all requests are done, another notification is made, this time
controlled by the sigevent parameter if lio_listio.


Another feature which I always wanted: the current lio_listio call
returns in blocking mode only if all requests are done.  In non-blocking
mode it returns immediately and the program needs to poll the aiocbs.
What is needed is something in the middle.  For instance, if multiple
read requests are issued the program might be able to start working as
soon as one request is satisfied.  I.e., a call similar to lio_listio
would be nice which also takes another parameter specifying how many of
the NENT aiocbs have to finish before the call returns.

-- 
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖



signature.asc
Description: OpenPGP digital signature


Re: [3/4] kevent: AIO, aio_sendfile() implementation.

2006-07-27 Thread Badari Pulavarty
On Thu, 2006-07-27 at 11:44 -0700, Ulrich Drepper wrote:
 Badari Pulavarty wrote:
  Before we spend too much time cleaning up and merging into mainline -
  I would like an agreement that what we add is good enough for glibc
  POSIX AIO.
 
 I haven't seen a description of the interface so far.  Would be good if
 it existed.  But I briefly mentioned one quirk in the interface about
 which Suparna wasn't sure whether it's implemented/implementable in the
 current interface.

Sebastien, could you provide a description of interfaces you are
adding ? Since you did all the work, it would be appropriate for
you to do it :)

 If a lio_listio call is made the individual requests are handle just as
 if they'd be issue separately.  I.e., the notification specified in the
 individual aiocb is performed when the specific request is done.  Then,
 once all requests are done, another notification is made, this time
 controlled by the sigevent parameter if lio_listio.
 
 
 Another feature which I always wanted: the current lio_listio call
 returns in blocking mode only if all requests are done.  In non-blocking
 mode it returns immediately and the program needs to poll the aiocbs.
 What is needed is something in the middle.  For instance, if multiple
 read requests are issued the program might be able to start working as
 soon as one request is satisfied.  I.e., a call similar to lio_listio
 would be nice which also takes another parameter specifying how many of
 the NENT aiocbs have to finish before the call returns.

Looks reasonable.

Thanks,
Badari

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[3/4] kevent: AIO, aio_sendfile() implementation.

2006-07-26 Thread Evgeniy Polyakov

This patch includes asynchronous propagation of file's data into VFS
cache and aio_sendfile() implementation.
Network aio_sendfile() works lazily - it asynchronously populates pages
into the VFS cache (which can be used for various tricks with adaptive
readahead) and then uses usual -sendfile() callback.

Signed-off-by: Evgeniy Polyakov [EMAIL PROTECTED]

diff --git a/fs/bio.c b/fs/bio.c
index 6a0b9ad..a3ee530 100644
--- a/fs/bio.c
+++ b/fs/bio.c
@@ -119,7 +119,7 @@ void bio_free(struct bio *bio, struct bi
 /*
  * default destructor for a bio allocated with bio_alloc_bioset()
  */
-static void bio_fs_destructor(struct bio *bio)
+void bio_fs_destructor(struct bio *bio)
 {
bio_free(bio, fs_bio_set);
 }
diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c
index 04af9c4..295fce9 100644
--- a/fs/ext2/inode.c
+++ b/fs/ext2/inode.c
@@ -685,6 +685,7 @@ ext2_writepages(struct address_space *ma
 }
 
 struct address_space_operations ext2_aops = {
+   .get_block  = ext2_get_block,
.readpage   = ext2_readpage,
.readpages  = ext2_readpages,
.writepage  = ext2_writepage,
diff --git a/fs/ext3/inode.c b/fs/ext3/inode.c
index 2edd7ee..e44f5ad 100644
--- a/fs/ext3/inode.c
+++ b/fs/ext3/inode.c
@@ -1700,6 +1700,7 @@ static int ext3_journalled_set_page_dirt
 }
 
 static struct address_space_operations ext3_ordered_aops = {
+   .get_block  = ext3_get_block,
.readpage   = ext3_readpage,
.readpages  = ext3_readpages,
.writepage  = ext3_ordered_writepage,
diff --git a/fs/file_table.c b/fs/file_table.c
index bcea199..8759479 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -113,6 +113,9 @@ struct file *get_empty_filp(void)
if (security_file_alloc(f))
goto fail_sec;
 
+#ifdef CONFIG_KEVENT_POLL
+   kevent_storage_init(f, f-st);
+#endif
tsk = current;
INIT_LIST_HEAD(f-f_u.fu_list);
atomic_set(f-f_count, 1);
@@ -160,6 +163,9 @@ void fastcall __fput(struct file *file)
might_sleep();
 
fsnotify_close(file);
+#ifdef CONFIG_KEVENT_POLL
+   kevent_storage_fini(file-st);
+#endif
/*
 * The function eventpoll_release() should be the first called
 * in the file cleanup chain.
diff --git a/fs/inode.c b/fs/inode.c
index 3a2446a..0493935 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -22,6 +22,7 @@ #include linux/pagemap.h
 #include linux/cdev.h
 #include linux/bootmem.h
 #include linux/inotify.h
+#include linux/kevent.h
 #include linux/mount.h
 
 /*
@@ -166,12 +167,18 @@ #endif
}
memset(inode-u, 0, sizeof(inode-u));
inode-i_mapping = mapping;
+#if defined CONFIG_KEVENT
+   kevent_storage_init(inode, inode-st);
+#endif
}
return inode;
 }
 
 void destroy_inode(struct inode *inode) 
 {
+#if defined CONFIG_KEVENT_INODE || defined CONFIG_KEVENT_SOCKET
+   kevent_storage_fini(inode-st);
+#endif
BUG_ON(inode_has_buffers(inode));
security_inode_free(inode);
if (inode-i_sb-s_op-destroy_inode)
diff --git a/fs/reiserfs/inode.c b/fs/reiserfs/inode.c
index 9857e50..578 100644
--- a/fs/reiserfs/inode.c
+++ b/fs/reiserfs/inode.c
@@ -2997,6 +2997,7 @@ int reiserfs_setattr(struct dentry *dent
 }
 
 struct address_space_operations reiserfs_address_space_operations = {
+   .get_block = reiserfs_get_block,
.writepage = reiserfs_writepage,
.readpage = reiserfs_readpage,
.readpages = reiserfs_readpages,

diff --git a/include/linux/fs.h b/include/linux/fs.h
index ecc8c2c..248f6a1 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -236,6 +236,9 @@ #include linux/mutex.h
 #include asm/atomic.h
 #include asm/semaphore.h
 #include asm/byteorder.h
+#ifdef CONFIG_KEVENT
+#include linux/kevent_storage.h
+#endif
 
 struct hd_geometry;
 struct iovec;
@@ -348,6 +351,8 @@ struct address_space;
 struct writeback_control;
 
 struct address_space_operations {
+   int  (*get_block)(struct inode *inode, sector_t iblock,
+   struct buffer_head *bh_result, int create);
int (*writepage)(struct page *page, struct writeback_control *wbc);
int (*readpage)(struct file *, struct page *);
void (*sync_page)(struct page *);
@@ -526,6 +531,10 @@ #ifdef CONFIG_INOTIFY
struct mutexinotify_mutex;  /* protects the watches list */
 #endif
 
+#ifdef CONFIG_KEVENT_INODE
+   struct kevent_storage   st;
+#endif
+
unsigned long   i_state;
unsigned long   dirtied_when;   /* jiffies of first dirtying */
 
@@ -659,6 +668,9 @@ #ifdef CONFIG_EPOLL
struct list_headf_ep_links;
spinlock_t  f_ep_lock;
 #endif /* #ifdef CONFIG_EPOLL */
+#ifdef CONFIG_KEVENT_POLL
+   struct kevent_storage   st;
+#endif
struct address_space*f_mapping;
 };
 extern spinlock_t files_lock;
diff --git 

Re: [3/4] kevent: AIO, aio_sendfile() implementation.

2006-07-26 Thread Christoph Hellwig
On Wed, Jul 26, 2006 at 01:18:15PM +0400, Evgeniy Polyakov wrote:
 
 This patch includes asynchronous propagation of file's data into VFS
 cache and aio_sendfile() implementation.
 Network aio_sendfile() works lazily - it asynchronously populates pages
 into the VFS cache (which can be used for various tricks with adaptive
 readahead) and then uses usual -sendfile() callback.
 
 Signed-off-by: Evgeniy Polyakov [EMAIL PROTECTED]
 
 diff --git a/fs/bio.c b/fs/bio.c
 index 6a0b9ad..a3ee530 100644
 --- a/fs/bio.c
 +++ b/fs/bio.c
 @@ -119,7 +119,7 @@ void bio_free(struct bio *bio, struct bi
  /*
   * default destructor for a bio allocated with bio_alloc_bioset()
   */
 -static void bio_fs_destructor(struct bio *bio)
 +void bio_fs_destructor(struct bio *bio)
  {
   bio_free(bio, fs_bio_set);
  }
 diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c
 index 04af9c4..295fce9 100644
 --- a/fs/ext2/inode.c
 +++ b/fs/ext2/inode.c
 @@ -685,6 +685,7 @@ ext2_writepages(struct address_space *ma
  }
  
  struct address_space_operations ext2_aops = {
 + .get_block  = ext2_get_block,

No way in hell.  For whatever you do please provide a interface at
the readpage/writepage/sendfile/etc abstraction layer.  get_block is
nothing that can be exposed to the common code.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [3/4] kevent: AIO, aio_sendfile() implementation.

2006-07-26 Thread Christoph Hellwig
On Wed, Jul 26, 2006 at 01:18:15PM +0400, Evgeniy Polyakov wrote:
 
 This patch includes asynchronous propagation of file's data into VFS
 cache and aio_sendfile() implementation.
 Network aio_sendfile() works lazily - it asynchronously populates pages
 into the VFS cache (which can be used for various tricks with adaptive
 readahead) and then uses usual -sendfile() callback.

And please don't base this on sendfile.  Please make the splice infrastructure
aynschronous without duplicating all the code but rather make the existing
code aynch and the existing synchronous call wait on them to finish, similar
to how we handle async/sync direct I/O.  And to be honest, I don't think
adding all this code is acceptable if it can't replace the existing aio
code while keeping the interface.  So while you interface looks pretty
sane the implementation needs a lot of work still :)

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [3/4] kevent: AIO, aio_sendfile() implementation.

2006-07-26 Thread David Miller
From: Christoph Hellwig [EMAIL PROTECTED]
Date: Wed, 26 Jul 2006 11:04:31 +0100

 And to be honest, I don't think adding all this code is acceptable
 if it can't replace the existing aio code while keeping the
 interface.  So while you interface looks pretty sane the
 implementation needs a lot of work still :)

Networking and disk AIO have significantly different needs.

Therefore, I really don't see it as reasonable to expect
a merge of these two things.  It doesn't make any sense.

I do agree that this stuff needs to be cleaned up, all the get_block
etc. hacks have to be pulled out and abstracted properly.  That part
of the kevent changes are indeed still crap :)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [3/4] kevent: AIO, aio_sendfile() implementation.

2006-07-26 Thread Christoph Hellwig
On Wed, Jul 26, 2006 at 02:08:49PM +0400, Evgeniy Polyakov wrote:
 On Wed, Jul 26, 2006 at 11:00:13AM +0100, Christoph Hellwig ([EMAIL 
 PROTECTED]) wrote:
struct address_space_operations ext2_aops = {
   + .get_block  = ext2_get_block,
  
  No way in hell.  For whatever you do please provide a interface at
  the readpage/writepage/sendfile/etc abstraction layer.  get_block is
  nothing that can be exposed to the common code.
 
 Compare this with sync read methods - all they do is exactly the same
 operations with low-level blocks, which are combined into nice exported
 function, so there is _no_ readpage layer - it calls only one function
 which works with blocks.

No.  The abtraction layer there is -readpage(s).  _A_ common implementation
works with a get_block callback from the filesystem, but there are various
others.  We've been there before, up to mid-2.3.x we had a get_block inode
operation and we got rid of it because it is the wrong abstraction.

 So it is not a technical problem, but political one.

It's a technical problem, and it's called get you abstractions right.  And
ontop of that a political one and that's called get your abstraction coherent.
If you managed to argue all of us into accept that get_block is the right
abstraction (and as I mentioned above that's technically not true) you'd
still have the burden to update everything to use the same abstraction.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [3/4] kevent: AIO, aio_sendfile() implementation.

2006-07-26 Thread Evgeniy Polyakov
On Wed, Jul 26, 2006 at 11:04:31AM +0100, Christoph Hellwig ([EMAIL PROTECTED]) 
wrote:
 On Wed, Jul 26, 2006 at 01:18:15PM +0400, Evgeniy Polyakov wrote:
  
  This patch includes asynchronous propagation of file's data into VFS
  cache and aio_sendfile() implementation.
  Network aio_sendfile() works lazily - it asynchronously populates pages
  into the VFS cache (which can be used for various tricks with adaptive
  readahead) and then uses usual -sendfile() callback.
 
 And please don't base this on sendfile.  Please make the splice infrastructure
 aynschronous without duplicating all the code but rather make the existing
 code aynch and the existing synchronous call wait on them to finish, similar
 to how we handle async/sync direct I/O.  And to be honest, I don't think
 adding all this code is acceptable if it can't replace the existing aio
 code while keeping the interface.  So while you interface looks pretty
 sane the implementation needs a lot of work still :)

Kevent was created quite before splice and friends, so I used what there
were :)

I stopped to work on AIO, since neither existing, nor mine
implementation were able to outperform sync speeds (one of the major problems
in my implementation is get_user_pages() overhead, which can be
completely eliminated with physical memory allocation being done in
advance in userspace, like Ulrich described).
My personal opinion on existing AIO is that it is not the right design.
Benjamin LaHaise agree with me (if I understood him right), but he
failed to move AIO outside repeated-call model (2.4 had state machine
based one, and out-of-the tree 2.6 patches have that design too).
In theory existing AIO (with all posix userspace API) can be replaced
with kevent (it will even take less space), but I would present it as a
TODO item, since kevent itself has nothing to do with AIO.

Kevent is a generic event processing mechanism, AIO, network AIO and all
others are just kernel users of it's functionality.

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [3/4] kevent: AIO, aio_sendfile() implementation.

2006-07-26 Thread Evgeniy Polyakov
On Wed, Jul 26, 2006 at 11:13:56AM +0100, Christoph Hellwig ([EMAIL PROTECTED]) 
wrote:
 On Wed, Jul 26, 2006 at 02:08:49PM +0400, Evgeniy Polyakov wrote:
  On Wed, Jul 26, 2006 at 11:00:13AM +0100, Christoph Hellwig ([EMAIL 
  PROTECTED]) wrote:
 struct address_space_operations ext2_aops = {
+   .get_block  = ext2_get_block,
   
   No way in hell.  For whatever you do please provide a interface at
   the readpage/writepage/sendfile/etc abstraction layer.  get_block is
   nothing that can be exposed to the common code.
  
  Compare this with sync read methods - all they do is exactly the same
  operations with low-level blocks, which are combined into nice exported
  function, so there is _no_ readpage layer - it calls only one function
  which works with blocks.
 
 No.  The abtraction layer there is -readpage(s).  _A_ common implementation
 works with a get_block callback from the filesystem, but there are various
 others.  We've been there before, up to mid-2.3.x we had a get_block inode
 operation and we got rid of it because it is the wrong abstraction.

Well, kevent can work not from it's own, but with common implementation,
which works with get_block(). No problem here.

  So it is not a technical problem, but political one.
 
 It's a technical problem, and it's called get you abstractions right.  And
 ontop of that a political one and that's called get your abstraction coherent.
 If you managed to argue all of us into accept that get_block is the right
 abstraction (and as I mentioned above that's technically not true) you'd
 still have the burden to update everything to use the same abstraction.

Christoph, I completely understand your point of view.
There is absolutely no technical problem to create common async implementation,
and place it where existing sync lives and call from readpage() level.

It just requires to allow to change BIO callbacks instead of default
one, and (probably) event sync readpage can be used.

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [3/4] kevent: AIO, aio_sendfile() implementation.

2006-07-26 Thread Christoph Hellwig
On Wed, Jul 26, 2006 at 02:19:21PM +0400, Evgeniy Polyakov wrote:
 I stopped to work on AIO, since neither existing, nor mine
 implementation were able to outperform sync speeds (one of the major problems
 in my implementation is get_user_pages() overhead, which can be
 completely eliminated with physical memory allocation being done in
 advance in userspace, like Ulrich described).
 My personal opinion on existing AIO is that it is not the right design.
 Benjamin LaHaise agree with me (if I understood him right),

I completely agree with that aswell.

 but he
 failed to move AIO outside repeated-call model (2.4 had state machine
 based one, and out-of-the tree 2.6 patches have that design too).
 In theory existing AIO (with all posix userspace API) can be replaced
 with kevent (it will even take less space), but I would present it as a
 TODO item, since kevent itself has nothing to do with AIO.

And replacing the existing aio code is exactly we I want you to do.  We
can't keep adding more and more code without getting rid of old mess forever.

And yes, the asynchronous pagecache population bit in your patchkit has a lot
to do with aio.  It's same variant of aio done right (or at least less bad).

I suspect the right way to go ahead is to drop that bit for now (it's the
by far worst code in the patchkit anyway) and then we can redo it later to
not get abstractions wrong and duplicate lots of code but also replace the
aio code.  I don't expect you to do that alone, you'll probably need quite
a bit help from us FS and VM people.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [3/4] kevent: AIO, aio_sendfile() implementation.

2006-07-26 Thread Avi Kivity

David Miller wrote:


From: Christoph Hellwig [EMAIL PROTECTED]
Date: Wed, 26 Jul 2006 11:04:31 +0100

 And to be honest, I don't think adding all this code is acceptable
 if it can't replace the existing aio code while keeping the
 interface.  So while you interface looks pretty sane the
 implementation needs a lot of work still :)

Networking and disk AIO have significantly different needs.

Surely, there needs to be a unified polling interface to support single 
threaded designs.


--
error compiling committee.c: too many arguments to function

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [3/4] kevent: AIO, aio_sendfile() implementation.

2006-07-26 Thread Ulrich Drepper
Christoph Hellwig wrote:
 My personal opinion on existing AIO is that it is not the right design.
 Benjamin LaHaise agree with me (if I understood him right),
 
 I completely agree with that aswell.

I agree, too, but the current code is not the last of the line.  Suparna
has a st of patches which make the current kernel aio code work much
better and especially make it really usable to implement POSIX AIO.

In Ottawa we were talking about submitting it and Suparna will.  We just
thought about a little longer timeframe.  I guess it could be
accelerated since he mostly has the patch done.  But I don't know her
schedule.

Important here is, don't base any decision on the current aio
implementation.

-- 
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖



signature.asc
Description: OpenPGP digital signature


Re: [3/4] kevent: AIO, aio_sendfile() implementation.

2006-07-26 Thread Phillip Susi

Christoph Hellwig wrote:

Networking and disk AIO have significantly different needs.

Therefore, I really don't see it as reasonable to expect
a merge of these two things.  It doesn't make any sense.


I'm not sure about that.  The current aio interface isn't exactly nice
for disk I/O either.  I'm more than happy to have a discussion about
that aspect.




I agree that it makes perfect sense for a merger because aio and 
networking have very similar needs.  In both cases, the caller hands the 
kernel a buffer and wants the kernel to either fill it or consume it, 
and to be able to do so asynchronously.  You also want to maximize 
performance in both cases by taking advantage of zero copy IO.


I wonder though, why do you say the current aio interface isn't nice for 
disk IO?  It seems to work rather nicely to me, and is much better than 
the posix aio interface.


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html