James Carlson wrote:

> 
> Yes, but I _thought_ it was clear that you wanted to get rid of
> select/poll in the path of the I/O, on the grounds that it results in
> too many system calls.  Right?
> 
  ...
> 
> I'm saying that an interface that accepts multiple file descriptors in
> an attempt to avoid poll-and-figure-out-which-one behavior, but that
> handles only one kind of transaction (I/O) and only one kind of
> descriptor (a socket) might be too limited to use easily.
> 
> What's the intended usage model?
> 

Kacheong's design involves either a call to port_getn or a call to
select/poll., followed by a call to his new function, recvfrom_list():

while (1) {
        select(...) /* get list of active sockets */
        recvfrom_list();
        process_all_data();
}

This technique will retrieve 1 available message on each socket
at the cost of 2 system calls.  If the traffic is not evenly
distributed, more system calls are required, to the point where
2 system calls are required per message if all the traffic
arrives on a single socket.

This suffers from the following limitations:

* As noted above, it doesn't handle asymmetric loadings well
* Does not easily admit the use of threads; only one thread
   can meaningfully poll on a set of fds at a time.
* Lack of buffer pre-posting mandates additional copy of
   data when recvfrom_list() is called.

The asynchronous model is this:

queue_list_of_async_recvfroms(port, ...);

in one or more threads:

        while (1) {
                port_getn(port, ....);
                process_all_data();
                queue_list_of_async_recvfroms(port, ...);
        }

In the case of evenly distributed inputs, this also
results in 2 system calls to retrieve a message on
each port.  However, if we're clever enough to post more
than one pending recvfrom on each socket (using more buffers),
we can also handle asymmetric loadings w/o increasing
work per message.  We also gain the the following:

* Buffers are pre-posted, so with a proper implementation &
   hardware, copies may be avoided.  Even if copies are required,
   they can be done by other cpus/threads/io dma engines.  This
   means that even single threaded apps should see a significant
   reduction in UDP latency on MP hardware.
* Seamless threading is possible w/o any user locking required
   to manage pending IO lists; the event port mechanism provides a
   user-specified pointer to accompany each IO operation.
* Other pending IO operations (disk, poll(2), listen, accept, etc)
   can be handled in the event loop, since event_ports are designed
   to unify dispatching paradigm.

Supporting asynchronous IO on sockets also admits significant
performance wins on the transmit side, since zero copy is easily done.

The downsides w/ asynchronous I/O is that some thought needs to be
given to transmit/receive buffer management, and that it may
represent a new programming pattern for networking developers.

- Bart



-- 
Bart Smaalders                  Solaris Kernel Performance
[EMAIL PROTECTED]               http://blogs.sun.com/barts
_______________________________________________
networking-discuss mailing list
networking-discuss@opensolaris.org

Reply via email to