James Carlson wrote:
> > Yes, but I _thought_ it was clear that you wanted to get rid of > select/poll in the path of the I/O, on the grounds that it results in > too many system calls. Right? > ... > > I'm saying that an interface that accepts multiple file descriptors in > an attempt to avoid poll-and-figure-out-which-one behavior, but that > handles only one kind of transaction (I/O) and only one kind of > descriptor (a socket) might be too limited to use easily. > > What's the intended usage model? > Kacheong's design involves either a call to port_getn or a call to select/poll., followed by a call to his new function, recvfrom_list(): while (1) { select(...) /* get list of active sockets */ recvfrom_list(); process_all_data(); } This technique will retrieve 1 available message on each socket at the cost of 2 system calls. If the traffic is not evenly distributed, more system calls are required, to the point where 2 system calls are required per message if all the traffic arrives on a single socket. This suffers from the following limitations: * As noted above, it doesn't handle asymmetric loadings well * Does not easily admit the use of threads; only one thread can meaningfully poll on a set of fds at a time. * Lack of buffer pre-posting mandates additional copy of data when recvfrom_list() is called. The asynchronous model is this: queue_list_of_async_recvfroms(port, ...); in one or more threads: while (1) { port_getn(port, ....); process_all_data(); queue_list_of_async_recvfroms(port, ...); } In the case of evenly distributed inputs, this also results in 2 system calls to retrieve a message on each port. However, if we're clever enough to post more than one pending recvfrom on each socket (using more buffers), we can also handle asymmetric loadings w/o increasing work per message. We also gain the the following: * Buffers are pre-posted, so with a proper implementation & hardware, copies may be avoided. Even if copies are required, they can be done by other cpus/threads/io dma engines. This means that even single threaded apps should see a significant reduction in UDP latency on MP hardware. * Seamless threading is possible w/o any user locking required to manage pending IO lists; the event port mechanism provides a user-specified pointer to accompany each IO operation. * Other pending IO operations (disk, poll(2), listen, accept, etc) can be handled in the event loop, since event_ports are designed to unify dispatching paradigm. Supporting asynchronous IO on sockets also admits significant performance wins on the transmit side, since zero copy is easily done. The downsides w/ asynchronous I/O is that some thought needs to be given to transmit/receive buffer management, and that it may represent a new programming pattern for networking developers. - Bart -- Bart Smaalders Solaris Kernel Performance [EMAIL PROTECTED] http://blogs.sun.com/barts _______________________________________________ networking-discuss mailing list networking-discuss@opensolaris.org