Re: [Request for comments] new poll API

2002-07-29 Thread Jeff Trawick
Brian Pane [EMAIL PROTECTED] writes:

 To continue the recent discussions on the problems in the current
 apr_poll API, here's a patch for apr_poll.h that illustrates my
 proposed fix.
 
 What I'm proposing here is to split the API into two parts:
 
   - A lightweight, single-function poll API for use (only!)
 with very small sets of descriptors.

Do we really need this API?  What is the sort of APR application for
which the heavy-duty API is harmful?

-- 
Jeff Trawick | [EMAIL PROTECTED]
Born in Roswell... married an alien...


RE: [Request for comments] new poll API

2002-07-29 Thread Bill Stoddard


 To continue the recent discussions on the problems in the current
 apr_poll API, here's a patch for apr_poll.h that illustrates my
 proposed fix.

 What I'm proposing here is to split the API into two parts:

   - A lightweight, single-function poll API for use (only!)
 with very small sets of descriptors.  This is basically
 the current implementation, but with a limit on the number
 of descriptors that it will accept.  With this limit, we
 can put the temporary pollfd array on the stack in apr_poll()
 to eliminate the memory leak.

   - A general-purpose poll API for larger sets of descriptors.
 This one is an abstract data type, accessible only through
 functions, so that we can do internal optimizations in the
 future (like replacing poll with /dev/poll on supported
 platforms).

Hummm. I was thinking we would create an entirely new set of APR calls
to encapsulate an event drivent network i/o interface (/dev/poll,
KQEnqueue/KQDequeue, IOCompletion ports, etc.). I'll try to work up an API
this week.

Bill



RE: [Request for comments] new poll API

2002-07-29 Thread Sascha Schumann
 Hummm. I was thinking we would create an entirely new set of APR calls
 to encapsulate an event drivent network i/o interface (/dev/poll,
 KQEnqueue/KQDequeue, IOCompletion ports, etc.). I'll try to work up an API
 this week.

Here are some resources which you might consider during the
creation of a new API.

http://www.kegel.com/c10k.html#frameworks

lists some existing event-driven engines, including a C++
implementation by Dan Kegel.

http://libevent.sourceforge.net

The architecture is supposed to handle level/state triggered
event systems.  It is still in the design phase, although
working code exists in the undernet ircd server.

http://www.acme.com/software/thttpd/

Thttpd has its own events abstraction, including (recurring)
timers and socket readiness (kqueue, poll, select).

- Sascha



RE: [Request for comments] new poll API

2002-07-29 Thread Ryan Bloom
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
 
 Brian Pane [EMAIL PROTECTED] writes:
 
  To continue the recent discussions on the problems in the current
  apr_poll API, here's a patch for apr_poll.h that illustrates my
  proposed fix.
 
  What I'm proposing here is to split the API into two parts:
 
- A lightweight, single-function poll API for use (only!)
  with very small sets of descriptors.
 
 Do we really need this API?  What is the sort of APR application for
 which the heavy-duty API is harmful?

I am biting my tongue here, but Jeff, you are the person who
specifically stated that the heavy-duty API was too slow for us to use.

Ryan




Re: [Request for comments] new poll API

2002-07-29 Thread Jeff Trawick
Ryan Bloom [EMAIL PROTECTED] writes:

  From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
  
  Brian Pane [EMAIL PROTECTED] writes:
  
   To continue the recent discussions on the problems in the current
   apr_poll API, here's a patch for apr_poll.h that illustrates my
   proposed fix.
  
   What I'm proposing here is to split the API into two parts:
  
 - A lightweight, single-function poll API for use (only!)
   with very small sets of descriptors.
  
  Do we really need this API?  What is the sort of APR application for
  which the heavy-duty API is harmful?
 
 I am biting my tongue here, but Jeff, you are the person who
 specifically stated that the heavy-duty API was too slow for us to use.

I said it was too slow and/or cumbersome to use in a particular
situation that does not correspond to something an APR app would do,
so I don't consider that a valid use-case for justifying the simpler
API.  (An APR app should be using an APR timeout socket option for
that situation.)

Like I said above, I'd first like to see a description of an APR app
that is harmed by doing the setup and using the accessor functions.
This should be helpful in determining how important it is to support
the simpler API flavor.

-- 
Jeff Trawick | [EMAIL PROTECTED]
Born in Roswell... married an alien...


RE: [Request for comments] new poll API

2002-07-29 Thread Ryan Bloom
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
 Ryan Bloom [EMAIL PROTECTED] writes:
 
   From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
  
   Brian Pane [EMAIL PROTECTED] writes:
  
To continue the recent discussions on the problems in the
current
apr_poll API, here's a patch for apr_poll.h that illustrates my
proposed fix.
   
What I'm proposing here is to split the API into two parts:
   
  - A lightweight, single-function poll API for use (only!)
with very small sets of descriptors.
  
   Do we really need this API?  What is the sort of APR application
for
   which the heavy-duty API is harmful?
 
  I am biting my tongue here, but Jeff, you are the person who
  specifically stated that the heavy-duty API was too slow for us to
use.
 
 I said it was too slow and/or cumbersome to use in a particular
 situation that does not correspond to something an APR app would do,
 so I don't consider that a valid use-case for justifying the simpler
 API.  (An APR app should be using an APR timeout socket option for
 that situation.)

Let me see if I understand.  Apr_poll() is too slow for APR to use, but
because you don't expect APR apps to use it too often, that is okay.
Sorry, that doesn't hold water.

 Like I said above, I'd first like to see a description of an APR app
 that is harmed by doing the setup and using the accessor functions.
 This should be helpful in determining how important it is to support
 the simpler API flavor.

Well, Greg Ames used to say all the time that apr_poll was seriously
hurting the performance of Apache.

Ryan




Re: [Request for comments] new poll API

2002-07-29 Thread Jeff Trawick
Ryan Bloom [EMAIL PROTECTED] writes:

  From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
  Ryan Bloom [EMAIL PROTECTED] writes:
  
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
   
Brian Pane [EMAIL PROTECTED] writes:
   
 To continue the recent discussions on the problems in the
 current
 apr_poll API, here's a patch for apr_poll.h that illustrates my
 proposed fix.

 What I'm proposing here is to split the API into two parts:

   - A lightweight, single-function poll API for use (only!)
 with very small sets of descriptors.
   
Do we really need this API?  What is the sort of APR application
 for
which the heavy-duty API is harmful?
  
   I am biting my tongue here, but Jeff, you are the person who
   specifically stated that the heavy-duty API was too slow for us to
 use.
  
  I said it was too slow and/or cumbersome to use in a particular
  situation that does not correspond to something an APR app would do,
  so I don't consider that a valid use-case for justifying the simpler
  API.  (An APR app should be using an APR timeout socket option for
  that situation.)
 
 Let me see if I understand.  Apr_poll() is too slow for APR to use, but
 because you don't expect APR apps to use it too often, that is okay.
 Sorry, that doesn't hold water.

You are grasping for generalizations and taking a lot of liberties
with the facts along the way.

  Like I said above, I'd first like to see a description of an APR app
  that is harmed by doing the setup and using the accessor functions.
  This should be helpful in determining how important it is to support
  the simpler API flavor.
 
 Well, Greg Ames used to say all the time that apr_poll was seriously
 hurting the performance of Apache.

Are you suggesting that Apache is a use-case to support the simpler
flavor?  I'm not sure how.

I suspect that rebuilding the native pollset (e.g., struct pollfd
array) every time apr_poll() is called is harmful to Apache.

I suspect that rebuilding the abstract pollset (apr_pollfd_t array) in
its entirety after calling poll() is harmful to Apache, which is only
going to process the first match.

-- 
Jeff Trawick | [EMAIL PROTECTED]
Born in Roswell... married an alien...


RE: [Request for comments] new poll API

2002-07-29 Thread Ryan Bloom

I am biting my tongue here, but Jeff, you are the person who
specifically stated that the heavy-duty API was too slow for us
to
  use.
  
   I said it was too slow and/or cumbersome to use in a particular
   situation that does not correspond to something an APR app would
do,
   so I don't consider that a valid use-case for justifying the
simpler
   API.  (An APR app should be using an APR timeout socket option for
   that situation.)
 
  Let me see if I understand.  Apr_poll() is too slow for APR to use,
but
  because you don't expect APR apps to use it too often, that is okay.
  Sorry, that doesn't hold water.
 
 You are grasping for generalizations and taking a lot of liberties
 with the facts along the way.

Sorry, but I don't think so.  The facts are simple.  You said that
apr_poll was both too slow and too complex for use inside of APR.  To
fix that, I re-implemented apr_poll() to resolve both of those issues.
Now, (because there is an infinitely fixable memory leak), we are
throwing out the re-write, and going back to the abstract type, which
was too slow and too complex for use inside of APR.  So, where am I
mis-remembering?

   Like I said above, I'd first like to see a description of an APR
app
   that is harmed by doing the setup and using the accessor
functions.
   This should be helpful in determining how important it is to
support
   the simpler API flavor.
 
  Well, Greg Ames used to say all the time that apr_poll was seriously
  hurting the performance of Apache.
 
 Are you suggesting that Apache is a use-case to support the simpler
 flavor?  I'm not sure how.
 
 I suspect that rebuilding the native pollset (e.g., struct pollfd
 array) every time apr_poll() is called is harmful to Apache.

It should actually perform better than the previous version of the code,
for small pollsets, which is what most people use with Apache.

 I suspect that rebuilding the abstract pollset (apr_pollfd_t array) in
 its entirety after calling poll() is harmful to Apache, which is only
 going to process the first match.

I am unable to parse this at all.  If you are talking about the current
implementation, then one of the advantages is that you don't need to do
that anymore.  If you are talking about the previous implementation,
then I really don't see which side of the discussion you are on.

Ryan




Re: [Request for comments] new poll API

2002-07-29 Thread Brian Pane
Ryan Bloom wrote:
I am biting my tongue here, but Jeff, you are the person who
specifically stated that the heavy-duty API was too slow for us
 

to
 

use.
 

I said it was too slow and/or cumbersome to use in a particular
situation that does not correspond to something an APR app would
   

do,
 

so I don't consider that a valid use-case for justifying the
   

simpler
 

API.  (An APR app should be using an APR timeout socket option for
that situation.)
   

Let me see if I understand.  Apr_poll() is too slow for APR to use,
 

but
 

because you don't expect APR apps to use it too often, that is okay.
Sorry, that doesn't hold water.
 

You are grasping for generalizations and taking a lot of liberties
with the facts along the way.
   

Sorry, but I don't think so.  The facts are simple.  You said that
apr_poll was both too slow and too complex for use inside of APR.
Jeff said it was too slow and complex for a specific use case
inside APR.  You generalized that to apr_poll is too slow for
APR to use, but the generalization isn't valid.  There are
many cases inside APR where the implementation uses an inline
variant of some other APR API for performance reasons (the
array and string APIs are two examples that come to mind
immediately).  Such cases don't mean that the API being bypassed
is unsuitable for use as a public API.  They simply mean that
the code that's bypassing them has used an appropriate
optimization to deliver high performance to its own callers.
--Brian



Re: [Request for comments] new poll API

2002-07-29 Thread Brian Pane
Jeff Trawick wrote:
Brian Pane [EMAIL PROTECTED] writes:
 

To continue the recent discussions on the problems in the current
apr_poll API, here's a patch for apr_poll.h that illustrates my
proposed fix.
What I'm proposing here is to split the API into two parts:
 - A lightweight, single-function poll API for use (only!)
   with very small sets of descriptors.
   

Do we really need this API?  What is the sort of APR application for
which the heavy-duty API is harmful?
This API really just adds programming convenience.  There's nothing
it provides that can't be done with the general-purpose API.  But the
main limitation of the general-purpose API is that, because it involves
some memory allocation, you can't use it for spontaneous, one-time poll
calls (like checking for readability on a single input socket if read
just returned EAGAIN) unless you can guarantee that the number of such
calls during the lifetime of the request pool has an upper limit.  With
the general-purpose poll API, the solution is to hold onto your poll
object and re-use it if you have to poll the same descriptor(s) again.
The lightweight API just provides a simpler alternative for cases where
keeping track of the pollset object(s) would be prohibitively complex.
--Brian



Re: [Request for comments] new poll API

2002-07-29 Thread Jeff Trawick
Ryan Bloom [EMAIL PROTECTED] writes:

  I suspect that rebuilding the native pollset (e.g., struct pollfd
  array) every time apr_poll() is called is harmful to Apache.
 
 It should actually perform better than the previous version of the code,
 for small pollsets, which is what most people use with Apache.

I started off agreeing that the new implementation is faster for small
pollsets, but I'm not sure that is the case when you consider
steady-state operations.  We save the overhead of the function call to
look up the returned events after the poll call but we pick up the
overhead of the internal call to get_event() just before poll() is
called.*

Hopefully everybody agrees that the current implementation is harmful
for big pollsets.

If APR had a small-pollset API and a big-pollset API, I suspect we'd
be better off in Apache just using the big-pollset API rather than
deciding at run-time which API to pick since implementing a choice
would likely introduce an extra function call which would erase any
small benefit of being able to use the small-pollset API.

  I suspect that rebuilding the abstract pollset (apr_pollfd_t array) in
  its entirety after calling poll() is harmful to Apache, which is only
  going to process the first match.
 
 I am unable to parse this at all.  If you are talking about the current
 implementation, then one of the advantages is that you don't need to do
 that anymore.

Here is the code I referred to as rebuilding the abstract pollset
(apr_pollfd_t array):

for (i = 0; i  num; i++) {
aprset[i].rtnevents = get_revent(pollset[i].revents);
}

*yeah, calling from Apache to APR is more expensive than an internal
APR call, but are we digging that deep to find the benefit?

-- 
Jeff Trawick | [EMAIL PROTECTED]
Born in Roswell... married an alien...


RE: [Request for comments] new poll API

2002-07-29 Thread Ryan Bloom
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
 
 Ryan Bloom [EMAIL PROTECTED] writes:
 
   I suspect that rebuilding the native pollset (e.g., struct pollfd
   array) every time apr_poll() is called is harmful to Apache.
 
  It should actually perform better than the previous version of the
code,
  for small pollsets, which is what most people use with Apache.
 
 I started off agreeing that the new implementation is faster for small
 pollsets, but I'm not sure that is the case when you consider
 steady-state operations.  We save the overhead of the function call to
 look up the returned events after the poll call but we pick up the
 overhead of the internal call to get_event() just before poll() is
 called.*

That get_event() call can easily be removed in 99% of the cases (see
below).

 If APR had a small-pollset API and a big-pollset API, I suspect we'd
 be better off in Apache just using the big-pollset API rather than
 deciding at run-time which API to pick since implementing a choice
 would likely introduce an extra function call which would erase any
 small benefit of being able to use the small-pollset API.

In how many situations would we actually need to use the big-pollset
API?  I would much rather just write the code to use the small-pollset
API, and possible #ifdef the big-pollset API.  It would be faster to use
the small-pollset API, and if you _must_ have the large-pollset, then
you configure for it.

   I suspect that rebuilding the abstract pollset (apr_pollfd_t
array) in
   its entirety after calling poll() is harmful to Apache, which is
only
   going to process the first match.
 
  I am unable to parse this at all.  If you are talking about the
current
  implementation, then one of the advantages is that you don't need to
do
  that anymore.
 
 Here is the code I referred to as rebuilding the abstract pollset
 (apr_pollfd_t array):
 
 for (i = 0; i  num; i++) {
 aprset[i].rtnevents = get_revent(pollset[i].revents);
 }
 
 *yeah, calling from Apache to APR is more expensive than an internal
 APR call, but are we digging that deep to find the benefit?

This is a straight bit-mask check, and in most cases, can be optimized
out to a no-op.  We have to write the optimization, but 99% of the time,
the function call can be removed.  As for the first/all rtnevents
decision, Apache should be using all of the results.  We don't
currently, but we could and should.




Re: [Request for comments] new poll API

2002-07-29 Thread Jeff Trawick
Brian Pane [EMAIL PROTECTED] writes:

 Jeff Trawick wrote:
 
 Brian Pane [EMAIL PROTECTED] writes:
 
 
 To continue the recent discussions on the problems in the current
 apr_poll API, here's a patch for apr_poll.h that illustrates my
 proposed fix.
 
 What I'm proposing here is to split the API into two parts:
 
   - A lightweight, single-function poll API for use (only!)
 with very small sets of descriptors.
 
 
 Do we really need this API?  What is the sort of APR application for
 which the heavy-duty API is harmful?
 
 
   But the
 main limitation of the general-purpose API is that, because it involves
 some memory allocation, you can't use it for spontaneous, one-time poll
 calls (like checking for readability on a single input socket if read
 just returned EAGAIN) unless you can guarantee that the number of such
 calls during the lifetime of the request pool has an upper limit.

This is the sort of thing I thought we'd start discussing a couple of
hours ago :)

The current implementation is useful if the user has to find out if
the socket is readable/writable WITHOUT CONSUMING THE DATA and it is
inconvenient to keep track of the APR representation of the
pollset. If they are going to turn right around and consume the data
then they might as well play with socket options and call apr_recv().
If it is convenient to keep track of the APR representation of a
pollset, then it doesn't matter either way.

I'm not sure that I see the compelling use-case but for various
reasons it is probably best for me to assume that there are plenty of
them.  Hopefully having a separate API won't be confusing to APR
programmers.

ugh :)

-- 
Jeff Trawick | [EMAIL PROTECTED]
Born in Roswell... married an alien...


Re: [Request for comments] new poll API

2002-07-29 Thread Brian Pane
Jeff Trawick wrote:
Brian Pane [EMAIL PROTECTED] writes:
 

Jeff Trawick wrote:
   

Brian Pane [EMAIL PROTECTED] writes:
 

To continue the recent discussions on the problems in the current
apr_poll API, here's a patch for apr_poll.h that illustrates my
proposed fix.
What I'm proposing here is to split the API into two parts:
- A lightweight, single-function poll API for use (only!)
  with very small sets of descriptors.
   

Do we really need this API?  What is the sort of APR application for
which the heavy-duty API is harmful?
 

 But the
main limitation of the general-purpose API is that, because it involves
some memory allocation, you can't use it for spontaneous, one-time poll
calls (like checking for readability on a single input socket if read
just returned EAGAIN) unless you can guarantee that the number of such
calls during the lifetime of the request pool has an upper limit.
   

This is the sort of thing I thought we'd start discussing a couple of
hours ago :)
The current implementation is useful if the user has to find out if
the socket is readable/writable WITHOUT CONSUMING THE DATA and it is
inconvenient to keep track of the APR representation of the
pollset. If they are going to turn right around and consume the data
then they might as well play with socket options and call apr_recv().
If it is convenient to keep track of the APR representation of a
pollset, then it doesn't matter either way.
I'm not sure that I see the compelling use-case but for various
reasons it is probably best for me to assume that there are plenty of
them.  Hopefully having a separate API won't be confusing to APR
programmers.
ugh :)
 

hmmm...another characteristic of the use case in which the current
API is useful is that there's exactly one descriptor involved.  Would
that yield a useful simplification of the two-API plan?  I'm thinking
of something like:
 - apr_pollset API for general-purpose use (abstract interface,
   does its own memory management, etc)
 - lightweight API that just checks a single descriptor for readability
   or writability
Of course, if we can only come up with a single use case for the second
API, and it happens to be apr_wait_for_io_or_timeout(), then let's just
inline the poll/select call there and forget about the lightweight API
until we find another use case.
--Brian



Re: [Request for comments] new poll API

2002-07-29 Thread Jeff Trawick
Brian Pane [EMAIL PROTECTED] writes:

 Jeff Trawick wrote:
...
 The current implementation is useful if the user has to find out if
 the socket is readable/writable WITHOUT CONSUMING THE DATA and it is
 inconvenient to keep track of the APR representation of the
 pollset. If they are going to turn right around and consume the data
 then they might as well play with socket options and call apr_recv().
 If it is convenient to keep track of the APR representation of a
 pollset, then it doesn't matter either way.
 
 I'm not sure that I see the compelling use-case but for various
 reasons it is probably best for me to assume that there are plenty of
 them.  Hopefully having a separate API won't be confusing to APR
 programmers.
 
 ugh :)
 
 
 
 hmmm...another characteristic of the use case in which the current
 API is useful is that there's exactly one descriptor involved.

strangely, I started to comment that I thought it should be a much
more *severely* limited number of descriptors...  an application with
11 descriptors to check on today may very well have 21 descriptors to
check on tomorrow and the programmer may be frazzled at having to
switch APIs (or worse yet, recompile APR with a bigger limit and thus
frazzle our users :) )

meanwhile, an application just checking on one descriptor is more
likely to still be able to use the simple API tomorrow...

 that yield a useful simplification of the two-API plan?  I'm thinking
 of something like:
 
   - apr_pollset API for general-purpose use (abstract interface,
 does its own memory management, etc)
 
   - lightweight API that just checks a single descriptor for readability
 or writability

That feels like a good thing to do.

 Of course, if we can only come up with a single use case for the second
 API, and it happens to be apr_wait_for_io_or_timeout(), then let's just
 inline the poll/select call there and forget about the lightweight API
 until we find another use case.

I usually would want to wait until there are more compelling scenarios
to justify putting in an API.  A slight difference in this situation
is that we already have the API that I try to justify and what needs
to be added is the other one :)

-- 
Jeff Trawick | [EMAIL PROTECTED]
Born in Roswell... married an alien...


Re: [Request for comments] new poll API

2002-07-29 Thread William A. Rowe, Jr.
At 01:17 PM 7/29/2002, Brian Pane wrote:
hmmm...another characteristic of the use case in which the current
API is useful is that there's exactly one descriptor involved.  Would
that yield a useful simplification of the two-API plan?  I'm thinking
of something like:
 - apr_pollset API for general-purpose use (abstract interface,
   does its own memory management, etc)
This is extremely effective for any poll-based server daemon.  We need
this so that the pollset can be manipulated for a group of listeners, and
it will become doubly necessary that it's very abstract if we will ever
support the async dispatch model in a poll-like fashion.
 - lightweight API that just checks a single descriptor for readability
   or writability
We have that [this is what started the argument.]  But I'd argue that up
to five or six sources are very useful, consider two ideas;
1. apr_poll learns to support brigades, so that a filter stack might be
   plugged in later for cgi output.  This probably is a combination of the
   APR_SPECULATIVE interface [is there anything on the stack] and
   a bit of intimate knowledge of the raw socket if there is nothing useful
   that's already sufficiently complete that is pending on the filter stack.
2. cgi needs to poll the cgi's stdout / stderr and the client input stack,
   while polling the cgi's stdin and client response stack for ready-to-write.
This is a pretty tightly knit group of fd's that should be optimized for the
small-set case.
Of course, if we can only come up with a single use case for the second
API, and it happens to be apr_wait_for_io_or_timeout(), then let's just
inline the poll/select call there and forget about the lightweight API
until we find another use case.
Well, I've offered the bigger-than-one case, which will be true for many
apps that poll on stdin/stdout and some other socket or file entity.  I really
can't conceive of more than about 6 fd's in any obvious small-set case.
But I agree, go with an abstract implementation, and make our existing
apr_wait_for_io_or_timeout into the first special case.  It's so darned simple
that we shouldn't be trying to overwhelm it with multi-fd logic.
I *really* empathize with Ryan that we should eat our own dogfood.  But
our dogfood is built for an application like the server's accept pollset, or
the several handles of a good cgi implementation.  We shouldn't convince
ourselves that the right solution for a one-of-many is the same solution
for a one-of-few, or a one-of-one.  And we shouldn't clobber the good of
the many for the good of the few, or the one :-)
Bill




Re: [Request for comments] new poll API

2002-07-29 Thread Brian Pane
William A. Rowe, Jr. wrote:
 - lightweight API that just checks a single descriptor for readability
   or writability

We have that [this is what started the argument.]  But I'd argue that up
to five or six sources are very useful, consider two ideas;
1. apr_poll learns to support brigades, so that a filter stack might be
   plugged in later for cgi output.  This probably is a combination of 
the
   APR_SPECULATIVE interface [is there anything on the stack] and
   a bit of intimate knowledge of the raw socket if there is nothing 
useful
   that's already sufficiently complete that is pending on the filter 
stack.

2. cgi needs to poll the cgi's stdout / stderr and the client input 
stack,
   while polling the cgi's stdin and client response stack for 
ready-to-write.

This is a pretty tightly knit group of fd's that should be optimized 
for the
small-set case.

In both of these cases, though, I think the abstract API
would work just fine--and it might well be a better choice
than transparent API.  In both examples, the group of descriptors
would be long-lived (within the context of the request, at
least).  E.g., in the CGI case, we'd be doing a poll on the
same stdout/stderr repeatedly until we got the input.  And
the CGI handler is maintaining state between all those poll
calls, so it could easily use an instance of the general-purpose
abstract pollset object.  And if we use that pollset multiple
times in a typical CGI request, it's more efficient to use the
encapsulated API than the transparent one, because the
encapsulated one can amortize the O(n) cost of setting up the
internall pollfd array over multiple poll invocations.
The cases where we'd really need a multi-descriptor transparent
API are those where the application can't easily keep track of
a pollset object between calls.  I haven't been able to think of
a good example yet, though.
Of course, if we can only come up with a single use case for the second
API, and it happens to be apr_wait_for_io_or_timeout(), then let's just
inline the poll/select call there and forget about the lightweight API
until we find another use case.

Well, I've offered the bigger-than-one case, which will be true for many
apps that poll on stdin/stdout and some other socket or file entity.  
I really
can't conceive of more than about 6 fd's in any obvious small-set case.

But I agree, go with an abstract implementation, and make our existing
apr_wait_for_io_or_timeout into the first special case.  It's so 
darned simple
that we shouldn't be trying to overwhelm it with multi-fd logic.

+1
Brian