Ryan Bloom wrote:

So, the problem that this design is trying to solve is a memory leak.  I
would rather just focus on fixing that problem.  So, there are a couple
of things here,  1)  The apr_pool_t that is in the current code is only
there because I needed it for the apr_palloc(), if that call is removed
so should the apr_pool_t.  2)  This is a multi-pronged approach.  The
assumption is that modern platforms will get the most benefit, but all
platforms will have the leak removed.  With no further ado, the approach
I would suggest.

The idea is that we will still end up creating the pollfd_t on every
call to apr_poll(), this does have a performance penalty, but on modern
platforms, that performance penalty is constant.  The only real
difference is how you allocate the array.  There are three options:

1)  C99 specifies Variable Length Arrays (VLAs) specifically to solve
this problem.  If your compiler supports it (gcc does on Linux at least,
as does MSVC I believe), then we create an array on the stack with a
size of the num argument.  (Obviously, we would need some autoconf
macros to determine if the platform supports VLAs.)

2)  For platforms that don't support VLAs, we create an array of size
FOO (value to be determined later).  Because this is allocated on the
stack, we just need to fill it out and call poll().

3)  If the caller needs more than FOO elements, we use malloc() and
free() to allocate the pollfd_t array.  This is an isolated call to
malloc/free, only required on platforms that don't support VLAs, and can
be optimized out if the user sets FOO correctly at compile time.  (For
Apache, the default will most likely be < 10, but we can experiment for
the best value).


The only objection I have to this approach is that anybody who calls poll with a large number of file descriptors (and thus hits case 3 above) probably has done a lot of work to design a highly concurrent application that they think will run efficiently. And they'll be in for an unpleasant surprise when they profile it and find out that we're doing a malloc/free on every poll request.

IMHO, it's better to say that we won't support more than "FOO"
descriptors through this API than to introduce a major performance
degradation for apps that go beyond that threshold.

I suppose we could have two APIs: the transparent one for small
numbers of descriptors (low overhead, but not scalable), and the
function-encapsulated API for general use (higher overhead, but
very scalable, possibly with support for /dev/poll in the future).

If we went with the two-API approach, it would resolve my
veto on the current API, because the memory leak would go
away and the O(n) scalability problem wouldn't matter if
n was never allowed to exceed, say, 10.

--Brian




Reply via email to