On 06 Feb 2010, at 2:30 AM, William A. Rowe Jr. wrote:
On 2/5/2010 4:35 PM, Graham Leggett wrote:
All of these modules, including mod_remoteip in trunk, take a piece
of
information from a request (a header value typically), and then
copies
the value upstream to the parent connection, blowing away the real
value
of the IP address.
Look again. It preserves both at the parent connection (to optimize
the
same-same match between 2 consecutive requests, think pipelining).
In my original message I said "I have to deal with a number of
modules", but we can focus on mod_remoteip for now.
Why does this not just use a simple pool cleanup on request end to
restore the values, instead of all the song and dance to try and keep
the previous value for restoration in the next request? Attaching it
to a pool is fine, restoring it in the next request isn't clean.
I do notice mod_remoteip can only be set server wide, and not per
directory. This makes it impossible to use while servers are in
transition, where one URL prefix is served via the old load balancer /
reverse proxy, and the another URL prefix is served via a different
load balancer / reverse proxy.
This blown away IP address now becomes the IP address for all further
requests on the same connection, which, if they are coming from a
load
balancer, are very unlikely to come from the same original client.
Certainly not true of mod_remoteip by design.
For the duration of the request, the value is replaced. For the
subsequent
request, the value is reset. See line 252/253; is there a simple bug
somewhere or are you speaking from direct observation?
I'm sitting with one (non ASF) module that copies the header direct
into the connection (!!!), I have a second (non ASF) module that seems
to be a later incarnation of the first one that at least gets the pool
lifetimes right and allocates the new IP out of the connection pool,
but it remains a leak, particularly as the module is designed to
accept connections from a load balancer with a connection pool where
every request will have a different IP.
In trying to unravel this module mess I have inherited, I noticed that
all these modules are allowing the request to poke around inside the
connection, which didn't seem very clean to me.
Instead, what I propose are fields inside the request itself, copied
by reference from the connection, and modules can then fiddle with
these request specific IP addresses to their heart's content, knowing
that the scope of the information is accurate.
It also solves the problem that in cases where you want to log the
original browser IP *and* the IP of the load balancer, you can do so.
Right now, the real IP is obscured completely.
Regards,
Graham
--