Hi, 

>> The only way I can see where I will have the ability
>>  to insert those HTTP inbound request headers is if
>>  my filter runs between the CORE_IN and the HTTP_IN
>>  input filter,
>
> I'm afraid this is true...  Does anyone else have a better idea?

After a few days of reseach, it seems like only a connection 
or protocol input filter would give me access to the headers + post data,
but the problem is that once a filter start requesting
data from CORE_IN, unless in SPECULATIVE mode, that data
will be marked gone by CORE_IN. Even in this case, this mode
will not work for a large request body (e.g. 1 MB), as it will always
return the same beginning chunk of data (the max internal buffer size 
being 8192).

>
>> in which case I will need a properly
>>  populated request_rec* structure to be able to use
>>  the ap_XXX APIs (typical things would be get the
>>  mime-type, content-length of the POST data, protocol
>>  version, etc).
>
> No, just modify the protocol data...  If you want to insert a specific 
> header field, just construct the text form of it as if the brower sent 
> it and return it to the filter on top at the right point in the data 
> stream.  You're going to be keeping up with data you've read anyway so 
> you can return it at the right time to the calling filter.  Be careful 
> you have configuration to avoid buffering the entire .iso file that 
> somebody tries to copy to their DAV filesystem.  And your configuration 
> may be roll-your-own w.r.t. selecting which request objects to operate 
> on since you may have to make your decisions before Apache has read the 
> entire request header.

Being an input filter, my code doesn't 
get invoked unless someone ask for some data. This will
be initiated by the read_request_line(). Now this will
call ap_get_brigade() with the GETLINE mode. It is then
that my filter gets called and it would like to read 
request line + http headers + post data (a big request
body, or chunked encoding will, at this point, make things
even harder), and then perhaps add additional headers 
so that the web application can accessed them through 
ap_table_get(r->headers_in) call. I could save those headers
in my context (f->ctx), and then modify the brigade 
when ap_get_mime_headers_core() will be called (assuming
that the next call that will ask for data after 
read_request_line() will always be ap_get_mime_headers_core()).
Is that a fair assumption?
Once done, return the brigade with the request line
that read_request_line() originally asked.

The question is how can I ask CORE_IN for the request line,
http headers and request body without causing it to be 
consumed? 

The other worry is memory usage. Since all allocation will
be done from the connection pool, with keepalive, that connection
can be alive for a long time, and as such, I can see memory
usage keep increasing. Could that be addressed by running 
as a protocol filter that gets inserted on the create_request()
hook?

>>  Since my filter will run before the
>>  HTTP_IN filter would have had a chance to parse the
>>  request line and the request headers, it will most
>>  likely cause confusion within the Apache internal
>>  state.

> The simple answer is that if you return data in the right format to 
> HTTP_IN, then there won't be a problem :)  

I wish it was easy as that :)

Reply via email to