Hi, >> The only way I can see where I will have the ability >> to insert those HTTP inbound request headers is if >> my filter runs between the CORE_IN and the HTTP_IN >> input filter, > > I'm afraid this is true... Does anyone else have a better idea?
After a few days of reseach, it seems like only a connection or protocol input filter would give me access to the headers + post data, but the problem is that once a filter start requesting data from CORE_IN, unless in SPECULATIVE mode, that data will be marked gone by CORE_IN. Even in this case, this mode will not work for a large request body (e.g. 1 MB), as it will always return the same beginning chunk of data (the max internal buffer size being 8192). > >> in which case I will need a properly >> populated request_rec* structure to be able to use >> the ap_XXX APIs (typical things would be get the >> mime-type, content-length of the POST data, protocol >> version, etc). > > No, just modify the protocol data... If you want to insert a specific > header field, just construct the text form of it as if the brower sent > it and return it to the filter on top at the right point in the data > stream. You're going to be keeping up with data you've read anyway so > you can return it at the right time to the calling filter. Be careful > you have configuration to avoid buffering the entire .iso file that > somebody tries to copy to their DAV filesystem. And your configuration > may be roll-your-own w.r.t. selecting which request objects to operate > on since you may have to make your decisions before Apache has read the > entire request header. Being an input filter, my code doesn't get invoked unless someone ask for some data. This will be initiated by the read_request_line(). Now this will call ap_get_brigade() with the GETLINE mode. It is then that my filter gets called and it would like to read request line + http headers + post data (a big request body, or chunked encoding will, at this point, make things even harder), and then perhaps add additional headers so that the web application can accessed them through ap_table_get(r->headers_in) call. I could save those headers in my context (f->ctx), and then modify the brigade when ap_get_mime_headers_core() will be called (assuming that the next call that will ask for data after read_request_line() will always be ap_get_mime_headers_core()). Is that a fair assumption? Once done, return the brigade with the request line that read_request_line() originally asked. The question is how can I ask CORE_IN for the request line, http headers and request body without causing it to be consumed? The other worry is memory usage. Since all allocation will be done from the connection pool, with keepalive, that connection can be alive for a long time, and as such, I can see memory usage keep increasing. Could that be addressed by running as a protocol filter that gets inserted on the create_request() hook? >> Since my filter will run before the >> HTTP_IN filter would have had a chance to parse the >> request line and the request headers, it will most >> likely cause confusion within the Apache internal >> state. > The simple answer is that if you return data in the right format to > HTTP_IN, then there won't be a problem :) I wish it was easy as that :)
