On Wed, May 4, 2011 at 12:39, <r...@tuxteam.de> wrote: > On Wed, May 04, 2011 at 11:36:35AM +0200, Sorin Manolache wrote: >> On Wed, May 4, 2011 at 11:34, <r...@tuxteam.de> wrote: >> > Hello list, >> > >> > as the subject line says, I'm trying to run a subrequest through >> > mod_proxy and need to post-process the subrequests response data. >> > Looking at older posts on this list it seems as if the only way to >> > accomplish this is: >> > >> > (1) create a subrequest with ap_sub_req_lookup_uri(...) >> > >> > (2) modify parts of the created subrequest (filename, handler, proxyreq >> > etc.) >> > >> > (3) Install a filter that captures the response data >> > >> > (4) run that subrequest >> >> Play it in conjunction to RewriteRules: >> >> RewriteCond %{IS_SUBREQ} true >> RewriteRule ^/some_name$ >> http://backend.host.net/path?query_string [P] > > Hmm, I don't seem to get what's you do different compared with my > approach: > > >> request_rec *subr = ap_sub_req_method_uri("GET", "/some_name", r, NULL); > > Same as my (1) > Here, "/some_name" is still an arbitrary URI and _not_ the proxy URI I > want to query. BTW, this does clutter the URL namespace, a big no-no in > my usecase ... > >> ap_add_output_filter(post_processing_filter_name, filter_context, >> subr, subr->connection); > > Same as my (3) > >> int status = ap_run_subreq(subr); >> int http_status = subr->status; >> // optional: subr->main = r; >> if (ap_is_HTTP_ERROR(status) || ap_is_HTTP_ERROR(http_status)) >> // some error handling >> } > > And you still need to _run_ the subrequest to get at the restponse > status etc. > >> >> There are some subtleties here: >> >> 1. The rewrite rules are ran in the translate_name hook. If you want >> to use %{ENV:request_note_name} in your rewrite rule, you have to copy >> them somehow (for example in another translate_name callback that is >> run before the mod_rewrite callbacks) from the main request notes to >> the subrequest notes. >> >> 2. Subrequests are not kept alive. In order to keep them alive, you >> could try to hook APR_OPTIONAL_HOOK(proxy, fixups, &proxy_fixups, >> NULL, NULL, APR_HOOK_MIDDLE). In the proxy_fixups callback, you can >> set subr->main = NULL; Then, after ap_run_subreq, you can re-set >> subr->main = r (the "optional" line in the code example above). i > > But that means loosing all request context in the subrequest! One of > tthe main reasons to use mod_proxy instead of > some-arbitrary-webclient-lib is the fact that mod_proxy passes all > incomming header to the backend server. A must in my case.
The request_rec structure of the subrequest is already correctly set up when I cut its link to the main request. >> I'm >> using this trick but I do not know all its consequences. > > Hmmm - bold. The costs of server downtime might easily exeed my > monthly income in this case :-) I didn't mean that I'm really clueless. I trawled through the apache sources quite extensively and I decided to do it. And there's a commercial/financial stake in my case too. If you look at mod_proxy's sources, there're 4 places in which r->main is checked, two in ap_proxy_http_request, one in ap_proxy_backend_broke and one in mod_proxy_ajp.c In the first place, If-Match, If-None-Match, If-Range, If-Modified-Since, If-Unmodified-Since are not passed through in the subrequest. In the second place, for subrequests: *) the connection is marked to be closed after the request *) Content-Length and Transfer-Encoding are removed *) the main request body, if any, is not forwarded to the subrequest's backend. So if you set subreq->main to NULL you won't have the effects listed above. In ap_proxy_backend_broke, if r is a subrequest and the backend broke, the main request response is marked as non-cacheable. I didn't look into mod_proxy_ajp.c. Sorin