Re: using mod_proxy for subrequests

Sorin Manolache Wed, 04 May 2011 05:01:04 -0700

On Wed, May 4, 2011 at 12:39,  <r...@tuxteam.de> wrote:
> On Wed, May 04, 2011 at 11:36:35AM +0200, Sorin Manolache wrote:
>> On Wed, May 4, 2011 at 11:34,  <r...@tuxteam.de> wrote:
>> > Hello list,
>> >
>> > as the subject line says, I'm trying to run a subrequest through
>> > mod_proxy and need to post-process the subrequests response data.
>> > Looking at older posts on this list it seems as if the only way to
>> > accomplish this is:
>> >
>> > (1)  create a subrequest with ap_sub_req_lookup_uri(...)
>> >
>> > (2) modify parts of the created subrequest (filename, handler, proxyreq
>> > etc.)
>> >
>> > (3) Install a filter that captures the response data
>> >
>> > (4) run that subrequest
>>
>> Play it in conjunction to RewriteRules:
>>
>> RewriteCond     %{IS_SUBREQ}            true
>> RewriteRule     ^/some_name$
>> http://backend.host.net/path?query_string [P]
>
> Hmm, I don't seem to get what's you do different compared with my
> approach:
>
>
>> request_rec *subr = ap_sub_req_method_uri("GET", "/some_name", r, NULL);
>
> Same as my (1)
> Here, "/some_name" is still an arbitrary URI and _not_ the proxy URI I
> want to query. BTW, this does clutter the URL namespace, a big no-no in
> my usecase ...
>
>> ap_add_output_filter(post_processing_filter_name, filter_context,
>> subr, subr->connection);
>
> Same as my (3)
>
>> int status = ap_run_subreq(subr);
>> int http_status = subr->status;
>> // optional: subr->main = r;
>> if (ap_is_HTTP_ERROR(status) || ap_is_HTTP_ERROR(http_status))
>>    // some error handling
>> }
>
> And you still need to _run_ the subrequest to get at the restponse
> status etc.
>
>>
>> There are some subtleties here:
>>
>> 1. The rewrite rules are ran in the translate_name hook. If you want
>> to use %{ENV:request_note_name} in your rewrite rule, you have to copy
>> them somehow (for example in another translate_name callback that is
>> run before the mod_rewrite callbacks) from the main request notes to
>> the subrequest notes.
>>
>> 2. Subrequests are not kept alive. In order to keep them alive, you
>> could try to hook APR_OPTIONAL_HOOK(proxy, fixups, &proxy_fixups,
>> NULL, NULL, APR_HOOK_MIDDLE). In the proxy_fixups callback, you can
>> set subr->main = NULL; Then, after ap_run_subreq, you can re-set
>> subr->main = r (the "optional" line in the code example above). i
>
> But that means loosing all request context in the subrequest! One of
> tthe main reasons to use mod_proxy instead of
> some-arbitrary-webclient-lib is the fact that mod_proxy passes all
> incomming header to the backend server. A must in my case.


The request_rec structure of the subrequest is already correctly set
up when I cut its link to the main request.

>> I'm
>> using this trick but I do not know all its consequences.
>
> Hmmm - bold. The costs of server downtime might easily exeed my
> monthly income in this case :-)

I didn't mean that I'm really clueless. I trawled through the apache
sources quite extensively and I decided to do it. And there's a
commercial/financial stake in my case too.

If you look at mod_proxy's sources, there're 4 places in which r->main
is checked, two in ap_proxy_http_request, one in
ap_proxy_backend_broke and one in mod_proxy_ajp.c

In the first place, If-Match, If-None-Match, If-Range,
If-Modified-Since, If-Unmodified-Since are not passed through in the
subrequest.

In the second place, for subrequests:

*) the connection is marked to be closed after the request
*) Content-Length and Transfer-Encoding are removed
*) the main request body, if any, is not forwarded to the subrequest's backend.

So if you set subreq->main to NULL you won't have the effects listed above.

In ap_proxy_backend_broke, if r is a subrequest and the backend broke,
the main request response is marked as non-cacheable.

I didn't look into mod_proxy_ajp.c.

Sorin

Re: using mod_proxy for subrequests

Reply via email to