I've spent an afternoon nailing down a problem, and it's turned out to be
some anomolous behaviour which I think ought to be documented. (Note that
I'm talking about Apache-1.3 here, but a brief glance at the 2.1.6 source
suggests it has the same issue)

Issue: the REQUEST_URI value passed to a CGI environment is _not_ the same
as the %{REQUEST_URI} substitution in mod_rewrite. See here:

[main/util_script.c]

    ap_table_setn(e, "REQUEST_URI", original_uri(r));

[modules/standard/mod_rewrite.c]

    else if (strcasecmp(var, "REQUEST_URI") == 0) { /* non-standard */
        result = r->uri;

Looks an innocuous difference, doesn't it? :-)

Where it bit me is when a request has been associated with an action
handler, and so the original URI is not the same as the rewritten one. You
can have

    AddType application/x-php php
    Action application/x-php /common-cgi/php_wrapper

and if you actually *execute* php_wrapper on this box, then the script gets
the correct REQUEST_URI as expected. However if you try to do something
clever in mod_rewrite, such as trigger the request to be proxied to another
machine where it will be executed:

    RewriteRule ^/common-cgi/php_wrapper http://foo.com%{REQUEST_URI} [P]

then the REQUEST_URI is not what you expect. (In fact it was made worse in
my case because a user had a .htaccess file doing per-directory rewrites,
and I really needed to proxy the original URI, not the rewritten one)

Now, what the CGI's original_uri() function does is to take the original
request line, strip off everything before the first space and after the
second space, and use that. You can simulate this in mod_rewrite:

    RewriteCond %{THE_REQUEST} "^[^ ]+ +(/[^ ]*)"
    RewriteRule ^ - [E=REQUEST_URI:%2]

and so I have a solution to my problem. There doesn't seem to be any other
mod_rewrite expansion apart from THE_REQUEST which contains the original
URI.

I very much doubt you can change the behaviour of either the %{REQUEST_URI}
mod_rewrite function or the CGI REQUEST_URI environment variable without
breaking people everywhere, but perhaps a nice juicy note could be put into
the documentation, or at least in the source code at the two points above?
It might save someone else having to spend a whole afternoon scratching
their head :-)

Also, I think it would be nice in mod_rewrite to have %{ORIGINAL_URI} which
gives the same value as REQUEST_URI in a CGI environment. However I can live
without that, given that it can be simulated using a regular expression as
shown above.

Regards,

Brian Candler.

Reply via email to