I'm having a doozy of time with a mod_rewrite setup, and I'd love some extra eyes and brains. Here's my dealio.
I have a complex setup in httpd.conf. First, a RewriteRule is hit: RewriteRule /some/stuff/(.*) /some/servlet?url=/new/stuff/($1) [QSA,PT,NE,L] Then, /some/servlet runs, and after doing what it does, it reconnects to the HTTP server (at 127.0.0.1) using the URL in the "url=" query param. Finally, this second HTTP connection hits a ScriptAlias: ScriptAlias /new/stuff /final/uri.cgi I am confident that the steps themselves are fine, so please try to avoid recommandations like "simply your ruleset". If my scenario is revealing a real bug, let's address that, please. :-) I'm requesting a URL: "http://server/some/stuff/foo%25bar/" I set a breakpoint in mod_rewrite.c:hook_uri2file(), and I see that r->uri has already been URI-unescaped to "/some/stuff/foo%bar". I presume this is the server core doing this? One of the first things mod_rewrite does is to say: if (r->filename == NULL) copy r->uri to r->filename; Now, my rewrite rule uses the [PT] flag, so after finishing the first rewrite (applying the rule), mod_rewrite then does the reverse copy, copying r->filename to r->uri. Now, keep in mind that r->uri (and now, r->filename) have already been URI-unescaped. Now my servet is activated. Once again, the URI is unescaped. Logs from my servlet indicate a query string of "url=/new/stuff/foo�r". The servlet does it's stuff, and then performs the turnaround it's been asked to perform. It URI-escapes the URL, and hits the URL "http://localhost/new/stuff/foo%bar". Finally, the ScriptAlias catches this URL, but of course it's wrong -- off by one level of URI-escaping. I tried patching mod_rewrite so that the passthrough step re-encodes the URI. That worked okay for the URL we've been talking about, but a) it's the wrong thing to do given the definition of the PT flag, and b) isn't a complete solution. Why (b)? Well, because I have this other request that I make sometimes: "http://server/some/stuff/foo%26bar/" When I make that request, mod_rewrite is again handed an unescaped URI "/some/stuff/foo&bar". Ah, so when I try to re-escape it with ap_escape_uri, it still comes out "/some/stuff/foo&bar". Yep. Because the '&' is valid in URIs, it doesn't get re-escaped. So by the time my servlet is called, it's query string has "url=/some/stuff/foo&bar", which means the "url" param's value is "/some/stuff/foo". Well, that won't work. I can't remove the [NE] from that RewriteRule because sometimes my original URLs have query string data themselves which is combined (using the [QSA] flag) with the servlet params. Since query string data is never unescaped by the server core, or by mod_rewrite, the result would cause my original URL's query data to be doubly-escaped (this happened in the past, is the reason why the [NE] was added in the first place). I'm completely out of ideas save for the one I don't want to face up to: fixing Apache's "let's taint r->uri by unescaping it before our handlers get a chance to see the thing" bug. Help. ?.
