There are at least 2 bug reports about the behaviour of mod_rewrite on
unescaping URLs and then passing the unescaped references in the rewrite
target (bug 34602 and 32328 deal about this in httpd-2 and I recall some
other bug report for 1.3).
As far as I have found out the problem really is that the httpd unescapes
the URLs before passing it to the mapper modules.
E.g. the rule
RewriteRule ^/(.*)$ /index.php?title=$1 [L]
rewrites URLs like /Foo to /index.php?title=Foo but as soon as there is a
special escaped char in the URL, like an escaped hash, plus or ampersand
the result is not as intended: /Foo%2BBar (/Foo+Bar urlencoded) is
rewritten to /index.php?title=Foo Bar (instead of
/index.php?title=Foo+Bar) and even worse /Foo%23Bar (/Foo#Bar urlencoded)
is rewritten to /index.php?title=Foo#Bar (instead of
/index.php?title=Foo%23Bar) so that the parts after the hash get totally
ignored.
I know that there are workarounds to this problem by using the untouched
%{THE_REQUEST} variable in a rewrite condition or inspecting these in the
script that gets executed (like wikimedia does) but these are suboptimal.
I have written a patch that tries to address this problem. To remain
backwards-compatible I did not change the original rewrite-behaviour but
instead added a new flag to indicate that backreferences should get
escaped.
Adding the flag [B] or [backrefescaping] to a RewriteRule makes
mod_rewrite escape the backreferences in the rewrite target, e.g.
RewriteRule ^/(.*)$ /index.php?title=$1 [L,B]
Forces that when constructing the rewrite target the backreferenced parts
get re-encoded.
The patch can be found here:
http://issues.apache.org/bugzilla/attachment.cgi?id=20217
Note that it is against 2.2.4 because I couldn't get the SVN version to
work.
Here is the patch for the doc (against SVN HEAD):
--- httpd/docs/manual/mod/mod_rewrite.xml.orig 2007-05-18
19:28:17.796875000 +0200
+++ httpd/docs/manual/mod/mod_rewrite.xml 2007-05-18 19:18:25.078125000
+0200
@@ -1176,6 +1176,19 @@
following flags: </p>
<ul>
+ <li>'<strong><code>backrefescaping|B</code></strong>'
+ Escapes the backreferences in the substitution string for
+ use as query string arguments.
+<example>
+RewriteRule ^(.*)$ index.php?show=$1 [B,L]
+</example>
+ If you do not use this flag, escaping of the URL will be done
+ before the backreference is placed. This will not work if the
initial
+ URL contains any special characters that need escaping.
+ In the given example, loading the URL http://example.com/C++
would
+ do an internal redirect to index.php?show=C%2B%2B instead of
+ index.php?show=C++ (which would possibly not give the result
intended).
+ </li>
<li>'<strong><code>chain|C</code></strong>'
(<strong>c</strong>hained with next rule)<br />
This flag chains the current rule with the next rule
--
Günther