Yes. I have run into this before. Mongrel will error on an invalid HTTP
URI, with one common case being characters not properly escaped, which
is what your example is. When one of the developers of my app brought
this up before, he was told by the Mongrel developer that this was
intentional, and would not be changed.
I didn't like this then, and I don't like it now, for a variety of
reasons, including that my app needs to respond to URLs sent by third
parties that are not under my control. Perhaps the current mongrel
developers (IS there even any active development on mongrel?) have a
different opinion, and this could be changed, or made configurable.
In the meantime, I have gotten around it with some mod_rewrite rules in
apache on top of mongrel, to take illegal URLs and escape/rewrite them
to be legal. Except due to some weird (bugs?) in apache and mod_rewrite
around escaping and difficulty of controlling escaping in the apache
conf, I actually had to use an external perl file too. Here's what I do:
Apache conf, applying to mongrel urls (which in my setup are all urls on
a given apache virtual host)
RewriteEngine on
RewriteMap query_escape
prg:/data/web/findit/Umlaut/distribution/script/rewrite_map.pl
#RewriteLock /var/lock/subsys/apache.rewrite.lock
RewriteCond %{query_string} ^(.*[\>\<].*)$
RewriteRule ^(.*)$ $1?${query_escape:%1} [R,L,NE]
The rewrite_map.pl file:
#!/usr/bin/perl
$| = 1; # Turn off buffering
while (<STDIN>) {
s/>/%3E/g;
s/</%3C/g;
s/\//%2F/g;
s/\\/%5C/g;
s/ /\+/g;
print $_;
}
##
Looks like I'm not actually escaping bare '%' chars, since i hadn't run
into those before in the URLs I need to handle. It would be trickier to
add a regexp for that, since you need to distinguish an improper % from
an % that's actually part of an entity reference. Maybe something like:
s/%([^A-F0-9]|$)([^A-F0-9]|$)/%25/g;
'/%25' would be a valid URI path representing the % char. '/%' is not.
Hope this helps,
Jonathan
Robbie Allen wrote:
If you append an extra percent sign to a URL that gets passed to
mongrel, it will return a Bad Request error. Kind of odd that
"http://localhost/%" causes a "Bad Request" instead of a "Not Found"
error.
Here is the error from the mongrel log:
HTTP parse error, malformed request (127.0.0.1):
#<Mongrel::HttpParserError: Invalid HTTP format, parsing fails.>
I'm using Nginx in front of mongrel. I understand this is a bad URL,
but is there anyway to have mongrel ignore lone percent signs? Or
perhaps a Nginx rewrite rule that will encode extraneous percent signs?
--
Jonathan Rochkind
Digital Services Software Engineer
The Sheridan Libraries
Johns Hopkins University
410.516.8886
rochkind (at) jhu.edu
_______________________________________________
Mongrel-users mailing list
Mongrel-users@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-users