"Sean M. Burke" <[EMAIL PROTECTED]> writes:

> 5) URL-encoding.  There's no URLencoding here, but I've seen
> plenty of it before.  It's pretty much resolvable with URI.pm:
> 
> use URI;
> sub obf { local $_ = $_[0]; s/([^\/])/sprintf '%%%2x', ord $1/eg; $_ }
> my $obf = 'http://' . obf('www.perl.com/pub/a/2001/08/27/bjornstad.html');
> 
> print "obf: $obf\n";
> my $x = URI->new($obf);
> print "normalized: ", $x, "\n";
> print "canonical: ", $x->canonical, "\n";
> 
> Output (wrapped for readability):
> 
> obf: http://%77%77%77%2e%70%65%72%6c%2e%63%6f%6d/%70%75%62/%61/%32
>   %30%30%31/%30%38/%32%37/%62%6a%6f%72%6e%73%74%61%64%2e%68%74%6d%6c
> normalized: http://%77%77%77%2e%70%65%72%6c%2e%63%6f%6d/%70%75%62/%61/%32
>   %30%30%31/%30%38/%32%37/%62%6a%6f%72%6e%73%74%61%64%2e%68%74%6d%6c
> canonical: http://www.perl.com/pub/a/2%30%301/%308/27/bjornstad.html
> 
> That's with URI.pm 1.11.  Hm, odd that "%32%30%30%31" canonizes as
> "2%30%301", not "2001".  Gisle?

Just another case where I'm missing the mythical ?? operator :-)
The difference between the two middle digits and the others is that
they are false.

This patch fixes the problem:

Index: URI/http.pm
===================================================================
RCS file: /cvsroot/libwww-perl/uri/URI/http.pm,v
retrieving revision 1.3
diff -u -p -u -r1.3 http.pm
--- URI/http.pm 1998/09/11 09:54:04     1.3
+++ URI/http.pm 2001/09/01 02:16:35
@@ -25,7 +25,8 @@ sub canonical
                $unreserved_escape{sprintf "%%%02X", ord($_)} = $_;
            }
        }
-       $$other =~ s/(%[0-9A-F]{2})/$unreserved_escape{$1} || $1/ge;
+       $$other =~ s/(%[0-9A-F]{2})/exists $unreserved_escape{$1} ?
+                                          $unreserved_escape{$1} : $1/ge;
        $other->path("/") if $slash_path;
     }
     $other;

Regards,
Gisle

Reply via email to