On 20 Aug 2001, Gisle Aas wrote:
> 
> As it seems unlikely that any _working_ code could be using this
> function it might be ok to simply change the default.  But this has

Well, they could be encoding ISINDEX queries with it, I suppose.  Not
escaping reserved chars is ok *if* the receiver isn't treating any of
them as metachars.  But switching to behavior 2 below would not break
such usage.

> been this way for such a long time (more than 6 years) so I still
> hesitate a bit.

There is the point that there would then be both correct and broken
versions floating around.  But I think that would still be better than
having only broken ones.

> I see 2 ways of changing the default:
> 
>    1) Remove % from the current set.  (URI.pm already considers,
>       % to be part of URIC, although this is a bit internal).
>    2) Go with your suggestion: [^A-Za-z0-9\-_.!~*'()]
> 
> It looks like 1) is more likely to not break code, but perhaps 2) is a
> more useful default.

Actually, I'd expect 2 to be safer, too.  Any current code that doesn't
want "%" to be escaped must be using a non-default set already, so we
are left with the ones that do want it escaped, and thus are presumably
performing the initial escaping of data before combining it into a URI.

For those applications, 2 is the correct set.  The only problematic
cases would be applications that erraneously do something like:

  $uri = "$script?".uri_escape(join "&", map "$_=$q{$_}", keys %q);

But those programs are already buggy, and in some sense I'd say it'd be
better to flush out these bugs rather than let them stay hidden.

-- 
Ilmari Karonen - http://www.sci.fi/~iltzu/
"... programs that work in spite of themselves are not what should be
guiding language design."  -- Sean M. Burke on the perl5-porters list


Reply via email to