On 20 Aug 2001, Gisle Aas wrote:
>
> As it seems unlikely that any _working_ code could be using this
> function it might be ok to simply change the default. But this has
Well, they could be encoding ISINDEX queries with it, I suppose. Not
escaping reserved chars is ok *if* the receiver isn't treating any of
them as metachars. But switching to behavior 2 below would not break
such usage.
> been this way for such a long time (more than 6 years) so I still
> hesitate a bit.
There is the point that there would then be both correct and broken
versions floating around. But I think that would still be better than
having only broken ones.
> I see 2 ways of changing the default:
>
> 1) Remove % from the current set. (URI.pm already considers,
> % to be part of URIC, although this is a bit internal).
> 2) Go with your suggestion: [^A-Za-z0-9\-_.!~*'()]
>
> It looks like 1) is more likely to not break code, but perhaps 2) is a
> more useful default.
Actually, I'd expect 2 to be safer, too. Any current code that doesn't
want "%" to be escaped must be using a non-default set already, so we
are left with the ones that do want it escaped, and thus are presumably
performing the initial escaping of data before combining it into a URI.
For those applications, 2 is the correct set. The only problematic
cases would be applications that erraneously do something like:
$uri = "$script?".uri_escape(join "&", map "$_=$q{$_}", keys %q);
But those programs are already buggy, and in some sense I'd say it'd be
better to flush out these bugs rather than let them stay hidden.
--
Ilmari Karonen - http://www.sci.fi/~iltzu/
"... programs that work in spite of themselves are not what should be
guiding language design." -- Sean M. Burke on the perl5-porters list