Am 04.05.2010 um 11:09 schrieb Gisle Aas: > I regret that I let \C sneak into the URI module.
I might have understood why one might think that \C is not a good idea to use in that method, and maybe not in general. The fact that character strings in Perl are encoded in UTF-8 is an implementation detail, and you shouldn't bother, or make any assumptions about this technicality. But by using \C to derive an encoded version - a byte string - from a character string (and maybe even taking it for granted you'll get a UTF-8 byte string), you're tying your interface to an implementation detail. And the behaviour of your code will change as soon as Perl moves on to use, say, UTF-16 as the internal encoding. (Which is highly unlikely, but that's another story.) Is it this (theoretically fragile) implicitness in handling character strings that makes \C a bad idea? But probably not as bad an idea as relying on the default platform encoding in Java ("default charset" in Java API doc lingo), which may be different from country to country and from installation to installation. http://java.sun.com/javase/6/docs/api/java/lang/String.html#String%28byte[]%29 -- Michael.Ludwig (#) XING.com