Re: BOM and principle of least surprise

Nick Ing-Simmons Tue, 11 May 2004 00:53:25 -0700

Jarkko Hietaniemi <[EMAIL PROTECTED]> writes:
>Nick Ing-Simmons wrote:
>
>> Larry Wall <[EMAIL PROTECTED]> writes:
>> 
>>>Right now, the meaning of "text" is subject to severe distortions
>>>due to legacy issues.  But in the long run, "text" is going to mean
>>>Unicode, and that probably means a UTF-8 file encoding at least in
>>>the western world, 
>> 
>> 
>> Microsoft seem to be somewhat focused on some 16-bit form.
>> 
>> This thread started as complaint that perl5 can't read a 
>> script saved as UCS-2/UTF-16 or whatever Windows uses.
>
>Uh, really?  Perl 5.8+ should be able to do that, automatically.



On 18th March, Erland Sommarskog <[EMAIL PROTECTED]> wrote:
>
>Using a thing like utf8 to determine the encoding of character literals
>is not a good idea. Suddenly someone saves the file in a different 
>encoding, and guess what happens. And as long as Perl does not act
>on byte-order marks, how would it be able to read a script that has
>been saved in UTF16-LE, which is the normal way of saving Unicode data
>on Windows?

I haven't tried this myself...

>
>I thought the issue was about Perl not automatically guessing the
>UTF-16 encoding of input data.

That is a related but separate issue.

Re: BOM and principle of least surprise

Reply via email to