Re: perlunitut - feedback appreciated

H. Peter Anvin Mon, 12 Nov 2001 10:18:55 -0800

Followup to:  <[EMAIL PROTECTED]>
By author:    Jarkko Hietaniemi <[EMAIL PROTECTED]>
In newsgroup: linux.utf8
>
> On Sun, Nov 11, 2001 at 12:57:27PM -0800, Edward Cherlin wrote:
> > Thanks. The Perl implementors and you have done a very good job. I have a
> > few suggestions and one complaint.
> > 
> > The most important issue is chr().
> > 
> > >Note that C<chr(...)> for arguments less than 0x100 (decimal 256) will
> > >return an eight-bit character for backward compatibility with older
> > >Perls (in ISO 8859-1 platforms it can be argued to be producing
> > >Unicode even then, just not Unicode encoded in UTF-8 -- the ISO 8859-1
> > >is equivalent to the first 256 characters of Unicode).  For C<chr()>
> > >arguments of 0x100 or more, Unicode will always be produced.
> > 
> > My complaint: There should be a pure Unicode alternative to this kludge.
> > Obviously, it is not hard to write one in Perl, but it should be part of the
> > implementation.
> 
> Note that for most of the time, the difference whether chr() generates
> ISO 8859-1 or UTF-8 encoded Unicode for the range 0x80..0xff shouldn't
> matter, since the upgrading of the 8-bit to UTF-8 is automatic.
>


*UNICODE* or *UTF-8*?

If what chr() returns is a string encoded in Unicode, of which UTF-8
is an encoding, that is one thing.  If:

print FILE chr(0xc0);

... prints a naked 0xc0 byte (instead of 0xc3 0x80) but

print FILE chr(0x1c0);

... prints 0xc7 0x80, then that is a serious case of braindamage.

        -hpa



-- 
<[EMAIL PROTECTED]> at work, <[EMAIL PROTECTED]> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt    <[EMAIL PROTECTED]>
--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Re: perlunitut - feedback appreciated

Reply via email to