Re: perl unicode support

Rich Felker Mon, 26 Mar 2007 14:06:36 -0800

On Mon, Mar 26, 2007 at 05:28:43PM -0400, ＳｒｉｎＴｕａｒ wrote:
> I frequenty run into problems with utf-8 in perl, and I was wondering
> if anyone else
> had encountered similar things.
> 
> One thing I've noticed is that when processing characters, I often get
> warnings about
> "wide characters in print", or have input/output get horribly mangled.
> 
> Ive been trying to work around it in various ways, commonly doing thing 
> such as:
> binmode STDIN,":utf8";
> binmode STDOUT,":utf8";
> 
> or using functions such as :
> sub unfunge_string
> {
>    foreach my $ref (@_)
>    {
>        $$ref = Encode::decode("utf8",$$ref,Encode::FB_CROAK);
>    }
> }
> 
> 
> but this feels wrong to me.
> 
> For a language that really goes out of its way to support encodings, I
> wonder if it
> wouldnt have been better off it it just ignored the entire concept
> alltogether and treated
> strings as arrays of bytes...


Read the ancient linux-utf8 list archives and you should get a good
feel for Larry Wall's views on the matter.

> Ive found pages wherin people complain of similar problems, such as:
> http://ahinea.com/en/tech/perl-unicode-struggle.html
> 
> And I'm wondering if in its attempt to be a good i18n citizen, perl
> hasnt gone overboard and made a mess of things instead.

I agree, but maybe there are workarounds. I have a system that's
completely UTF-8-only. I don't have or want support for any legacy
encodings except in a few isolated tools (certainly nothing related to
perl) for converting legacy data I receive from outside.

With that in mind, I built perl without PerlIO, wanting to take
advantage of my much smaller and faster stdio implementation. But now,
binmode doesn't work, so the only way I can get rid of the nasty
warning is by disabling it explicitly.

Is there any way to get perl to behave sanely in this regard? I don't
really use perl much (mainly for irssi) so if not, I guess I'll just
leave it how it is and hope nothing seriously breaks..

Rich

--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Re: perl unicode support

Reply via email to