Re: [Haskell-cafe] Re: Strings and utf-8

2007-11-30 Thread Johan Tibell
Am I wrong to think that UTF8 should be THE standard? I believe it can encode anything encoded by other encodings. All the UTF-* encodings can encode the same code points. There are different trade offs though. Can't we consider non-utf8 text as legacy? I don't like that word, but I do

Re: [Haskell-cafe] Re: Strings and utf-8

2007-11-29 Thread Duncan Coutts
On Wed, 2007-11-28 at 17:38 -0200, Maurí­cio wrote: (...) When it's phrased as truncates to 8 bits it sounds so simple, surely all we need to do is not truncate to 8 bits right? The problem is, what encoding should it pick? UTF8, 16, 32, EBDIC? (...) One sensible suggestion

Re: [Haskell-cafe] Re: Strings and utf-8

2007-11-29 Thread Jules Bean
Duncan Coutts wrote: On Wed, 2007-11-28 at 17:38 -0200, Maurí­cio wrote: (...) When it's phrased as truncates to 8 bits it sounds so simple, surely all we need to do is not truncate to 8 bits right? The problem is, what encoding should it pick? UTF8, 16, 32, EBDIC? (...) One

Re: [Haskell-cafe] Re: Strings and utf-8

2007-11-29 Thread Duncan Coutts
On Thu, 2007-11-29 at 13:05 +, Jules Bean wrote: Language of messages is quite different from language of a file you read. Suppose I am English, and I have a russian friend, Vlad. My default locale is, say, latin-1, and his is something cyrillic. I might well open files including my

Re: [Haskell-cafe] Re: Strings and utf-8

2007-11-29 Thread Thomas Hartman
by: [EMAIL PROTECTED] 11/29/2007 07:44 AM To Maurí­cio [EMAIL PROTECTED] cc haskell-cafe@haskell.org Subject Re: [Haskell-cafe] Re: Strings and utf-8 On Wed, 2007-11-28 at 17:38 -0200, Maurí­cio wrote: (...) When it's phrased as truncates to 8 bits it sounds so simple, surely all we need

Re: [Haskell-cafe] Re: Strings and utf-8

2007-11-29 Thread Reinier Lamers
Thomas Hartman wrote: A translation of http://www.ahinea.com/en/tech/perl-unicode-struggle.html from perl to haskell would be a very useful piece of documentation, I think. Perl encodes both Unicode and binary data as the same (dynamic) data type. Haskell - at least in theory - has two