Am I wrong to think that UTF8 should be THE
standard? I believe it can encode anything
encoded by other encodings.
All the UTF-* encodings can encode the same code points. There are
different trade offs though.
Can't we consider non-utf8 text as legacy?
I don't like that word, but I do
Language of messages is quite different
from language of a file you read. (...)
Yes, it's a fundamental limitation of the
unix locale system and multi-user
systems. However it's no less wrong than
just picking UTF8 all the time. (...)
Am I wrong to think that UTF8 should be THE
standard?
On Wed, 2007-11-28 at 17:38 -0200, Maurício wrote:
(...) When it's phrased as truncates to 8
bits it sounds so simple, surely all we need
to do is not truncate to 8 bits right?
The problem is, what encoding should it pick?
UTF8, 16, 32, EBDIC? (...)
One sensible suggestion
Duncan Coutts wrote:
On Wed, 2007-11-28 at 17:38 -0200, Maurício wrote:
(...) When it's phrased as truncates to 8
bits it sounds so simple, surely all we need
to do is not truncate to 8 bits right?
The problem is, what encoding should it pick?
UTF8, 16, 32, EBDIC? (...)
One
On Thu, 2007-11-29 at 13:05 +, Jules Bean wrote:
Language of messages is quite different from language of a file you read.
Suppose I am English, and I have a russian friend, Vlad.
My default locale is, say, latin-1, and his is something cyrillic.
I might well open files including my
by: [EMAIL PROTECTED]
11/29/2007 07:44 AM
To
Maurício [EMAIL PROTECTED]
cc
haskell-cafe@haskell.org
Subject
Re: [Haskell-cafe] Re: Strings and utf-8
On Wed, 2007-11-28 at 17:38 -0200, Maurício wrote:
(...) When it's phrased as truncates to 8
bits it sounds so simple, surely all we need
Thomas Hartman wrote:
A translation of
http://www.ahinea.com/en/tech/perl-unicode-struggle.html
from perl to haskell would be a very useful piece of documentation, I
think.
Perl encodes both Unicode and binary data as the same (dynamic) data
type. Haskell - at least in theory - has two
(...) When it's phrased as truncates to 8
bits it sounds so simple, surely all we need
to do is not truncate to 8 bits right?
The problem is, what encoding should it pick?
UTF8, 16, 32, EBDIC? (...)
One sensible suggestion many people have made
is that H98 file IO should use the locale