Don Stewart <[EMAIL PROTECTED]> writes:

> You can use either bytestrings, which will ignore any encoding, 

Uh, I am hesitant to voice my protest here, but I think this bears
some elaboration:

Bytestrings are exactly that, strings of bytes.
There are basically two interfaces, one (Data.ByteString[.Lazy]),
which operates on raw bytes (and gives you Word8s), and another
(Data.ByteString[.Lazy].Char8), which treats the contents as Chars.
The latter will only deal with Unicode code points 0..255 (or
ISO_8859-1) -- and truncate higher code point values to fit this
range.

Basically, bytestrings are the wrong tool for the job if you need more
than 8 bits per character.  I think the predecessors of bytestring
(FPS?) had support for other fixed-size encodings, that is, two-byte
and four-byte characters. Perhaps writing a Data.Word16String
bytestrings-alike using UCS-2 would be an option? 

-k
-- 
If I haven't seen further, it is by standing in the footprints of giants
_______________________________________________
Haskell-Cafe mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to