John Goerzen wrote:
>Char in Haskell represents a Unicode character. I don't know exactly >what its size is, but it must be at least 16 bits and maybe more. >String would then share those properties. > >However, usually I'm accustomed to dealing with data in 8-bit words. >So I have some questions:
Char and String handling in Haskell is deeply broken. There's a discussion ongoing on this very list about fixing it (in the context of pathnames).
But for now, Haskell's Char behaves like C's char with respect to I/O. This is unlikely ever to change (in the existing I/O interface) because it would break too much code. So the answers to your questions are:
> * If I use hPutStr on a string, is it guaranteed that the number of > 8-bit bytes written equals (length stringWritten)?
Yes, if the handle is opened in binary mode. No if not.
> + If yes, what happens to the upper 8 bits? Are they simply > stripped off?
Yes.
> * If I run hGetChar, is it possible that it would consume more than > one byte of input?
No in binary mode, yes in text mode.
> * Does Haskell treat the "this is a Unicode file" marker special in > any way?
No.
> * Same questions on withCString and related String<->CString > conversions.
They all behave as if reading/writing a file in binary mode.
-- Ben
_______________________________________________ Haskell-Cafe mailing list [email protected] http://www.haskell.org/mailman/listinfo/haskell-cafe
