Re: Text in Haskell: a second proposal

2002-08-15 Thread Stefan Karrmann
Marcin 'Qrczak' Kowalczyk Sat, Aug 10, 2002 at 09:02:30AM +: Thu, 8 Aug 2002 23:40:42 -0700, Ashley Yakeley [EMAIL PROTECTED] pisze: 1. Octets. 2. C char. 3. Unicode code points. 4. Unicode code values, useful only for UTF-16, which is seldom used. 5. What handles handle. I

RE: Text in Haskell: a second proposal

2002-08-13 Thread Simon Marlow
At 2002-08-09 03:26, Simon Marlow wrote: Why combine I/O and {en,de}coding? Firstly, efficiency. Hmm... surely the encoding functions can be defined efficiently? decodeISO88591 :: [Word8] - [Char]; encodeISO88591 :: [Char] - [Word8]; -- uses low octet of codepoint You

RE: Text in Haskell: a second proposal

2002-08-13 Thread Ashley Yakeley
At 2002-08-13 04:13, Simon Marlow wrote: That depends what you mean by efficient: these functions represent an extra layer of intermediate list between the handle buffer and the final [Char], and furthermore they don't work with partial reads - the input has to be a lazy stream gotten from

RE: Text in Haskell: a second proposal

2002-08-10 Thread Ashley Yakeley
At 2002-08-09 03:26, Simon Marlow wrote: Why combine I/O and {en,de}coding? Firstly, efficiency. Hmm... surely the encoding functions can be defined efficiently? decodeISO88591 :: [Word8] - [Char]; encodeISO88591 :: [Char] - [Word8]; -- uses low octet of codepoint You could surely

Re: Text in Haskell: a second proposal

2002-08-10 Thread Marcin 'Qrczak' Kowalczyk
Thu, 8 Aug 2002 23:40:42 -0700, Ashley Yakeley [EMAIL PROTECTED] pisze: 1. Octets. 2. C char. 3. Unicode code points. 4. Unicode code values, useful only for UTF-16, which is seldom used. 5. What handles handle. I disagree, they should be: 1. Word8 2. CChar 3. Char 4. Word16 5.

Re: Text in Haskell: a second proposal

2002-08-10 Thread Wolfgang Jeltsch
On Friday, 2002-08-09, 08:40, CEST, Ashley Yakeley wrote: At 2002-08-08 23:10, Ken Shan wrote: 1. Octets. 2. C char. 3. Unicode code points. 4. Unicode code values, useful only for UTF-16, which is seldom used. 5. What handles handle. ... I suggest that the following Haskell types

Re: Text in Haskell: a second proposal

2002-08-09 Thread Ashley Yakeley
At 2002-08-08 23:10, Ken Shan wrote: 1. Octets. 2. C char. 3. Unicode code points. 4. Unicode code values, useful only for UTF-16, which is seldom used. 5. What handles handle. ... I suggest that the following Haskell types be used for the five items above: 1. Word8 2. CChar 3.

Re: Text in Haskell: a second proposal

2002-08-09 Thread Ketil Z. Malde
Ken Shan [EMAIL PROTECTED] writes: I suggest that the following Haskell types be used for the five items above: 1. Word8 2. CChar 3. CodePoint 4. Word16 5. Char On most machines, Char will be a wrapper around Word8. (This contradicts the present language standard.) Can you

Re: Text in Haskell: a second proposal

2002-08-09 Thread Sven Moritz Hallberg
On Fri, 2002-08-09 at 08:40, Ashley Yakeley wrote: At 2002-08-08 23:10, Ken Shan wrote: 1. Octets. 2. C char. 3. Unicode code points. 4. Unicode code values, useful only for UTF-16, which is seldom used. 5. What handles handle. ... I suggest that the following Haskell types be used

Re: Text in Haskell: a second proposal

2002-08-09 Thread Ashley Yakeley
At 2002-08-09 01:19, Sven Moritz Hallberg wrote: Whether or not the old Char-based ones should be deprecated, or whatever, I don't know. I think any notion of treating the _raw_ contents of a file as Chars must go, because it is simply incorrect. Right. Certainly we need to come up with

RE: Text in Haskell: a second proposal

2002-08-09 Thread Simon Marlow
Here's my take on the Unicode issue. Summary: unless there's a very good reason, I don't think we should decouple encoding/decoding from I/O, at least for the standard I/O library. Firstly, types. We already have all the necessary types: - Char, a Unicode code point - Word8, an octet -