Re: [Haskell-cafe] surrogate code points in a Char

2009-11-25 Thread Colin Adams
2009/11/25 Mark Lentczner ma...@glyphic.com: The current version of Unicode is 5.1. This text is now in D90, though otherwise the same. My references below are to the 5.1 documents (freely available on line at: http://www.unicode.org/versions/Unicode5.1.0/ ) It's been 5.2 for over a month

Re: [Haskell-cafe] surrogate code points in a Char

2009-11-24 Thread Mark Lentczner
On Nov 18, 2009, at 7:28 AM, Manlio Perillo wrote: The Unicode Standard (version 4.0, section 3.9, D31 - pag 76) says: Because surrogate code points are not included in the set of Unicode scalar values, UTF-32 code units in the range D800 .. DFFF are ill-formed The current version

[Haskell-cafe] surrogate code points in a Char

2009-11-18 Thread Manlio Perillo
Hi. The Unicode Standard (version 4.0, section 3.9, D31 - pag 76) says: Because surrogate code points are not included in the set of Unicode scalar values, UTF-32 code units in the range D800 .. DFFF are ill-formed However GHC does not reject this code units: Prelude print '\xD800'

Re: [Haskell-cafe] surrogate code points in a Char

2009-11-18 Thread Edward Kmett
Enforcing a gap in the middle of the range of Char would be exceedingly awkward to propagate through all of the libraries. Off the top of my head: 1.) Functions like succ and pred which currently work on Char as an enumeration would have to jump over the gap, to be truly anal retentive about the