Ian Lynagh wrote:
On Tue, Jan 22, 2008 at 03:16:15PM +0000, Magnus Therning wrote:
On 1/22/08, Duncan Coutts <[EMAIL PROTECTED]> wrote:
On Tue, 2008-01-22 at 09:29 +0000, Magnus Therning wrote:
I vaguely remember that in GHC 6.6 code like this

  length $ map ord "a string"

being able able to generate a different answer than

  length "a string"
That seems unlikely.
Unlikely yes, yet I get the following in GHCi (ghc 6.6.1, the version
currently in Debian Sid):

map ord "a"
[97]
map ord "ö"
[195,182]

In 6.6.1:

Prelude Data.Char> map ord "ö"
[195,182]
Prelude Data.Char> length "ö"
2

there are actually 2 bytes there, but your terminal is showing them as
one character.
Still, that seems weird to me. A Haskell Char is a Unicode character. An "ö" is either one character (unicode point 0xF6) (which, in UTF-8, is coded as two bytes) or a combination of an "o" with an umlaut (Unicode point 776). But because the last character is not 776, the "ö" here should just be one character. I'd suspect that the two-character string comes from the terminal speaking UTF-8 to GHC expecting Latin-1. GHC 6.8 expects UTF-8, so all is fine.

On my MacBook (OS X 10.4), 'ö' also immediately expands to "\303\266" when I type it in my terminal, even outside GHCi. That suggests that the terminal program doesn't handle Unicode and immediately escapes weird characters.

Regards,
Reinier
_______________________________________________
Haskell-Cafe mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to