On Tue, 2008-01-22 at 09:29 +0000, Magnus Therning wrote: > I vaguely remember that in GHC 6.6 code like this > > length $ map ord "a string" > > being able able to generate a different answer than > > length "a string"
That seems unlikely. > At the time I thought that the encoding (in my case UTF-8) was “leaking > through”. After switching to GHC 6.8 the behaviour seems to have > changed, and mapping 'ord' on a string results in a list of ints > representing the Unicode code point rather than the encoding: Yes. GHC 6.8 treats .hs files as UTF-8 where it previously treated them as Latin-1. > > map ord "åäö" > [229,228,246] > > Is this the case, or is there something strange going on with character > encodings? That's what we'd expect. Note that GHCi still uses Latin-1. This will change in GHC-6.10. > I was hoping that this would mean that 'chr . ord' would basically be a > no-op, but no such luck: > > > chr . ord $ 'å' > '\229' > > What would I have to do to get an 'å' from '229'? Easy! Prelude> 'å' == '\229' True Prelude> 'å' == Char.chr 229 True Remember, when you type: Prelude> 'å' what you really get is: Prelude> putStrLn (show 'å') So perhaps what is confusing you is the Show instance for Char which converts Char -> String into a portable ascii representation. Duncan _______________________________________________ Haskell-Cafe mailing list [email protected] http://www.haskell.org/mailman/listinfo/haskell-cafe
