#5559: heap profile character encoding confusion
---------------------------------+------------------------------------------
Reporter: guest | Owner:
Type: bug | Status: new
Priority: normal | Component: Documentation
Version: 7.0.3 | Keywords: heap profile, character
encoding
Testcase: | Blockedby:
Os: Unknown/Multiple | Blocking:
Architecture: Unknown/Multiple | Failure: None/Unknown
---------------------------------+------------------------------------------
Heap profiling this UTF-8 source file (where ø is encoded as C3 B8) with
ghc-7.0.3 on GNU/Linux with LANG=en_GB.utf8 seems to give an output .hp
file in ISO-8859 encoding (where ø is encoded as F8).
{{{
føb :: Integer -> Integer
føb n
| n == 0 = 0
| n == 1 = 1
| n >= 2 = føb (n - 1) + føb (n - 2)
main :: IO ()
main = print (føb 100)
}}}
hexdump extract from .hp file:
{{{
00000000 28 32 39 33 29 66 f8 62 2f 43 41 46 3a 6c 76 6c
|(293)f.b/CAF:lvl|
00000010 31 5f 72 50 70 09 34 30 0a |1_rPp.40.|
00000019
}}}
This causes some problems for heap profile visualization programs:
* hp2ps: viewing the .ps in evince shows a wrong character (slashed-l
instead of ø)
* hp2pretty: viewing the .svg with rsvg aborts with an invalid utf8
error
hp2any-core seemed to handle the character encoding correctly in this test
(displayed as "\248") with correct appearance in hp2any-graph's OpenGL
window.
I'd like to know if ISO-8859 will always be used for .hp files, or if the
ISO-8859 is a misfeature and UTF-8 will be used in future, or if it will
eventually use the current locale settings.
I didn't find any documentation on character encoding here:
http://www.haskell.org/ghc/docs/latest/html/users_guide/prof-heap.html
--
Ticket URL: <http://hackage.haskell.org/trac/ghc/ticket/5559>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
_______________________________________________
Glasgow-haskell-bugs mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs