Re: [C2hs] CHS lexer goes into infinite loop on chars > 255

Manuel M T Chakravarty Sun, 13 Dec 2009 20:04:09 -0800

Duncan Coutts:
> Found another bug that surfaces when we compile c2hs with ghc-6.12.
> 
> By default text files are now read in the locale encoding rather than
> just ASCII. This means we can (and do) get characters over 255. The
> behaviour is that c2hs goes into an infinite loop and consumes all the
> memory on your machine (in particular this happens with some files in
> gtk2hs).
> 
> Unfortunately the 255 assumption is pretty strongly wired into the c2hs
> lexer. From Lexer.hs:
> 
> -- * Unicode posses a problem as the character domain becomes too big 
> -- for using arrays to represent transition tables and even sparse 
> -- structures will posse a significant overhead when character ranges
> -- are naively represented. So, it might be time for finite maps again.
> 
> The short term solution is to set the text mode to be ASCII. In the
> longer term we might want to replace the .chs lexer and parser, like we
> did already for the C parser.


Yes, that make sense.  At the time, unicode support in GHC was a still far away.

Manuel

_______________________________________________
C2hs mailing list
C2hs@haskell.org
http://www.haskell.org/mailman/listinfo/c2hs

Re: [C2hs] CHS lexer goes into infinite loop on chars > 255

Reply via email to