Hello,

I have modified the Alex lexer generator to support unicode. 

The general idea is that the state-machine works on the UTF8
representation of the text. I submit my work here for review
in order to off-load the maintainer (Simon Marlow) as far
as possible.

The prototype is available on github:

git://github.com/jyp/Alex.git

Be sure to 
 * checkout the "utf8" branch (so "git diff master" shows the changes)
 * Do a 2-stage bootstrapping before testing


Caveats:
 * The generated code depends on some utf8 packages;
 * There is no attempt to fix the bytestring-based wrappers;
 * Left-context recognition is not table-based any more;
 * Presence of debug code.

Bug reports, comments, and especially patches are welcome :)

Thanks,
-- JP

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to