I would like to mention that we have been using a tool for generating
tokenizers called Quex:

     http://quex.sourceforge.net/

Quex is similar to Flex++ and generates C++ tokenizers but it can handle 
text in various
encodings, including UTF-8, and regular expressions allow using Unicode 
properties.

-- Beppe Attardi
Università di Pisa

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to