Hello Gregg,

Friday, July 25, 2008, 11:36:28 AM, you wrote:

> Hi,

> I'd like to use re2c to parse some natural language text.  I think I
> can make it work as it is now with a little hackery, since it
> recognizes utf-8 byte sequences, but it would be better if it actually
> understood unicode better, naturally.  Any plans to improve Unicode
> support?  I haven't looked at the internals yet, but an obvious
> approach would be to add a byte->char conversion layer and modify the
> regex machinery to work on chars instead of bytes.  Any idea how much
> work that would be?

> I'd also like to be able to generate javascript code so I can embed a
> parser in a webpage.  Any idea how much work that would be?

The testing would be the harder work. What is required is to have re2c read
chrs from the input stream into ints rather than into chars (bytes as you
called it). However you can already di so if you provide the layer doing so
and just pass along the int array. Anyway, at this point re2c development
is bount to PH 5.3 progress. Once PHP 5.3 alpha release phase comes to a
halt re2c 0.14 will be released and a new development cycle can be started.
For that cycle native fast UTF-8 support is high up in my priority list.

marcus

> Thanks,

> Gregg

> -------------------------------------------------------------------------
> This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
> Build the coolest Linux based applications with Moblin SDK & win great prizes
> Grand prize is a trip for two to an Open Source event anywhere in the world
> http://moblin-contest.org/redirect.php?banner_id=100&url=/
> _______________________________________________
> Re2c-general mailing list
> Re2c-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/re2c-general




-- 
Best regards,
 Marcus                            mailto:[EMAIL PROTECTED]


-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Re2c-general mailing list
Re2c-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/re2c-general

Reply via email to