Re: [ragel-users] ragel and encodings

Adrian Thurston Mon, 25 May 2009 19:47:44 -0700

Some people express multibyte sequences directly in ragel with a char or 
unsigned char alphtype. There is contributed script in examples called 
unicode2ragel.rb that generates ragel definitions for ranges of unicode 
code points in utf8 or ucs4.


As a side note, it shoudl probably be in contrib. I'm going to move that 
now for anyone following the SVN directly.

-Adrian

Robert Lemmen wrote:
> On Thu, May 21, 2009 at 11:34:35AM -0400, Wil Macaulay wrote:
>> Depends on your platform, but my approach to this problem (on the Mac)
>> was to detect
>> the encoding, and convert to UTF-8 before parsing. I also converted
>> line-endings (\r\n -> \n)
>> and ensured a newline at the end of the data at the same time.
> 
> how do you handle utf-8 in your ragel code? do you use a single-byte
> alphtype and then handle the utf-8 sequences manually?
> 
> cu  robert
> 
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> ragel-users mailing list
> ragel-users@complang.org
> http://www.complang.org/mailman/listinfo/ragel-users

_______________________________________________
ragel-users mailing list
ragel-users@complang.org
http://www.complang.org/mailman/listinfo/ragel-users

Re: [ragel-users] ragel and encodings

Reply via email to