I've added support for user-specified character encodings to Rats! (with one option each for input and output). The development tree is currently undergoing some substantial changes but the parser generator hasn't changed much. So, I put the current snapshot up as an alpha release at:

http://cs.nyu.edu/rgrimm/xtc/xtc-1.11.0-a1.zip

Enjoy! And let me know if there are bugs in the character encoding support.

Robert

On Feb 5, 2007, at 7:45 PM, Steven Foster wrote:

Thanks Robert.    I am puzzled because I thought that utf8  IS the
default charset for java.   Is that incorrect?

How difficult would it be to change Rats! to read utf8 files?

For our purpose, it will be much easier to incorporate utf8 text in
the rats file rather than represent it as escaped hex.

- Steven

Robert Grimm wrote:

Right now, Rats! just uses Java's default character encoding for
reading and writing files. I guess I might/should change that to
UTF-8. You can, however, use unicode escapes ('\\' 'u' hex hex hex
hex) in character and string literals to denote non-ASCII characters.

Robert

On Feb 5, 2007, at 7:11 PM, Steven Foster wrote:

Hi everyone,

It seems I can't write *.rats source files in unicode encodings
( neither utf8, utf-16 nor ucs-2).     Rats immediately gives
error message about invalid characters at beginning of file.

Do I need to recompile Rats to read unicode source files?

Or have I edited the ?.rats file with the wrong editor?  ( I tried
several )

I have tried both big-endian and little-endian utf-16.

Thank you very much!
Steven


_______________________________________________
PEG mailing list
PEG@lists.csail.mit.edu
https://lists.csail.mit.edu/mailman/listinfo/peg







_______________________________________________
PEG mailing list
PEG@lists.csail.mit.edu
https://lists.csail.mit.edu/mailman/listinfo/peg

Reply via email to