2009/2/20 Robert Fischer <[email protected]>: > > If my parser reads through a file and encounters the string sequence > backslash, n or the string > sequence backslash, t, then I (as a parser for a language with Java-style > string semantics) would > like to convert that to newline and tab, respectively. Similarly, I'd like > to convert \u0017 into > the appropriate unicode character. What's the easiest way to go about doing > that on the JVM? Any > handy method out there to accomplish that, either in the standard library or > some other library?
As Paul says handling \t etc. by hand is trivial. The code just goes in the switch you use to handle end of string/end of line characters. \u... can be more problematic. If you are implementing them as Java does they must be processed before the lexing takes place (i.e. they are not String escapes they can appear anywhere in the program text and must be replaced with the character they represent before the text is processed further). Also there's an interesting little wrinkle with multiple "u" characters. (see section 3.3 of the JLS). I implemented this for Groovy and it was slightly tricky. I believe Groovy later moved away from the strict JLS implementation because it caused problems with the Regex implementation - I'm afraid I can't remember the details. John Wilson --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "JVM Languages" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/jvm-languages?hl=en -~----------~----~----~----~------~----~------~--~---
