[jvm-l] Re: How to convert a Strings into Control Characters

John Wilson Fri, 20 Feb 2009 02:56:16 -0800

2009/2/20 Robert Fischer <[email protected]>:
>
> If my parser reads through a file and encounters the string sequence 
> backslash, n or the string
> sequence backslash, t, then I (as a parser for a language with Java-style 
> string semantics) would
> like to convert that to newline and tab, respectively.  Similarly, I'd like 
> to convert \u0017 into
> the appropriate unicode character.  What's the easiest way to go about doing 
> that on the JVM?  Any
> handy method out there to accomplish that, either in the standard library or 
> some other library?


As Paul says handling \t etc. by hand is trivial. The code just goes
in the switch you use to handle end of string/end of line characters.

\u... can be more problematic. If you are implementing them as Java
does they must be processed before the lexing takes place (i.e. they
are not String escapes they can appear anywhere in the program text
and must be replaced with the character they represent before the text
is processed further). Also there's an interesting little wrinkle with
multiple "u" characters. (see section 3.3 of the JLS). I implemented
this for Groovy and it was slightly tricky. I believe Groovy later
moved away from the strict JLS implementation because it caused
problems with the Regex implementation - I'm afraid I can't remember
the details.

John Wilson

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "JVM 
Languages" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/jvm-languages?hl=en
-~----------~----~----~----~------~----~------~--~---

[jvm-l] Re: How to convert a Strings into Control Characters

Reply via email to