Kenneth McDonald wrote:
Are there any articles detailing the current state of Unicode and JRuby,
and in particular detailing gotchas, including with the use of the
regular expression engine?
Not really, though we'd love someone to do so on wiki.jruby.org. As it
stands, there's a few fairly simple rules:
- UTF-8 strings work fine throughout JRuby, including in regex
- Strings passed to Java String-receiving methods are expected to be
UTF-8, and we try to decode them as such
- Strings returned from Java are encoded as UTF-8
In general, if you stick to UTF-8 things work fine (or else we'll fix
them). Other encodings, not so much. Ruby's unicode story is still such
a tangled web, and in our case we're trying to bridge two worlds.
- Charlie
---------------------------------------------------------------------
To unsubscribe from this list, please visit:
http://xircles.codehaus.org/manage_email