On 26 Sep 2006, at 17:09 , Charles O Nutter wrote:
[...] we're probably just going to create some incompatibilities to solve the Unicode issue on our end. It's likely that in the future all strings in JRuby will be UTF-16 strings as in Java, and all operations will deal in characters instead of bytes whereever possible. We'll deal with issues that arise as they come up, such as for handling IO that wants byte counts when we're providing character counts.
Early versions of the unicode_hacks plugin redefined string methods to work on codepoints instead of bytes. This turned out to break a lot of libraries and applications in sometimes subtle but very nasty ways. Patching up IO might work, but suppose you have something like this:
header('Content-Length', body.length)
Here, length must return the number of bytes and not the number of
characters. How can you ever know what to return in this case?
Kind regards, Thijs
PGP.sig
Description: This is a digitally signed message part
