Setting $KCODE = "U" doesn't actually affect the encoding of the literal in the
same compilation unit. It only affects literals that are parsed after the KCODE
is set.
$KCODE = "U"
x = "日本語"
p x.Encoding # => ASCII-8BIT since the current compilation
unit (a file) was parsed using BINARY encoding
p x.size # => 9 bytes
y = eval('"日本語"')
p y.Encoding # => KCODE: UTF8
p y.size # => 9 since String#size in MRI 1.8.6
doesn't understand encodings, it counts in bytes
c = x.to_clr_string # this is essentially creating a string whose
non ASCII characters are not correctly encoded in UTF8 (they are UTF8 bytes
widened to 16bits)
p c.size # => 9 characters
p c.Encoding # => UTF-8 since CLR string doesn't hold on an
encoding. When you ask for its bytes we need to use some encoding.
# Maybe we could choose UTF16 but MRI 1.8.6 has
at least some support for.
d = y.to_clr_string # correctly encoded string
d c.Encoding # UTF-8
p d.size # 3 characters
Encodings in 1.8.6 are not very well supported and it is difficult to implement
good interop between CLR and MRI strings. It would get better in the next
version of IronRuby which will target compatibility with 1.9.
Tomas
-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Daniele Alessandri
Sent: Monday, March 15, 2010 1:48 PM
To: [email protected]
Subject: [Ironruby-core] $KCODE, -KU and CLR strings
Hi everyone,
please consider this snippet:
$KCODE = "U"
puts "日本語".to_clr_string.length
When I run it by launching ir.exe without any option I get 9 as an output (each
character in that string is actually made up of 3 bytes with UTF-8 encoding),
and when I do the same with the -KU option being passed to ir.exe I get 3.
Aside from the fact that I think that 3 is to be considered the right behaviour
here, shouldn't the sole $KCODE = "U" have the same effect of starting ir.exe
with the -KU option?
Thanks,
Daniele
--
Daniele Alessandri
http://www.clorophilla.net/
http://twitter.com/JoL1hAHN
_______________________________________________
Ironruby-core mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ironruby-core
_______________________________________________
Ironruby-core mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ironruby-core