If I run this in Ruby 1.8.6:
> ruby –Ku uni.rb
And uni.rb is UTF-8 encoded w/o BOM:
puts $KCODE
puts 'hèllo'.size
I’ll get output:
UTF-8
6
So that clearly doesn’t work as one might expect. String literals in MRI 1.8
are always binary (ie. the accented character is stored as any other 2 bytes in
the string).
AFAIK $KCODE only affects some built-in and library methods – for example
String#inspect, regular expression, conversion libraries, etc.
Although IronRuby stores string literals in UTF16 .NET strings, to be fully
compatible with MRI 1.8 we use a custom BinaryEncoding for these strings. When
a string is converted to an array of bytes using this encoding, only 8 bits of
each character are used (the other bits are required to be 0). This works fine
for encodings that use a single byte per character. It’s broken for multi-byte
encodings but that’s a problem with Ruby 1.8 in general.
If you want to use Unicode you should not use 1.8 semantics. You should use -19
switch to run your script in 1.9 mode and either add a UTF8 BOM preamble or
Ruby encoding magic comment:
#encoding: UTF-8
puts 'hèllo'.size
> ruby19 uni.rb
5
> ir.exe -19 uni.rb
5
In a hosted app you can set 1.9 compat mode when creating the
ScriptEngine/Runtime:
var ruby = IronRuby.Ruby.CreateEngine((setup) => {
setup.Options["Compatibility"] = RubyCompatibility.Ruby19
});
Tomas
From: [email protected]
[mailto:[email protected]] On Behalf Of Tomas Matousek
Sent: Tuesday, March 03, 2009 9:56 AM
To: [email protected]
Subject: Re: [Ironruby-core] Issue with accents (UTF-8) - is it supposed to
work ?
I’ll take a look.
Tomas
From: [email protected]
[mailto:[email protected]] On Behalf Of Ivan Porto Carrero
Sent: Tuesday, March 03, 2009 6:58 AM
To: [email protected]
Subject: Re: [Ironruby-core] Issue with accents (UTF-8) - is it supposed to
work ?
No not a mono related issue. I get the same results when i run your sample on
windows with MS.NET<http://MS.NET>
It must be an encoding thing. When I set the $KCODE to "UTF-8" it still has the
same behavior which is weird I guess :)
On Tue, Mar 3, 2009 at 3:35 PM, Thibaut Barrère
<[email protected]<mailto:[email protected]>> wrote:
Hi,
> not sure if it's an oddity in my code, a bug or non-implemented feature in
> IronRuby or Mono - so I'm reporting it here. When using accents inside
> strings ("Barrère") that I pass to either buttons or datagridviews, they
> translate into "BarrA¨re". Here's a sample (also available on github):
Bumping this one - do you have some idea of what's happening there ?
Is it a mono related issue ?
-- Thibaut
> Hi,
> not sure if it's an oddity in my code, a bug or non-implemented feature in
> IronRuby or Mono - so I'm reporting it here. When using accents inside
> strings ("Barrère") that I pass to either buttons or datagridviews, they
> translate into "BarrA¨re". Here's a sample (also available on github):
>
> form = Magic.build do
> form(:text => "DataGridView sample", :width => 800, :height => 600) do
> # nifty - current Magic.build makes it possible to reuse the control
> that has been added
> @grid = data_grid_view :dock => DockStyle.fill
> @grid.column_count = 2
> @grid.columns[0].name = "First name"
> @grid.columns[1].name = "Last name"
>
> @grid.rows.add("Thibaut","Barrère") # using my name with its nasty
> accent - utf-8 ?
> end
> end
>
> After editing the datagridview, I noticed a log on stdout from mono:
> 009-03-01 11:48:36.927 mono[5512:10b] WARNING:
> CFSTR("Barr\37777777703\37777777603\37777777702\37777777650re") has non-7
> bit chars, interpreting using MacOS Roman encoding for now, but this will
> change. Please eliminate usages of non-7 bit chars (including escaped
> characters above \177 octal) in CFSTR().
> So I guess the issue probably boils down to non-MacOS Roman support in Mono.
> What do you think ?
> -- Thibaut
_______________________________________________
Ironruby-core mailing list
[email protected]<mailto:[email protected]>
http://rubyforge.org/mailman/listinfo/ironruby-core
_______________________________________________
Ironruby-core mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ironruby-core