#741: All character encoding of the string become UTF-8 when use force_encoding. ----------------------------------+----------------------------------------- Reporter: watson1...@… | Owner: lsansone...@… Type: defect | Status: new Priority: blocker | Milestone: Component: MacRuby | Keywords: ----------------------------------+----------------------------------------- Test code: {{{ $ cat t.rb def escape(string) # original : 'CGI::escape' p string.encoding p string string.gsub(/([^ a-zA-Z0-9_.-]+)/) do puts "Bytesize: #{$1.bytesize}" '%' + $1.unpack('H2' * $1.bytesize).join('%').upcase end.tr(' ', '+') end
value = "\xe3\x82\x86\xe3\x81\x8d\xe3\x81\xb2\xe3\x82\x8d" value.force_encoding("utf-8") p escape(value) puts "----" value = "\xa4\xe6\xa4\xad\xa4\xd2\xa4\xed" value.force_encoding("EUC-JP") p escape(value) puts "----" value = "\x82\xe4\x82\xab\x82\xd0\x82\xeb" value.force_encoding("Shift_JIS") p escape(value) }}} Result on Ruby 1.9.2 preview3: {{{ $ ruby -v t.rb ruby 1.9.2dev (2010-05-31 revision 28117) [x86_64-darwin10.3.0] #<Encoding:UTF-8> "ゆきひろ" Bytesize: 12 "%E3%82%86%E3%81%8D%E3%81%B2%E3%82%8D" ---- #<Encoding:EUC-JP> "\x{A4E6}\x{A4AD}\x{A4D2}\x{A4ED}" Bytesize: 8 "%A4%E6%A4%AD%A4%D2%A4%ED" ---- #<Encoding:Shift_JIS> "\x{82E4}\x{82AB}\x{82D0}\x{82EB}" Bytesize: 8 "%82%E4%82%AB%82%D0%82%EB" }}} Result on Macruby SVN Trunk Head: {{{ $ macruby -v t.rb MacRuby 0.7 (ruby 1.9.2) [universal-darwin10.0, x86_64] #<Encoding:UTF-8> "ゆきひろ" Bytesize: 12 "%E3%82%86%E3%81%8D%E3%81%B2%E3%82%8D" ---- #<Encoding:EUC-JP> "ゆきひろ" Bytesize: 12 "%E3%82%86%E3%81%8D%E3%81%B2%E3%82%8D" ---- #<Encoding:Shift_JIS> "ゆきひろ" Bytesize: 12 "%E3%82%86%E3%81%8D%E3%81%B2%E3%82%8D" }}} -- Ticket URL: <http://www.macruby.org/trac/ticket/741> MacRuby <http://macruby.org/> _______________________________________________ MacRuby-devel mailing list MacRuby-devel@lists.macosforge.org http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel