PROTON-736: Only encode Ruby strings as UTF-8 if it says it's UTF-8 For Ruby 1.9 and later the String method has methods for both checking the encoding of a string and also for changing that encoding and validating it. So for those version of Ruby the code checks if a string is UTF-8, or if it can be "forced" to be UTF-8, and, if so, sends it as a String.
Otherwise, it considers the string to be only binary data and sends it as Binary. On Ruby 1.8 these functions don't exist. So what the code does is check the string for non-unicode data. If none is found then it sends it as a String, otherwise it treats it as a Binary. Project: http://git-wip-us.apache.org/repos/asf/qpid-proton/repo Commit: http://git-wip-us.apache.org/repos/asf/qpid-proton/commit/8a042a22 Tree: http://git-wip-us.apache.org/repos/asf/qpid-proton/tree/8a042a22 Diff: http://git-wip-us.apache.org/repos/asf/qpid-proton/diff/8a042a22 Branch: refs/heads/examples Commit: 8a042a22f2e2a2b40f9507d1b18ea74c918d8e13 Parents: 24cda63 Author: Darryl L. Pierce <mcpie...@gmail.com> Authored: Wed Nov 5 10:15:54 2014 -0500 Committer: Darryl L. Pierce <mcpie...@apache.org> Committed: Thu Nov 6 13:05:03 2014 -0500 ---------------------------------------------------------------------- .../bindings/ruby/lib/qpid_proton/mapping.rb | 22 +++++++++++++++++++- 1 file changed, 21 insertions(+), 1 deletion(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/qpid-proton/blob/8a042a22/proton-c/bindings/ruby/lib/qpid_proton/mapping.rb ---------------------------------------------------------------------- diff --git a/proton-c/bindings/ruby/lib/qpid_proton/mapping.rb b/proton-c/bindings/ruby/lib/qpid_proton/mapping.rb index 4cc25ce..841156c 100644 --- a/proton-c/bindings/ruby/lib/qpid_proton/mapping.rb +++ b/proton-c/bindings/ruby/lib/qpid_proton/mapping.rb @@ -110,7 +110,27 @@ module Qpid # :nodoc: class << STRING def put(data, value) - data.string = value.to_s + # In Ruby 1.9+ we have encoding methods that can check the content of + # the string, so use them to see if what we have is unicode. If so, + # good! If not, then just treat is as binary. + # + # No such thing in Ruby 1.8. So there we need to use Iconv to try and + # convert it to unicode. If it works, good! But if it raises an + # exception then we'll treat it as binary. + if RUBY_VERSION >= "1.9" + if value.encoding == "UTF-8" || value.force_encoding("UTF-8").valid_encoding? + data.string = value.to_s + else + data.binary = value.to_s + end + else + begin + newval = Iconv.new("UTF8//TRANSLIT//IGNORE", "UTF8").iconv(value.to_s) + data.string = newval + rescue + data.binary = value + end + end end end --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@qpid.apache.org For additional commands, e-mail: commits-h...@qpid.apache.org