PROTON-736: Only encode Ruby strings as UTF-8 if it says it's UTF-8

For Ruby 1.9 and later the String method has methods for both checking
the encoding of a string and also for changing that encoding and
validating it. So for those version of Ruby the code checks if a string
is UTF-8, or if it can be "forced" to be UTF-8, and, if so, sends it as
a String.

Otherwise, it considers the string to be only binary data and sends it
as Binary.

On Ruby 1.8 these functions don't exist. So what the code does is check
the string for non-unicode data. If none is found then it sends it as a
String, otherwise it treats it as a Binary.


Project: http://git-wip-us.apache.org/repos/asf/qpid-proton/repo
Commit: http://git-wip-us.apache.org/repos/asf/qpid-proton/commit/8a042a22
Tree: http://git-wip-us.apache.org/repos/asf/qpid-proton/tree/8a042a22
Diff: http://git-wip-us.apache.org/repos/asf/qpid-proton/diff/8a042a22

Branch: refs/heads/examples
Commit: 8a042a22f2e2a2b40f9507d1b18ea74c918d8e13
Parents: 24cda63
Author: Darryl L. Pierce <mcpie...@gmail.com>
Authored: Wed Nov 5 10:15:54 2014 -0500
Committer: Darryl L. Pierce <mcpie...@apache.org>
Committed: Thu Nov 6 13:05:03 2014 -0500

----------------------------------------------------------------------
 .../bindings/ruby/lib/qpid_proton/mapping.rb    | 22 +++++++++++++++++++-
 1 file changed, 21 insertions(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/qpid-proton/blob/8a042a22/proton-c/bindings/ruby/lib/qpid_proton/mapping.rb
----------------------------------------------------------------------
diff --git a/proton-c/bindings/ruby/lib/qpid_proton/mapping.rb 
b/proton-c/bindings/ruby/lib/qpid_proton/mapping.rb
index 4cc25ce..841156c 100644
--- a/proton-c/bindings/ruby/lib/qpid_proton/mapping.rb
+++ b/proton-c/bindings/ruby/lib/qpid_proton/mapping.rb
@@ -110,7 +110,27 @@ module Qpid # :nodoc:
 
     class << STRING
       def put(data, value)
-        data.string = value.to_s
+        # In Ruby 1.9+ we have encoding methods that can check the content of
+        # the string, so use them to see if what we have is unicode. If so,
+        # good! If not, then just treat is as binary.
+        #
+        # No such thing in Ruby 1.8. So there we need to use Iconv to try and
+        # convert it to unicode. If it works, good! But if it raises an
+        # exception then we'll treat it as binary.
+        if RUBY_VERSION >= "1.9"
+          if value.encoding == "UTF-8" || 
value.force_encoding("UTF-8").valid_encoding?
+            data.string = value.to_s
+          else
+            data.binary = value.to_s
+          end
+        else
+          begin
+            newval = Iconv.new("UTF8//TRANSLIT//IGNORE", 
"UTF8").iconv(value.to_s)
+            data.string = newval
+          rescue
+            data.binary = value
+          end
+        end
       end
     end
 


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@qpid.apache.org
For additional commands, e-mail: commits-h...@qpid.apache.org

Reply via email to