Hi Steve,

On Dec 5, 2009, at 1:45 PM, s.ross wrote:

My code receives XML data from a Web Service API call that is in UTF8 encoding. This winds up in a string.

return_data = NSURLConnection.sendSynchronousRequest(@request, returningResponse: response, error: error) str = NSString.alloc.initWithData(return_data, encoding: NSUTF8StringEncoding)
    puts "******* response encoding it #{str.encoding}"

The result of the puts above is 'MACINTOSH'.

I suspect the encoding of the string is not UTF-8, because when I try to parse the XML using REXML, I get:

RegexpError: too short multibyte code

This occurs way in REXML:

/Library/Frameworks/MacRuby.framework/Versions/0.5/usr/lib/ruby/ 1.9.0/rexml/text.rb:132:in `check:'

In any case, my questions are:

1) If anyone has run across this what did you do?

I don't believe REXML works. In any case, I would recommend to not use it. Since you're already using Cocoa, why not giving NSXMLDocument a try?

2) Why might the encoding be MACINTOSH and not UTF-8, as specified in the initWithData method call?

#encoding returns the fastest encoding available for the receiver. You may specify UTF-8 during the string creation, but if Cocoa can pick a smaller encoding at runtime (like ASCII) it will.

This is different from the Ruby 1.9 semantics and we have a plan to fix that in 0.6.

3) Suggestions?

See my comment in 1) :)

Laurent
_______________________________________________
MacRuby-devel mailing list
MacRuby-devel@lists.macosforge.org
http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel

Reply via email to