Laurent--

Thanks for the quick reply. See comments below:


On Dec 5, 2009, at 4:22 PM, Laurent Sansonetti wrote:

> Hi Steve,
> 
> On Dec 5, 2009, at 1:45 PM, s.ross wrote:
> 
>> My code receives XML data from a Web Service API call that is in UTF8 
>> encoding. This winds up in a string.
>> 
>>    return_data = NSURLConnection.sendSynchronousRequest(@request, 
>> returningResponse: response, error: error)
>>    str = NSString.alloc.initWithData(return_data, encoding: 
>> NSUTF8StringEncoding)
>>    puts "******* response encoding it #{str.encoding}"
>> 
>> The result of the puts above is 'MACINTOSH'.
>> 
>> I suspect the encoding of the string is not UTF-8, because when I try to 
>> parse the XML using REXML, I get:
>> 
>> RegexpError: too short multibyte code
>> 
>> This occurs way in REXML:
>> 
>> /Library/Frameworks/MacRuby.framework/Versions/0.5/usr/lib/ruby/1.9.0/rexml/text.rb:132:in
>>  `check:'
>> 
>> In any case, my questions are:
>> 
>> 1) If anyone has run across this what did you do?
> 
> I don't believe REXML works. In any case, I would recommend to not use it. 
> Since you're already using Cocoa, why not giving NSXMLDocument a try?

What I really want to use is Nokogiri. My main issue is that I'm having to 
reimplement XML-RPC because the Ruby Std. Lib version is broken over SSL. Even 
if it weren't it's never been thread safe and thus can't operate 
asynchronously. As a result, what I have is an XML document inside an XML-RPC 
response envelope. That means I have to parse the document once to get the 
contents of the envelope (which is HTML-escaped), then parse those contents to 
get an XML document I can work with. I've been using XPath for that, and that's 
why I haven't moved over the NSXMLDocument.

Maybe I'm missing a bet here and should shift my strategy. I'll do some more 
reading...

>> 2) Why might the encoding be MACINTOSH and not UTF-8, as specified in the 
>> initWithData method call?
> 
> #encoding returns the fastest encoding available for the receiver. You may 
> specify UTF-8 during the string creation, but if Cocoa can pick a smaller 
> encoding at runtime (like ASCII) it will.
> 
> This is different from the Ruby 1.9 semantics and we have a plan to fix that 
> in 0.6.

This is kind of surprising behavior. The 1.9 semantics are sufficiently 
different from 1.8x that code that works correctly on 1.8.7 breaks awkwardly on 
1.9. Ok, but I fixed that in an MRI version and the gotcha above broke my 
MacRuby version. Now that I know this, I guess I can deal with it.

> 
>> 3) Suggestions?
> 
> See my comment in 1) :)
> 
> Laurent
> _______________________________________________
> MacRuby-devel mailing list
> MacRuby-devel@lists.macosforge.org
> http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel

_______________________________________________
MacRuby-devel mailing list
MacRuby-devel@lists.macosforge.org
http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel

Reply via email to