Bugs item #21658, was opened at 2008-08-24 11:01 You can respond by visiting: http://rubyforge.org/tracker/?func=detail&atid=1971&aid=21658&group_id=494
Category: None Group: None >Status: Closed >Resolution: Accepted Priority: 3 Submitted By: Nobody (None) >Assigned to: Charlie Savage (cfis) Summary: failure to parse and obey encoding when creating document Initial Comment: The following appeared on comp.ruby.lang: ===== quoted material follows I have an XML request, using the following code as an example: require "rubygems" require "xml/libxml" movie = "sin+city" search_url = 'http://www.movie-xml.com/interfaces/getmovie.php?moviename=' url = search_url+movie doc = XML::Document.file(url) Here's the response I get: Input is not proper UTF-8, indicate encoding ! The source XML has an encoding declared as such: <?xml version="1.0" encoding="ISO-8859-1"?> ===== end quoted material Tested and confirmed, plus I tried the same operation with REXML and there was no problem. It looks like we are not examining the encoding attribute up front and obeying it when parsing the body of the doc. ---------------------------------------------------------------------- >Comment By: Charlie Savage (cfis) Date: 2008-11-24 12:12 Message: No response - closing the issue. ---------------------------------------------------------------------- Comment By: Charlie Savage (cfis) Date: 2008-11-15 17:47 Message: This url is no longer valid. Do you have another test case? ---------------------------------------------------------------------- Comment By: Eric Ivancich (ivancich) Date: 2008-08-24 17:29 Message: Twice in the XML data retrieved from the URL generated in the detailed description, the word "verg?enza" appears, where the "?" has hex code 0xFC that encodes a lower case "u" with umlaut in ISO-8859-1. 0xFC cannot appear in UTF-8 data due to RFC-3629. So that adds further evidence that it's trying to parse the file as UTF-8 rather than ISO-8859-1. ---------------------------------------------------------------------- Comment By: Erik Hollensbe (erikh) Date: 2008-08-24 14:24 Message: >From this thread on ruby-talk: http://www.ruby-forum.com/topic/163524 ---------------------------------------------------------------------- You can respond by visiting: http://rubyforge.org/tracker/?func=detail&atid=1971&aid=21658&group_id=494 _______________________________________________ libxml-devel mailing list libxml-devel@rubyforge.org http://rubyforge.org/mailman/listinfo/libxml-devel