Bugs item #22956, was opened at 2008-11-23 17:54 You can respond by visiting: http://rubyforge.org/tracker/?func=detail&atid=1971&aid=22956&group_id=494
Category: None Group: None >Status: Closed >Resolution: Accepted Priority: 3 Submitted By: Pavel Valodzka (valodzka) >Assigned to: Charlie Savage (cfis) Summary: Libxml HTML parser fails on very simple html pages Initial Comment: Please, remove check "htmlParseDocument(ctxt) == -1", because it imposible use html parser, it raise exception on every page, for example for google.com: Error: Tag nobr invalid at :3. Error: htmlParseEntityRef: expecting ';' at :3. Error: htmlParseEntityRef: expecting ';' at :3. Error: htmlParseEntityRef: expecting ';' at :3. Error: Tag nobr invalid at :3. Error: htmlParseEntityRef: expecting ';' at :3. Error: htmlParseEntityRef: expecting ';' at :3. LibXML::XML::Error: Error: htmlParseEntityRef: expecting ';' at :3. htmlParseDocument(ctxt) returns -1 very often, it doesn't mean that document can be used. ---------------------------------------------------------------------- >Comment By: Charlie Savage (cfis) Date: 2008-11-23 18:40 Message: Yeah, that code has been removed in trunk. ---------------------------------------------------------------------- You can respond by visiting: http://rubyforge.org/tracker/?func=detail&atid=1971&aid=22956&group_id=494 _______________________________________________ libxml-devel mailing list libxml-devel@rubyforge.org http://rubyforge.org/mailman/listinfo/libxml-devel