Bugs item #22956, was opened at 2008-11-24 02:54 You can respond by visiting: http://rubyforge.org/tracker/?func=detail&atid=1971&aid=22956&group_id=494
Category: None Group: None Status: Open Resolution: None Priority: 3 Submitted By: Pavel Valodzka (valodzka) Assigned to: Nobody (None) Summary: Libxml HTML parser fails on very simple html pages Initial Comment: Please, remove check "htmlParseDocument(ctxt) == -1", because it imposible use html parser, it raise exception on every page, for example for google.com: Error: Tag nobr invalid at :3. Error: htmlParseEntityRef: expecting ';' at :3. Error: htmlParseEntityRef: expecting ';' at :3. Error: htmlParseEntityRef: expecting ';' at :3. Error: Tag nobr invalid at :3. Error: htmlParseEntityRef: expecting ';' at :3. Error: htmlParseEntityRef: expecting ';' at :3. LibXML::XML::Error: Error: htmlParseEntityRef: expecting ';' at :3. htmlParseDocument(ctxt) returns -1 very often, it doesn't mean that document can be used. ---------------------------------------------------------------------- You can respond by visiting: http://rubyforge.org/tracker/?func=detail&atid=1971&aid=22956&group_id=494 _______________________________________________ libxml-devel mailing list libxml-devel@rubyforge.org http://rubyforge.org/mailman/listinfo/libxml-devel