Bugs item #22956, was opened at 2008-11-24 02:54
You can respond by visiting: 
http://rubyforge.org/tracker/?func=detail&atid=1971&aid=22956&group_id=494

Category: None
Group: None
Status: Open
Resolution: None
Priority: 3
Submitted By: Pavel Valodzka (valodzka)
Assigned to: Nobody (None)
Summary: Libxml HTML parser fails on very simple html pages

Initial Comment:
Please, remove check "htmlParseDocument(ctxt) == -1", because it imposible use 
html parser, it raise exception on every page, for example for google.com:

Error: Tag nobr invalid at :3.
Error: htmlParseEntityRef: expecting ';' at :3.
Error: htmlParseEntityRef: expecting ';' at :3.
Error: htmlParseEntityRef: expecting ';' at :3.
Error: Tag nobr invalid at :3.
Error: htmlParseEntityRef: expecting ';' at :3.
Error: htmlParseEntityRef: expecting ';' at :3.
LibXML::XML::Error: Error: htmlParseEntityRef: expecting ';' at :3.

htmlParseDocument(ctxt) returns -1 very often, it doesn't mean that document 
can be used.


----------------------------------------------------------------------

You can respond by visiting: 
http://rubyforge.org/tracker/?func=detail&atid=1971&aid=22956&group_id=494
_______________________________________________
libxml-devel mailing list
libxml-devel@rubyforge.org
http://rubyforge.org/mailman/listinfo/libxml-devel

Reply via email to