Yeah, the error is thrown by HTMLParser, TAG is build on top of it.
I will try some other tools like Beautifull Soup.
Thanks.
2014-05-20 10:04 GMT-03:00 Anthony abasta...@gmail.com:
No, TAG is only a basic parser and not robust against errors in the HTML.
You should probably use a more sophisticated tool, such as Beautiful Soup
(which is built on top of the lxml and html5lib parsers). The standard
library also includes the HTMLParser module, but you may run into similar
problems with that.
Anthony
On Tuesday, May 20, 2014 8:14:37 AM UTC-4, yamandu wrote:
I am trying to parse a HTML with the TAG helper from a fetched URL using
urllib.
The HTML is broken in some parts, it has end span tags without respective
start span tags.
TAG helper gives error: unable to balance span tag.
I tested it. Open tags not closed are parsed, but not closed tags without
open.
Would be there a work around for this?
--
Resources:
- http://web2py.com
- http://web2py.com/book (Documentation)
- http://github.com/web2py/web2py (Source code)
- https://code.google.com/p/web2py/issues/list (Report Issues)
---
You received this message because you are subscribed to the Google Groups
web2py-users group.
To unsubscribe from this group and stop receiving emails from it, send an
email to web2py+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
Att.
Carlos J. Costa
Cientista da Computação
Esp. Gestão em Telecom
EL MELECH NEEMAN!
אָמֵן
--
Resources:
- http://web2py.com
- http://web2py.com/book (Documentation)
- http://github.com/web2py/web2py (Source code)
- https://code.google.com/p/web2py/issues/list (Report Issues)
---
You received this message because you are subscribed to the Google Groups
web2py-users group.
To unsubscribe from this group and stop receiving emails from it, send an email
to web2py+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.