Dinesh B Vadhia wrote:
I'm processing tens of thousands of html files and a few of them contain
mismatched tags and ElementTree throws the error:
"Unexpected error opening J:/F2/663/blahblah.html: mismatched tag: line 124, column
8"
I now want to scan each file and simply identify each mismatched or
unpaired
tags (by line number) in each file. I've read the ElementTree docs and cannot
see anything obvious how to do this. I know this is a common problem but
feeling a bit clueless here - any ideas?
Don't use elementTree, use BeautifulSoup instead.
elementTree expects perfect input, typically generated by another computer.
BeautifulSoup is designed to handle your everyday HTML page, filled with
errors of all possible kinds.
Sincerely,
Albert
_______________________________________________
Tutor maillist - Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor