This is the error and traceback: Unexpected error opening J:/F2/....html: mismatched tag: line 124, column 8
Traceback (most recent call last): File "C:\....py", line 492, in <module> raw = extractText(xhtmlfile) File "C:\....py", line 334, in extractText tree = make_tree(xhtmlfile) File "....py", line 169, in make_tree return tree UnboundLocalError: local variable 'tree' referenced before assignment Here is line 124, col 8 and I cannot see any obvious missing/mismatched tags: "<p>As to the present time I am unable physical and mentally to secure all this information at present.</p>" Dinesh From: Kent Johnson Sent: Tuesday, April 28, 2009 7:13 AM To: Dinesh B Vadhia Cc: tutor@python.org Subject: Re: [Tutor] finding mismatched or unpaired html tags On Tue, Apr 28, 2009 at 8:54 AM, Dinesh B Vadhia <dineshbvad...@hotmail.com> wrote: > I'm processing tens of thousands of html files and a few of them contain > mismatched tags and ElementTree throws the error: > > "Unexpected error opening J:/F2/663/blahblah.html: mismatched tag: line 124, > column 8" > > I now want to scan each file and simply identify each mismatched or unpaired > tags (by line number) in each file. I've read the ElementTree docs and > cannot see anything obvious how to do this. I know this is a common problem > but feeling a bit clueless here - any ideas? It seems like the exception gives you the line number. What kind of exception is raised? The exception object may contain the line and column in a more accessible form, so you could catch the exception, get the line number, then read that line out of the file and show it. Kent
_______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor