This is the error and traceback:

Unexpected error opening J:/F2/....html: mismatched tag: line 124, column 8

Traceback (most recent call last):
  File "C:\....py", line 492, in <module>
    raw = extractText(xhtmlfile)
  File "C:\....py", line 334, in extractText
    tree = make_tree(xhtmlfile)
  File "....py", line 169, in make_tree
    return tree
UnboundLocalError: local variable 'tree' referenced before assignment
 

Here is line 124, col 8 and I cannot see any obvious missing/mismatched tags:

"<p>As to the present time I am unable physical and mentally to secure all this 
information at present.</p>"

Dinesh




From: Kent Johnson 
Sent: Tuesday, April 28, 2009 7:13 AM
To: Dinesh B Vadhia 
Cc: tutor@python.org 
Subject: Re: [Tutor] finding mismatched or unpaired html tags


On Tue, Apr 28, 2009 at 8:54 AM, Dinesh B Vadhia
<dineshbvad...@hotmail.com> wrote:
> I'm processing tens of thousands of html files and a few of them contain
> mismatched tags and ElementTree throws the error:
>
> "Unexpected error opening J:/F2/663/blahblah.html: mismatched tag: line 124,
> column 8"
>
> I now want to scan each file and simply identify each mismatched or unpaired
> tags (by line number) in each file.  I've read the ElementTree docs and
> cannot see anything obvious how to do this.  I know this is a common problem
> but feeling a bit clueless here - any ideas?

It seems like the exception gives you the line number. What kind of
exception is raised? The exception object may contain the line and
column in a more accessible form, so you could catch the exception,
get the line number, then read that line out of the file and show it.

Kent
_______________________________________________
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Reply via email to