Ezio Melotti added the comment:

Sorry, I misread your code, looks like you want the href *without* 'cve'.
In that case change my code to use "'cve' not in attrs['href']" (also avoid 
using  s.find('cve') == -1 , and use the more readable and idiomatic  'cve' not 
in s ).

I think your original script doesn't work for two reasons:
1) you are looking for a table with class="tablesorter", but in the HTML the 
table doesn't have that class, so self.is_table is never set to True;
2) you are finding the href of the <a> with a "style" attribute and correctly 
setting it to self.href_name, but the value is then replaced by "" when the 
following <a> without "style" is found;

That said, I still suggest you to abandon sgmllib and use HTMLParser, or 
possibly an external module like BeautifulSoup or LXML.

resolution:  -> invalid
stage:  -> committed/rejected
status: open -> closed

Python tracker <rep...@bugs.python.org>
Python-bugs-list mailing list

Reply via email to