Re: [Tutor] HTML Parsing

Andreas Kostyrka Mon, 21 Apr 2008 07:22:18 -0700

Just from memory, you need to subclass the HTMLParser class, and provide
start_dt and end_dt methods, plus one to capture the text inbetween.


Read the docs on htmllib (www.python.org | Documentation | module docs),
and see if you can manage if not, come back with questions ;)

Andreas

Am Montag, den 21.04.2008, 14:40 +0100 schrieb Stephen Nelson-Smith:
> On 4/21/08, Andreas Kostyrka <[EMAIL PROTECTED]> wrote:
> > As usual there are a number of ways.
> >
> >  But I basically see two steps here:
> >
> >  1.) capture all dt elements. If you want to stick with the standard
> >  library, htmllib would be the module. Else you can use e.g.
> >  BeautifulSoup or something comparable.
> 
> I want to stick with standard library.
> 
> How do you capture <dt> elements?
> 
> S.

signature.asc
Description: Dies ist ein digital signierter Nachrichtenteil

_______________________________________________
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] HTML Parsing

Reply via email to