On Apr 1, 2:52 pm, [EMAIL PROTECTED] wrote: > On Apr 1, 3:13 pm, "Ulysse" <[EMAIL PROTECTED]> wrote: > > > Hello, > > > I'm trying to extract the data from HTML table. Here is the part of > > the HTML source : > > > .... > > > Do you know the way to do it ? > > Beautiful Soup is an easy way to parse HTML (that may be > broken).http://www.crummy.com/software/BeautifulSoup/ > > Here's a start of a parser for your HTML: > > soup = BeautifulSoup(txt) > for tr in soup('tr'): > dateTd, textTd = tr('td')[1:] > print 'Date :', dateTd.contents[0].strip() > print textTd #element still needs parsing > > where txt is the string in your message.
I have seen the Beautiful Soup online help and tried to apply that to my problem. But it seems to be a little bit hard. I will rather try to do this with regular expressions... -- http://mail.python.org/mailman/listinfo/python-list