I don't know the cause of the error here but I will say that parsing HTML with regular expressions is fraught with difficulty unless you know that the HTML will be suitably formatted in advance.
You may be better off using one of the HTML parsing modules such as HTMLParser or even the more powerful BeautifulSoup. -- Alan Gauld Author of the Learn to Program web site http://www.freenetpages.co.uk/hp/alan.gauld "Oleg Oltar" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] >I am trying to parse an html page. Have following error while doing >that > > > src = sel.get_html_source() > links = re.findall(r'<a class="al4"[^<]*</a>', src) > for link in links: > print link > > > > ====================================================================== > ERROR: test_new (__main__.NewTest) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "<stdin>", line 19, in test_new > UnicodeEncode Error: 'ascii' codec can't encode character u'\xae' > in > position 90: ordinal not in range(128) > > ---------------------------------------------------------------------- > Ran 1 test in 6.345s > -------------------------------------------------------------------------------- > _______________________________________________ > Tutor maillist - Tutor@python.org > http://mail.python.org/mailman/listinfo/tutor > _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor