KB SU wrote: > Hi, > > I have open url and read like following: > > $import urllib > $txt = urllib.urlopen("http://www.terme-catez.si").read() > $txt
> If you see above, in junk of HTLM, there is text like 'Terme > \xc4\x8cate\xc5\xbe' (original is 'Terme Čatež'). Now, I want to convert > code like '\xc4\x8c' or '\xc5\xbe' to unaccented chars so that 'Terme > \xc4\x8cate\xc5\xbe' become 'Terme Catez'. Is there any way convert from > whole HTML. First convert to unicode with txt = txt.decode("utf-8") and then follow http://effbot.org/zone/unicode-convert.htm Peter _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor