On Tue, 05 Aug 2008 12:07:39 +0000, Duncan Booth wrote: > Whenever you put a URL into an HTML file you need to escape it, so > naturally you will also need to unescape it when it is retrieved from > the file. However, whatever you use to parse the HMTL ought to be > unescaping text and attributes as part of the parsing process, so you > shouldn't need a separate function for this.
... > Even Python's builtin HTMLParser class will do this for you. What parser > are you using? A regex. I know, I know, now I have two problems :-) It's a quick and dirty hack, not a production piece of code, and I have a quick and dirty fix by just using url.replace('&', '&'). Thanks to everybody who replied. I guess I really have to bite the bullet and learn how to use a proper HTML parser. -- Steven -- http://mail.python.org/mailman/listinfo/python-list