Re: URLs and ampersands

Steven D'Aprano Tue, 05 Aug 2008 16:26:20 -0700

On Tue, 05 Aug 2008 12:07:39 +0000, Duncan Booth wrote:

> Whenever you put a URL into an HTML file you need to escape it, so
> naturally you will also need to unescape it when it is retrieved from
> the file. However, whatever you use to parse the HMTL ought to be
> unescaping text and attributes as part of the parsing process, so you
> shouldn't need a separate function for this.


...

> Even Python's builtin HTMLParser class will do this for you. What parser
> are you using?

A regex.

I know, I know, now I have two problems :-)

It's a quick and dirty hack, not a production piece of code, and I have a 
quick and dirty fix by just using url.replace('&amp;', '&').

Thanks to everybody who replied. I guess I really have to bite the bullet 
and learn how to use a proper HTML parser.



-- 
Steven
--
http://mail.python.org/mailman/listinfo/python-list

Re: URLs and ampersands

Reply via email to