Here's a URL from a link on the home page of a major company. <a href="/adsk/servlet/index?siteID=123112&id=1860142">About Us</a>
Yes, that "&" is in the source text of the page. This is, in fact, correct HTML. See http://www.htmlhelp.com/tools/validator/problems.html#amp What's the appropriate Python function to call to unescape a URL which might contain things like that? Will this interfere with the usual "%" type escapes in URLs? What's actually needed to get this right is something that goes from HTML escaped form to URL escaped form, because, in general, there is no unescaped form that will work for all URLs. There's "htmldecode" at "http://zesty.ca/python/scrape.py", which works, but this should be a standard library function. John Nagle -- http://mail.python.org/mailman/listinfo/python-list