In message <[EMAIL PROTECTED]>, John Nagle
wrote:

> Here's a URL from a link on the home page of a major company.
> 
> <a href="/adsk/servlet/index?siteID=123112&amp;id=1860142">About Us</a>
> 
> What's the appropriate Python function to call to unescape a URL
> which might contain things like that?

Just use any HTML-parsing library. I think the standard Python HTMLParser
will do the trick, provided there aren't any errors in the HTML.

> Will this interfere with the usual "%" type escapes in URLs?

No. Just think of it as an HTML attribute value; the fact that it's a URL is
a question of later interpretation, nothing to do with the fact that it
comes from an HTML attribute.

-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to