----- Original Message ----- From: "Marsh, Drew" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Monday, April 29, 2002 1:42 PM Subject: Re: [DOTNET] UrlDecode & escape characters question
> Erick Thompson [mailto:[EMAIL PROTECTED]] wrote: > > > I've noticed that the HttpUtility.UrlDecode doesn't handle > > some of the old style character escapes, such as "&" and > > " ". While I understand that the %xx format should be > > used, a lot of sites still use the older style of escapes. Is > > there some way to convert these escapes, or should I manually > > replace "&" with %26 and " " with +? > > > > If it's a manual process, does anyone know if there is a > > definitive list of these non standard escapes, or is that an oxymoron? > > It's because & and &nbps; are not, and never were, valid url encoded > sequences. They're SGML encoded entities... which are totally different. If > someone puts & into a uri it should be encoded as %26amp%3B. If you need > to escape or decode SGML (i.e. HTML/XML) entities then use > HttpUtility::HtmlEncode/Decode. I know that they are not valid, but they are heavily used. I believe that %23 and so on are valid in HTML. If so, can I just use HtmlDecode then UrlDecode to get a correct URL out of a given, even if incorrect, url? Thanks, Erick You can read messages from the DOTNET archive, unsubscribe from DOTNET, or subscribe to other DevelopMentor lists at http://discuss.develop.com.