----- Original Message -----
From: "Marsh, Drew" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Monday, April 29, 2002 1:42 PM
Subject: Re: [DOTNET] UrlDecode & escape characters question


> Erick Thompson [mailto:[EMAIL PROTECTED]] wrote:
>
> > I've noticed that the HttpUtility.UrlDecode doesn't handle
> > some of the old style character escapes, such as "&amp;" and
> > "&nbsp;". While I understand that the %xx format should be
> > used, a lot of sites still use the older style of escapes. Is
> > there some way to convert these escapes, or should I manually
> > replace "&amp;" with %26 and "&nbsp;" with +?
> >
> > If it's a manual process, does anyone know if there is a
> > definitive list of these non standard escapes, or is that an oxymoron?
>
> It's because &amp; and &nbps; are not, and never were, valid url encoded
> sequences. They're SGML encoded entities... which are totally different.
If
> someone puts &amp; into a uri it should be encoded as %26amp%3B. If you
need
> to escape or decode SGML (i.e. HTML/XML) entities then use
> HttpUtility::HtmlEncode/Decode.

I know that they are not valid, but they are heavily used. I believe that
%23 and so on are valid in HTML. If so, can I just use HtmlDecode then
UrlDecode to get a correct URL out of a given, even if incorrect, url?

Thanks,
Erick

You can read messages from the DOTNET archive, unsubscribe from DOTNET, or
subscribe to other DevelopMentor lists at http://discuss.develop.com.

Reply via email to