> Unfortunately we also need to translate URLs in an HTML context. It has
> become a "standard" to include escapes such as & and © in the
> URL text itself. This is not forbidden in the RFC on URIs, but for
> obvious reasons it's not always supported by the webserver. Furthermore
> we need to normalize URLs anyway.

The HTML4 standard specifies that HTML entities *must* be translated in
every context except for SCRIPT tags. Therefore, the following link is
"wrong":

<a href="mypage.cgi?ID=1&location=3">

And this is "right":

<a href="mypage.cgi?ID=1&amp;location=3">

When the client actually processes the link, however, the URI that it
requests is

mypage.cgi?ID=1&location=3

However, most older HTML does not follow this standard (I know mine does
not). We therefore have to check and see whether an & is followed by a valid
HTML entity code, and only translate the entity if it is valid. I believe
that this is how NS/IE currently function.

+============================================
+ Benjamin Smedberg
+ CUA Asst. Webmaster
+ [EMAIL PROTECTED]
+============================================
+ http://computing.cua.edu/as/bds/
+ How to make God laugh: tell Him YOUR plans!
+============================================


------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the SUBJECT of the message.

Reply via email to