Gilles Detillieux wrote:
> OK, we do clearly have a problem with SGML entities in 3.1.2, as well
> as 3.2. (3.2 has some more serious problems, which I was hoping to
> tackle, but that's another story.) So, right now, it only translates
> &foo; entities outside of any HTML tags. I think there are reasons
Unfortunately we also need to translate URLs in an HTML context. It has
become a "standard" to include escapes such as & and © in the
URL text itself. This is not forbidden in the RFC on URIs, but for
obvious reasons it's not always supported by the webserver. Furthermore
we need to normalize URLs anyway.
I was initially thinking this would need to be placed in the URL code,
but it strikes me that this really only needs to happen in the HTML
parser itself.
(And yes, I'm aware the 3.2 HtSGMLCodec has problems, but I've been a
bit pre-occupied.)
--
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/
------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the SUBJECT of the message.