Geoff Hutchison wrote:
>
> I finally had some time to sit down and do some ht://Dig work. I've been
> swamped with getting our server back to speed after several accounts were
> compromised. :-( It turns out someone managed to slip a packet sniffer onto
> our network. <ugh>
>
> Anyway, I cleaned out the incoming folder of the bug system. This was one
> of the messages posted.
>
> Is this correct? If so, what should we do about it? We can't use | because
> that already has a meaning for ht://Dig. Furthermore, we'll still have to
> parse & separators because many browsers (and a *lot* of URLs) still use it.
>
> Anyone have a good suggestion for a separator? I'd go for * offhand, but I
> might be missing some horrible consequence (I was going to suggest # first
> and realized the error of my ways).
>
> -Geoff
>
> Date: Sat, 3 Apr 1999 20:15:05 -0800
> From: [EMAIL PROTECTED]
> To: [EMAIL PROTECTED]
> Subject: PRIVATE: Use of & as CGI variable separator vs. HTML 4.0
>
> Full_Name: Fred Condo
> Version: 3.1.1
> OS: FreeBSD 2.2.8
> Submission from: pm3dyn102.dip.csuchico.edu (132.241.249.102)
>
> HTML 4.0 strict does not permit the & character in the URLs generated as CGI
> variable separators for the page list. This is because the & introduces a
> general entity.
>
> W3C recommend rewriting the code to use a different separator, such as | or ;
> that does not have special meaning in HTML.
>
> Until this is done, ht://dig cannot emit valid HTML 4.0 (strict).
I'm the originator of this bug report, and it occurs to me there are two
separate problems.
First is the one I had in mind when reporting it: the pages list of links when
there are more results that fit on one ht://dig results page. Those links look
like this:
http://webclass.csuchico.edu/cgi-bin/htsearch?restrict=&exclude=&config=webclass&method=and&format=builtin%2Dlong&words=HTML&page=7
Htsearch generates and uses this, so it shouldn't be a big matter to change the
separator. A quick check with the W3C validator shows that encoding the
ampersands as & validates under HTML 4.0 (strict) and works with Netscape
4.51.
The second class of URL emitted by htsearch is a link to a page in the search
database. The default exclusion list in the sample configuration file disallows
CGI scripts, which are I would guess the principal users of the & separator.
But it's conceivable that there are still URLs that have & in them. I don't
know that there is any easy answer for this, unless the & solution noted
above is generally good.
The reference for the invalidity of the naked & in URLs is at
http://www.cs.duke.edu/~dsb/kgv-faq/errors.html#bad-entity
The W3C Validator is at http://validator.w3.org/
--
Fred Condo + [EMAIL PROTECTED] + http://webclass.csuchico.edu/
[EMAIL PROTECTED] + fredcondo on Yahoo Pager
------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the SUBJECT of the message.