Re: [htdig3-dev] Use of & as CGI variable separator vs. HTML 4.0

Fred Condo Tue, 13 Apr 1999 21:09:59 -0700

Geoff Hutchison wrote:
> 
> I finally had some time to sit down and do some ht://Dig work. I've been
> swamped with getting our server back to speed after several accounts were
> compromised. :-( It turns out someone managed to slip a packet sniffer onto
> our network. <ugh>
> 
> Anyway, I cleaned out the incoming folder of the bug system. This was one
> of the messages posted.
> 
> Is this correct? If so, what should we do about it? We can't use | because
> that already has a meaning for ht://Dig. Furthermore, we'll still have to
> parse & separators because many browsers (and a *lot* of URLs) still use it.
> 
> Anyone have a good suggestion for a separator? I'd go for * offhand, but I
> might be missing some horrible consequence (I was going to suggest # first
> and realized the error of my ways).
> 
> -Geoff
> 
> Date: Sat, 3 Apr 1999 20:15:05 -0800
> From: [EMAIL PROTECTED]
> To: [EMAIL PROTECTED]
> Subject: PRIVATE: Use of & as CGI variable separator vs. HTML 4.0
> 
> Full_Name: Fred Condo
> Version: 3.1.1
> OS: FreeBSD 2.2.8
> Submission from: pm3dyn102.dip.csuchico.edu (132.241.249.102)
> 
> HTML 4.0 strict does not permit the & character in the URLs generated as CGI
> variable separators for the page list. This is because the & introduces a
> general entity.
> 
> W3C recommend rewriting the code to use a different separator, such as | or ;
> that does not have special meaning in HTML.
> 
> Until this is done, ht://dig cannot emit valid HTML 4.0 (strict).

I'm the originator of this bug report, and it occurs to me there are two
separate problems.

First is the one I had in mind when reporting it: the pages list of links when
there are more results that fit on one ht://dig results page. Those links look
like this:

http://webclass.csuchico.edu/cgi-bin/htsearch?restrict=&exclude=&config=webclass&method=and&format=builtin%2Dlong&words=HTML&page=7

Htsearch generates and uses this, so it shouldn't be a big matter to change the
separator. A quick check with the W3C validator shows that encoding the
ampersands as &amp; validates under HTML 4.0 (strict) and works with Netscape
4.51.

The second class of URL emitted by htsearch is a link to a page in the search
database. The default exclusion list in the sample configuration file disallows
CGI scripts, which are I would guess the principal users of the & separator.
But it's conceivable that there are still URLs that have & in them. I don't
know that there is any easy answer for this, unless the &amp; solution noted
above is generally good.

The reference for the invalidity of the naked & in URLs is at
http://www.cs.duke.edu/~dsb/kgv-faq/errors.html#bad-entity

The W3C Validator is at http://validator.w3.org/
-- 
Fred Condo + [EMAIL PROTECTED] + http://webclass.csuchico.edu/
[EMAIL PROTECTED]                +      fredcondo on Yahoo Pager
------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the SUBJECT of the message.
Re: [htdig3-dev] Use of & as CGI variable separator vs. HTML 4.0

Reply via email to