According to Jerry Asher:
> >Well, I think it would take a good quality patch with documentation
> >changes to describe the new config attribute, plus a good argument
> >for why the change should be incorporated. I have yet to see one.
> >Making htsearch generate bad URIs for the sake of not having to make a
> >trivial change to your wrapper script just doesn't seem that wise to me.
>
> Before I develop a patch, I'll try to make the argument....
>
> The argument is that what "site developers" want is "search", and they
> already have a variety of hoops to jump through: configure a webserver,
> develop scripts, and implement the search engine. We're not all using
> Apache, and we're not all using PHP. But we are using a variety of
> frameworks that we may or may not have the ability to change.
...
> So why patch htDig and not AOLserver/OpenACS/OpenNSD?
>
> 1. Weak answer: Convenience and stability: the patch (appears to be)
> pretty local to one function within Display.cc, within AOLserver there is
> one function and within OpenACS there are about ten functions that would
> require changing to understand the use of semicolons.
Yikes. Ten functions to do the one simple task of parsing a query string?
They're not into reusing code, are they?
> 2. Stronger answer: Standards. As you mention, "there's no question that
> the ampersand is still the standard...". (I understand the importance of
> the word "still" suggesting that that may not be true in the future.)
Well, now you're quoting me out of context. It's the standard
separator for the CGI interface. But, W3C recommends ';', or at the
very least, '&' within URIs. The use of the semicolon may be a mere
recommendation, but the HTML 4.0 standard (and more recent standards
derived from it, i.e. 4.01 and XHTML) are pretty clear on the point
that bare ampersands in URIs are a no-no. So we have a case of two
conflicting standards, leaving us two options for resolving the conflict:
1) use the separator that W3C suggests for URIs, but still recognizing
the ampersand in query strings passed by CGI, or 2) use & in URIs,
which the browser will convert to a simple ampersand when it passes it
back to the server when following the link.
W3C isn't suggesting we change the CGI standard, at least I don't think
they are, and neither am I. What I am suggesting is that CGI programs
accept a dual-standard and recognize both separators. This strikes me
as the ideal solution.
What's wrong with the 2nd approach, i.e. using &? Well, for
one thing it's cumbersome and ugly (minor point, I know), but also
because this doesn't address what, up until now, had been the biggest
beef people had with the change from '&' to ';'. The main complaint,
as far as I can recall, was with PHP wrappers that directly parse the
URIs put out by htsearch, and not a problem with parsing the CGI input.
PHP wrappers will still see the unprocessed '&', and not a bare
ampersand, so they still would need to be changed.
> I will suggest that AOLserver/OpenNSD is not the only webserver that
> understands ampersands at the moment but that does not understand
> semicolons. The question becomes: must all webservers come up to the level
> of the protocol where not only the minimal standard is supported, but all
> recommendations are supported to use htDig, or is there somewhat that htDig
> can be made to support all webservers easily and still support the highest
> conforming webservers?
This is the part I'm still having a bit of difficulty understanding.
Not being familiar at all with AOLserver/OpenACS/OpenNSD, I don't
know what parts of it need to look at query strings in URIs at all.
I've never seen it as a web server problem per se, but rather a problem
with CGI programs and wrapper scripts. I do have an Apache-centric
world view, I admit, but in the scheme of things as I see it, the web
server passes unprocessed query strings to the individual CGI programs.
Does AOLserver process query strings itself before passing them on?
Are various CGI programs integrated into the server in a monolithic sort
of manner? How does htsearch fit into this picture?
> I suggest there is, and that a patch to htDig to use ampersand separators
> depending on a configuration item would be useful to the general population.
>
> Okay then, that's the argument....
Well, you do make a case that hasn't been made before, namely that this
does seem to go beyond the relunctance to bring a few wrapper scripts
or CGIs in line with the times. I guess I'm just a little surprised,
given that HTML 4.0 has been around for well over 2 year now, at the
inertia involved in conforming to it. Is AOLserver never going to
adopt W3C's recommendations? Given AOL's size, I imagine that's a
distinct possibility. They may be more inclined to follow the MS route
of defining their own versions of existing standards.
I'm not unsympathetic to the hoops that web developers have to jump
through. I certainly have to jump through my share of them. I don't
want htsearch to make things impossible for developers, but I don't want
to make it easier to do the wrong thing than to do the right thing.
I think the semicolon separator should remain the default, but I wouldn't
oppose a config attribute to change it. However, I think it would be
wrong to make it a simple choice between ';' and '&' (i.e. as a boolean
attribute), because it closes the door to the better choice of '&'
when that would work. So, maybe there should be a string attribute
that defines the separator, with ';' being the default, and '&'
being the recommended alternative.
If I'm not mistaken, using '&' would still meet your requirements,
as for you it seems to be a server issue, and the server should only see
the simple ampersand decoded by the client and passed back to the server.
This wouldn't cause htsearch to violate the HTML 4.0 standard.
The bare '&' as separator should be a last recourse, only for cases where
the htsearch output must be processed directly by a wrapper program that
can't be fixed to allow ';' or '&'.
--
Gilles R. Detillieux E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba Phone: (204)789-3766
Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html