Hi,
I have a pretty nifty patch that allows you to retrieve
your templates (wrapper.html, etc...) through http. This may
seem a bit convoluted at first, but it makes sense if you
generate your web pages dynamically, such as with PHP,
mod_perl, Mason, AxKit, or any other application framework.
Benefits of dynamically retrieving your web templates rather
than using static templates:
1) You get *automatically* generated html templates with the
proper decoration for your site. You never have to worry
about your templates being out of date with your current
web site design.
2) Rather than using a static search.html with a static form,
and form values, you can use the htsearch cgi executable
to retrieve your search.html *as a template*, which then
gets filled in by htsearch, just as with wrapper.html.
This way, you can always be guaranteed that your forms from
search.html, wrapper.html, nomatch.html, and syntax.html
are always consistent.
3) You can modularly create your templates using your normal
site software -- and you can keep them modular. For example,
search.html has some header/footer stuff, plus a fill-in form.
Well, so does wrapper.html, and the others. If you break out
the form bit (with htsearch variables) into a component (Mason
terminology) you can then simply include the simple form
component into wrapper.html, et. al..
4) you keep all your htdig in one place. i.e. search.html,
wrapper.html, nomatch.html, etc are all part of your
document root, though they are *never* served directly
to a client. They are always filtered through htsearch.
Here's how to use the patch:
0) apply the attached patch to htdig-3.1.5, recompile.
tar zxf htdig-3.1.5.tar.gz
gzip -dc htdig-url-for-template-patch.gz | patch -p0
cd htdig-3.1.5
./configure
make
1) set up your templates in your document root. including
a templatized version of search.html
2) set the following variables in your htdig.conf file:
search_results_wrapper: http://yoursite.com/.../wrapper.html
nothing_found_file: http://yoursite.com/.../nomatch.html
syntax_error_file: http://yoursite.com/.../syntax.html
search_form_only: http://yoursite.com/.../search.html
3) instead of calling /search.html directly from your site, when
you point a user to your search engine, point him to
http://yoursite.com/cgi-bin/htsearch?search_form_only=1
which will cause htsearch to use the newly created
'search_form_only' configuration file directive to grab your
search.html, and do the variable substitution in it. As you
use the htdig search, it will dynamically grab the templates
off your site. If you change the look and feel of your site,
your search engine results will all automatically change
for you!
The drawback to this method, of course, is that *every* search hit
requires an extra call to your web server. It's not a *terrible*
performance hit because the search itself may be the real CPU/time
hog, not one extra HTTP hit. I have not measured any sort of
performance with this patch though, so buyer beware.
This drawback can be alleviated with some intelligent caching, which
I have been too lazy to think about.
Let me know if you think the patch is useful/functional, etc.
-Caleb Crome
(Note: The business end of this patch is really quite small, mostly
contained to just Display.cc, but I had to create a copy of
htdig/Document.[h,cc] in htsearch, and cut some stuff out, which
is why the patch is so big. If you want to see what the patch does
simply ignore the DocumentReader.cc and DocumentReader.h -- they are
the copies of Document.cc and Document.h.)
P.S. htdig is some very nicely written code -- easy to understand.
htdig-url-for-template-patch.gz