We selected ht://Dig to go on a clients internal site, and have been happy 
enough with it so
far, with the exception of logging - I ended up modifying the code to get 
around what were for
us limitations. I'd like to describe the changes I made - I would be 
curious to get some feedback.

What we wanted to do was keep a record of what terms people were searching 
on, and how many
hits they got as a result. I got standard logging working but I had the 
following issues:

1) The syslog system that managed the log files was archiving old files ( 
which was a
    pain but liviable ) and changing permissions so that they were only 
readable by root.

2) The format that the logged information is written out as is a total pain 
to pick
    information out of

3) The logging information is written out every time htsearch is called.

This last one is a killer. The problem is that htsearch is called every 
time a prev or next
or page number button is pressed - which means that the search is logged 
every time, when in
reality the search was only actually entered once.

The first part solution was to add the following to the configuration file:

   # If this is set to the name of a file name, that becomes the name of the
   # file to log data to. If it is set to the default of "none" it logs using
   # syslog.

   logging_file: <absolute path to a log file>

   # If this is set to true, it will only log the results when the cgi 
parameter
   # "init=Y" is set. Default is false

   logging_initonly: true

and then adjust the code in defaults.cc and Display.cc to use these settings.

I then added
    <input type=hidden name=init value="Y" >

to the search forms in the initial search page, the "nomatch" page and the 
serach form at the
bottom of the results page.

The formats are currently done as a hack. If the logging_file is set to 
"none" it does
exactly what it used to do, eg:

   Apr  9 09:38:43 myhost htsearch[11228]: 192.168.1.10 [myconfig] (and) [car]
     [(car or auto or automobile)] (98/10) - 1 -- 
http://mywebhost.com.au/search/search.html

otherwise it outputs exactly the same information, except they are 
delimited by "|" and have the
number of seconds since epoch at the front:

   986844235|192.168.1.10|myconfig|and|car|(car or auto or 
automobile)|98|10|1|http://mywebhost.com.au/search/search.html

Any thoughts on this approach?

Regs

Brian

-------------------------
Brian White
Step Two Designs Pty Ltd - SGML, XML & HTML Consultancy
Phone: +612-93197901
Web:   http://www.steptwo.com.au/
Email: [EMAIL PROTECTED]


_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to