Hey, guys.  I ran into something wierd when I was testing out the
allow_numbers changes last week, which I haven't been quite able to
explain or track down in the code.  Of the pages on my site that I was
indexing, about a dozen of them were from a CGI script that puts out a
Last-Modified header to set the date appropriately in search results.
Because of a recent bug in the script, which I just fixed last week,
it turns out that the Last-Modified headers were coming out with no
date on them, so htdig was giving them a modtime of 0 (i.e. the epoch).
This is different behaviour than htdig 3.1.6, which gave them the current
time instead.  It may be that the 3.2 code should be fixed to do likewise,
as it seems the more sensible behaviour.

However, that's not the wierd thing.  What was odd is that even though
these dozen or so web pages were definitely in the database, and came
out into db.docs after an htdump (with a m:0 field), htsearch would not
show these in search results.  I looked at the code, and the only thing
that I can see that would cause this is if the startyear, startmonth or
startday input parameters were set, causing the timet_startdate value
in Display.cc to be greater than 0.  But I didn't set these!  I ran
htsearch from the command line, so I know I wasn't passing it these
values as input parameters, and the config file I used didn't define
these as attributes either.

I know the problem was the 0 modtime, because when I fixed the CGI script
to return a proper Last-Modified header, the pages showed up in htsearch,
with no other changes being made.

Does anyone know of anything else that might explain this behaviour?
I'd start putting trace prints in htsearch to track this down, but I have
too many high-priority things right now to spend much time on ht://Dig
right away.  htsearch -vvvv didn't give any indication of what might be
going on - the URLs in question never even showed up at all in the output.

I don't think I'd consider this a showstopper, but it does seem odd that
htsearch rejects any modtime value at all when none of those parameters
have been specified.  This, coupled with the fact that htdig will assign
a 0 modtime if it can't parse the Last-Modified header (as opposed to a
missing Last-Modified header, which should be taken as the current time
if I'm not mistaken), could lead to others having similar problems.

-- 
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/
Dept. Physiology, U. of Manitoba  Winnipeg, MB  R3E 3J7  (Canada)


-------------------------------------------------------
This SF.net email is sponsored by OSDN developer relations
Here's your chance to show off your extensive product knowledge
We want to know what you know. Tell us and you have a chance to win $100
http://www.zoomerang.com/survey.zgi?HRPT1X3RYQNC5V4MLNSV3E54
_______________________________________________
ht://Dig Developer mailing list:
[EMAIL PROTECTED]
List information (subscribe/unsubscribe, etc.)
https://lists.sourceforge.net/lists/listinfo/htdig-dev

Reply via email to