According to Adam Rice:
> I'm having a problem where my search results are out-of-date with
> respect to the site, even though htdig is definitely running, and
> definitely fetching the files from the web server, and not giving
> errors. Perhaps I am misunderstanding what an update dig does? I thought
> that it checked every document in its database, and rescanned it if it
> was new, as well as following any links to new documents, and removing
> it if it gets a 404.
> 
> I run htdig and htmerge with the -a commandline options. I then move the
> *.docdb.work, *.docs.index.work and *.words.db.work files to *.docdb,
> *.docs.index.work and *.words.db respectively. I don't actually use
> wildcards, the *s are just there because I have different databases for
> different sites. I then copy the *.docdb file back to *.docdb.work so
> that it is there for the next update dig. The *.wordlist.work file is
> left alone ready for the next update.
> 
> Does that procedure sound correct? All the pages on the sites use
> server-side includes, and hence don't have Last-Modified: headers, could
> that be confusing matters?

The procedure above sounds correct to me, but for dynamic content with no
Last-Modifed headers, you need to set

 modification_time_is_now: true

in your configuration file for 3.1.x.  In 3.2.0, this attribute is gone,
and htdig always assumes the current time for any missing Last-Modified
header.

-- 
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  <http://www.htdig.org/mail/menu.html>
FAQ:            <http://www.htdig.org/FAQ.html>

Reply via email to