> 
> On Tue, 3 Oct 2000, David Adams wrote:
> 
> > When, during an update run, htdig says of a page: "retrieved but not
> > changed", how does htdig decide that the page is the same as the last time?
> 
> It checks the date it received from the server (if present) against the
> date in the database. If they're the same, it ignores the file.
> 
> > An author is maintaining that she added a link to a page and that an update
> > run of htdig failed to follow the new link(s) she had added.
> 
> Are these static or dynamic pages? If the server is not returning
> Last-Modified headers, then this could be the problem.
> 
> --
> -Geoff Hutchison
> Williams Students Online
> http://wso.williams.edu/

Not dynamic in the true sense, but SSI on an Apache server.  Another
reply to this list gave the vital information:

> No, the XBitHack turns .html files with execute permission into SSI
> files (equivalent to .shtml), and for SSI files, Apache does NOT put
> out a Last-Modified header because SSI generates dynamic content. 

It had not occured to me that an SSI file was "dynamic", I live and learn!

This explains why a significant fraction of pages on our principal server
generate the "retrieved but not changed message".  Just as well we re-index
completely once a week!

I will add

modification_time_is_now: true

to the configuration file and that should fix the problem.

Thanks again to both you and Gilles for your replies.

-- 
 
David J Adams
<[EMAIL PROTECTED]>
Computing Services
University of Southampton

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  <http://www.htdig.org/mail/menu.html>
FAQ:            <http://www.htdig.org/FAQ.html>

Reply via email to