On Mon, 9 Sep 2002, Tod Thomas wrote: > I have a document from last year that is getting indexed with the > current date as the modified date. I've checked the file myself and its > a year old and hasn't been touched since then.
Right, but are you sure the server is actually sending a date in the Last-Modified: header? If the file is sent as dynamic content, e.g. .shtml, .cgi, .php, .asp, .jsp, etc. then the server will not send a date by default and the only logical thing for htdig to do is pick the current time. > I used htdig -t to get an ASCII dump to look at the modification date > and it looks like this - m:1031587378. Could somebody help me out > with the format of this date? This is the number of seconds since 1970-01-01, a commonly used UNIX date format. (If you have GNU date, you can get this with date +"%s" from a command-line.) Since I got 1031609537 just now, this was about 6 hours ago, give or take. > I imagine htdig uses a number of different dates in priority order - > maybe a META date followed by an internally stored binary date, or If you have a META date in the document, you should take a look at the use_doc_date attribute in 3.1.6: <http://www.htdig.org/attrs.html#use_doc_date> But remember, if it's using the current date on an old file, that's because the server is not sending a correct Last-Modified: header. You can see the headers returned from a server if you run "htdig -vvv" (which also outputs a variety of additional debugging information as the indexing proceeds). -- -Geoff Hutchison Williams Students Online http://wso.williams.edu/ ------------------------------------------------------- This sf.net email is sponsored by: OSDN - Tired of that same old cell phone? Get a new here for FREE! https://www.inphonic.com/r.asp?r=sourceforge1&refcode1=vs3390 _______________________________________________ htdig-general mailing list <[EMAIL PROTECTED]> To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html

