According to [EMAIL PROTECTED]:
> Is there some documentation on the format/content of the databases, as 
> produced by htdig and htmerge?  
> 
> What I'd like to be able to do, if feasible, is to tell, from the databases 
> themselves which url's have been indexed, and ideally the date on which this 
> was done.  

I don't think there's much documentation on the specific format of the
databases, other than the source code.  I don't think the date on which
a document was last indexed is stored, but the last modified date is
stored in db.docdb.  This date will be the date indexed for documents
where the server doesn't return a last modified date, e.g. for dynamic
content.

It would probably be pretty easy to build a simple docdb dumping tool
out of htnotify, which does a simple traversal through the database.
You could get it to output any field you want from the "DocumentRef"
object.  Apart from that, I don't think any such tool exists yet, though
its on the to-do list for 3.2.

-- 
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.

Reply via email to