Re: [htdig] Caching and DocID

Gilles Detillieux Fri, 30 Mar 2001 15:15:11 -0800
According to Edmond Abrahamian:
> I'm looking for ways to cache documents that may have been
> removed from indexed sites after the indexing. -- Something
> akin of google's "cached" version of an indexed document.
> 
> Is it possible to retrieve these cached versions from the
> doc db by DocID? If so, is there a tool in place to do so?

No, the docdb only stores the excerpt data, with all HTML tags stripped
out, and not a complete copy of the original document.  You'd need to
customize the code to do caching, and store the documents in another
database.
 
> Any other ideas would be appreciated.

I suppose you could index through a caching server, as a proxy, and
configure the caching server to hold on to all documents even if
the original is removed.  This is way out of my area of experience,
though.  Any other ideas?

-- 
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html
Re: [htdig] Caching and DocID

Reply via email to