I guess you can run segmentMergeTool to merge new
segments with previous one ( document with duplicated
URL and content MD5 will be discarded) and then run
index on it,

not sure if it is the best scenario for daily
refetching---just my thought based on the code I dig
out,

Michael Ji,

--- Lokkju <[EMAIL PROTECTED]> wrote:

> I have searched through the mail archives, and seen
> this question
> asked alot, but no answer ever seems to come back. 
> I am going to be
> using nutch against 5 sites, and I want to update
> the index on a
> nightly basis.  Besides deleting the previous crawl,
> then running it
> again, what method of doing nightly updates is
> recommended?
> 
> Thanks,
> Nick
> 



        
                
__________________________________ 
Yahoo! Mail - PC Magazine Editors' Choice 2005 
http://mail.yahoo.com

Reply via email to