Hi, I have the same question of Joshua. I not understand how do a recrawl on existing crawldb without delete the crawldb. Someone can explain me how do this tasks? I'm newbye with Nutch.
Thank and sorry from my poor english. 2010/5/18 Joshua J Pavel <[email protected]> > I would like to recrawl a certain site I admin at a much smaller interval > - say, every hour. > > I've specified my db.fetch.interval.default to be 1200, but if I attempt > to recrawl using the crawl directory and segments from last time, I can't > seem to get it to work. I think the problem is adddays... is there any > way to reuse my crawl segments with a fetch segment less than 1 day? I > currently delete my old crawl directory. While that works, I would like > to preserve the entries and timestamps and such in there between crawls. > Is this possible?

