Re: is nutch recrawl possible?

Stefan Groschupf Mon, 19 Dec 2005 05:58:40 -0800

It is difficult to answer your question since the used vocabulary ismay wrong.You can refetch pages, no problem. But you can not continue a crashedfetch process.Nutch provides a tool that runs a set of steps like, segmentgeneration, fetching, db updateting etc.So may first try to run these steps manually instead of using thecrawl command.Than you may will already get an idea where you can jump in to grepyour needed data.


Stefan


Am 19.12.2005 um 14:46 schrieb Pushpesh Kr. Rajwanshi:

Hi,
I am crawling some sites using nutch. My requirement is, when i runa nutchcrawl, then somehow it should be able to reuse the data in webdbpopulated
in previous crawl.
In other words my question is suppose my crawl is running and icancel it
somewhere in middle, then is there someway i can resume the crawl ?
I dont know even if i can do this at all or if there is some waythen please
throw some light on this.

TIA

Regards,
Pushpesh

Re: is nutch recrawl possible?

Reply via email to