Hi,

For part one is your depth parameter same when you re crawl?
part 2:-To get an idea about the fetched and un fetched url  nutch provides
a tool to generate stats for the crawl. You can check out the stats after
each crawl and identify which urls being fetched and un fetched.

Regards,
Avi Sanadhya
University Of Southern California

On Thu, Feb 12, 2015 at 10:39 AM, Nagarjun Pola <[email protected]> wrote:

> Hi Everyone,
>
> I started to use Nutch 1.10 for my homework and I see that every time I
> perform a crawl using the same configuration and same seed urls I get a
> different number of fetched urls. This occurs even when the old crawl data
> is deleted.
>
> This way I would not be able to identify which URLs had a problem being
> fetched and if it was resolved later or not.
>
> Any suggestions on how to solve this issue would be of great help.
>
> Thank You.
>
> Best,
> Nagarjun Pola
> University of Southern California
>
>

Reply via email to