Hi apachenutch,

Something of a wild guess here. Given that you are using the same seed file as 
I am, I would have expected to see a single URL in the index at the end of the 
first iteration, not 10. So the only time I have observed similar behavior was 
when the fetcher truncated the file because of the http.content.limit setting, 
you may want to set it to -1 and see if the problem gets fixed.

You can verify if this is needed by looking at the cnt column for the seed URL 
and see if the contents of the page is the same as what you get from a 
view-source of the seed URL page on your browser.

Also to answer your original question, the depth is the iteration number. Each 
time you go deeper and deeper because you are putting the outlinks generated 
from the previous call back into the fetch list and fetching/parsing them. You 
can of course script it and specify a depth parameter that controls the number 
of iterations...

-sujit

On Feb 21, 2012, at 2:16 PM, apachenutch wrote:

> Update DB was done, after inject, generate, fetch and parse.
> Tried iterating after doing the update.
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Please-help-Nutch-fetch-command-not-fetching-data-tp3764751p3764994.html
> Sent from the Nutch - User mailing list archive at Nabble.com.

Reply via email to