Re: Iterative Crawling

Tejas Patil Thu, 14 Mar 2013 21:45:34 -0700

On Thu, Mar 14, 2013 at 7:36 PM, Dat Tran <[email protected]> wrote:


> Update: When i remove or add new url in urls.txt( seed list), it is strange
> the crawling result is not changed. It means  nutch crawls always the first
> seed list.


You mean the older copy of seeds file is used effectively and the updates
you did not got reflected ?
What command are you using to run the crawler ?

This problem is caused by any temporary file ? How can i resolve
> it? where can i find temporary file of nutch
>

In shell, you can use "ls -a" to see the hidden files. That way you can
find if there are hidden backup files in your seeds directory.

>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Iterative-Crawling-tp4046501p4047572.html
> Sent from the Nutch - User mailing list archive at Nabble.com.
>

Re: Iterative Crawling

Reply via email to