The most common problem is not setting the agent name in the
nutch-site.xml file. First off check the log files for the task and see
if any errors are occuring and it would be good to see more of your
configuration for crawl-urlfilter and nutch-site.
Dennis
Volkan Ebil wrote:
Hİ,
I have setup nutch and hadoop succesfully.
No problem at start.sh and stop.sh.
I create a dir name urls with a txt file as seed.
After I run the command
bin/hadoop dfs -put urls urls
it works .I check the list with the command
bin/hadoop dfs -ls
After that i have edited the crawl-urlfilter.txt and nutch-site.xml
hadoop-site.xml and other configurations
At last i ran bin/nutch crawl command but it gives
No urls to fetch check your filter and seed list error
I have observed the content of the webdb with the command readdb -stats
There is no problem at generate ,inject.
I am sure there is no problem in crawl-url filter and other configuration
xml files
İs anyone know any possible problem????
Thanks in advance.