The second call of bin/crawl seems to have swapped seed and crawl directory:
> root@Walleye:~/nutch# bin/crawl crawl urls -dir crawl -depth 3 -topN 5 The first call root@myserver:~/nutch# bin/crawl urls/seed.txt testcrawl -dir crawl -depth 3 -topN 50 seems to have the order correct. But does also not follow entirely the scheme % bin/crawl <seedDir> <crawlDir> <solrURL> <numberOfRounds> The status message of Injector also indicates that the seed directory does not contain URLs: (first call) Injector: total number of urls rejected by filters: 0 Injector: total number of urls injected after normalization and filtering: 1 => one URL injected (second call) > Injector: total number of urls rejected by filters: 0 > Injector: total number of urls injected after normalization and filtering: 0 => zero URLs Sebastian On 07/05/2014 02:57 AM, CdnGuy wrote: > Wiped and rebuilt the server from scratch. > Followed tutorial again. > Here's the results: > root@Walleye:~/nutch# bin/crawl crawl urls -dir crawl -depth 3 -topN 5 > Injector: starting at 2014-07-04 20:50:27 > Injector: crawlDb: urls/crawldb > Injector: urlDir: crawl > Injector: Converting injected urls to crawl db entries. > Injector: total number of urls rejected by filters: 0 > Injector: total number of urls injected after normalization and filtering: 0 > Injector: Merging injected urls into crawl db. > Injector: overwrite: false > Injector: update: false > Injector: finished at 2014-07-04 20:50:31, elapsed: 00:00:03 > > Still no files in the crawl directory. > What am I missing? > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/NutchTutorial-Followed-Crawldb-Not-Created-tp4145668p4145686.html > Sent from the Nutch - User mailing list archive at Nabble.com. >

