Please try this command bin/nutch crawl search -dir /usr/data/crawl -depth 2 &> crawl.log & where search folder contains the list of files containing URLs. The crawler will crawl data into /usr/data/crawl/crawldb folder. crawl.log being the log file. Hope this helps. Thanks Sudhi
BDalton <[EMAIL PROTECTED]> wrote: I get this error, bin/nutch crawl url.txt -dir newcrawled -depth 2 >& crawl.log Exception in thread "main" java.io.IOException: Input directory d:/nutch3/urls/urls.txt in local is invalid. at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:274) at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:327) at org.apache.nutch.crawl.Injector.inject(Injector.java:138) at org.apache.nutch.crawl.Crawl.main(Crawl.java:105) -- View this message in context: http://www.nabble.com/0.8--Will-not-accept-url-list-file-on-Windows-tf1962714.html#a5385356 Sent from the Nutch - User forum at Nabble.com. --------------------------------- How low will we go? Check out Yahoo! Messengers low PC-to-Phone call rates. --------------------------------- Do you Yahoo!? Get on board. You're invited to try the new Yahoo! Mail Beta.
