nutch expect "urls" to be a directory. create a directory "urls" and create in this directory a file called like you want and edit this file, add the urls you want to crawl.
Injector: urlDir: urls Input path doesnt exist : C:/cygwin/home/Mouad&Sibel/nutch-0.9/urls Mouad schrieb: > Hello, > i installed Nutch on windows and everything went well until I wanted to > crawl a website. > I typed this line on the urls file that I created on nutch directory : echo > 'http://dawahweb.net' > urls > I could not create a WebDB trying to type admin db -create > I received this log : > crawl started in: crawl-tinysite > rootUrlDir = urls > threads = 10 > depth = 1 > Injector: starting > Injector: crawlDb: crawl-tinysite/crawldb > Injector: urlDir: urls > Injector: Converting injected urls to crawl db entries. > Exception in thread "main" org.apache.hadoop.mapred.InvalidInputException: > Input path doesnt exist : C:/cygwin/home/Mouad&Sibel/nutch-0.9/urls > at > org.apache.hadoop.mapred.InputFormatBase.validateInput(InputFormatBase.java:138) > at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:326) > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:543) > at org.apache.nutch.crawl.Injector.inject(Injector.java:162) > at org.apache.nutch.crawl.Crawl.main(Crawl.java:115) > > can anyone help please? > > Mouad >