The syntax for the crawl command is Crawl <urlDir> [-dir d] [-threads n] [-depth i] [-topN N]
So your first parameter should point to the *directory* containing the file with seed urls, not the file itself. Please fix your syntax and try again. Rgrds, Thomas On 6/3/06, Teruhiko Kurosaka <[EMAIL PROTECTED]> wrote: > I tried to run the May 31 version of the nightly build but it failed. > It has something to do with the "job", which I thought would not be > needed > if I just need to run on a regular file system. Why does Nutch try to > use Hadoop in the default configuration? Is it necessary? > > -kuro > > $ ./bin/nutch crawl test/thoreau-url.txt -dir test/thoreau-index -depth > 2 > 060602 170942 parsing > jar:file:/C:/opt/nutch-060531/lib/hadoop-0.2.1.jar!/hadoop-default.xml > 060602 170942 parsing file:/C:/opt/nutch-060531/conf/nutch-default.xml > 060602 170942 parsing file:/C:/opt/nutch-060531/conf/crawl-tool.xml > 060602 170942 parsing > jar:file:/C:/opt/nutch-060531/lib/hadoop-0.2.1.jar!/mapred-default.xml > 060602 170942 parsing > jar:file:/C:/opt/nutch-060531/lib/hadoop-0.2.1.jar!/mapred-default.xml > 060602 170942 parsing file:/C:/opt/nutch-060531/conf/nutch-site.xml > 060602 170942 parsing file:/C:/opt/nutch-060531/conf/hadoop-site.xml > 060602 170942 crawl started in: test/thoreau-index > 060602 170942 rootUrlDir = test/thoreau-url.txt > 060602 170942 threads = 10 > 060602 170942 depth = 2 > 060602 170942 Injector: starting > 060602 170942 Injector: crawlDb: test/thoreau-index/crawldb > 060602 170942 Injector: urlDir: test/thoreau-url.txt > 060602 170942 Injector: Converting injected urls to crawl db entries. > 060602 170942 parsing > jar:file:/C:/opt/nutch-060531/lib/hadoop-0.2.1.jar!/hadoop-default.xml > 060602 170942 parsing file:/C:/opt/nutch-060531/conf/nutch-default.xml > 060602 170942 parsing file:/C:/opt/nutch-060531/conf/crawl-tool.xml > 060602 170942 parsing > jar:file:/C:/opt/nutch-060531/lib/hadoop-0.2.1.jar!/mapred-default.xml > 060602 170942 parsing > jar:file:/C:/opt/nutch-060531/lib/hadoop-0.2.1.jar!/mapred-default.xml > 060602 170942 parsing > jar:file:/C:/opt/nutch-060531/lib/hadoop-0.2.1.jar!/mapred-default.xml > 060602 170942 parsing > jar:file:/C:/opt/nutch-060531/lib/hadoop-0.2.1.jar!/mapred-default.xml > 060602 170942 parsing file:/C:/opt/nutch-060531/conf/nutch-site.xml > 060602 170943 parsing file:/C:/opt/nutch-060531/conf/hadoop-site.xml > 060602 170948 parsing > jar:file:/C:/opt/nutch-060531/lib/hadoop-0.2.1.jar!/hadoop-default.xml > 060602 170948 parsing file:/C:/opt/nutch-060531/conf/nutch-default.xml > 060602 170948 parsing file:/C:/opt/nutch-060531/conf/crawl-tool.xml > 060602 170948 parsing > jar:file:/C:/opt/nutch-060531/lib/hadoop-0.2.1.jar!/mapred-default.xml > 060602 170948 parsing > jar:file:/C:/opt/nutch-060531/lib/hadoop-0.2.1.jar!/mapred-default.xml > 060602 170948 parsing > jar:file:/C:/opt/nutch-060531/lib/hadoop-0.2.1.jar!/mapred-default.xml > 060602 170948 parsing > jar:file:/C:/opt/nutch-060531/lib/hadoop-0.2.1.jar!/mapred-default.xml > 060602 170948 parsing > jar:file:/C:/opt/nutch-060531/lib/hadoop-0.2.1.jar!/mapred-default.xml > 060602 170948 parsing file:/C:/opt/nutch-060531/conf/nutch-site.xml > 060602 170948 parsing file:/C:/opt/nutch-060531/conf/hadoop-site.xml > 060602 170948 parsing > jar:file:/C:/opt/nutch-060531/lib/hadoop-0.2.1.jar!/hadoop-default.xml > 060602 170948 parsing > jar:file:/C:/opt/nutch-060531/lib/hadoop-0.2.1.jar!/mapred-default.xml > 060602 170948 parsing > /tmp/hadoop/mapred/local/localRunner/job_7rvt51.xml > 060602 170948 parsing file:/C:/opt/nutch-060531/conf/hadoop-site.xml > 060602 170948 Running job: job_7rvt51 > 060602 170948 job_7rvt51 > java.io.IOException: No input directories specified in: Configuration: > defaults: > hadoop-default.xml , mapred-default.xml , > /tmp/hadoop/mapred/local/localRunner/job7rvt51.xmlfinal: hadoop-site.xml > at > org.apache.hadoop.mapred.InputFormatBase.listPaths(InputFormatBase.java: > 96) > at > org.apache.hadoop.mapred.InputFormatBase.getSplits(InputFormatBase.java: > 106) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:80) > java.io.IOException: Job failed! > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:341) > at org.apache.nutch.crawl.Injector.inject(Injector.java:130) > at org.apache.nutch.crawl.Crawl.main(Crawl.java:104) > Exception in thread "main" > _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
