I install nutch on my window follows the tutorial 

when I try to crawl I run into following problem

I really don't know what need to be done now  

-----------------------------------------------------------------------------------------------------

$ ./nutch crawl urls -dir crawl -depth 1 -topN 10 -threads 1
CrawlDb update: done
LinkDb: starting
LinkDb: linkdb: crawl/linkdb
LinkDb: URL normalize: true
LinkDb: URL filter: true
LinkDb: adding segment: file:/C:/apache-nutch-1.1-bin/apache-nutch-1.1-bin/bin/c
rawl/segments/20100724175909
LinkDb: adding segment: file:/C:/apache-nutch-1.1-bin/apache-nutch-1.1-bin/bin/c
rawl/segments/20100724180451
LinkDb: adding segment: file:/C:/apache-nutch-1.1-bin/apache-nutch-1.1-bin/bin/c
rawl/segments/20100724180657
LinkDb: adding segment: file:/C:/apache-nutch-1.1-bin/apache-nutch-1.1-bin/bin/c
rawl/segments/20100724180758
LinkDb: adding segment: file:/C:/apache-nutch-1.1-bin/apache-nutch-1.1-bin/bin/c
rawl/segments/20100724180914
LinkDb: adding segment: file:/C:/apache-nutch-1.1-bin/apache-nutch-1.1-bin/bin/c
rawl/segments/20100724183632
LinkDb: adding segment: file:/C:/apache-nutch-1.1-bin/apache-nutch-1.1-bin/bin/c
rawl/segments/20100724183845
LinkDb: adding segment: file:/C:/apache-nutch-1.1-bin/apache-nutch-1.1-bin/bin/c
rawl/segments/20100725000142
Exception in thread "main" org.apache.hadoop.mapred.InvalidInputException: Input
 path does not exist: file:/C:/apache-nutch-1.1-bin/apache-nutch-1.1-bin/bin/cra
wl/segments/20100724175909/parse_data
Input path does not exist: file:/C:/apache-nutch-1.1-bin/apache-nutch-1.1-bin/bi
n/crawl/segments/20100724180451/parse_data
Input path does not exist: file:/C:/apache-nutch-1.1-bin/apache-nutch-1.1-bin/bi
n/crawl/segments/20100724183632/parse_data
        at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.j
ava:190)
        at org.apache.hadoop.mapred.SequenceFileInputFormat.listStatus(SequenceF
ileInputFormat.java:44)
        at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.ja
va:201)
        at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:810)

        at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:7
81)
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730)
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1249)
        at org.apache.nutch.crawl.LinkDb.invert(LinkDb.java:170)
        at org.apache.nutch.crawl.LinkDb.invert(LinkDb.java:147)
        at org.apache.nutch.crawl.Crawl.main(Crawl.java:140)

 

-------------------------------------------------------------------

nutch is deployed on windows xp+cygwin 2.67+jre1.6
                                          
_________________________________________________________________
Your E-mail and More On-the-Go. Get Windows Live Hotmail Free.
https://signup.live.com/signup.aspx?id=60969

Reply via email to