Were there errors during parsing of that last segment?
> I'm starting with nutch and I ran a simple job as described in the > nutch tutorial. After a while I get the following error: > > > CrawlDb update: URL filtering: true > CrawlDb update: Merging segment data into db. > CrawlDb update: finished at 2011-07-12 12:32:03, elapsed: 00:00:03 > LinkDb: starting at 2011-07-12 12:32:03 > LinkDb: linkdb: /Users/toom/Downloads/nutch-1.3/sites/linkdb > LinkDb: URL normalize: true > LinkDb: URL filter: true > LinkDb: adding segment: > file:/Users/toom/Downloads/nutch-1.3/sites/segments/20110707140238 > LinkDb: adding segment: > file:/Users/toom/Downloads/nutch-1.3/sites/segments/20110712113732 > LinkDb: adding segment: > file:/Users/toom/Downloads/nutch-1.3/sites/segments/20110712114256 > LinkDb: adding segment: > file:/Users/toom/Downloads/nutch-1.3/sites/segments/20110712122856 > LinkDb: adding segment: > file:/Users/toom/Downloads/nutch-1.3/sites/segments/20110712122908 > LinkDb: adding segment: > file:/Users/toom/Downloads/nutch-1.3/sites/segments/20110712123051 > Exception in thread "main" > org.apache.hadoop.mapred.InvalidInputException: Input path does not > exist: > file:/Users/toom/Downloads/nutch-1.3/sites/segments/20110707140238/parse_d > ata Input path does not exist: > file:/Users/toom/Downloads/nutch-1.3/sites/segments/20110712113732/parse_da > ta Input path does not exist: > file:/Users/toom/Downloads/nutch-1.3/sites/segments/20110712114256/parse_da > ta at > org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:1 > 90) at > org.apache.hadoop.mapred.SequenceFileInputFormat.listStatus(SequenceFileIn > putFormat.java:44) at > org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:20 > 1) at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:810) > at > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:781) > at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730) at > org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1249) at > org.apache.nutch.crawl.LinkDb.invert(LinkDb.java:175) > at org.apache.nutch.crawl.LinkDb.invert(LinkDb.java:149) > at org.apache.nutch.crawl.Crawl.run(Crawl.java:142) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > at org.apache.nutch.crawl.Crawl.main(Crawl.java:54)

