On 2/21/07, Oleg V. Konovalov <[EMAIL PROTECTED]> wrote:
Thanx, but... As I wrote earlier, - I've tried MANY WAYS, including recommended.For example: bin/nutch generate /nutch/filesystem/crawl/crawldb /nutch/filesystem/crawl/segments Generator: starting Generator: segment: /nutch/filesystem/crawl/segments/20070221175753 Generator: Selecting best-scoring urls due for fetch. Exception in thread "main" java.io.IOException: Input directory /nutch/filesystem/crawl/crawldb/current in localhost:9000 is invalid. at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:274) at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:327) at org.apache.nutch.crawl.Generator.generate(Generator.java:319) at org.apache.nutch.crawl.Generator.main(Generator.java:395) /nutch/filesystem/crawl/crawldb/current EXISTS!
Very strange. I am not sure what the problem is then. Can you include the output of commands: hadoop dfs -ls /nutch/filesystem/crawl/ hadoop dfs -ls /nutch/filesystem/crawl/crawldb
Any other ideas? -- Oleg.
-- Doğacan Güney
