Re: Nutch 0.8.1 problems

Doğacan Güney Wed, 21 Feb 2007 07:38:26 -0800

On 2/21/07, Oleg V. Konovalov <[EMAIL PROTECTED]> wrote:

Thanx, but... As I wrote earlier, - I've tried MANY WAYS, including recommended.


For example:

bin/nutch generate /nutch/filesystem/crawl/crawldb 
/nutch/filesystem/crawl/segments
Generator: starting
Generator: segment: /nutch/filesystem/crawl/segments/20070221175753
Generator: Selecting best-scoring urls due for fetch.
Exception in thread "main" java.io.IOException: Input directory 
/nutch/filesystem/crawl/crawldb/current in localhost:9000 is invalid.
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:274)
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:327)
        at org.apache.nutch.crawl.Generator.generate(Generator.java:319)
        at org.apache.nutch.crawl.Generator.main(Generator.java:395)

/nutch/filesystem/crawl/crawldb/current EXISTS!


Very strange. I am not sure what the problem is then. Can you include
the output of commands:

hadoop dfs -ls /nutch/filesystem/crawl/
hadoop dfs -ls /nutch/filesystem/crawl/crawldb


Any other ideas?

--
Oleg.



--
Doğacan Güney

Re: Nutch 0.8.1 problems

Reply via email to