Hi,
I am getting the following exception when indexing (right after adding
segments):
Exception in thread "main"
org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory
/home/user/nutch/crawl/indexes already exists
        at
org.apache.hadoop.mapred.OutputFormatBase.checkOutputSpecs(OutputFormatBase.java:96)
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:329)
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:543)
        at org.apache.nutch.indexer.Indexer.index(Indexer.java:273)
        at org.apache.nutch.crawl.Crawl.main(Crawl.java:134)

The strange thing is that it only happens some times (like every second time
or something like that), and before starting the crawler I delete the folder
/home/user/nutch/crawl

Is there anyone that knows what can be happening here and how I can fix it?
I am on a one year old nutch 0.9 and the problem just started recently.

best regards,
Magnus

Reply via email to