I'm hoping that my emails actually reach other people, as they've been ignored so far.

I just ran a recrawl today to crawl a few injected URLs that I have. At the end of the recrawl I received the following error:

060623 122916 merging segment indexes to: /home/honda/nutch-0.7.2/crawl/index Exception in thread "main" java.io.IOException: /home/honda/nutch-0.7.2/crawl/segments/20060619230003/index not a directory
       at org.apache.lucene.store.FSDirectory.init(FSDirectory.java:180)
at org.apache.lucene.store.FSDirectory.getDirectory(FSDirectory.java:141)
       at org.apache.nutch.indexer.IndexMerger.merge(IndexMerger.java:80)
       at org.apache.nutch.indexer.IndexMerger.main(IndexMerger.java:160)

Of course all of the crawled segments are not in the index.

Can ANYONE tellme how to fix this? I'm getting a bit discouraged with Nutch due to the large number of errors I keep receiving during crawls. I do not want to have to recrawl my entire sitelist AGAIN just to fix this.

Anyone?

Matt

Reply via email to