I'm hoping that my emails actually reach other people, as they've been
ignored so far.
I just ran a recrawl today to crawl a few injected URLs that I have. At the
end of the recrawl I received the following error:
060623 122916 merging segment indexes to:
/home/honda/nutch-0.7.2/crawl/index
Exception in thread "main" java.io.IOException:
/home/honda/nutch-0.7.2/crawl/segments/20060619230003/index not a directory
at org.apache.lucene.store.FSDirectory.init(FSDirectory.java:180)
at
org.apache.lucene.store.FSDirectory.getDirectory(FSDirectory.java:141)
at org.apache.nutch.indexer.IndexMerger.merge(IndexMerger.java:80)
at org.apache.nutch.indexer.IndexMerger.main(IndexMerger.java:160)
Of course all of the crawled segments are not in the index.
Can ANYONE tellme how to fix this? I'm getting a bit discouraged with Nutch
due to the large number of errors I keep receiving during crawls. I do not
want to have to recrawl my entire sitelist AGAIN just to fix this.
Anyone?
Matt