We use Solr to inject non-nutch crawled Lucene documents into the main Nutch index. This works fine.. we can search (not using the nutch searcher) both nutch docs and the solr injected docs with one query.

However, I would like to use the IndexMerger for merging successive Nutch crawls. If one of the index directories we give bin/nutch merge has Solr-generated Lucene docs in it, we get:

2007-01-26 02:49:34,093 INFO indexer.IndexMerger - merging indexes to: crawl/index 2007-01-26 02:49:34,094 INFO indexer.IndexMerger - Adding crawl/ index_07_01_25_20_33_15/part-00000 2007-01-26 02:49:34,102 INFO indexer.IndexMerger - Adding crawl/ index/_0.fnm 2007-01-26 02:49:34,106 FATAL indexer.IndexMerger - IndexMerger: java.io.IOException: crawl/index/_0.fnm not a directory at org.apache.nutch.indexer.FsDirectory.<init> (FsDirectory.java:44) at org.apache.nutch.indexer.IndexMerger.merge (IndexMerger.java:82) at org.apache.nutch.indexer.IndexMerger.run(IndexMerger.java: 150)
        at org.apache.hadoop.util.ToolBase.doMain(ToolBase.java:189)
at org.apache.nutch.indexer.IndexMerger.main (IndexMerger.java:113)


Any way around this?




Reply via email to