We use Solr to inject non-nutch crawled Lucene documents into the
main Nutch index. This works fine.. we can search (not using the
nutch searcher) both nutch docs and the solr injected docs with one
query.
However, I would like to use the IndexMerger for merging successive
Nutch crawls. If one of the index directories we give bin/nutch merge
has Solr-generated Lucene docs in it, we get:
2007-01-26 02:49:34,093 INFO indexer.IndexMerger - merging indexes
to: crawl/index
2007-01-26 02:49:34,094 INFO indexer.IndexMerger - Adding crawl/
index_07_01_25_20_33_15/part-00000
2007-01-26 02:49:34,102 INFO indexer.IndexMerger - Adding crawl/
index/_0.fnm
2007-01-26 02:49:34,106 FATAL indexer.IndexMerger - IndexMerger:
java.io.IOException: crawl/index/_0.fnm not a directory
at org.apache.nutch.indexer.FsDirectory.<init>
(FsDirectory.java:44)
at org.apache.nutch.indexer.IndexMerger.merge
(IndexMerger.java:82)
at org.apache.nutch.indexer.IndexMerger.run(IndexMerger.java:
150)
at org.apache.hadoop.util.ToolBase.doMain(ToolBase.java:189)
at org.apache.nutch.indexer.IndexMerger.main
(IndexMerger.java:113)
Any way around this?
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general