> From: Winton Davies [mailto:[EMAIL PROTECTED]] > (2) How can i avoid the FD problem? I know about parallelizing the > indexing, but I'd like to get an efficient single index before doing > that ? If I could set the Merge Factor up real high, then I think I'd > be able to work
Assume that you can comfortably hold a 100,000 document index in RAM. You might try something like: IndexWriter writer = new IndexWriter(...); writer.mergeFactor = 100000; writer.maxMergeDocs = 100000; ... add all your documents ... writer.mergeFactor = 100; writer.maxMergeDocs = Integer.MAX_VALUE; writer.optimize(); writer.close(); The initial indexes created for single documents are created in a RAMDirectory. Setting mergeFactor == maxMergeDocs means that it will only do RAM->FS merging, not FS->FS merging, so very few file handles are used. A more efficient and slightly more complex approach would be to build large indexes in RAM, and copy them to disk with IndexWriter.addIndexes: IndexWriter fsWriter = new IndexWriter(new File(...), analyzer, true); while (... more docs to index...) RAMDirectory ramDir = new RAMDirectory(); IndexWriter ramWriter = new IndexWriter(ramDir, analyzer, true); ... add 100,000 docs to ramWriter ... ramWriter.optimize(); ramWriter.close(); fsWriter.addIndexes(new Directory[] { ramDir }); } fsWriter.optimize(); fsWriter.close(); This is broken in the release. Instead use the nightly build to try this. If you try these, please report back on how well they work. Doug _______________________________________________ Lucene-users mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/lucene-users