Hi folks: I am working on an application that requires real time indexing, e.g. for every insert, I open the writer, add a document and then closes the writer.
I want to control the number of files created, and according to the documentation, a small mergeFactor is desired. However, I am experiencing the opposite, see the following code segment: public static void main(String[] args) throws IOException{ int mfactor=10; int mbuffer=1000; IndexModifier writer=null; File dir=new File("/tmp/john/"); long start=System.currentTimeMillis(); for (int i=0;i<5000;++i){ try{ boolean create=!IndexReader.indexExists(dir); writer=new IndexModifier(dir,new StandardAnalyzer(),create); writer.setMergeFactor(mfactor); writer.setMaxBufferedDocs(mbuffer); Document doc=new Document(); doc.add(new Field("test","this is a test doc", Field.Store.YES,Field.Index.TOKENIZED,Field.TermVector.YES)); writer.addDocument(doc); } finally{ if (writer!=null){ writer.close(); } } } long end=System.currentTimeMillis(); System.out.println("took: "+(end-start)); } If I set the mfactor value to a high number, e.g. 1000, indexing takes much longer but the number of files decreases dramatically. Is this expected or are there any better ways of tuning the indexing parameters so that I limit the number of open files while gettting a decent indexing speed? Thanks -John