Hello
I have indexed several thousand (52 to be exact) text files and I keep running out of 
disk space to store the indexes.  The size of the documents I have indexed is around 
2.5 GB.  The size of the Lucene indexes is around 287 GB.  Does this seem correct?  I 
am not storing the contents of the file, just indexing and tokenizing.  I am using 
Lucene 1.3 final.  Can you guys let me know what you are experiencing?  I don't want 
to go into production with something that I should be configuring better.  

I am not sure if this helps, but I have a temp index and a real index.  I index the 
file into the temp index, and then merge the temp index into the real index using the 
addIndexes method on the IndexWriter.  I have also set the production writer 
setUseCompoundFile to true.  I did not set this on the temp index.  The last thing 
that I do before closing the production writer is to call the optimize method.  

I would really appreciate any ideas to get the index size smaller if it is at all 
possible.

Thanks
Rob

Reply via email to