From: Doug Cutting
http://www.mail-archive.com/[EMAIL PROTECTED]/msg08757.html

> An index typically requires around 35% of the plain text size.

I think it's a little big.

sv

On Wed, 18 Aug 2004, Rob Jose wrote:

> Hello
> I have indexed several thousand (52 to be exact) text files and I keep 
> running out of disk space to store the indexes.  The size of the 
> documents I have indexed is around 2.5 GB.  The size of the Lucene 
> indexes is around 287 GB.  Does this seem correct?  I am not storing the 
> contents of the file, just indexing and tokenizing.  I am using Lucene 
> 1.3 final.  Can you guys let me know what you are experiencing?  I don't 
> want to go into production with something that I should be configuring 
> better.  
> 
> I am not sure if this helps, but I have a temp index and a real index.  I index the 
> file into the temp index, and then merge the temp index into the real index using 
> the addIndexes method on the IndexWriter.  I have also set the production writer 
> setUseCompoundFile to true.  I did not set this on the temp index.  The last thing 
> that I do before closing the production writer is to call the optimize method.  
> 
> I would really appreciate any ideas to get the index size smaller if it is at all 
> possible.
> 
> Thanks
> Rob


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to