On Tue, 2011-07-05 at 17:50 +0200, Hiller, Dean x66079 wrote:
> We are using a sort of nosql environment and deleting 200 gig on one machine 
> from the database is fast, but then we go and delete 5 gigs of indexes that 
> were created and it takes forever!!!!

8 million indexes is at a minimum 16 (24?) million files. If you are
using a conventional harddisk for that, then yes, it takes forever.
SSD is the answer to that problem, but then again, SSD is the answer to
most IO-performance problems.

Just a quick sanity check: I hope you are not storing the individual
index folders under the same root folder? If you have
indexes/index0000001/
indexes/index0000002/
indexes/index0000003/
...
indexes/index8000000/
in the same folder, you are asking for trouble since most file systems
don't perform well with folders with millions of entries. If that is the
case, split them in sub folders for every X order of magnitude, such as
indexes/000/000/index0000001/
indexes/000/000/index0000002/
indexes/000/000/index0000003/
...
indexes/000/500/index0500001/
...
indexes/008/000/index8000000/


Having that many tiny indexes sets off an alarm bell for me. That's
quite a special use of Lucene you've got going.

> Is there any option in lucene to make it so it uses LARGER files and less 
> count of files so it is easier to maintain and wipe out an index much faster?

Use compound files, optimize to single segment. If I understand
correctly, your indexes are tiny, so this should not give any noticeable
performance hit.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to