Re: index size before and after commit

Grant Ingersoll Thu, 01 Oct 2009 06:26:15 -0700

It may take some time before resources are released and garbagecollected, so that may be part of the reason why things hang aroundand du doesn't report much of a drop.


On Oct 1, 2009, at 8:54 AM, Phillip Farber wrote:

I am trying to automate a build process that adds documents to 10shards over 5 machines and need to limit the size of a shard to nomore than 200GB because I only have 400GB of disk available tooptimize a given shard.
Why does the size (du) of an index typically decrease after acommit? I've observed a decrease in size of as much as from 296GBdown to 151GB or as little as from 183GB to 182GB. Is that sizeafter a commit close to the size the index would be after anoptimize? For that matter, are there cases where optimization cantake more than 2x? I've heard of cases but have not observed themin my system.

I seem to recall a case where it can be 3x, but I don't know that ithas been observed much.

I only do adds to the shards, never query them. An LVM snapshot ofthe shard receives the queries.
Is doing a commit before I take a du a reliable way to gauge thesize of the shard? It is really bad news to allow a shard to goover 200GB in my use case. How do others manage this problem of 2xspace needed to optimize with "limited" dosk space?


Do you need to optimize at all?


--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)using Solr/Lucene:

http://www.lucidimagination.com/search

Re: index size before and after commit

Reply via email to