Thanks Erick. This helps.

On 3/16/16 10:11 AM, Erick Erickson wrote:
First of all, "optimize-like" does _not_ happen
"every time a commit happens". What _does_ happen
is the current state of the index is examined and if
certain conditions are met _then_ segment
merges happen. Think of these as "partial optimizes".

This is under control of the TieredMergePolicy by
default.

There are limits placed on the number of simultaneous
merges that can happen, and they're all done in
background threads so you should see lots of I/O,
but the priority of those threads is low so it shouldn't
have  much impact on query perf.

It's theoretically possible that the background merge
will merge down to one segment, so you still need at
least as much free space on your disk and your index
occupies.

Best,
Erick


On Wed, Mar 16, 2016 at 10:07 AM, Rallavagu <rallav...@gmail.com> wrote:
Erick, Thanks for the response. Comments in line...

On 3/16/16 9:56 AM, Erick Erickson wrote:

In general, don't bother with optimize unless the index is quite static,
i.e. there are very few adds/updates or those updates are done in
batches and rarely (i.e. once a day or less frequently).

As far as space, this will require that you have at _least_ as much
free space on your disks as your index occupies. Shouldn't require
much in the way of RAM though.

Optimize, also referred to as "Force Merge" will merge all the segments
down to one, and in the process reclaim data from deleted (or updated)
documents.

The thing is, this is also accomplished by "background merging" which
happens automatically. Every time you do a hard commit, Lucene
figures out if any segments need to be merged and does that automatically.
During that process, any information associated with deleted docs is
reclaimed.

If "optimize" like operation happening automatically every time a hard
commit happens, with following settings (15 seconds for hard commit) what
would be impact on performance particularly on disk space?

<autoCommit>
        <maxTime>${solr.autoCommit.maxTime:15000}</maxTime>
        <openSearcher>false</openSearcher>
      </autoCommit>

Thanks.


The third video down here:

http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html
is Mikes visualization of the automatic merging process.

Best,
Erick

On Wed, Mar 16, 2016 at 9:40 AM, Rallavagu <rallav...@gmail.com> wrote:

All,

Solr 5.4 with emdbedded Jetty (4G heap)

Trying to understand behavior of "optimize" operation if not run
explicitly.
What is the frequency at which this operation is run, what are the
storage
requirements and how do we schedule it? Any comments/pointers would
greatly
help.

Thanks in advance

Reply via email to