Re: Solr Index size keeps fluctuating, becomes ~4x normal size.

kshitij tyagi Mon, 10 Apr 2017 04:27:08 -0700

Hi Himanshu,

maxWarmingSearchers would break nothing on production. Whenever you request
solr to open a new searcher, it autowarms the searcher so that it can
utilize caching. After autowarm is complete a new searcher is opened.


The questions you need to adress here are

1. Are you using soft-commit or hard-commit? If you are using hard commit
and update frequency is high then you need to switch to soft-commit.

2. you are dealing with only 2.1 millions that is a small set but stilll
you are facing issues, why are you indexing all the fields in solr?
you need to make significant changes in schema and index only those fields
upon which you are querying and not index all the fields.

3. Check you segment count configuration in solrconfig.xml, it should not
be too high or too low as it will affect indexing speed, a high number
would give good indexing speed but a low search result.

Hope these things would help you to tune better.

Regards,
Kshitij

On Mon, Apr 10, 2017 at 1:27 PM, Himanshu Sachdeva <himan...@limeroad.com>
wrote:

> Hi Toke,
>
> Thanks for your time and quick response. As you said, I changed our logging
> level from SEVERE to INFO and indeed found the performance warning
> *Overlapping
> onDeckSearchers=2* in the logs. I am considering limiting the
> *maxWarmingSearchers* count in configuration but want to be sure that
> nothing breaks in production in case simultaneous commits do happen
> afterwards.
>
> What would happen if we set *maxWarmingSearchers* count to 1 and make
> simultaneous commit from different endpoints? I understand that solr will
> prevent opening a new searcher for the second commit but is that all there
> is to it? Does it mean solr will serve stale data( i.e. send stale data to
> the slaves) ignoring the changes from the second commit? Will these changes
> reflect only when a new searcher is initialized and will they be ignored
> till
> then? Do we even need searchers on the master as we will be querying only
> the slaves? What purpose do the searchers serve exactly? Your time and
> guidance will be very much appreciated. Thank you.
>
> On Thu, Apr 6, 2017 at 6:12 PM, Toke Eskildsen <t...@kb.dk> wrote:
>
> > On Thu, 2017-04-06 at 16:30 +0530, Himanshu Sachdeva wrote:
> > > We monitored the index size for a few days and found that it varies
> > > widely from 11GB to 43GB.
> >
> > Lucene/Solr indexes consists of segments, each holding a number of
> > documents. When a document is deleted, its bytes are not removed
> > immediately, only marked. When a document is updated, it is effectively
> > a delete and an add.
> >
> > If you have an index with 3 documents
> >   segment-0 (live docs [0, 1, 2], deleted docs [])
> > and update document 0 and 1, you will have
> >   segment-0 (live docs [2], deleted docs [0, 1])
> >   segment-1 (live docs
> > [0, 1], deleted docs [])
> > if you then update document 1 again, you will
> > have
> >   segment-0 (live docs [2], deleted docs [0, 1])
> >   segment-1 (live
> > docs [0], deleted docs [1])
> >   segment-1 (live docs [1], deleted docs [])
> >
> > for a total of ([2] + [0, 1]) + ([0] + [1]) + ([1] + []) = 6 documents.
> >
> > The space is reclaimed when segments are merged, but depending on your
> > setup and update pattern that may take some time. Furthermore there is a
> > temporary overhead of merging, when the merged segment is being written
> and
> > the old segments are still available. 4x the minimum size is fairly
> large,
> > but not unrealistic, with enough index-updates.
> >
> > > Recently, we started getting a lot of out of memory errors on the
> > > master. Everytime, solr becomes unresponsive and we need to restart
> > > jetty to bring it back up. At the same we observed the variation in
> > > index size. We are suspecting that these two problems may be linked.
> >
> > Quick sanity check: Look for "Overlapping onDeckSearchers" in your
> > solr.log to see if your memory problems are caused by multiple open
> > searchers:
> > https://wiki.apache.org/solr/FAQ#What_does_.22exceeded_limit_of_maxWarm
> > ingSearchers.3DX.22_mean.3F
> > --
> > Toke Eskildsen, Royal Danish Library
> >
>
>
>
> --
> Himanshu Sachdeva
>

Re: Solr Index size keeps fluctuating, becomes ~4x normal size.

Reply via email to