Re: Real Time Search and External File Fields

Shawn Heisey Sun, 09 Oct 2016 07:28:37 -0700

On 10/8/2016 1:18 PM, Mike Lissner wrote:
> I want to make sure I understand this properly and document this for
> futurepeople that may find this thread. Here's what I interpret your
> advice to be:
> 0. Slacken my auto soft commit interval to something more like a minute.


Yes, I would do this.  I would also increase autoCommit to something
between one and five minutes, with openSearcher set to false.  There's
nothing *wrong* with 15 seconds for autoCommit, but I want my server to
be doing less work during normal operation.

To answer a question you posed in a later message: Yes, it's common for
users to have a longer interval on autoSoftCommit than autoCommit. 
Remember the mantra in the URL about understanding commits:  Hard
commits are about durability, soft commits are about visibility.  Hard
commits when openSearcher is false are almost always *very* fast, so
it's typically not much of a burden to have them happen more frequently,
and thus have a better data durability guarantee.  Like I said above, I
generally use an autoCommit value between one and five minutes.

> I'm a bit confused about the example autowarmcount for the caches, which is
> 0. Why not set this to something higher? I guess it's a RAM utilization vs.
> speed tradeoff? A low number like 16 seems like it'd have minimal impact on
> RAM?

A low autowarmCount is generally chosen for one reason: commit speed. 
If the example configs have it set to zero, I'm sure this was done so
commits would proceed as fast as possible.  Large values can turn
opening a new searcher into a process that can take *minutes*.

On my index shards, the autowarmCount on my filterCache is *four*. 
That's it -- execute only four of the most recent filters in the cache
when a new searcher opens.  That warming *still* sometimes takes as long
as 20 seconds on the larger shards.  The filters used in queries on my
indexes are very large and very complex, and can match millions of
documents.  Pleading with the dev team to decrease query complexity
doesn't help.

On the idea of reusing the external file data when it doesn't change:  I
do not know if this is possible.  I have no idea how Solr and Lucene use
the data found in the external file, so it might be completely necessary
to re-load it every time.  You can open an issue in Jira to explore the
idea, but don't be too surprised if it doesn't go anywhere.

Thanks,
Shawn

Re: Real Time Search and External File Fields

Reply via email to