Re: Searching performance suffers tremendously during indexing

François Schiettecatte Sun, 01 May 2011 09:16:32 -0700

If you are on linux, I would recommend two tools you can use to track what is 
going on on the machine, atop ( http://freshmeat.net/projects/atop/ ) and dstat 
( http://freshmeat.net/projects/dstat/ ).


atop in particular has been very useful to me in tracking down performance 
issues in real time (when I am running a process) or at random intervals (when 
the machine slows down for no apparent reason.

From the little you have told us my hunch is that you are saturating a disk 
somewhere, either the index disk or swap (as pointed out by Mike)

Cheers

François

On May 1, 2011, at 9:54 AM, Michael McCandless wrote:

> Committing too frequently is very costly, since this calls fsync on
> numerous files under-the-hood, which strains the IO system and can cut
> into queries. If you really want to commit frequently, turning on compound
> file format could help things, since that's 1 file to fsync instead of N, per
> segment.
> 
> Also, if you have a large merge running (turning on IW's infoStream
> will tell you), this can cause the OS to swap pages out, unless you
> set swappiness (if you're on Linux) to 0.
> 
> Finally, beware of having too-large a JVM max heap; you may accumulate
> long-lived, uncollected garbage, which the OS may happily swap out
> (since the pages are never touched), which then kills performance when
> GC finally runs.  I describe this here:
> http://blog.mikemccandless.com/2011/04/just-say-no-to-swapping.html
> It's good to leave some RAM for the OS to use as IO cache.
> 
> Ideally, merging should not evict pages from the OS's buffer cache,
> but unfortunately the low-level IO flags to control this (eg
> fadvise/madvise) are not available in Java (I wrote about that here:
> http://blog.mikemccandless.com/2010/06/lucene-and-fadvisemadvise.html).
> 
> However, we have a GCoC student this summer working on the problem
> (see https://issues.apache.org/jira/browse/LUCENE-2795), so after this
> is done we'll have a NativeUnixDirectory impl that hopefully prevents
> buffer cache eviction due to merging without you having to tweak
> swappiness settings.
> 
> Mike
> 
> http://blog.mikemccandless.com
> 
> On Sat, Apr 30, 2011 at 9:23 PM, Craig Stires <craig.sti...@gmail.com> wrote:
>> 
>> Daniel,
>> 
>> I've been able to post documents to Solr without degrading the performance
>> of search.  But, I did have to make some changes to the solrconfig.xml
>> (ramBufferSize, mergeFactor, autoCommit, etc).
>> 
>> What I found to be helpful was having a look at what was the causing the OS
>> to grind.  If your system is swapping too much to disk, you can check if
>> bumping up the ram (-Xms512m -Xmx1024m) alleviates it.  Even if this isn't
>> the fix, you can at least isolate if it's a memory issue, or if your issue
>> is related to a disk I/O issue (e.g. running optimization on every commit).
>> 
>> 
>> Also, is worth having a look in your logs to see if the server is having
>> complaints about memory or issues with your schema, or some other unexpected
>> issue.
>> 
>> A resource that has been helpful for me
>> http://wiki.apache.org/solr/SolrPerformanceFactors
>> 
>> 
>> 
>> 
>> 
>> 
>> -----Original Message-----
>> From: Daniel Huss [mailto:hussdl1985-solrus...@yahoo.de]
>> Sent: Sunday, 1 May 2011 5:35 AM
>> To: solr-user@lucene.apache.org
>> Subject: Searching performance suffers tremendously during indexing
>> 
>> Hi everyone,
>> 
>> our Solr-based search is unresponsive while documents are being indexed.
>> The documents to index (results of a DB query) are sent to Solr by a
>> daemon in batches of varying size. The number of documents per batch may
>> vary between one and several hundreds of thousands.
>> 
>> Before investigating any further, I would like to ask if this can be
>> considered an issue at all. I was expecting Solr to handle concurrent
>> indexing/searching quite well, in fact this was one of the main reasons
>> for chosing Solr over the searching capabilities of our RDMS.
>> 
>> Is searching performance *supposed* to drop while documents are being
>> indexed?
>> 
>>

Re: Searching performance suffers tremendously during indexing

Reply via email to