On 3/19/2013 2:31 PM, Brian Hurt wrote:
Which is the problem- you might think that 60ms unique key accesses
(what I'm seeing) is more than good enough- and for most use cases,
you'd be right.  But it's not unusual for a single web-page hit to
generate many dozens, if not low hundreds, of calls to get document by
id.  At which point, 60ms hits pile up fast.

I have to concur with Jack's assessment that 60ms may indicate a general performance issue, possibly caused by not having enough memory in your server.

I've got a distributed index with 77 million documents in it, seven shards, total index size about 85GB. It's running 4.2.

I tried some uncached unique id queries on it. This search kicks off seven shard searches against two servers, collates the results, then returns them to the browser. The results came back with a QTime of 7-8 milliseconds. When I try a different uncached query against one of the shard servers directly (14GB index size), the QTime value is zero.

I have this performance level because I have plenty of extra RAM, which lets the OS cache the index files effectively. Each server has half the index (over 40GB on disk) and 64GB of RAM. Of that 64GB, 6GB is allocated to Solr. If we say the OS takes up 1GB (which it most likely does not), that leaves 57GB of OS disk cache. Java's garbage collector is highly tuned in my setup, because without it, I experience very long GC pauses.


Here's some additional info that may or may not be useful to you:

The BloomFilter postings format for Lucene is rumored to have amazing performance improvements for searching unique keys.

An obstacle: Solr does not currently have an out-of-the-box way to actually use it. A high-level solution has been proposed, but no code has been written yet. The following issue describes the current state:

https://issues.apache.org/jira/browse/SOLR-3950

You could always write your own custom postings format instead of waiting for someone (most likely me) to figure out how to go about including it directly in Solr. If you do this, I hope you'll be able to attach your code to the issue so everyone benefits.

Thanks,
Shawn

Reply via email to