Re: SolrCloud - Query performance degrades with multiple servers(Shards)

Erick Erickson Mon, 18 Jul 2016 08:39:31 -0700

+1 to Susheel's question. Sharding inevitably adds
overhead. Roughly each shard is queried
for its top N docs (10 if, say, rows=10). The
doc ID and sort criteria (score by default) are returned
to the node that originally got the request. That node
then sorts the lists into the real top 10 to return to
the user. Then the node handling the request re-queries
the shards for the contents of those docs.


Sharding is a way to handle very large data sets, the
general recommendation is to shard _only_ when you
have too many documents to get good query perf
from a single shard.

If you need to increase QPS, add _replicas_ not shards.
Only go to sharding when you have too many documents
fit on your hardware.

Best,
Erick

On Mon, Jul 18, 2016 at 6:31 AM, Susheel Kumar <susheel2...@gmail.com> wrote:
> Hello,
>
> Question:  Do you really need sharding/can live without sharding since you
> mentioned only 10K records in one shard. What's your index/document size?
>
> Thanks,
> Susheel
>
> On Mon, Jul 18, 2016 at 2:08 AM, kasimjinwala <jinwala.ka...@gmail.com>
> wrote:
>
>> currently I am using solrCloud 5.0 and I am facing query performance issue
>> while using 3 implicit shards, each shard contain around 10K records.
>> when I am specifying shards parameter(*shards=shard1*) in query it gives
>> 30K-35K qps. but while removing shards parameter from query it give
>> *1000-1500qps*. performance decreases drastically.
>>
>> please provide comment or suggestion to solve above issue
>>
>>
>>
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660p4287600.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>

Re: SolrCloud - Query performance degrades with multiple servers(Shards)

Reply via email to