On 2/3/2020 5:17 PM, ChienHua wrote:
What should we expect the query performance impacted by splitting one
collection into more shards?

We expect the query performance would degrade by splitting more shards since
the overhead of merging results from several shards.

However, the test result seems not as we expect. Any idea or experience for
the performance impact?

This is a often misunderstood aspect of Solr performance.

In situations with a very high query rate, splitting into shards is generally going to reduce performance. This happens because as you mentioned, there is overhead from merging the results. A high query rate will keep all the CPUs very busy.

But in situations with a low query rate, more shards can actually make things faster. This is a possibility when there is a significant surplus of available CPU capacity ... the subqueries for one query can complete concurrently, so even with the overhead of merging, the overall result is faster.

The size of the index can also affect this dynamic. If you take an index that is way too big for a single machine and split it so it has shards on multiple machines, that can improve query performance dramatically.

Thanks,
Shawn

Reply via email to