Hello, I am experimenting with solr distributed search/random sharding (currently use geo sharding), hope to gain some performance and also scalability in the future. (index size keep growing and geo shard is hard to scale)
However I'm seeing worse performance with distributed search, on a testing server of 6 shards, 15 core cpu, 24G mem, index size is about 8G on each shard. With geo sharding it can easily take 150 QPS load with good response time. Now with distribute search, there are timeout and average response time also inreases. This is probably no big surprise since I'm using same amount of shards and plus overhead of distribute search/merge/http network etc. When I look into details (slow queries), I found some real issues that I need help with. For example, a query which takes 200ms with geo sharding, now timeout (>2000ms) with distributed search. And each shard query (isShard=true) takes about 1200ms. But if I run the query toward the shard only (without distributed search), it only takes <200ms. So I compared the two query urls, the only difference is shard query using distribute search has "fsv=true". I understand field sort values are need during merge process, but didn't expect that'll make this much difference in performance, although we do have lot of sort orders (about 20 different sort orders). Any suggestion/comment on the performance problem I'm having with distributed search? Is distributed search the right choice for me? What other setup/idea I can try? thanks, XJ