Shard timeouts on large (1B docs) Solr cluster

Jay Hill Thu, 26 Jan 2012 10:28:32 -0800

I'm on a project where we have 1B docs sharded across 20 servers. We're not
in production yet and we're doing load tests now. We're sending load to hit
100qps per server. As the load increases we're seeing query times
sporadically increasing to 10 seconds, 20 seconds, etc. at times. What
we're trying to do is set a shard timeout so that responses longer than 2
seconds are discarded. We can live with less results in these cases. We're
not replicating yet as we want to see how the 20 shards perform first (plus
we're waiting on the massive amount of hardware)


I've tried setting the following config in our default req. handler:
<int name="shard-socket-timeout">2000</int>
<int name="shard-connection-timeout">2000</int>

I've just added these, and am testing now, but this doesn't look promising
either:
<int name="timeAllowed">2000</int>
<bool name="partialResults">true</bool>

Couldn't find much on the wiki about these params - I'm looking for more
details about how these work. I'll be happy to update the wiki with more
details based on the discussion here.

Any details about exactly how I can achieve my goal of timing out and
disregarding queries longer that 2 seconds would be greatly appreciated.

The index is insanely lean - no stored fields, no norms, no stop words,
etc. RAM buffer is 128, and we're using the standard "search" req. handler.
Essentially we're running Solr as a nosql data store, which suits this
project, but we need responses to be no longer than 2 seconds at the max.

Thanks,
-Jay

Shard timeouts on large (1B docs) Solr cluster

Reply via email to