Hello Solr Users, I just wrote up a piece about some work I did recently to improve the throughput of distributed search.
http://www.zinascii.com/2013/solr-distributed-search-and-the-stale-check.html The short of it is that the stale check in Apache's HTTP Client used by SolrJ can add a lot of latency to a distributed search request. Especially given that distributed search is actually made up of 2 stages, each of which must perform its own stale check. For my particular benchmark setup I saw a 2-4x increase in throughput and 100ms+ drop in latency. All my work has been done in context of a larger project, Yokozuna [1], and thus the patch is currently local to that project. I would like to see a similar fix made upstream and that is why I am posting here. I was hoping the Solr sages could offer their input. My fix is very basic, simply disabling the check and adding a sweeper thread to prevent socket reset errors [2]. But if I had more time I think a rewrite using the latest Apache HTTP Components might be in order. I'm not sure. I'm happy to answer any questions and give more details on my test setup. -Z [1] https://github.com/rzezeski/yokozuna [2] https://github.com/rzezeski/yokozuna/blob/a731748f07ee2156b5b3eb558e6b8a3efda4bfe4/solr-patches/no-stale-check.patch