Hi everyone! I've posted a similar question earlier, but in a thread related to facets in general, so I thought I'd repost it here as a separate thread.
I have a faceted search that is very fast when I executed the query on a single solr server, but is significantly slower when executed in a distributed environment. The set-back seem to be in the sharding of our data.. And that puzzles me a little bit... I can't really see why SOLR is so slow at doing this. The scenario: Let's say we have two servers (s1 and s2). If i query the following: q=threadid:33&facet=true&facet.field=author&limit=-1&facet.mincount=0&rows=0 directly on either server, the response is lightning fast. (<10ms) So, in theory I could query them directly, concat the result myself and get that done pretty fast. But if I introduce the shards parameter, the response time booms to between 15000ms and 20000ms! shards=s1:8983/solr,s2:8983/solr My initial thoughts is that I MUST be doing something wrong here? So I try the following: Run the query on server s1, with the shards param shards=s1:8983/solr response time goes from sub 10ms to between 5000ms and 10000ms! Same results if i run the query on s2, and same if i use shards=s2:8983/solr Is there really that much overhead in running a distributed facet field query with Solr? Anyone else experienced this? On the other hand, running regular queries without facet distributed is lightning fast... (so can't really see that this is a network problem or anything either). - I tried running a facet query on s1 with s1 as the shards param, and that is still as slow as if the shards param was pointed to a different server... Any insight into this would be greatly appreciated! (Would like to avoid having to hack together our own solution concatenating results...) Cheers, Aleks