Hi everyone! I've posted a similar question earlier, but in a thread related
to facets in general, so I thought I'd repost it here as a separate thread.

I have a faceted search that is very fast when I executed the query on a
single solr server, but is significantly slower when executed in a
distributed environment.
The set-back seem to be in the sharding of our data.. And that puzzles me a
little bit... I can't really see why SOLR is so slow at doing this.

The scenario:
Let's say we have two servers (s1 and s2).
If i query
the following:
q=threadid:33&facet=true&facet.field=author&limit=-1&facet.mincount=0&rows=0
directly on either server, the response is lightning fast. (<10ms)

So, in theory I could query them directly, concat the result myself and get
that done pretty fast.

But if I introduce the shards parameter, the response time booms to between
15000ms and 20000ms!
shards=s1:8983/solr,s2:8983/solr

My initial thoughts is that I MUST be doing something wrong here?

So I try the following:
Run the query on server s1, with the shards param shards=s1:8983/solr
response time goes from sub 10ms to between 5000ms and 10000ms!
Same results if i run the query on s2, and same if i use shards=s2:8983/solr

Is there really that much overhead in running a distributed facet field
query with Solr? Anyone else experienced this?

On the other hand, running regular queries without facet distributed is
lightning fast... (so can't really see that this is a network problem or
anything either). - I tried running a facet query on s1 with s1 as the
shards param, and that is still as slow as if the shards param was pointed
to a different server...

Any insight into this would be greatly appreciated! (Would like to avoid
having to hack together our own solution concatenating results...)

Cheers,
 Aleks

Reply via email to