Hi

* Using Solr 4.4.0
* Up to 1000 shards total - spread across about 20-40 Solr-servers on 20-40 machines

Searching "limited but high rows across many shards all with high hits" is slow
E.g.
* Query from outside client: q=content:something&rows=1000
* Resulting in sub-requests to each shard something a-la this
** 1) q=content&rows=1000&fl=id,score
** 2) Request the full documents with ids in the global-top-1000 found among the top-1000 from each shard

Interpretation
* limited but high rows are means 1000 in the example above
* many shards means 200-1000 in our case
* all with high hits, means that each of the shards have a significant number of hits on the query (q-param)

Doing such a query on our system takes between 5 min to 1 hour - depending on a lot of things. We have profiled and made our own PoC solution that brings the response-time down to between 5 secs and 1 minute (about a factor 60 faster) - and not requiring nearly as many resources from the system while performing the search. Of course we want to have a solution going into production. We have to either mature out PoC solution and use that, or adopt an existing solution from the newest Solr release. Do any of you guys know if there are a solution to this "problem" in the newest Solr release?

Regards, Per Steffensen



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to