Joel,

It needs to perform. Typically users will have 1 - 5 million rows in a
query, returning 10 - 15 fields. Grouping reduces the return by 50% or more
normally. Responses tend be less than a half second.

It sounds like the manipulation of docs at the collector level has been left
to the single solr node implementations, and that your streaming API is the
way forward for cloud implementations. Even if it does have some performance
drawbacks. I can bear slower searches as long as they are not seconds
slower.

I could implement some business strategy that forks searching to either the
AnalyticsQuery or the streaming API based on the shard count in the
collection. Most of my customers will have single shard collections. A goal
of mine is to keep each collection whole as long as possible. If one gets
too big for the pond I'll move it to a bigger pond, until some heap limit is
reached when it will have to be split. 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Merging-documents-from-a-distributed-search-tp4226802p4227595.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to