Maybe I got this wrong...but isn't this what mapreduce is meant to deal with?
eg,
1) get the job (a query)
2) map it to workers ( servers that provide search results from their own
indexing)
3) wait for the results from all workers that reply within acceptable
timeframe.
4) comb through the lot of results from all workers, reduce them according to
your own biz rules (eg, remove dupes, sort them by quality / priority... here
possibly relying on the original parameters of the query in 1)
5) return the reduced results to the frontend.
That seems to be how Sphinx works:
http://www.sphinxsearch.com/doc.html#distributed
Of course, the details of this are far over my head for either system,
so I don't really know if that's a sensible way of doing things or
not.
Ciao,
--
David N. Welton
http://www.welton.it/davidw/