"...it is a distributed real-time query scheme..."

SolrCloud does this already. It treats all the shards like one-big-index, and you can 
query it normally to get "subset" results from each shard. Why do you have to 
re-write the query for each shard? Seems unnecessary.

<br><br><br>------- Original Message -------
On 4/9/2012  08:45 AM Benson Margulies wrote:<br> Jan Høydahl,
<br>
<br>My problem is intimately connected to Solr. it is not a batch job for
<br>hadoop, it is a distributed real-time query scheme. I hate to add yet
<br>another complex framework if a Solr RP can do the job simply.
<br>
<br>For this problem, I can transform a Solr query into a subset query on
<br>each shard, and then let the SolrCloud mechanism.
<br>
<br>I am well aware of the 'zoo' of alternatives, and I will be evaluating
<br>them if I can't get what I want from Solr.
<br>
<br>On Mon, Apr 9, 2012 at 9:34 AM, Jan Høydahl <jan....@cominvent.com> wrote:
<br>> Hi,
<br>>
<br>> Instead of using Solr, you may want to have a look at Hadoop or another 
framework for distributed computation, see e.g. 
http://java.dzone.com/articles/comparison-gridcloud-computing
<br>>
<br>> --
<br>> Jan Høydahl, search solution architect
<br>> Cominvent AS - www.cominvent.com
<br>> Solr Training - www.solrtraining.com
<br>>
<br>> On 9. apr. 2012, at 13:41, Benson Margulies wrote:
<br>>
<br>>> I'm working on a prototype of a scheme that uses SolrCloud to, in
<br>>> effect, distribute a computation by running it inside of a request
<br>>> processor.
<br>>>
<br>>> If there are N shards and M operations, I want each node to perform
<br>>> M/N operations. That, of course, implies that I know N.
<br>>>
<br>>> Is that fact available anyplace inside Solr, or do I need to just 
configure it?
<br>>
<br>
<br>

Reply via email to