On Mon, Apr 9, 2012 at 9:50 AM, Darren Govoni <ontre...@ontrenet.com> wrote: > "...it is a distributed real-time query scheme..." > > SolrCloud does this already. It treats all the shards like one-big-index, > and you can query it normally to get "subset" results from each shard. Why > do you have to re-write the query for each shard? Seems unnecessary.
For reasons described in previous email that I won't repeat here. > > <br><br><br>------- Original Message ------- > On 4/9/2012 08:45 AM Benson Margulies wrote:<br> Jan Høydahl, > <br> > <br>My problem is intimately connected to Solr. it is not a batch job for > <br>hadoop, it is a distributed real-time query scheme. I hate to add yet > <br>another complex framework if a Solr RP can do the job simply. > <br> > <br>For this problem, I can transform a Solr query into a subset query on > <br>each shard, and then let the SolrCloud mechanism. > <br> > <br>I am well aware of the 'zoo' of alternatives, and I will be evaluating > <br>them if I can't get what I want from Solr. > <br> > <br>On Mon, Apr 9, 2012 at 9:34 AM, Jan Høydahl <jan....@cominvent.com> > wrote: > <br>> Hi, > <br>> > <br>> Instead of using Solr, you may want to have a look at Hadoop or > another framework for distributed computation, see e.g. > http://java.dzone.com/articles/comparison-gridcloud-computing > <br>> > <br>> -- > <br>> Jan Høydahl, search solution architect > <br>> Cominvent AS - www.cominvent.com > <br>> Solr Training - www.solrtraining.com > <br>> > <br>> On 9. apr. 2012, at 13:41, Benson Margulies wrote: > <br>> > <br>>> I'm working on a prototype of a scheme that uses SolrCloud to, in > <br>>> effect, distribute a computation by running it inside of a request > <br>>> processor. > <br>>> > <br>>> If there are N shards and M operations, I want each node to perform > <br>>> M/N operations. That, of course, implies that I know N. > <br>>> > <br>>> Is that fact available anyplace inside Solr, or do I need to just > configure it? > <br>> > <br> > <br>