Hi,
Instead of using Solr, you may want to have a look at Hadoop or another
framework for distributed computation, see e.g.
http://java.dzone.com/articles/comparison-gridcloud-computing
--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Solr Training -
Jan Høydahl,
My problem is intimately connected to Solr. it is not a batch job for
hadoop, it is a distributed real-time query scheme. I hate to add yet
another complex framework if a Solr RP can do the job simply.
For this problem, I can transform a Solr query into a subset query on
each
...it is a distributed real-time query scheme...
SolrCloud does this already. It treats all the shards like one-big-index, and you can
query it normally to get subset results from each shard. Why do you have to
re-write the query for each shard? Seems unnecessary.
brbrbr--- Original
On Mon, Apr 9, 2012 at 9:50 AM, Darren Govoni ontre...@ontrenet.com wrote:
...it is a distributed real-time query scheme...
SolrCloud does this already. It treats all the shards like one-big-index,
and you can query it normally to get subset results from each shard. Why
do you have to
I _think_ you need to look at the Zookeeper information, perhaps
something like ZkController.getCloudState or some such?
Warning: I haven't been in that code, so this is just a guess. But
since the SolrCloud stuff has to know this kind of info in order
to do distributed indexing, it's got to be