I cannot say that I have researched it, but I have always taken it to be random.
Upayavira On Tue, Apr 16, 2013, at 12:23 PM, Furkan KAMACI wrote: > Thanks for your detailed explanation. However you said: > > "It will then choose one of those hosts/cores for each shard, and send a > request to them as a distributed search request." Is there any document > that explains of distributed search? What is the criteria for it? > > > 2013/4/16 Upayavira <u...@odoko.co.uk> > > > If you are accessing Solr from Java code, you will likely use the SolrJ > > client to do so. If your users are hitting Solr directly, you should > > think about whether this is wise - as well as providing them with direct > > search access, you are also providing them with the ability to delete > > your entire index with a single command. > > > > SolrJ isn't really a load balancer as such. When SolrJ is used to make a > > request against a collection, it will ask Zookeeper for the names of the > > shards that make up that collection, and for the hosts/cores that make > > up the set of replicas for those shards. > > > > It will then choose one of those hosts/cores for each shard, and send a > > request to them as a distributed search request. > > > > This has the advantage over traditional load balancing that if you bring > > up a new node, that node will register itself with ZooKeeper, and thus > > your SolrJ client(s) will know about it, without any intervention. > > > > Upayavira > > > > On Tue, Apr 16, 2013, at 08:36 AM, Furkan KAMACI wrote: > > > Hi Shawn; > > > > > > I am sorry but what kind of Load Balancing is that? I mean does it check > > > whether some leaders are using much CPU or RAM etc.? I think a problem > > > may > > > occur at such kind of scenario: if some of leaders getting more documents > > > than other leaders (I don't know how it is decided that into which shard > > > a > > > document will go) than there will be a bottleneck on that leader? > > > > > > > > > 2013/4/15 Shawn Heisey <s...@elyograg.org> > > > > > > > On 4/15/2013 8:05 AM, Furkan KAMACI wrote: > > > > > > > >> My system is as follows: I crawl data with Nutch and send them into > > > >> SolrCloud. Users will search at Solr. > > > >> > > > >> What is that CloudSolrServer, should I use it for load balancing or > > is it > > > >> something else different? > > > >> > > > > > > > > It appears that the Solr integration in Nutch currently does not use > > > > CloudSolrServer. There is an issue to add it. The mutual dependency > > on > > > > HttpClient is holding it up - Nutch uses HttpClient 3, SolrJ 4.x uses > > > > HttpClient 4. > > > > > > > > https://issues.apache.org/**jira/browse/NUTCH-1377< > > https://issues.apache.org/jira/browse/NUTCH-1377> > > > > > > > > Until that is fixed, a load balancer would be required for full > > redundancy > > > > for updates with SolrCloud. You don't have to use a load balancer for > > it > > > > to work, but if the Solr server that Nutch is using goes down, then > > > > indexing will stop unless you reconfigure Nutch or bring the Solr > > server > > > > back up. > > > > > > > > Thanks, > > > > Shawn > > > > > > > > > >