Re: Usage of CloudSolrServer?

Upayavira Tue, 16 Apr 2013 04:57:13 -0700

I cannot say that I have researched it, but I have always taken it to be
random.


Upayavira

On Tue, Apr 16, 2013, at 12:23 PM, Furkan KAMACI wrote:
> Thanks for your detailed explanation. However you said:
> 
> "It will then choose one of those hosts/cores for each shard, and send a
> request to them as a distributed search request." Is there any document
> that explains of distributed search? What is the criteria for it?
> 
> 
> 2013/4/16 Upayavira <u...@odoko.co.uk>
> 
> > If you are accessing Solr from Java code, you will likely use the SolrJ
> > client to do so. If your users are hitting Solr directly, you should
> > think about whether this is wise - as well as providing them with direct
> > search access, you are also providing them with the ability to delete
> > your entire index with a single command.
> >
> > SolrJ isn't really a load balancer as such. When SolrJ is used to make a
> > request against a collection, it will ask Zookeeper for the names of the
> > shards that make up that collection, and for the hosts/cores that make
> > up the set of replicas for those shards.
> >
> > It will then choose one of those hosts/cores for each shard, and send a
> > request to them as a distributed search request.
> >
> > This has the advantage over traditional load balancing that if you bring
> > up a new node, that node will register itself with ZooKeeper, and thus
> > your SolrJ client(s) will know about it, without any intervention.
> >
> > Upayavira
> >
> > On Tue, Apr 16, 2013, at 08:36 AM, Furkan KAMACI wrote:
> > > Hi Shawn;
> > >
> > > I am sorry but what kind of Load Balancing is that? I mean does it check
> > > whether some leaders are using much CPU or RAM etc.? I think a problem
> > > may
> > > occur at such kind of scenario: if some of leaders getting more documents
> > > than other leaders (I don't know how it is decided that into which shard
> > > a
> > > document will go) than there will be a bottleneck on that leader?
> > >
> > >
> > > 2013/4/15 Shawn Heisey <s...@elyograg.org>
> > >
> > > > On 4/15/2013 8:05 AM, Furkan KAMACI wrote:
> > > >
> > > >> My system is as follows: I crawl data with Nutch and send them into
> > > >> SolrCloud. Users will search at Solr.
> > > >>
> > > >> What is that CloudSolrServer, should I use it for load balancing or
> > is it
> > > >> something else different?
> > > >>
> > > >
> > > > It appears that the Solr integration in Nutch currently does not use
> > > > CloudSolrServer.  There is an issue to add it.  The mutual dependency
> > on
> > > > HttpClient is holding it up - Nutch uses HttpClient 3, SolrJ 4.x uses
> > > > HttpClient 4.
> > > >
> > > > https://issues.apache.org/**jira/browse/NUTCH-1377<
> > https://issues.apache.org/jira/browse/NUTCH-1377>
> > > >
> > > > Until that is fixed, a load balancer would be required for full
> > redundancy
> > > > for updates with SolrCloud.  You don't have to use a load balancer for
> > it
> > > > to work, but if the Solr server that Nutch is using goes down, then
> > > > indexing will stop unless you reconfigure Nutch or bring the Solr
> > server
> > > > back up.
> > > >
> > > > Thanks,
> > > > Shawn
> > > >
> > > >
> >

Re: Usage of CloudSolrServer?

Reply via email to