A couple of things to check: 1) How many znodes are under the /overseer/queue (which you can see in the Cloud Tree panel in the Admin UI) 2) How often are you committing? The general advice is that your indexing client(s) should not send commits and instead rely on auto-commit settings in solrconfig.xml. I usually start with a hard auto-commit every 60secs 3) Anything in the logs telling you why a replica thinks it needs to recover? Specifically, I'd search for ZooKeeper session expiration log messages (grep expired solr.log)
On Thu, Oct 16, 2014 at 10:01 PM, Sachin Kale <sachinpk...@gmail.com> wrote: > Also, the PingRequestHandler is configured as: > > <requestHandler name="/admin/ping" class="solr.PingRequestHandler"> > <str name="healthcheckFile">server-enabled.txt</str></requestHandler> > > > On Fri, Oct 17, 2014 at 9:07 AM, Sachin Kale <sachinpk...@gmail.com> > wrote: > > > From ZooKeeper side, we have following configuration: > > tickTime=2000 > > dataDir=/var/lib/zookeeper > > clientPort=2181 > > initLimit=5 > > syncLimit=2 > > server.1=192.168.70.27:2888:3888 > > server.2=192.168.70.64:2889:3889 > > server.3=192.168.70.26:2889:3889 > > > > Also, in solr.xml, we have zkClientTimeout set to 30000. > > > > On Fri, Oct 17, 2014 at 7:27 AM, Erick Erickson <erickerick...@gmail.com > > > > wrote: > > > >> And what is your zookeeper timeout? When it's too short that can lead > >> to this behavior. > >> > >> Best, > >> Erick > >> > >> On Thu, Oct 16, 2014 at 4:34 PM, "Jürgen Wagner (DVT)" > >> <juergen.wag...@devoteam.com> wrote: > >> > Hello, > >> > you have one shard and 11 replicas? Hmm... > >> > > >> > - Why you have to keep two nodes on some machines? > >> > - Physical hardware or virtual machines? > >> > - What is the size of this index? > >> > - Is this all on a local network or are there links with potential > >> outages > >> > or failures in between? > >> > - What is the query load? > >> > - Have you had a look at garbage collection? > >> > - Do you use the internal Zookeeper? > >> > - How many nodes? > >> > - Any observers? > >> > - What kind of load does Zookeeper show? > >> > - How much RAM do these nodes have available? > >> > - Do some servers get into swapping? > >> > - ... > >> > > >> > How about some more details in terms of sizing and topology? > >> > > >> > Cheers, > >> > --Jürgen > >> > > >> > > >> > On 16.10.2014 18:41, sachinpkale wrote: > >> > > >> > Hi, > >> > > >> > Recently we have shifted to SolrCloud (4.10.1) from traditional > >> Master-Slave > >> > configuration. We have only one collection and it has only only one > >> shard. > >> > Cloud Cluster contains total 12 nodes (on 8 machines. On 4 machiens, > we > >> have > >> > two instances running on each) out of which one is leader. > >> > > >> > Whenever I see the cluster status using http:// > <IP>:<HOST>/solr/#/~cloud, > >> it > >> > shows at least one (sometimes, it is 2-3) node status as recovering. > We > >> are > >> > using HAProxy load balancer and there also many times, it is showing > the > >> > nodes are recovering. This is happening for all nodes in the cluster. > >> > > >> > What would be the problem here? How do I check this in logs? > >> > > >> > > >> > > >> > -- > >> > View this message in context: > >> > > >> > http://lucene.472066.n3.nabble.com/Frequent-recovery-of-nodes-in-SolrCloud-tp4164541.html > >> > Sent from the Solr - User mailing list archive at Nabble.com. > >> > > >> > > >> > > >> > -- > >> > > >> > Mit freundlichen Grüßen/Kind regards/Cordialement vôtre/Atentamente/С > >> > уважением > >> > i.A. Jürgen Wagner > >> > Head of Competence Center "Intelligence" > >> > & Senior Cloud Consultant > >> > > >> > Devoteam GmbH, Industriestr. 3, 70565 Stuttgart, Germany > >> > Phone: +49 6151 868-8725, Fax: +49 711 13353-53, Mobile: +49 171 864 > >> 1543 > >> > E-Mail: juergen.wag...@devoteam.com, URL: www.devoteam.de > >> > > >> > ________________________________ > >> > Managing Board: Jürgen Hatzipantelis (CEO) > >> > Address of Record: 64331 Weiterstadt, Germany; Commercial Register: > >> > Amtsgericht Darmstadt HRB 6450; Tax Number: DE 172 993 071 > >> > > >> > > >> > > > > >