our production solr nodes were having similar issue with 4 nodes everything is normal, but when we try to increase the replicas (nodes) to 10 most of then went to recovery. our config params : nodes : 20 (replica in each node) soft commit is 6 sec hard commit is 5 min indexing scheduled time : every 3 mins around 5k of documents.
Now we are back on 4 nodes in prod, which is working our for this season, but we may be hitting this case once again in near future where we want to expand. I have been going through the blog which suggest soft commit and hard commit for near real time search instances, may be you can also have a look. http://lucidworks.com/blog/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/ *Rajesh.* On Mon, Apr 27, 2015 at 11:15 AM, Gopal Jee <gopal....@myntra.com> wrote: > We have a 26 node solr cloud cluster. During heavy re-indexing, some of > nodes go into recovering state. > as per current config, soft commit is set to 15 minute and hard commit to > 30 sec. Moreover, zkClientTimeout is set to 30 sec in solr nodes. > Please advise. > > Thanks > Gopal >