You should probably post your question to solr-user mailing list for a broader audience. If you can share more details about your cluster and more logs, it would probably also be beneficial
-- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com > 6. mar. 2017 kl. 03.05 skrev Sathyam <sathyam.dorasw...@gmail.com>: > > Hi, > > I am indexing data into a 64 shard collection created in a SOLR 4.10.3,CDH > cluster running over HDFS and having 19 nodes. > The indexing runs very well for the intial few hours(5-6) post which all > the different nodes of the cluster start showing health issues(varying > randomly across the nodes) and the indexing speed also reduces a lot. > I have used the SOLR tuning guidelines specified in - > https://www.cloudera.com/documentation/enterprise/5-8-x/topics/search_tuning_solr.html#csug_topic_10 > > and tried but it did not work out. I observed that decreasing the > "solr.hdfs.blockcache.slab.count" to a very low value(32) improves indexing > a lot but only for the initial few hours. > > Some errors that I get on the server side logs are - > org.apache.curator.CuratorConnectionLossException: KeeperErrorCode = > ConnectionLoss > org.apache.solr.core.SolrCore: org.apache.solr.common.SolrException: Cannot > talk to ZooKeeper - Updates are disabled. > org.apache.solr.update.processor.DistributedUpdateProcessor: ClusterState > says we are the leader, but locally we don't think so > > The cluster never self-recovers post encountering the errors I mentioned > above. > Restarting the cluster does solve the problem though which again starts > occurring after a few hours. > > I would need some suggestions/guidelines/helpful links on what are the > parameters that I should consider and their recommended values to be used > to ensure a stable and smooth indexing. > > > > -- > Sathyam Doraswamy