[ https://issues.apache.org/jira/browse/SOLR-5961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14311479#comment-14311479 ]
ASF subversion and git services commented on SOLR-5961: ------------------------------------------------------- Commit 1658236 from [~markrmil...@gmail.com] in branch 'dev/trunk' [ https://svn.apache.org/r1658236 ] SOLR-7033, SOLR-5961: RecoveryStrategy should not publish any state when closed / cancelled and there should always be a pause between recoveries even when recoveries are rapidly stopped and started as well as when a node attempts to become the leader for a shard. > Solr gets crazy on /overseer/queue state change > ----------------------------------------------- > > Key: SOLR-5961 > URL: https://issues.apache.org/jira/browse/SOLR-5961 > Project: Solr > Issue Type: Bug > Components: SolrCloud > Affects Versions: 4.7.1 > Environment: CentOS, 1 shard - 3 replicas, ZK cluster with 3 nodes > (separate machines) > Reporter: Maxim Novikov > Assignee: Shalin Shekhar Mangar > Priority: Critical > > No idea how to reproduce it, but sometimes Solr stars littering the log with > the following messages: > 419158 [localhost-startStop-1-EventThread] INFO > org.apache.solr.cloud.DistributedQueue ? LatchChildWatcher fired on path: > /overseer/queue state: SyncConnected type NodeChildrenChanged > 419190 [Thread-3] INFO org.apache.solr.cloud.Overseer ? Update state > numShards=1 message={ > "operation":"state", > "state":"recovering", > "base_url":"http://${IP_ADDRESS}/solr", > "core":"${CORE_NAME}", > "roles":null, > "node_name":"${NODE_NAME}_solr", > "shard":"shard1", > "collection":"${COLLECTION_NAME}", > "numShards":"1", > "core_node_name":"core_node2"} > It continues spamming these messages with no delay and the restarting of all > the nodes does not help. I have even tried to stop all the nodes in the > cluster first, but then when I start one, the behavior doesn't change, it > gets crazy nuts with this " /overseer/queue state" again. > PS The only way to handle this was to stop everything, manually clean up all > the data in ZooKeeper related to Solr, and then rebuild everything from > scratch. As you should understand, it is kinda unbearable in the production > environment. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org