We have a collection with 2 shards, 3 nodes per shard running solr 4.10.2 Our issue is that cores that get in recovery never recover, they are in a constant state of recovery unless we restart the node and then reload the core on the leader. Updates seem to get to the server fine as the transaction log grows over time and when we restart the node it replays the transaction log successfully and chugs along in recovery until we reload the core on the leader. If we hit the maxwarmingsearchers error would that break something that prevents recovery?
here is log i have for the node that is in recovery: INFO - 2015-09-18 15:10:25.332; org.apache.solr.handler.component.SpellCheckComponent; http://0.0.0.41:8080/solr/collection1/|http://0.0.0.45:8080/solr/collection1/ {suggestions={}} INFO - 2015-09-18 15:10:25.332; org.apache.solr.handler.component.SpellCheckComponent; http://0.0.0.40:8080/solr/collection1/|http://0.0.0.42:8080/solr/collection1/|http://0.0.0.44:8080/solr/collection1/ { suggestions={}} INFO - 2015-09-18 15:10:25.609; org.apache.solr.update.DirectUpdateHandler2; start commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false} WARN - 2015-09-18 15:10:25.642; org.apache.solr.core.SolrCore; [collection1] Error opening new searcher. exceeded limit of maxWarmingSearchers=2, try again later. ERROR - 2015-09-18 15:10:25.642; org.apache.solr.common.SolrException; auto commit error...:org.apache.solr.common.SolrException: Error opening new searcher. exceeded limit of maxWarmingSearchers=2, try again later. at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1663) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1421) at org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:615) at org.apache.solr.update.CommitTracker.run(CommitTracker.java:216) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) INFO - 2015-09-18 15:10:26.429; org.apache.solr.handler.component.SpellCheckComponent; http://0.0.0.40:8080/solr/collection1/|http://0.0.0.42:8080/solr/collection1/|http://0.0.0.44:8080/solr/collection1/ n ull INFO - 2015-09-18 15:10:26.429; org.apache.solr.handler.component.SpellCheckComponent; http://0.0.0.41:8080/solr/collection1/|http://0.0.0.45:8080/solr/collection1/ null INFO - 2015-09-18 15:10:26.430; org.apache.solr.handler.component.SpellCheckComponent; http://0.0.0.40:8080/solr/collection1/|http://0.0.0.42:8080/solr/collection1/|http://0.0.0.44:8080/solr/collection1/ n ull INFO - 2015-09-18 15:10:26.430; org.apache.solr.handler.component.SpellCheckComponent; http://0.0.0.41:8080/solr/collection1/|http://0.0.0.45:8080/solr/collection1/ null INFO - 2015-09-18 15:10:27.359; org.apache.solr.handler.component.SpellCheckComponent; http://0.0.0.40:8080/solr/collection1/|http://0.0.0.42:8080/solr/collection1/|http://0.0.0.44:8080/solr/collection1/ n ull INFO - 2015-09-18 15:10:27.359; org.apache.solr.handler.component.SpellCheckComponent; http://0.0.0.41:8080/solr/collection1/|http://0.0.0.45:8080/solr/collection1/ null INFO - 2015-09-18 15:10:27.710; org.apache.solr.handler.component.SpellCheckComponent$SpellCheckerListener; Building spell index for spell checker: default INFO - 2015-09-18 15:10:27.766; org.apache.solr.cloud.RecoveryStrategy; PeerSync Recovery was successful - registering as Active. core=collection1 INFO - 2015-09-18 15:10:27.766; org.apache.solr.cloud.ZkController; publishing core=collection1 state=active collection=collection1 INFO - 2015-09-18 15:10:27.773; org.apache.solr.update.DefaultSolrCoreState; Running recovery - first canceling any ongoing recovery WARN - 2015-09-18 15:10:27.774; org.apache.solr.cloud.RecoveryStrategy; Stopping recovery for core=collection1 coreNodeName=solrserver4 INFO - 2015-09-18 15:10:27.774; org.apache.solr.cloud.RecoveryStrategy; Starting recovery process. core=collection1 recoveringAfterStartup=false INFO - 2015-09-18 15:10:27.776; org.apache.solr.cloud.RecoveryStrategy; Finished recovery process. core=collection1 INFO - 2015-09-18 15:10:27.776; org.apache.solr.cloud.RecoveryStrategy; Starting recovery process. core=collection1 recoveringAfterStartup=false INFO - 2015-09-18 15:10:27.776; org.apache.solr.update.DefaultSolrCoreState; Running recovery - first canceling any ongoing recovery WARN - 2015-09-18 15:10:27.777; org.apache.solr.cloud.RecoveryStrategy; Stopping recovery for core=collection1 coreNodeName=solrserver4 INFO - 2015-09-18 15:10:27.777; org.apache.solr.cloud.RecoveryStrategy; Finished recovery process. core=collection1 INFO - 2015-09-18 15:10:27.778; org.apache.solr.update.DefaultSolrCoreState; Running recovery - first canceling any ongoing recovery INFO - 2015-09-18 15:10:27.778; org.apache.solr.cloud.RecoveryStrategy; Starting recovery process. core=collection1 recoveringAfterStartup=false WARN - 2015-09-18 15:10:27.778; org.apache.solr.cloud.RecoveryStrategy; Stopping recovery for core=collection1 coreNodeName=solrserver4 INFO - 2015-09-18 15:10:27.778; org.apache.solr.cloud.RecoveryStrategy; Finished recovery process. core=collection1 INFO - 2015-09-18 15:10:27.779; org.apache.solr.update.DefaultSolrCoreState; Running recovery - first canceling any ongoing recovery INFO - 2015-09-18 15:10:27.779; org.apache.solr.cloud.RecoveryStrategy; Starting recovery process. core=collection1 recoveringAfterStartup=false WARN - 2015-09-18 15:10:27.779; org.apache.solr.cloud.RecoveryStrategy; Stopping recovery for core=collection1 coreNodeName=solrserver4 The starting stopping recovery just replays constantly. Let me know what else is needed to help troubleshoot this issue. Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-4-10-2-Cores-in-Recovery-tp4230598.html Sent from the Solr - User mailing list archive at Nabble.com.