This time without formatting ;-)

This is from the leader log. (There are other statements inbetween, but I
think they are irrelevant). First of all it reads the zookeeper state. This
happens now and then. Then I guess the replica says "Hey, I'm alive, please
start the recover process": 

[qtp689554095-19] INFO  org.apache.solr.common.cloud.ZkStateReader  -
Updating cloud state from ZooKeeper...
[qtp689554095-17] INFO  org.apache.solr.servlet.SolrDispatchFilter  -
[admin] webapp=null path=/admin/cores
params={coreNodeName=10.231.188.127:8080_solr
_swap&state=recovering&nodeName=10.231.188.127:8080_solr&action=PREPRECOVERY&checkLive=true&core=swap&wt=javabin&onlyIfLeader=true&version=2}
status=400 QTime=12
0485
[qtp689554095-15] ERROR org.apache.solr.core.SolrCore  -
org.apache.solr.common.SolrException: I was asked to wait on state
recovering for 10.231.188.127
:8080_solr but I still do not see the requested state. I see state: null
live:false

>From the replica log, which tries to recover. For some reason I get the
"read timed out", as if solr was dead.. (but it clearly isnt). Maybe I could
turn on debug logging to view the actual URL for the query? 
Also, I can see the replica telling zookeeper its alive and recovering:

[RecoveryThread] ERROR org.apache.solr.cloud.RecoveryStrategy  - Error while
trying to recover.
core=swap:org.apache.solr.client.solrj.SolrServerException: Timeout occured
while waiting response from server at: http://10.231.188.126:8080/solr
...Caused by: java.net.SocketTimeoutException: Read timed out
[RecoveryThread] ERROR org.apache.solr.cloud.RecoveryStrategy  - Recovery
failed - trying again... (1) core=swap
[RecoveryThread] INFO  org.apache.solr.cloud.RecoveryStrategy  - Wait 4.0
seconds before trying to recover again (2)
[RecoveryThread] INFO  org.apache.solr.cloud.ZkController  - publishing
core=swap state=recovering

I left this looping for a while in case of some bad synchronizations, but it
never recoverd

I will use aliases instead of swapping from now on, since swapping could
lead to unstable situations (probably like this one...). Maybe its
appropriate to show a warning when the user click the "Swap" when in solr
cloud mode. That would have saved you from my noob messages ;-)




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Clusterstate-says-state-recovering-but-Core-says-I-see-state-null-tp4084504p4084682.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to