At least it looks like your hitting that - based on it mentioning no frame of reference to use to sync with - more importantly though, your also hitting another issue - see my email to the user list:
http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201212.mbox/%3cd0994a2d-04b0-4a80-af07-9add49b85...@gmail.com%3E - Mark On Dec 21, 2012, at 2:10 PM, Mark Miller <markrmil...@gmail.com> wrote: > Your hitting https://issues.apache.org/jira/browse/SOLR-3939 > > The luck of hashing must have left the guy trying to become the leader > without any docs. Due to SOLR-3939, a node with an empty index cannot become > the leader. > > - Mark > > On Dec 21, 2012, at 1:41 PM, gumatias <gust...@matias.com> wrote: > >> I'm getting the same error. I followed the SolrCloud examples and it didn't >> work.. here's basically what I've done: >> >> EXPERIMENT 1: start shards and index documents, search for documents in all >> replicas >> >> # Starting Shards >> - Shard1 Leader (with zookeeper) >> java -Dbootstrap_confdir=./solr/collection1/conf >> -Dcollection.configName=myconf -DzkRun >> -DzkHost=localhost:9983,localhost:8574,localhost:9900 -DnumShards=2 -jar >> start.jar >> - Shard1 Replica (with zookeeper) >> java -Djetty.port=7574 -DzkRun >> -DzkHost=localhost:9983,localhost:8574,localhost:9900 -jar start.jar >> - Shard2 Leader (with zookeeper) >> java -Djetty.port=8900 -DzkRun >> -DzkHost=localhost:9983,localhost:8574,localhost:9900 -jar start.jar >> - Shard2 Replica >> java -Djetty.port=7500 >> -DzkHost=localhost:9983,localhost:8574,localhost:9900 -jar start.jar >> >> clusterstate.json: http://dl.dropbox.com/u/7570330/clusterstate.txt >> >> # Indexing sample document >> java -jar post.jar hd.xml >> >> # search in all Shards: number of results found: 2 >> Note: all shards have the same result >> >> EXPERIMENT 2: Kill current Shard1 Leader, expect Shard1 Replica to become >> leader, search should still work and results return (is that right?) >> >> # Killing Shard2 Leader >> >> Shard2 Replica logs: >> ... >> Dec 21, 2012 11:57:39 AM org.apache.zookeeper.server.PrepRequestProcessor >> pRequest >> INFO: Processed session termination for sessionid: 0x3bbe3403c00001 >> Dec 21, 2012 11:57:39 AM org.apache.zookeeper.server.PrepRequestProcessor >> pRequest >> INFO: Got user-level KeeperException when processing >> sessionid:0x3bbe3403c00000 type:delete cxid:0x4dea zxid:0xfffffffffffffffe >> txntype:unknown reqpath:n/a Error >> Path:/collections/collection1/leaders/shard1 Error:KeeperErrorCode = NoNode >> for /collections/collection1/leaders/shard1 >> Dec 21, 2012 11:57:39 AM org.apache.solr.cloud.ShardLeaderElectionContext >> runLeaderProcess >> INFO: Running the leader process. >> Dec 21, 2012 11:57:39 AM org.apache.zookeeper.server.PrepRequestProcessor >> pRequest >> INFO: Got user-level KeeperException when processing >> sessionid:0x3bbe3403c00000 type:create cxid:0x4dec zxid:0xfffffffffffffffe >> txntype:unknown reqpath:n/a Error Path:/overseer Error:KeeperErrorCode = >> NodeExists for /overseer >> Dec 21, 2012 11:57:39 AM org.apache.solr.cloud.ShardLeaderElectionContext >> shouldIBeLeader >> INFO: Checking if I should try and be the leader. >> Dec 21, 2012 11:57:39 AM org.apache.solr.cloud.ShardLeaderElectionContext >> shouldIBeLeader >> INFO: My last published State was Active, it's okay to be the leader. >> Dec 21, 2012 11:57:39 AM org.apache.solr.cloud.ShardLeaderElectionContext >> runLeaderProcess >> INFO: I may be the new leader - try and sync >> Dec 21, 2012 11:57:39 AM org.apache.solr.cloud.SyncStrategy sync >> INFO: Sync replicas to >> http://Gustavos-MacBook-Pro.local:8900/solr/collection1/ >> Dec 21, 2012 11:57:39 AM org.apache.solr.update.PeerSync sync >> INFO: PeerSync: core=collection1 >> url=http://Gustavos-MacBook-Pro.local:8900/solr START >> replicas=[http://Gustavos-MacBook-Pro.local:8983/solr/collection1/] >> nUpdates=100 >> Dec 21, 2012 11:57:39 AM org.apache.solr.update.PeerSync sync >> INFO: PeerSync: core=collection1 >> url=http://Gustavos-MacBook-Pro.local:8900/solr DONE. We have no versions. >> sync failed. >> Dec 21, 2012 11:57:39 AM org.apache.solr.common.SolrException log >> SEVERE: Sync Failed >> Dec 21, 2012 11:57:39 AM org.apache.solr.cloud.ShardLeaderElectionContext >> rejoinLeaderElection >> INFO: There is a better leader candidate than us - going back into recovery >> Dec 21, 2012 11:57:39 AM org.apache.solr.update.DefaultSolrCoreState >> doRecovery >> INFO: Running recovery - first canceling any ongoing recovery >> Dec 21, 2012 11:57:39 AM org.apache.solr.cloud.RecoveryStrategy run >> INFO: Starting recovery process. core=collection1 >> recoveringAfterStartup=false >> Dec 21, 2012 11:57:39 AM org.apache.solr.cloud.RecoveryStrategy doRecovery >> INFO: Attempting to PeerSync from >> http://Gustavos-MacBook-Pro.local:8983/solr/collection1/ core=collection1 - >> recoveringAfterStartup=false >> Dec 21, 2012 11:57:39 AM org.apache.solr.update.PeerSync sync >> INFO: PeerSync: core=collection1 >> url=http://Gustavos-MacBook-Pro.local:8900/solr START >> replicas=[http://Gustavos-MacBook-Pro.local:8983/solr/collection1/] >> nUpdates=100 >> Dec 21, 2012 11:57:39 AM org.apache.zookeeper.server.PrepRequestProcessor >> pRequest >> INFO: Got user-level KeeperException when processing >> sessionid:0x3bbe3403c00000 type:delete cxid:0x4df3 zxid:0xfffffffffffffffe >> txntype:unknown reqpath:n/a Error >> Path:/collections/collection1/leaders/shard1 Error:KeeperErrorCode = NoNode >> for /collections/collection1/leaders/shard1 >> Dec 21, 2012 11:57:39 AM org.apache.solr.update.PeerSync sync >> WARNING: no frame of reference to tell of we've missed updates >> Dec 21, 2012 11:57:39 AM org.apache.solr.cloud.RecoveryStrategy doRecovery >> INFO: PeerSync Recovery was not successful - trying replication. >> core=collection1 >> Dec 21, 2012 11:57:39 AM org.apache.solr.cloud.RecoveryStrategy doRecovery >> INFO: Starting Replication Recovery. core=collection1 >> Dec 21, 2012 11:57:39 AM org.apache.solr.client.solrj.impl.HttpClientUtil >> createClient >> INFO: Creating new http client, >> config:maxConnections=128&maxConnectionsPerHost=32&followRedirects=false >> Dec 21, 2012 11:57:39 AM org.apache.solr.cloud.ShardLeaderElectionContext >> runLeaderProcess >> INFO: Running the leader process. >> Dec 21, 2012 11:57:39 AM org.apache.zookeeper.server.PrepRequestProcessor >> pRequest >> INFO: Got user-level KeeperException when processing >> sessionid:0x3bbe3403c00000 type:create cxid:0x4df4 zxid:0xfffffffffffffffe >> txntype:unknown reqpath:n/a Error Path:/overseer Error:KeeperErrorCode = >> NodeExists for /overseer >> Dec 21, 2012 11:57:39 AM org.apache.solr.cloud.ShardLeaderElectionContext >> waitForReplicasToComeUp >> INFO: Waiting until we see more replicas up: total=2 found=1 >> timeoutin=180000 >> Dec 21, 2012 11:57:39 AM org.apache.zookeeper.server.quorum.LearnerHandler >> run >> SEVERE: Unexpected exception causing shutdown while sock still open >> java.io.EOFException >> at java.io.DataInputStream.readInt(DataInputStream.java:375) >> at >> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63) >> at >> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:84) >> at >> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108) >> at >> org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:416) >> Dec 21, 2012 11:57:39 AM org.apache.zookeeper.ClientCnxn$SendThread run >> INFO: Unable to read additional data from server sessionid 0x3bbe3403c00000, >> likely server has closed socket, closing socket connection and attempting >> reconnect >> Dec 21, 2012 11:57:39 AM >> org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker run >> WARNING: Connection broken for id 0, my id = 2, error = java.io.IOException: >> Channel eof >> Dec 21, 2012 11:57:39 AM org.apache.zookeeper.server.quorum.LearnerHandler >> run >> WARNING: ******* GOODBYE /127.0.0.1:58549 ******** >> Dec 21, 2012 11:57:39 AM >> org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker run >> WARNING: Interrupting SendWorker >> Dec 21, 2012 11:57:39 AM >> org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker run >> WARNING: Interrupted while waiting for message on queue >> java.lang.InterruptedException >> at >> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:1961) >> at >> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2038) >> at >> java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:347) >> at >> org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:622) >> Dec 21, 2012 11:57:39 AM >> org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker run >> WARNING: Send worker leaving thread >> Dec 21, 2012 11:57:40 AM org.apache.zookeeper.ClientCnxn$SendThread >> startConnect >> INFO: Opening socket connection to server localhost/127.0.0.1:9900 >> Dec 21, 2012 11:57:40 AM org.apache.zookeeper.ClientCnxn$SendThread >> primeConnection >> INFO: Socket connection established to localhost/127.0.0.1:9900, initiating >> session >> Dec 21, 2012 11:57:40 AM org.apache.zookeeper.server.NIOServerCnxn$Factory >> run >> INFO: Accepted socket connection from /127.0.0.1:58930 >> Dec 21, 2012 11:57:40 AM org.apache.zookeeper.server.NIOServerCnxn >> readConnectRequest >> INFO: Client attempting to renew session 0x3bbe3403c00000 at >> /127.0.0.1:58930 >> Dec 21, 2012 11:57:40 AM org.apache.zookeeper.server.NIOServerCnxn >> finishSessionInit >> INFO: Established session 0x3bbe3403c00000 with negotiated timeout 15000 for >> client /127.0.0.1:58930 >> Dec 21, 2012 11:57:40 AM org.apache.zookeeper.ClientCnxn$SendThread >> readConnectResult >> INFO: Session establishment complete on server localhost/127.0.0.1:9900, >> sessionid = 0x3bbe3403c00000, negotiated timeout = 15000 >> Dec 21, 2012 11:57:40 AM org.apache.solr.cloud.ShardLeaderElectionContext >> waitForReplicasToComeUp >> INFO: Waiting until we see more replicas up: total=2 found=1 >> timeoutin=179497 >> Dec 21, 2012 11:57:40 AM org.apache.solr.common.cloud.ZkStateReader >> updateClusterState >> INFO: Updating cloud state from ZooKeeper... >> Dec 21, 2012 11:57:40 AM org.apache.solr.cloud.ShardLeaderElectionContext >> waitForReplicasToComeUp >> INFO: Waiting until we see more replicas up: total=2 found=1 >> timeoutin=178996 >> Dec 21, 2012 11:57:41 AM org.apache.solr.cloud.ShardLeaderElectionContext >> waitForReplicasToComeUp >> INFO: Waiting until we see more replicas up: total=2 found=1 >> timeoutin=178494 >> Dec 21, 2012 11:57:41 AM org.apache.solr.cloud.ShardLeaderElectionContext >> waitForReplicasToComeUp >> INFO: Waiting until we see more replicas up: total=2 found=1 >> timeoutin=177992 >> Dec 21, 2012 11:57:42 AM org.apache.solr.cloud.ShardLeaderElectionContext >> waitForReplicasToComeUp >> INFO: Waiting until we see more replicas up: total=2 found=1 >> timeoutin=177491 >> Dec 21, 2012 11:57:42 AM org.apache.solr.cloud.ShardLeaderElectionContext >> waitForReplicasToComeUp >> INFO: Waiting until we see more replicas up: total=2 found=1 >> timeoutin=176989 >> Dec 21, 2012 11:57:43 AM org.apache.solr.cloud.ShardLeaderElectionContext >> waitForReplicasToComeUp >> INFO: Waiting until we see more replicas up: total=2 found=1 >> timeoutin=176488 >> Dec 21, 2012 11:57:43 AM org.apache.solr.cloud.ShardLeaderElectionContext >> waitForReplicasToComeUp >> INFO: Waiting until we see more replicas up: total=2 found=1 >> timeoutin=175986 >> Dec 21, 2012 11:57:44 AM org.apache.solr.cloud.ShardLeaderElectionContext >> waitForReplicasToComeUp >> INFO: Waiting until we see more replicas up: total=2 found=1 >> timeoutin=175484 >> ... >> >> Notes: >> >> - Shard1 replica doesnt become the leader and keeps "waiting until see more >> replicas up" >> >> - Search results for all Shards: >> <lst name="error"> >> <str name="msg">no servers hosting shard:</str> >> <int name="code">503</int> >> </lst> >> >> Dump: http://dl.dropbox.com/u/7570330/dump.txt >> >> What am I doing wrong? >> >> >> >> -- >> View this message in context: >> http://lucene.472066.n3.nabble.com/Solrcloud-not-reachable-and-after-restart-just-a-no-servers-hosting-shard-tp4009786p4028623.html >> Sent from the Solr - User mailing list archive at Nabble.com. >