Re: Solrcloud not reachable and after restart just a no servers hosting shard
I'm getting the same error. I followed the SolrCloud examples and it didn't work.. here's basically what I've done: EXPERIMENT 1: start shards and index documents, search for documents in all replicas # Starting Shards - Shard1 Leader (with zookeeper) java -Dbootstrap_confdir=./solr/collection1/conf -Dcollection.configName=myconf -DzkRun -DzkHost=localhost:9983,localhost:8574,localhost:9900 -DnumShards=2 -jar start.jar - Shard1 Replica (with zookeeper) java -Djetty.port=7574 -DzkRun -DzkHost=localhost:9983,localhost:8574,localhost:9900 -jar start.jar - Shard2 Leader (with zookeeper) java -Djetty.port=8900 -DzkRun -DzkHost=localhost:9983,localhost:8574,localhost:9900 -jar start.jar - Shard2 Replica java -Djetty.port=7500 -DzkHost=localhost:9983,localhost:8574,localhost:9900 -jar start.jar clusterstate.json: http://dl.dropbox.com/u/7570330/clusterstate.txt # Indexing sample document java -jar post.jar hd.xml # search in all Shards: number of results found: 2 Note: all shards have the same result EXPERIMENT 2: Kill current Shard1 Leader, expect Shard1 Replica to become leader, search should still work and results return (is that right?) # Killing Shard2 Leader Shard2 Replica logs: ... Dec 21, 2012 11:57:39 AM org.apache.zookeeper.server.PrepRequestProcessor pRequest INFO: Processed session termination for sessionid: 0x3bbe3403c1 Dec 21, 2012 11:57:39 AM org.apache.zookeeper.server.PrepRequestProcessor pRequest INFO: Got user-level KeeperException when processing sessionid:0x3bbe3403c0 type:delete cxid:0x4dea zxid:0xfffe txntype:unknown reqpath:n/a Error Path:/collections/collection1/leaders/shard1 Error:KeeperErrorCode = NoNode for /collections/collection1/leaders/shard1 Dec 21, 2012 11:57:39 AM org.apache.solr.cloud.ShardLeaderElectionContext runLeaderProcess INFO: Running the leader process. Dec 21, 2012 11:57:39 AM org.apache.zookeeper.server.PrepRequestProcessor pRequest INFO: Got user-level KeeperException when processing sessionid:0x3bbe3403c0 type:create cxid:0x4dec zxid:0xfffe txntype:unknown reqpath:n/a Error Path:/overseer Error:KeeperErrorCode = NodeExists for /overseer Dec 21, 2012 11:57:39 AM org.apache.solr.cloud.ShardLeaderElectionContext shouldIBeLeader INFO: Checking if I should try and be the leader. Dec 21, 2012 11:57:39 AM org.apache.solr.cloud.ShardLeaderElectionContext shouldIBeLeader INFO: My last published State was Active, it's okay to be the leader. Dec 21, 2012 11:57:39 AM org.apache.solr.cloud.ShardLeaderElectionContext runLeaderProcess INFO: I may be the new leader - try and sync Dec 21, 2012 11:57:39 AM org.apache.solr.cloud.SyncStrategy sync INFO: Sync replicas to http://Gustavos-MacBook-Pro.local:8900/solr/collection1/ Dec 21, 2012 11:57:39 AM org.apache.solr.update.PeerSync sync INFO: PeerSync: core=collection1 url=http://Gustavos-MacBook-Pro.local:8900/solr START replicas=[http://Gustavos-MacBook-Pro.local:8983/solr/collection1/] nUpdates=100 Dec 21, 2012 11:57:39 AM org.apache.solr.update.PeerSync sync INFO: PeerSync: core=collection1 url=http://Gustavos-MacBook-Pro.local:8900/solr DONE. We have no versions. sync failed. Dec 21, 2012 11:57:39 AM org.apache.solr.common.SolrException log SEVERE: Sync Failed Dec 21, 2012 11:57:39 AM org.apache.solr.cloud.ShardLeaderElectionContext rejoinLeaderElection INFO: There is a better leader candidate than us - going back into recovery Dec 21, 2012 11:57:39 AM org.apache.solr.update.DefaultSolrCoreState doRecovery INFO: Running recovery - first canceling any ongoing recovery Dec 21, 2012 11:57:39 AM org.apache.solr.cloud.RecoveryStrategy run INFO: Starting recovery process. core=collection1 recoveringAfterStartup=false Dec 21, 2012 11:57:39 AM org.apache.solr.cloud.RecoveryStrategy doRecovery INFO: Attempting to PeerSync from http://Gustavos-MacBook-Pro.local:8983/solr/collection1/ core=collection1 - recoveringAfterStartup=false Dec 21, 2012 11:57:39 AM org.apache.solr.update.PeerSync sync INFO: PeerSync: core=collection1 url=http://Gustavos-MacBook-Pro.local:8900/solr START replicas=[http://Gustavos-MacBook-Pro.local:8983/solr/collection1/] nUpdates=100 Dec 21, 2012 11:57:39 AM org.apache.zookeeper.server.PrepRequestProcessor pRequest INFO: Got user-level KeeperException when processing sessionid:0x3bbe3403c0 type:delete cxid:0x4df3 zxid:0xfffe txntype:unknown reqpath:n/a Error Path:/collections/collection1/leaders/shard1 Error:KeeperErrorCode = NoNode for /collections/collection1/leaders/shard1 Dec 21, 2012 11:57:39 AM org.apache.solr.update.PeerSync sync WARNING: no frame of reference to tell of we've missed updates Dec 21, 2012 11:57:39 AM org.apache.solr.cloud.RecoveryStrategy doRecovery INFO: PeerSync Recovery was not successful - trying replication. core=collection1 Dec 21, 2012 11:57:39 AM org.apache.solr.cloud.RecoveryStrategy doRecovery INFO: Starting Replication Recovery. core=collection1 Dec 21, 2012 11:57:39 AM
Re: Solrcloud not reachable and after restart just a no servers hosting shard
Your hitting https://issues.apache.org/jira/browse/SOLR-3939 The luck of hashing must have left the guy trying to become the leader without any docs. Due to SOLR-3939, a node with an empty index cannot become the leader. - Mark On Dec 21, 2012, at 1:41 PM, gumatias gust...@matias.com wrote: I'm getting the same error. I followed the SolrCloud examples and it didn't work.. here's basically what I've done: EXPERIMENT 1: start shards and index documents, search for documents in all replicas # Starting Shards - Shard1 Leader (with zookeeper) java -Dbootstrap_confdir=./solr/collection1/conf -Dcollection.configName=myconf -DzkRun -DzkHost=localhost:9983,localhost:8574,localhost:9900 -DnumShards=2 -jar start.jar - Shard1 Replica (with zookeeper) java -Djetty.port=7574 -DzkRun -DzkHost=localhost:9983,localhost:8574,localhost:9900 -jar start.jar - Shard2 Leader (with zookeeper) java -Djetty.port=8900 -DzkRun -DzkHost=localhost:9983,localhost:8574,localhost:9900 -jar start.jar - Shard2 Replica java -Djetty.port=7500 -DzkHost=localhost:9983,localhost:8574,localhost:9900 -jar start.jar clusterstate.json: http://dl.dropbox.com/u/7570330/clusterstate.txt # Indexing sample document java -jar post.jar hd.xml # search in all Shards: number of results found: 2 Note: all shards have the same result EXPERIMENT 2: Kill current Shard1 Leader, expect Shard1 Replica to become leader, search should still work and results return (is that right?) # Killing Shard2 Leader Shard2 Replica logs: ... Dec 21, 2012 11:57:39 AM org.apache.zookeeper.server.PrepRequestProcessor pRequest INFO: Processed session termination for sessionid: 0x3bbe3403c1 Dec 21, 2012 11:57:39 AM org.apache.zookeeper.server.PrepRequestProcessor pRequest INFO: Got user-level KeeperException when processing sessionid:0x3bbe3403c0 type:delete cxid:0x4dea zxid:0xfffe txntype:unknown reqpath:n/a Error Path:/collections/collection1/leaders/shard1 Error:KeeperErrorCode = NoNode for /collections/collection1/leaders/shard1 Dec 21, 2012 11:57:39 AM org.apache.solr.cloud.ShardLeaderElectionContext runLeaderProcess INFO: Running the leader process. Dec 21, 2012 11:57:39 AM org.apache.zookeeper.server.PrepRequestProcessor pRequest INFO: Got user-level KeeperException when processing sessionid:0x3bbe3403c0 type:create cxid:0x4dec zxid:0xfffe txntype:unknown reqpath:n/a Error Path:/overseer Error:KeeperErrorCode = NodeExists for /overseer Dec 21, 2012 11:57:39 AM org.apache.solr.cloud.ShardLeaderElectionContext shouldIBeLeader INFO: Checking if I should try and be the leader. Dec 21, 2012 11:57:39 AM org.apache.solr.cloud.ShardLeaderElectionContext shouldIBeLeader INFO: My last published State was Active, it's okay to be the leader. Dec 21, 2012 11:57:39 AM org.apache.solr.cloud.ShardLeaderElectionContext runLeaderProcess INFO: I may be the new leader - try and sync Dec 21, 2012 11:57:39 AM org.apache.solr.cloud.SyncStrategy sync INFO: Sync replicas to http://Gustavos-MacBook-Pro.local:8900/solr/collection1/ Dec 21, 2012 11:57:39 AM org.apache.solr.update.PeerSync sync INFO: PeerSync: core=collection1 url=http://Gustavos-MacBook-Pro.local:8900/solr START replicas=[http://Gustavos-MacBook-Pro.local:8983/solr/collection1/] nUpdates=100 Dec 21, 2012 11:57:39 AM org.apache.solr.update.PeerSync sync INFO: PeerSync: core=collection1 url=http://Gustavos-MacBook-Pro.local:8900/solr DONE. We have no versions. sync failed. Dec 21, 2012 11:57:39 AM org.apache.solr.common.SolrException log SEVERE: Sync Failed Dec 21, 2012 11:57:39 AM org.apache.solr.cloud.ShardLeaderElectionContext rejoinLeaderElection INFO: There is a better leader candidate than us - going back into recovery Dec 21, 2012 11:57:39 AM org.apache.solr.update.DefaultSolrCoreState doRecovery INFO: Running recovery - first canceling any ongoing recovery Dec 21, 2012 11:57:39 AM org.apache.solr.cloud.RecoveryStrategy run INFO: Starting recovery process. core=collection1 recoveringAfterStartup=false Dec 21, 2012 11:57:39 AM org.apache.solr.cloud.RecoveryStrategy doRecovery INFO: Attempting to PeerSync from http://Gustavos-MacBook-Pro.local:8983/solr/collection1/ core=collection1 - recoveringAfterStartup=false Dec 21, 2012 11:57:39 AM org.apache.solr.update.PeerSync sync INFO: PeerSync: core=collection1 url=http://Gustavos-MacBook-Pro.local:8900/solr START replicas=[http://Gustavos-MacBook-Pro.local:8983/solr/collection1/] nUpdates=100 Dec 21, 2012 11:57:39 AM org.apache.zookeeper.server.PrepRequestProcessor pRequest INFO: Got user-level KeeperException when processing sessionid:0x3bbe3403c0 type:delete cxid:0x4df3 zxid:0xfffe txntype:unknown reqpath:n/a Error Path:/collections/collection1/leaders/shard1 Error:KeeperErrorCode = NoNode for /collections/collection1/leaders/shard1 Dec 21, 2012 11:57:39 AM
Re: Solrcloud not reachable and after restart just a no servers hosting shard
At least it looks like your hitting that - based on it mentioning no frame of reference to use to sync with - more importantly though, your also hitting another issue - see my email to the user list: http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201212.mbox/%3cd0994a2d-04b0-4a80-af07-9add49b85...@gmail.com%3E - Mark On Dec 21, 2012, at 2:10 PM, Mark Miller markrmil...@gmail.com wrote: Your hitting https://issues.apache.org/jira/browse/SOLR-3939 The luck of hashing must have left the guy trying to become the leader without any docs. Due to SOLR-3939, a node with an empty index cannot become the leader. - Mark On Dec 21, 2012, at 1:41 PM, gumatias gust...@matias.com wrote: I'm getting the same error. I followed the SolrCloud examples and it didn't work.. here's basically what I've done: EXPERIMENT 1: start shards and index documents, search for documents in all replicas # Starting Shards - Shard1 Leader (with zookeeper) java -Dbootstrap_confdir=./solr/collection1/conf -Dcollection.configName=myconf -DzkRun -DzkHost=localhost:9983,localhost:8574,localhost:9900 -DnumShards=2 -jar start.jar - Shard1 Replica (with zookeeper) java -Djetty.port=7574 -DzkRun -DzkHost=localhost:9983,localhost:8574,localhost:9900 -jar start.jar - Shard2 Leader (with zookeeper) java -Djetty.port=8900 -DzkRun -DzkHost=localhost:9983,localhost:8574,localhost:9900 -jar start.jar - Shard2 Replica java -Djetty.port=7500 -DzkHost=localhost:9983,localhost:8574,localhost:9900 -jar start.jar clusterstate.json: http://dl.dropbox.com/u/7570330/clusterstate.txt # Indexing sample document java -jar post.jar hd.xml # search in all Shards: number of results found: 2 Note: all shards have the same result EXPERIMENT 2: Kill current Shard1 Leader, expect Shard1 Replica to become leader, search should still work and results return (is that right?) # Killing Shard2 Leader Shard2 Replica logs: ... Dec 21, 2012 11:57:39 AM org.apache.zookeeper.server.PrepRequestProcessor pRequest INFO: Processed session termination for sessionid: 0x3bbe3403c1 Dec 21, 2012 11:57:39 AM org.apache.zookeeper.server.PrepRequestProcessor pRequest INFO: Got user-level KeeperException when processing sessionid:0x3bbe3403c0 type:delete cxid:0x4dea zxid:0xfffe txntype:unknown reqpath:n/a Error Path:/collections/collection1/leaders/shard1 Error:KeeperErrorCode = NoNode for /collections/collection1/leaders/shard1 Dec 21, 2012 11:57:39 AM org.apache.solr.cloud.ShardLeaderElectionContext runLeaderProcess INFO: Running the leader process. Dec 21, 2012 11:57:39 AM org.apache.zookeeper.server.PrepRequestProcessor pRequest INFO: Got user-level KeeperException when processing sessionid:0x3bbe3403c0 type:create cxid:0x4dec zxid:0xfffe txntype:unknown reqpath:n/a Error Path:/overseer Error:KeeperErrorCode = NodeExists for /overseer Dec 21, 2012 11:57:39 AM org.apache.solr.cloud.ShardLeaderElectionContext shouldIBeLeader INFO: Checking if I should try and be the leader. Dec 21, 2012 11:57:39 AM org.apache.solr.cloud.ShardLeaderElectionContext shouldIBeLeader INFO: My last published State was Active, it's okay to be the leader. Dec 21, 2012 11:57:39 AM org.apache.solr.cloud.ShardLeaderElectionContext runLeaderProcess INFO: I may be the new leader - try and sync Dec 21, 2012 11:57:39 AM org.apache.solr.cloud.SyncStrategy sync INFO: Sync replicas to http://Gustavos-MacBook-Pro.local:8900/solr/collection1/ Dec 21, 2012 11:57:39 AM org.apache.solr.update.PeerSync sync INFO: PeerSync: core=collection1 url=http://Gustavos-MacBook-Pro.local:8900/solr START replicas=[http://Gustavos-MacBook-Pro.local:8983/solr/collection1/] nUpdates=100 Dec 21, 2012 11:57:39 AM org.apache.solr.update.PeerSync sync INFO: PeerSync: core=collection1 url=http://Gustavos-MacBook-Pro.local:8900/solr DONE. We have no versions. sync failed. Dec 21, 2012 11:57:39 AM org.apache.solr.common.SolrException log SEVERE: Sync Failed Dec 21, 2012 11:57:39 AM org.apache.solr.cloud.ShardLeaderElectionContext rejoinLeaderElection INFO: There is a better leader candidate than us - going back into recovery Dec 21, 2012 11:57:39 AM org.apache.solr.update.DefaultSolrCoreState doRecovery INFO: Running recovery - first canceling any ongoing recovery Dec 21, 2012 11:57:39 AM org.apache.solr.cloud.RecoveryStrategy run INFO: Starting recovery process. core=collection1 recoveringAfterStartup=false Dec 21, 2012 11:57:39 AM org.apache.solr.cloud.RecoveryStrategy doRecovery INFO: Attempting to PeerSync from http://Gustavos-MacBook-Pro.local:8983/solr/collection1/ core=collection1 - recoveringAfterStartup=false Dec 21, 2012 11:57:39 AM org.apache.solr.update.PeerSync sync INFO: PeerSync: core=collection1 url=http://Gustavos-MacBook-Pro.local:8900/solr START replicas=[http://Gustavos-MacBook-Pro.local:8983/solr/collection1/] nUpdates=100 Dec 21,
Solrcloud not reachable and after restart just a no servers hosting shard
Hi, I am running Solrcloud 4.0-BETA and during the weekend it 'crashed' somehow, so that it wasn't reachable. CPU load was 100%. After a restart i couldn't access the data it just telled me: no servers hosting shard Is there a way to get the data back? Thanks regards Daniel
Re: Solrcloud not reachable and after restart just a no servers hosting shard
hi, Can you share a little bit more about your configuration: how many shards, # of replicas, how does your clusterstate.json look like, anything suspicious in the logs? -- Sami Siren On Mon, Sep 24, 2012 at 11:13 AM, Daniel Brügge daniel.brue...@gmail.com wrote: Hi, I am running Solrcloud 4.0-BETA and during the weekend it 'crashed' somehow, so that it wasn't reachable. CPU load was 100%. After a restart i couldn't access the data it just telled me: no servers hosting shard Is there a way to get the data back? Thanks regards Daniel
Re: Solrcloud not reachable and after restart just a no servers hosting shard
Right - we need logs, admin-cloud dump to clipboard info, anything else to go on. On Mon, Sep 24, 2012 at 4:36 AM, Sami Siren ssi...@gmail.com wrote: hi, Can you share a little bit more about your configuration: how many shards, # of replicas, how does your clusterstate.json look like, anything suspicious in the logs? -- Sami Siren On Mon, Sep 24, 2012 at 11:13 AM, Daniel Brügge daniel.brue...@gmail.com wrote: Hi, I am running Solrcloud 4.0-BETA and during the weekend it 'crashed' somehow, so that it wasn't reachable. CPU load was 100%. After a restart i couldn't access the data it just telled me: no servers hosting shard Is there a way to get the data back? Thanks regards Daniel -- - Mark