On Aug 14, 2013, at 9:01 AM, Manuel Le Normand <manuel.lenorm...@gmail.com> 
wrote:

> Does this sound like the scenario that happened:
> By removing the index dir from replica 2 I also removed the flog

Did you also remove the tlog dir?

It's normally:

data/index
data/tlog 

> from which
> the zookeeper extracts the version of the two replicas and decides which
> one should be elected to leader. As replica 2 did have no tlog, the zk
> didn't have anyway to compare the 2 registered replicas so it just picked
> arbitrarly one of the replicas to lead, resulting in electing empty
> replicas.

If one had no tlog, it should have recovered from the one that still had a tlog.

> 
> How does the zookeeper compare the 2 tlogs to know which one is more
> recent? does it not rely on the version number shown in the admin UI?

It looks at recent id's in the tlogs of both and compares them.

- Mark

> 
> 
> On Wed, Aug 14, 2013 at 11:00 AM, Manuel Le Normand <
> manuel.lenorm...@gmail.com> wrote:
> 
>> Hello,
>> My solr cluster runs on RH Linux with tomcat7 servlet.
>> NumOfShards=40, replicationFactor=2, 40 servers each has 2 replicas. Solr
>> 4.3
>> 
>> For experimental reasons I splitted my cluster to 2 sub-clusters, each
>> containing a single replica of each shard.
>> When connecting back these sub-clusters the sync failed (more than 100
>> docs indexed per shard) so a replication process started on sub-cluster #2.
>> Due to transient storage limitations needed for the replication process, I
>> removed all the index from sub-cluster #2 before connecting it back, then I
>> connected sub-cluster #2's servers in 3-4 bulks to avoid high disk load.
>> The first bulk replications worked well, but after a while an internal
>> script pkilled all the solr instances, some while replicating. After
>> starting back the servlet I discovered the disaster - on part of the
>> replicas that were in a replicating stage there was a wrong zookeeper
>> leader election - good state replicas (sub-cluster 1) replicated from empty
>> replicas (sub-cluster 2) ending up in removing all documents in these
>> shards!!
>> 
>> These are the logs from solr-prod32 (sub cluster #2 - bad state) - the
>> shard1_replica1 is elected to be leader although it was not before the
>> replication process (and shouldn't have the higher version number):
>> 
>> 2013-08-13 13:39:15.838 [INFO ]
>> org.apache.solr.cloud.ShardLeaderElectionContext Enough replicas found to
>> continue.
>> 2013-08-13 13:39:15.838 [INFO ]
>> org.apache.solr.cloud.ShardLeaderElectionContext I may be the new leader -
>> try and sync
>> 2013-08-13 13:39:15.839 [INFO ] org.apache.solr.cloud.SyncStrategy Sync
>> replicas to http://solr-prod32:5050/solr/raw shard1_replica1/
>> 2013-08-13 13:39:15.841 [INFO ]
>> org.apache.solr.client.solrj.impl.HttpClientUtil Creating new http client,
>> config:maxConnectionsPerHost=20&maxConnections=10000&connTimeout=30000&socketTimeout=30000&retry=false
>> 2013-08-13 13:39:15.844 [INFO ] org.apache.solr.update.PeerSync PeerSync:
>> core=raw_shard1_replica1 url=http://solr-prod32:8080/solr START replicas=[
>> http://solr-prod02:5080/solr/raw shard1_replica2/] nUpdates=100
>> 2013-08-13 13:39:15.847 [INFO I org.apache.solr.update.PeerSync PeerSync:
>> core=raw shard1_replica1 url=http://solr-prod32:8080/solr DONE. We have
>> no versions. sync failed.
>> 2013-08-13 13:39:15.847 [INFO ] org.apache.solr.cloud.SyncStrategy
>> Leader's attempt to sync with shard failed, moving to the next canidate
>> 2013-08-13 13:39:15.847 [INFO ]
>> org.apache.solr.cloud.ShardLeaderElectionContext We failed sync, but we
>> have no versions - we can't sync in that case - we were active before, so
>> become leader anyway
>> 2013-08-13 13:39:15.847 [INFO ]
>> org.apache.solr.cloud.ShardLeaderElectionContext I am the new leader:
>> http://solr-prod32:8080/solr/raw_shard1_replica1/
>> 2013-08-13 13:39:15.847 [INFO ] org.apache.solr.common.cloud.SolrZkClient
>> makePath: /collections/raw/leaders/shardl
>> 2013-08-13 13:39:17.423 [INFO ] org.apache.solr.common.cloud.ZkStateReader
>> A cluster state change: WatchedEvent state:SyncConnected
>> type:NodeDataChanged path:/clusterstate.json, has occurred - updating...
>> (live nodes size: 40)
>> 
>> While in solr-prod02 (sub cluster #1 - good state) I get:
>> 2013-08-13 13:39:15.671 [INFO ] org.apache.solr.cloud.ZkController
>> publishing core=raw_shard1_replica2 state=down
>> 2013-08-13 13:39:15.671 [INFO ] org.apache.solr.cloud.ZkController
>> numShards not found on descriptor - reading it from system property
>> 2013-08-13 13:39:15.673 [INFO ] org.apache.solr.core.CoreContainer
>> registering core: raw_shard1_replica2
>> 2013-08-13 13:39:15.673 [INFO ] org.apache.solr.cloud.ZkController
>> Register replica - core:raw_shard1_replica2 address:
>> http://so1r-prod02:8080/solr collection:raw shard:shard1
>> 2013-08-13 13:39:17.423 [INFO ] org.apache.solr.common.cloud.ZkStateReader
>> A cluster state change: WatchedEvent stare:SyncConnected
>> type:NodeDataChanged path:/clusterstate.json, has occurred - updating...
>> (live nodes size: 40)
>> 2013-08-13 13:39:17.480 [INFO ] org.apache.solr.cloud.ZkController We are
>> httpL//solr-prod02:8080/solr/raw_shard1_replica2/ and leader is
>> http://solr-prod32:8080/solr/raw_shard1_replica1/
>> 2013-08-13 13:39:17.481 [INFO ] org.apache.solr.cloud.ZkController No
>> LogReplay needed for core=raw_shard1_replica2
>> 2013-08-13 13:39:17.481 [INFO ] org.apache.solr.cloud.ZkController Core
>> needs to recover:raw shard1_replica2
>> 2013-08-13 13:39:17.481 [INFO ]
>> org.apache.solr.update.DefaultSolrCoreState Running recovery - first
>> canceling any ongoing recovery
>> 2013-08-13 13:39:17.485 [INFO org.apache.solr.common.cloud.ZkStateReader
>> Updating cloud state from ZooKeeper...
>> 2013-08-13 13:39:17.485 [INFO ] org.apache.solr.cloud.RecoveryStrategy
>> Starting recovery process. core=raw_shard1_rep1ica2
>> 
>> Why was the leader elected wrongly??
>> 
>> Thanks
>> 

Reply via email to