My client has a test cluster Solr 4.6 with three instances 1, 2, and 3 hosting 
shards 1, 2, and 3, respectively.  There is no replication in this cluster.  We 
started receiving OOME during indexing; likely the batches were too large.  The 
cluster was rebooted to restore the system.  However, upon reboot, instance 2 
now shows as a replica of shard 1 and its shard2 is down with a null range.  
Instance 2 is queryable shards.tolerant=true&distribute=false and returns a 
different set of records than instance 1 (as would be expected during normal 
operations).  Clusterstate.json is similar to the following:
 
mycollection:{
shard1:{
range:8000000-d554ffff,
state:active,
replicas:{
instance1....state:active...,
instance2....state:active...
}
},
shard3:{....state:active.....},
shard2:{
range:null,
state:active,
replicas:{
instance2{....state:down....}
}
},
maxShardsPerNode:1,
replicationFactor:1
}
 
Any ideas on how this would come to pass?  Would manually correcting the 
clusterstate.json in Zk correct this situation?

Reply via email to