Looking at the master it looks like at some point there were shards that
went down.  I am seeing things like what is below.

NFO: A cluster state change: WatchedEvent state:SyncConnected
type:NodeChildrenChanged path:/live_nodes, has occurred - updating... (live
nodes size: 12)
Apr 2, 2013 8:12:52 PM org.apache.solr.common.cloud.ZkStateReader$3 process
INFO: Updating live nodes... (9)
Apr 2, 2013 8:12:52 PM org.apache.solr.cloud.ShardLeaderElectionContext
runLeaderProcess
INFO: Running the leader process.
Apr 2, 2013 8:12:52 PM org.apache.solr.cloud.ShardLeaderElectionContext
shouldIBeLeader
INFO: Checking if I should try and be the leader.
Apr 2, 2013 8:12:52 PM org.apache.solr.cloud.ShardLeaderElectionContext
shouldIBeLeader
INFO: My last published State was Active, it's okay to be the leader.
Apr 2, 2013 8:12:52 PM org.apache.solr.cloud.ShardLeaderElectionContext
runLeaderProcess
INFO: I may be the new leader - try and sync



On Tue, Apr 2, 2013 at 5:09 PM, Mark Miller <markrmil...@gmail.com> wrote:

> I don't think the versions you are thinking of apply here. Peersync does
> not look at that - it looks at version numbers for updates in the
> transaction log - it compares the last 100 of them on leader and replica.
> What it's saying is that the replica seems to have versions that the leader
> does not. Have you scanned the logs for any interesting exceptions?
>
> Did the leader change during the heavy indexing? Did any zk session
> timeouts occur?
>
> - Mark
>
> On Apr 2, 2013, at 4:52 PM, Jamie Johnson <jej2...@gmail.com> wrote:
>
> > I am currently looking at moving our Solr cluster to 4.2 and noticed a
> > strange issue while testing today.  Specifically the replica has a higher
> > version than the master which is causing the index to not replicate.
> > Because of this the replica has fewer documents than the master.  What
> > could cause this and how can I resolve it short of taking down the index
> > and scping the right version in?
> >
> > MASTER:
> > Last Modified:about an hour ago
> > Num Docs:164880
> > Max Doc:164880
> > Deleted Docs:0
> > Version:2387
> > Segment Count:23
> >
> > REPLICA:
> > Last Modified: about an hour ago
> > Num Docs:164773
> > Max Doc:164773
> > Deleted Docs:0
> > Version:3001
> > Segment Count:30
> >
> > in the replicas log it says this:
> >
> > INFO: Creating new http client,
> >
> config:maxConnectionsPerHost=20&maxConnections=10000&connTimeout=30000&socketTimeout=30000&retry=false
> >
> > Apr 2, 2013 8:15:06 PM org.apache.solr.update.PeerSync sync
> >
> > INFO: PeerSync: core=dsc-shard5-core2
> > url=http://10.38.33.17:7577/solrSTART replicas=[
> > http://10.38.33.16:7575/solr/dsc-shard5-core1/] nUpdates=100
> >
> > Apr 2, 2013 8:15:06 PM org.apache.solr.update.PeerSync handleVersions
> >
> > INFO: PeerSync: core=dsc-shard5-core2 url=http://10.38.33.17:7577/solr
> > Received 100 versions from 10.38.33.16:7575/solr/dsc-shard5-core1/
> >
> > Apr 2, 2013 8:15:06 PM org.apache.solr.update.PeerSync handleVersions
> >
> > INFO: PeerSync: core=dsc-shard5-core2 url=http://10.38.33.17:7577/solr Our
> > versions are newer. ourLowThreshold=1431233788792274944
> > otherHigh=1431233789440294912
> >
> > Apr 2, 2013 8:15:06 PM org.apache.solr.update.PeerSync sync
> >
> > INFO: PeerSync: core=dsc-shard5-core2
> > url=http://10.38.33.17:7577/solrDONE. sync succeeded
> >
> >
> > which again seems to point that it thinks it has a newer version of the
> > index so it aborts.  This happened while having 10 threads indexing
> 10,000
> > items writing to a 6 shard (1 replica each) cluster.  Any thoughts on
> this
> > or what I should look for would be appreciated.
>
>

Reply via email to