sorry for spamming here....

shard5-core2 is the instance we're having issues with...

Apr 2, 2013 7:27:14 PM org.apache.solr.common.SolrException log
SEVERE: shard update error StdNode:
http://10.38.33.17:7577/solr/dsc-shard5-core2/:org.apache.solr.common.SolrException:
Server at http://10.38.33.17:7577/solr/dsc-shard5-core2 returned non ok
status:503, message:Service Unavailable
        at
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:373)
        at
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:181)
        at
org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:332)
        at
org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:306)
        at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
        at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)


On Tue, Apr 2, 2013 at 5:43 PM, Jamie Johnson <jej2...@gmail.com> wrote:

> here is another one that looks interesting
>
> Apr 2, 2013 7:27:14 PM org.apache.solr.common.SolrException log
> SEVERE: org.apache.solr.common.SolrException: ClusterState says we are the
> leader, but locally we don't think so
>         at
> org.apache.solr.update.processor.DistributedUpdateProcessor.doDefensiveChecks(DistributedUpdateProcessor.java:293)
>         at
> org.apache.solr.update.processor.DistributedUpdateProcessor.setupRequest(DistributedUpdateProcessor.java:228)
>         at
> org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:339)
>         at
> org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:100)
>         at
> org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:246)
>         at
> org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:173)
>         at
> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
>         at
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
>         at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
>         at org.apache.solr.core.SolrCore.execute(SolrCore.java:1797)
>         at
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:637)
>         at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:343)
>
>
>
> On Tue, Apr 2, 2013 at 5:41 PM, Jamie Johnson <jej2...@gmail.com> wrote:
>
>> Looking at the master it looks like at some point there were shards that
>> went down.  I am seeing things like what is below.
>>
>> NFO: A cluster state change: WatchedEvent state:SyncConnected
>> type:NodeChildrenChanged path:/live_nodes, has occurred - updating... (live
>> nodes size: 12)
>> Apr 2, 2013 8:12:52 PM org.apache.solr.common.cloud.ZkStateReader$3
>> process
>> INFO: Updating live nodes... (9)
>> Apr 2, 2013 8:12:52 PM org.apache.solr.cloud.ShardLeaderElectionContext
>> runLeaderProcess
>> INFO: Running the leader process.
>> Apr 2, 2013 8:12:52 PM org.apache.solr.cloud.ShardLeaderElectionContext
>> shouldIBeLeader
>> INFO: Checking if I should try and be the leader.
>> Apr 2, 2013 8:12:52 PM org.apache.solr.cloud.ShardLeaderElectionContext
>> shouldIBeLeader
>> INFO: My last published State was Active, it's okay to be the leader.
>> Apr 2, 2013 8:12:52 PM org.apache.solr.cloud.ShardLeaderElectionContext
>> runLeaderProcess
>> INFO: I may be the new leader - try and sync
>>
>>
>>
>> On Tue, Apr 2, 2013 at 5:09 PM, Mark Miller <markrmil...@gmail.com>wrote:
>>
>>> I don't think the versions you are thinking of apply here. Peersync does
>>> not look at that - it looks at version numbers for updates in the
>>> transaction log - it compares the last 100 of them on leader and replica.
>>> What it's saying is that the replica seems to have versions that the leader
>>> does not. Have you scanned the logs for any interesting exceptions?
>>>
>>> Did the leader change during the heavy indexing? Did any zk session
>>> timeouts occur?
>>>
>>> - Mark
>>>
>>> On Apr 2, 2013, at 4:52 PM, Jamie Johnson <jej2...@gmail.com> wrote:
>>>
>>> > I am currently looking at moving our Solr cluster to 4.2 and noticed a
>>> > strange issue while testing today.  Specifically the replica has a
>>> higher
>>> > version than the master which is causing the index to not replicate.
>>> > Because of this the replica has fewer documents than the master.  What
>>> > could cause this and how can I resolve it short of taking down the
>>> index
>>> > and scping the right version in?
>>> >
>>> > MASTER:
>>> > Last Modified:about an hour ago
>>> > Num Docs:164880
>>> > Max Doc:164880
>>> > Deleted Docs:0
>>> > Version:2387
>>> > Segment Count:23
>>> >
>>> > REPLICA:
>>> > Last Modified: about an hour ago
>>> > Num Docs:164773
>>> > Max Doc:164773
>>> > Deleted Docs:0
>>> > Version:3001
>>> > Segment Count:30
>>> >
>>> > in the replicas log it says this:
>>> >
>>> > INFO: Creating new http client,
>>> >
>>> config:maxConnectionsPerHost=20&maxConnections=10000&connTimeout=30000&socketTimeout=30000&retry=false
>>> >
>>> > Apr 2, 2013 8:15:06 PM org.apache.solr.update.PeerSync sync
>>> >
>>> > INFO: PeerSync: core=dsc-shard5-core2
>>> > url=http://10.38.33.17:7577/solrSTART replicas=[
>>> > http://10.38.33.16:7575/solr/dsc-shard5-core1/] nUpdates=100
>>> >
>>> > Apr 2, 2013 8:15:06 PM org.apache.solr.update.PeerSync handleVersions
>>> >
>>> > INFO: PeerSync: core=dsc-shard5-core2 url=http://10.38.33.17:7577/solr
>>> > Received 100 versions from 10.38.33.16:7575/solr/dsc-shard5-core1/
>>> >
>>> > Apr 2, 2013 8:15:06 PM org.apache.solr.update.PeerSync handleVersions
>>> >
>>> > INFO: PeerSync: core=dsc-shard5-core2 url=http://10.38.33.17:7577/solr Our
>>> > versions are newer. ourLowThreshold=1431233788792274944
>>> > otherHigh=1431233789440294912
>>> >
>>> > Apr 2, 2013 8:15:06 PM org.apache.solr.update.PeerSync sync
>>> >
>>> > INFO: PeerSync: core=dsc-shard5-core2
>>> > url=http://10.38.33.17:7577/solrDONE. sync succeeded
>>> >
>>> >
>>> > which again seems to point that it thinks it has a newer version of the
>>> > index so it aborts.  This happened while having 10 threads indexing
>>> 10,000
>>> > items writing to a 6 shard (1 replica each) cluster.  Any thoughts on
>>> this
>>> > or what I should look for would be appreciated.
>>>
>>>
>>
>

Reply via email to