Did you upgrade recently to Solr 4.7? 4.7 has a bad bug which can
cause out of memory issues. Can you check your logs for out of memory
errors?

On Sun, Mar 23, 2014 at 9:07 PM, Lukas Mikuckis <lukasmikuc...@gmail.com> wrote:
> Solr version: 4.7
>
> Architecture:
> 2 solrs (1 shard, leader + replica)
> 3 zookeepers
>
> Servers:
> * zookeeper + solr (heap 4gb) - RAM 8gb, 2 cpu cores
> * zookeeper + solr  (heap 4gb) - RAM 8gb, 2 cpu cores
> * zookeeper
>
> Solr data:
> * 21 collections
> * Many fields, small docs, docs count per collection from 1k to 500k
>
> About a week ago solr started crashing. It crashes every day, 3-4 times a
> day. Usually at nigh. I can't tell anything what could it be related to
> because at that time we haven't done any configuration changes. Load
> haven't changed too.
>
>
> Everything starts with Stopping recovery for .. warnings (every warnings is
> repeated several times):
>
> WARN  org.apache.solr.cloud.RecoveryStrategy; Stopping recovery for
> zkNodeName=core_node1core=******************
>
> WARN  org.apache.solr.cloud.ElectionContext; cancelElection did not find
> election node to remove
>
> WARN  org.apache.solr.update.PeerSync; no frame of reference to tell if
> we've missed updates
>
> WARN  - 2014-03-23 04:00:26.286; org.apache.solr.update.PeerSync; no frame
> of reference to tell if we've missed updates
>
> WARN  - 2014-03-23 04:00:30.728; org.apache.solr.handler.SnapPuller; File
> _f9m_Lucene41_0.doc expected to be 6218278 while it is 7759879
>
> WARN  - 2014-03-23 04:00:54.126;
> org.apache.solr.update.UpdateLog$LogReplayer; Starting log replay
> tlog{file=/path/solr/collection1_shard1_replica2/data/tlog/tlog.0000000000000003272
> refcount=2} active=true starting pos=356216606
>
> Then again Stopping recovery for .. warnings:
>
> WARN  org.apache.solr.cloud.RecoveryStrategy; Stopping recovery for
> zkNodeName=core_node1core=******************
>
> ERROR - 2014-03-23 05:19:29.566; org.apache.solr.common.SolrException;
> org.apache.solr.common.SolrException: No registered leader was found after
> waiting for 4000ms , collection: collection1 slice: shard1
>
> ERROR - 2014-03-23 05:20:03.961; org.apache.solr.common.SolrException;
> org.apache.solr.common.SolrException: I was asked to wait on state down for
> IP:PORT_solr but I still do not see the requested state. I see state:
> active live:false
>
>
> After this serves mostly didn't recover.



-- 
Regards,
Shalin Shekhar Mangar.

Reply via email to