On 7/9/2013 9:50 AM, Jed Glazner wrote:
I'll give you the high level before delving deep into setup etc. I have been 
struggeling at work with a seemingly random problem when solr will hang for 
10-15 minutes during updates.  This outage always seems to immediately be 
proceeded by an EOF exception on  the replica.  Then 10-15 minutes later we see 
an exception on the leader for a socket timeout to the replica.  The leader 
will then tell the replica to recover which in most cases it does and then the 
outage is over.

Here are the setup details:

We are currently using Solr 4.0.0 with an external ZK ensemble of 5 machines.

After 4.0.0 was released, a *lot* of problems with SolrCloud surfaced and have since been fixed. You're five releases and about nine months behind what's current. My recommendation: Upgrade to 4.3.1, ensure your configuration is up to date with changes to the example config between 4.0.0 and 4.3.1, and reindex. Ideally, you should set up a 4.0.0 testbed, duplicate your current problem, and upgrade the testbed to see if the problem goes away. A testbed will also give you practice for a smooth upgrade of your production system.

Thanks,
Shawn

Reply via email to