Hey Daniel,

Thanks for the response.  I think we'll give this a try to see if this
helps.

Jed.

On 7/10/13 10:48 AM, "Daniel Collins" <danwcoll...@gmail.com> wrote:

>We had something similar in terms of update times suddenly spiking up for
>no obvious reason.  We never got quite as bad as you in terms of the other
>knock on effects, but we certainly saw updates jumping from 10ms up to
>30000ms, all our external queues backed up and we rejected some updates,
>then after a while things quietened down.
>
>We were running Solr 4.3.0 but with Java 6 and the CMS GC.  We swapped to
>Java 7, G1 GC (and increased heap size from 8Gb to 12Gb) and the problem
>went away.
>
>Now, I admit its not exactly the same as your case, we never had the
>follow-on effects, but I'd consider Java 7 and the G1 GC, it has certainly
>reduced the "spikes" in our indexing times.
>
>We run the following settings now (the usual caveats apply, it might not
>work for you).
>
>    GC_OPTIONS="-XX:+AggressiveOpts -XX:+UseG1GC -XX:+UseStringCache
>-XX:+OptimizeStringConcat -XX:-UseSplitVerifier -XX:+UseNUMA
>-XX:MaxGCPauseMillis=50 -XX:GCPauseIntervalMillis=1000"
>
>I set the MaxGCPauseMillis/GCPauseIntervalMillis to try to minimise
>application pauses, that's our goal, if we have to use more memory in the
>short term then so be it, but we couldn't afford application pauses,
>because we are using NRT (soft commits every 1s, hard commits every 60s)
>and we get a lot of updates.
>
>I know there have been other discussion on G1 and it has received mixed
>results overall, but for us, it seems to be a winner.
>
>Hope that helps,
>
>
>On 10 July 2013 08:32, Jed Glazner <jglaz...@adobe.com> wrote:
>
>> We are planning an upgrade to 4.4 but it's still weeks out. We offer a
>> high availability search service and there are a number of changes in
>>4.4
>> that are not backward compatible. (i.e. Clusterstate.json and no
>>solr.xml)
>> So there must be lots of testing, additionally this upgrade cannot be
>> performed without downtime.
>>
>> Regardless, I need to find a band-aid right now.  Does anyone know if
>>it's
>> possible to set the timeout for distributed update request to/from
>>leader.
>>  Currently we see it's set to 0.  Maybe via -D startup param, or
>>something?
>>
>> Jed
>>
>> On 7/10/13 1:23 AM, "Otis Gospodnetic" <otis.gospodne...@gmail.com>
>>wrote:
>>
>> >Hi Jed,
>> >
>> >This is really with Solr 4.0?  If so, it may be wiser to jump on 4.4
>> >that is about to be released.  We did not have fun working with 4.0 in
>> >SolrCloud mode a few months ago.  You will save time, hair, and money
>> >if you convince your manager to let you use Solr 4.4. :)
>> >
>> >Otis
>> >--
>> >Solr & ElasticSearch Support -- http://sematext.com/
>> >Performance Monitoring -- http://sematext.com/spm
>> >
>> >
>> >
>> >On Tue, Jul 9, 2013 at 4:44 PM, Jed Glazner <jglaz...@adobe.com> wrote:
>> >> Hi Shawn,
>> >>
>> >> I have been trying to duplicate this problem without success for the
>> >>last 2 weeks which is one reason I'm getting flustered.   It seems
>> >>reasonable to be able to duplicate it but I can't.
>> >>
>> >>  We do have a story to upgrade but that is still weeks if not months
>> >>before that gets rolled out to production.
>> >>
>> >> We have another cluster running the same version but with 8 shards
>>and
>> >>8 replicas with each shard at 100gb and more load and more indexing
>> >>requests without this problem but we send docs in batches here and all
>> >>fields are stored.   Where as the trouble index has only 1 or 2 stored
>> >>fields and only send docs 1 at a time.
>> >>
>> >> Could that have anything to do with it?
>> >>
>> >> Jed
>> >>
>> >>
>> >> Von Samsung Mobile gesendet
>> >>
>> >>
>> >>
>> >> -------- Ursprüngliche Nachricht --------
>> >> Von: Shawn Heisey <s...@elyograg.org>
>> >> Datum: 07.09.2013 18:33 (GMT+01:00)
>> >> An: solr-user@lucene.apache.org
>> >> Betreff: Re: Solr Hangs During Updates for over 10 minutes
>> >>
>> >>
>> >> On 7/9/2013 9:50 AM, Jed Glazner wrote:
>> >>> I'll give you the high level before delving deep into setup etc. I
>> >>>have been struggeling at work with a seemingly random problem when
>>solr
>> >>>will hang for 10-15 minutes during updates.  This outage always seems
>> >>>to immediately be proceeded by an EOF exception on  the replica.
>>Then
>> >>>10-15 minutes later we see an exception on the leader for a socket
>> >>>timeout to the replica.  The leader will then tell the replica to
>> >>>recover which in most cases it does and then the outage is over.
>> >>>
>> >>> Here are the setup details:
>> >>>
>> >>> We are currently using Solr 4.0.0 with an external ZK ensemble of 5
>> >>>machines.
>> >>
>> >> After 4.0.0 was released, a *lot* of problems with SolrCloud surfaced
>> >> and have since been fixed.  You're five releases and about nine
>>months
>> >> behind what's current.  My recommendation: Upgrade to 4.3.1, ensure
>>your
>> >> configuration is up to date with changes to the example config
>>between
>> >> 4.0.0 and 4.3.1, and reindex.  Ideally, you should set up a 4.0.0
>> >> testbed, duplicate your current problem, and upgrade the testbed to
>>see
>> >> if the problem goes away.  A testbed will also give you practice for
>>a
>> >> smooth upgrade of your production system.
>> >>
>> >> Thanks,
>> >> Shawn
>> >>
>>
>>

Reply via email to