Re: Cluster gets stuck after full re-index

Florian Munz Wed, 04 Jun 2014 14:19:21 -0700

I don't see any signs of GC in the logs or somewhere else, shouldn'tthere be high CPU usage in that case?


We moved from 4 to 2 nodes and from 2 to 1 number of replicas.



Cheers,
Florian

On 03.06.14 12:36, Mark Walkom wrote:

Am I reading that right, you're basically at 100% heap usage? If that is
the case then it'd be GC that's killing you.

Did you add more nodes when you moved to AWS or do you have the same number?

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: [email protected] <mailto:[email protected]>
web: www.campaignmonitor.com <http://www.campaignmonitor.com>


On 3 June 2014 20:27, Florian Munz <[email protected]
<mailto:[email protected]>> wrote:

    Other than the jmap -heap I didn't manage to look more specifically
    into it:

    https://gist.github.com/theflow/b983d512ea344545f7f6#file-jmap

    The same process runs fine on much smaller machines in our staging
    environment, without the live traffic, of course.

    Anything particular I should run that would give more insights?


    Cheers,
    Florian


    On Tuesday, June 3, 2014 12:21:32 PM UTC+2, Mark Walkom wrote:

        How does your heap look during all this?

        Regards,
        Mark Walkom

        Infrastructure Engineer
        Campaign Monitor
        email: [email protected]
        web: www.campaignmonitor.com <http://www.campaignmonitor.com>


        On 3 June 2014 20:14, Florian Munz <[email protected]> wrote:

            Hello,

            we recently moved our ES cluster from dedicated hardware to
            AWS instances, they have less memory available, but use SSDs
            for the ES data directory. We kept JVM (1.7.0_17) and ES
            (0.90.9) version exactly the same. On the new hardware,
            after running a full re-index (creating a new index,
            pointing an alias to the new and one alias to the old index,
            sending realtime updates to both aliases and running a
            script to fill up the new index) our cluster gets stuck.

            10 minutes after the re-index finishes and we move both
            aliases to the new index, ES stops answering any search or
            index queries, no errors in the logs apart from it not
            answering queries anymore:

            
org.elasticsearch.common.util.__concurrent.__EsRejectedExecutionException:
            rejected execution (queue capacity 1000) on
            
org.elasticsearch.action.__search.type.__TransportSearchTypeAction$__BaseAsyncAction$4@172018e5

            CPU load is low, it doesn't look like it's doing anything
            expensive. A request to hot_threads times out. I've put the
            output from jstack and jmap here:

            https://gist.github.com/__theflow/b983d512ea344545f7f6
            <https://gist.github.com/theflow/b983d512ea344545f7f6>

            We tried upgrading to 0.90.13, since the changelog mentioned
            a problem with infinite loops, but same behavior. We're
            planning to upgrade to a more recent version of ES soon, but
            it'll take a bit to fully test that.


            Any ideas what could be causing this?


            thanks,
            Florian

            --
            You received this message because you are subscribed to the
            Google Groups "elasticsearch" group.
            To unsubscribe from this group and stop receiving emails
            from it, send an email to elasticsearc...@__googlegroups.com.
            To view this discussion on the web visit
            
https://groups.google.com/d/__msgid/elasticsearch/7a347529-__df1a-4a21-9ac1-d3af882a035a%__40googlegroups.com
            
<https://groups.google.com/d/msgid/elasticsearch/7a347529-df1a-4a21-9ac1-d3af882a035a%40googlegroups.com?utm_medium=email&utm_source=footer>.
            For more options, visit https://groups.google.com/d/__optout
            <https://groups.google.com/d/optout>.


    --
    You received this message because you are subscribed to the Google
    Groups "elasticsearch" group.
    To unsubscribe from this group and stop receiving emails from it,
    send an email to [email protected]
    <mailto:[email protected]>.
    To view this discussion on the web visit
    
https://groups.google.com/d/msgid/elasticsearch/04b3d0a2-a47e-47c6-8411-eb619c3c54bc%40googlegroups.com
    
<https://groups.google.com/d/msgid/elasticsearch/04b3d0a2-a47e-47c6-8411-eb619c3c54bc%40googlegroups.com?utm_medium=email&utm_source=footer>.
    For more options, visit https://groups.google.com/d/optout.


--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/NFGiLsmPkk0/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
[email protected]
<mailto:[email protected]>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAEM624YTb8m1qRZ5iuAf3eq6v-2FkSpemmw8d2UhVJep8zt0BQ%40mail.gmail.com
<https://groups.google.com/d/msgid/elasticsearch/CAEM624YTb8m1qRZ5iuAf3eq6v-2FkSpemmw8d2UhVJep8zt0BQ%40mail.gmail.com?utm_medium=email&utm_source=footer>.
For more options, visit https://groups.google.com/d/optout.


--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/538F8D3F.40503%40theflow.de.
For more options, visit https://groups.google.com/d/optout.

Re: Cluster gets stuck after full re-index

Reply via email to