Re: Cluster gets stuck after full re-index

Itamar Syn-Hershko Wed, 04 Jun 2014 14:29:17 -0700

Many concurrent costly operations (range queries or faceting when I/O ops
are required, segment merges, shard allocations etc) are known to starve ES
out of threads or processing power, and this is what you are experiencing -
no threads are capable of taking your requests.


Immediate solution is to run a master-only node (node.data = false) so you
have one node that acts as master and cluster coordinator that is known to
never starve out of system resources. Even running this node side-by-side
on the same server as one of the data nodes can protect you as it doesn't
have the same memory requirements etc as a data node.

Finally, there has been (and still is) a lot of work put into this so I
strongly recommend upgrading to the latest (currently it is 1.2.1).

--

Itamar Syn-Hershko
http://code972.com | @synhershko <https://twitter.com/synhershko>
Freelance Developer & Consultant
Author of RavenDB in Action <http://manning.com/synhershko/>


On Tue, Jun 3, 2014 at 1:14 PM, Florian Munz <[email protected]> wrote:

> Hello,
>
> we recently moved our ES cluster from dedicated hardware to AWS instances,
> they have less memory available, but use SSDs for the ES data directory. We
> kept JVM (1.7.0_17) and ES (0.90.9) version exactly the same. On the new
> hardware, after running a full re-index (creating a new index, pointing an
> alias to the new and one alias to the old index, sending realtime updates
> to both aliases and running a script to fill up the new index) our cluster
> gets stuck.
>
> 10 minutes after the re-index finishes and we move both aliases to the new
> index, ES stops answering any search or index queries, no errors in the
> logs apart from it not answering queries anymore:
>
> org.elasticsearch.common.util.concurrent.EsRejectedExecutionException:
> rejected execution (queue capacity 1000) on
> org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$4@172018e5
>
> CPU load is low, it doesn't look like it's doing anything expensive. A
> request to hot_threads times out. I've put the output from jstack and jmap
> here:
>
> https://gist.github.com/theflow/b983d512ea344545f7f6
>
> We tried upgrading to 0.90.13, since the changelog mentioned a problem
> with infinite loops, but same behavior. We're planning to upgrade to a more
> recent version of ES soon, but it'll take a bit to fully test that.
>
>
> Any ideas what could be causing this?
>
>
> thanks,
> Florian
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/7a347529-df1a-4a21-9ac1-d3af882a035a%40googlegroups.com
> <https://groups.google.com/d/msgid/elasticsearch/7a347529-df1a-4a21-9ac1-d3af882a035a%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZvRJK_pR6nskv8ujpH8cCRp890UZ1d8M_iU0Zi-OULO%3DQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Cluster gets stuck after full re-index

Reply via email to