Many concurrent costly operations (range queries or faceting when I/O ops are required, segment merges, shard allocations etc) are known to starve ES out of threads or processing power, and this is what you are experiencing - no threads are capable of taking your requests.
Immediate solution is to run a master-only node (node.data = false) so you have one node that acts as master and cluster coordinator that is known to never starve out of system resources. Even running this node side-by-side on the same server as one of the data nodes can protect you as it doesn't have the same memory requirements etc as a data node. Finally, there has been (and still is) a lot of work put into this so I strongly recommend upgrading to the latest (currently it is 1.2.1). -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Action <http://manning.com/synhershko/> On Tue, Jun 3, 2014 at 1:14 PM, Florian Munz <[email protected]> wrote: > Hello, > > we recently moved our ES cluster from dedicated hardware to AWS instances, > they have less memory available, but use SSDs for the ES data directory. We > kept JVM (1.7.0_17) and ES (0.90.9) version exactly the same. On the new > hardware, after running a full re-index (creating a new index, pointing an > alias to the new and one alias to the old index, sending realtime updates > to both aliases and running a script to fill up the new index) our cluster > gets stuck. > > 10 minutes after the re-index finishes and we move both aliases to the new > index, ES stops answering any search or index queries, no errors in the > logs apart from it not answering queries anymore: > > org.elasticsearch.common.util.concurrent.EsRejectedExecutionException: > rejected execution (queue capacity 1000) on > org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$4@172018e5 > > CPU load is low, it doesn't look like it's doing anything expensive. A > request to hot_threads times out. I've put the output from jstack and jmap > here: > > https://gist.github.com/theflow/b983d512ea344545f7f6 > > We tried upgrading to 0.90.13, since the changelog mentioned a problem > with infinite loops, but same behavior. We're planning to upgrade to a more > recent version of ES soon, but it'll take a bit to fully test that. > > > Any ideas what could be causing this? > > > thanks, > Florian > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/7a347529-df1a-4a21-9ac1-d3af882a035a%40googlegroups.com > <https://groups.google.com/d/msgid/elasticsearch/7a347529-df1a-4a21-9ac1-d3af882a035a%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZvRJK_pR6nskv8ujpH8cCRp890UZ1d8M_iU0Zi-OULO%3DQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
