Maybe it is not OOM but running out of file descriptors, that can only be seen in the stack trace.
TransportClient, by default, tries to reconnect quite aggressively, so if you could monitor the number of open network ports while you get OOM this would be helpful for analysis. Maybe you have sniff mode on (which is the default) and TransportClient retries are consuming all ports in despair... you may switch log level to debug to find out what TransportClient is doing. To remedy the cluster red situation a little bit you can increase the timeout from 5s to say 30s (this will not solve the cause of the problem of course, it delays the OOM). And you should take attention of the number of responding shards in search responses, so you can bail out with fatal error when the number of responding shards is too low. Of course you could also check cluster health color once a minute or so. Jörg On Mon, Jun 23, 2014 at 10:29 PM, Santiago Ferrer Deheza < [email protected]> wrote: > The rare thing is that happends when the cluster status is red (thats why > i think the client is the problem). > > This is my code > > > ElasticResponse response; > > > try{ > > if(category != null && !category.isEmpty()){ > > SearchRequestBuilder searchQuery = client.prepareSearch(ConfigFactory. > load().getString("elasticsearch.updater.index")) > > .setSearchType(SearchType.QUERY_THEN_FETCH).setSize(numberOfAds); > > > > > BoolQueryBuilder qb = QueryBuilders > > .boolQuery(); > > > > > FunctionScoreQueryBuilder functionQueryBuilder = createFunctionScore( > external, qb); > > > List<Map<String, Object>> ads = new ArrayList<Map<String, Object>>(); > > String categPath = Category.getCategoryIdPath(category); > > Deque<SearchCategory> categories = new LinkedList<ElasticSearch. > SearchCategory>(); > > for(String category : categPath.split("_")){ > > categories.addFirst(new SearchCategory(category,false, categPath)); > > } > > functionQueryBuilder = functionQueryBuilder.boostMode(CombineFunction. > MULT); > > functionQueryBuilder = functionQueryBuilder.scoreMode("max"); > > fillWithFunctions(functionQueryBuilder,categories,INITIAL_BOOST); > > SearchResponse searchResponse = searchQuery.setTypes(categPath.split("_" > )[0]).setQuery(functionQueryBuilder).execute().actionGet(50, TimeUnit. > MILLISECONDS); > > SearchHits hits = searchResponse.getHits(); > > Iterator<SearchHit> it = hits.iterator(); > > int count = 0; > > > while(it.hasNext() && count < numberOfAds){ > > Map<String, Object> sourceAsMap = it.next().sourceAsMap(); > > ads.add(sourceAsMap); > > count++; > > } > > if(ads.isEmpty()){ > > Logger.info("ES - No ads found. Category: " + category); > > response = new ElasticResponse(NO_CONTENT); > > }else{ > > response = new ElasticResponse(ads,OK, position); > > } > > }else{ > > Logger.info("ES - No category sent."); > > response = new ElasticResponse(BAD_REQUEST); > > } > > }catch(ElasticsearchTimeoutException e){ > > Logger.info("ES - Timeout.", e); > > MetricsManager.getElasticsearchMetrics().incrementTimeoutCounter(); > > response = new ElasticResponse(REQUEST_TIMEOUT); > > }catch(Exception e){ > > Logger.error("Searching in elasticsearch",e); > > response = new ElasticResponse(INTERNAL_SERVER_ERROR); > > } > > > In the code you can see we define a timeout when executing *actionGet*. > This works fine when the cluster is OK (we have a limited SLA) but when the > ES Cluster goes down it doesn't take it into count, raising our SLA. > > Thanks! > > > On Monday, June 23, 2014 4:07:06 PM UTC-3, Jörg Prante wrote: > >> Most likely you have memory leaks in your app and your client memory was >> exhausted. >> >> If you can show the client code how you submit queries and process >> responses and the stack traces you receive, more help could be possible to >> offer. >> >> A general hint is to switch to Java 7. >> >> Jörg >> >> >> On Mon, Jun 23, 2014 at 8:14 PM, Santiago Ferrer Deheza < >> [email protected]> wrote: >> >>> Hi there! >>> >>> I'm having this exception (*'java.lang.OutOfMemoryError: GC overhead >>> limit exceeded'*) *in client* when my ES 1.1.1 cluster goes down. Im >>> having problems with the cluster (work in progress) but it doesn't seem >>> right that the client server throws *OutOfMemoryError.* >>> >>> Client Spcs: >>> >>> - Java 6u32 >>> - Ubuntu 12.04 LTS >>> - Elasticsearch 1.1.1 Jar >>> >>> >>> The client is only use for searching. Any clue? If more information is >>> need just let me know. >>> >>> Thanks, >>> Santi! >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "elasticsearch" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> >>> To view this discussion on the web visit https://groups.google.com/d/ >>> msgid/elasticsearch/b34378cf-8f43-4f65-b8a6-e6f649150e67% >>> 40googlegroups.com >>> <https://groups.google.com/d/msgid/elasticsearch/b34378cf-8f43-4f65-b8a6-e6f649150e67%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/b59e9c2a-77d3-4d52-a330-d9d20a77e6bb%40googlegroups.com > <https://groups.google.com/d/msgid/elasticsearch/b59e9c2a-77d3-4d52-a330-d9d20a77e6bb%40googlegroups.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHdNPJtHhuEWV9T2mCnNcVmESapVNRqcW35JfgnYRiJCQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
