Hi all,

I need some help here. I started a load test for Elasticsearch before using 
that in production environment. I have three EC2 instances that are 
configured in following manner which creates a Elasticsearch cluster.

All three machines has the following same hardware configurations.

32GB RAM
160GB SSD hard disk
8 core CPU

*Machine 01*
Elasticsearch server (16GB heap)
Elasticsearch Java client (Who generates a continues load and report to ES 
- 4GB heap)


*Machine 02*
Elasticsearch server (16GB heap)
Elasticsearch Java client (Who generates a continues load and report to 
ES - 4GB heap)


*Machine 03*
Elasticsearch server (16GB heap)
Elasticsearch Java client (Who queries from ES continuously - 1GB heap)


Note that the two clients together generates around 20K records per second 
and report them as bulks with average size of 25. The other client queries 
only one query per second. My document has the following format.

{
            "_index": "my_index",
            "_type": "my_type",
            "_id": "7334236299916134105",
            "_score": 3.6111107,
            "_source": {
               "long_1": 96186289301793,
               "long_2": 7334236299916134000,
               "string_1": "random_string",
               "long_3": 96186289301793,
               "string_2": "random_string",
               "string_3": "random_string",
               "string_4": "random_string",
               "string_5": "random_string",
               "long_4": 5457314198948537000
          }
}

The problem is, after few minutes, Elasticsearch reports errors in the logs 
like this.

[2015-02-24 08:03:58,070][ERROR][marvel.agent.exporter    ] [Gateway] 
create failure (index:[.marvel-2015.02.24] type: [cluster_stats]): 
RemoteTransportException[[Marvel 
Girl][inet[/10.167.199.140:9300]][bulk/shard]]; nested: 
EsRejectedExecutionException[rejected execution (queue capacity 50) on 
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1@76dbf01];

[2015-02-25 04:23:36,459][ERROR][marvel.agent.exporter    ] [Wildside] 
create failure (index:[.marvel-2015.02.25] type: [index_stats]): 
UnavailableShardsException[[.marvel-2015.02.25][0] [2] shardIt, [0] active 
: Timeout waiting for [1m], request: 
org.elasticsearch.action.bulk.BulkShardRequest@2e7693b7]

Note that this error happens for different indices and different types.

Again after few minutes, Elasticsearch clients get 
NoNodeAvailableException. I hope that is because Elasticsearch cluster 
malfunctioning due to above errors. But eventually the clients get 
"java.lang.OutOfMemoryError: GC overhead limit exceeded" error.

I did some profiling and found out that increasing 
the org.elasticsearch.action.index.IndexRequest instances is the cause for 
this OutOfMemory error. I tried even with "index.store.type: memory" and it 
seems still the Elasticsearch cluster cannot build the indices to the 
required rate.

Please point out any tuning parameters or any method to get rid of these 
issues. Or please explain a different way to report and query this amount 
of load.


Thanks
Malaka

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/35a29ca5-02f6-4fe9-8600-2cdb91c519cf%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to