A simple cluster, 2 nodes, replica 1. Each node has 1.5Go RAM, 2 cores, SAS 
disks.
With Elasticsearch 1.1 some deconnection appears, and some CPU load picks.
Bad usage of logstash (lots of tiny bulk imports, with monthly indices).
Logstash usage was fixed, and Elasticsearch upgraded to 1.3. 1.3.3, then 
1.3.4, 10 minutes after.
The CPU usage is now 100% (so one core used), LOTS of file descriptor 
opened, and memory usage is growing. RAM is upgraded to 2Go.

Strace show that 5 threads use lots of CPU and 1 thread does 7000 stat()/s.

Elasticsearch Hot thread show lots of FSDirectory.listAll. Disk usage is 
low, just a lots of stats.

The shard is set to 9, and logstash opens lots of indices, 2286 shards for 
7GB, 37487 files in the indices folder.

In the recovery API, everything is "done" with strange percent score, all 
shards have "replica" states.

Now, the load makes heavy  waves, slowing the service.

This is just a long migration from different version of Lucene (from ES 1.1 
to 1.3), a misconfiguration, a real bug, or am I just doomed?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2e7ad3ba-a1f6-44c8-b9e5-67b0c1ed8bc9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to