You should consider if it is possible to install the latest ES (current 1.0.0.RC1) and the latest JVM.
If you use 4 nodes, you should consider 4 shards by default, for balanced resources on every index. You did not set anything special for bulk indexing thread pool if you mean that, the settings are in threadpool.bulk, not threadpool.index (I don't know if your Logstash is using bulk or index) indices.memory.index_buffer_size is adjusted automatically, no need to cap it to 50%. Also index.translog.flush_threshold_ops, I wonder why you adjust that value. By moving the search pool away from the number of CPU cores, you reduce the automatic scale of search in your cluster which is bad. Using 20 instead of 18 (3*6 is default) makes not much difference per se. But reducing the queue size from 1000 to 100 will make your search load bail out early and often. Your heap size is very large (30g) and you should be prepared that you have to take additional efforts to tackle GC challenges. You should also think about dedicated master nodes if you want to drive large heaps with expected high GC on data nodes. The indexing load is automatically distributed, no need to care for that in Logstash. But you should consider to set up Logstash so that it can index to more than one node, just for more resiliency. Jörg On Tue, Jan 28, 2014 at 9:51 AM, Luca Belluccini <[email protected]>wrote: > Hello, > I am putting in place an ES cluster with 4 nodes (6 Cores + 48GB RAM). > The aim is to use Kibana as a data analysis tool. > I set up Logstash to properly feed ES and use the following: > > - https://gist.github.com/lucabelluccini/7563998 for index templates > - Some tweaks to elasticsearch.yml: > - indices.memory.index_buffer_size: 50% > - index.translog.flush_threshold_ops: 50000 > - index.number_of_shards: 3 > - threadpool.search.type: fixed > - threadpool.search.size: 20 > - threadpool.search.queue_size: 100 > - threadpool.index.type: fixed > - threadpool.index.size: 60 > - threadpool.index.queue_size: 200 > - node.master: true > - node.data: true > - ES_HEAP_SIZE=30g > > Logstash is sending to one of the hosts and I wanted to ask if the > indexing is automatically distributed over all the nodes or you have to set > up something to exploit all the processing power of all the 4 nodes. > > Thanks in advance, > Luca B. > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/15f9d547-0d78-48bb-bb33-c18d88e78687%40googlegroups.com > . > For more options, visit https://groups.google.com/groups/opt_out. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoE8tvmcty8HMxyJogOWW-L5wL%3D3sQjtuR-A3r8o1r%2BwCg%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
