Setup: 4 nodes Replication = 0 ES_HEAP_SIZE = 75GB Number of Indices = 59 (using logstash one index per month) Total shards = 234 (each index is 4 hards, one per node) Total docs = 7.4 billion Total size = 4.7TB
When I add a new file, which I do using logstash on all four nodes, the indexing immediately throttles. For instance: [2014-09-18 09:41:42,326][INFO ][index.engine.internal ] [hdp13] [ logstash-2014.09][2] stop throttling indexing: numMergesInFlight=4, maxNumMerges=5 [2014-09-18 09:41:45,267][INFO ][index.engine.internal ] [hdp13] [logstash-2014.09][2] now throttling indexing: numMergesInFlight=6, maxNumMerges=5 [2014-09-18 09:41:45,303][INFO ][index.engine.internal ] [hdp13] [logstash-2014.09][2] stop throttling indexing: numMergesInFlight=4, maxNumMerges=5 [2014-09-18 09:41:51,273][INFO ][index.engine.internal ] [hdp13] [logstash-2014.09][2] now throttling indexing: numMergesInFlight=6, maxNumMerges=5 [2014-09-18 09:41:51,379][INFO ][index.engine.internal ] [hdp13] [logstash-2014.09][2] stop throttling indexing: numMergesInFlight=4, maxNumMerges=5 [2014-09-18 09:42:06,429][INFO ][index.engine.internal ] [hdp13] [logstash-2014.09][2] now t Where should I be looking to tuning the indexing performance? The query load on the cluster is very low as it is a research cluster and so I would sacrifice query performance for indexing. The 4 nodes all run logstash, listening one various ports. I use netcat to 'feed' the data to the 4 nodes from a hadoop cluster. hadoop1 netcat --------> hadoop2 netcat --------> ES1 hadoop3 netcat --------> And so on. Each ES node has 24 disks but I am only using one at the moment. This is an obvious IO bottleneck, but I am unclear how to use all disks? If I add more disks with ES share the data between them all? eg; /mnt/disk1 /mnt/disk2 etc Thanks -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3e85d65c-8001-4f90-bfa0-f7e63679feba%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
