Which ES version are you using? You should use the latest (soon to be 1.3): there have been a number of bulk-indexing improvements recently.
Are you using the bulk API with multiple/async client threads? Are you saturating either CPU or IO in your cluster (so that the test is really a full cluster capacity test)? Also, the relationship between refresh_interval and indexing performance is tricky: it turns out, -1 is often a poor choice, because it means your bulk indexing threads are sometimes tied up flushing segments when with refreshing enabled, it's a separate thread that does that. So a refresh of 5s is maybe a good choice. Mike McCandless http://blog.mikemccandless.com On Wed, Jul 16, 2014 at 6:51 AM, Marek Dabrowski <[email protected]> wrote: > Hello > > My configuration is: > 6 nodes Elasticsearch cluster > OS: Centos 6.5 > JVM: 1.7.0_25 > > Cluster is working fine. I can indexing data, query, etc. Now I'm doing > test on package about ~50mln doc (~13GB). I would like take better > performance during indexing data. To take this target I has been changed > parameter refresh_interval. I did test for 1s, -1 and 600s. Time for > indexing data is that same. I checked configuration (_settings) for index > and value for refresh_interval is ok (has proper value), eg: > > { > "smt_20140501_100000_20g_norefresh" : { > "settings" : { > "index" : { > "uuid" : "q3imiZGQTDasQUuMWS8oiw", > "number_of_replicas" : "1", > "number_of_shards" : "6", > "refresh_interval" : "600s", > "version" : { > "created" : "1020199" > } > } > } > } > } > > > > Create index, setting refresh_interval and load is done on that same > cluster node. Before test index is deleted and created again before start > new test with new value of refresh_interval. All cluster nodes logs > information that parameter has been changed, eg: > [2014-07-16 11:24:09,813][INFO ][index.shard.service ] [h6] > [smt_20140501_100000_20g_norefresh][1] updating refresh_interval from [1s] > to [-1] > or > [2014-07-16 11:32:32,928][INFO ][index.shard.service ] [h6] > [smt_20140501_100000_20g_norefresh][1] updating refresh_interval from [1s] > to [10m] > > After start test new data are available immediately and indexing time that > same in 3 cases. I don't know where is failure. Somebody know what is going > on? > > Regards > Marek > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/f7565c36-98c7-4e3e-8132-796f9edfb3fa%40googlegroups.com > <https://groups.google.com/d/msgid/elasticsearch/f7565c36-98c7-4e3e-8132-796f9edfb3fa%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAD7smRezWeZFQMSMVXj7ELW0xGSu3sPRxfXqcuF4bmtrLVBjYg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
