Which ES version are you using?  You should use the latest (soon to be
1.3): there have been a number of bulk-indexing improvements recently.

Are you using the bulk API with multiple/async client threads?  Are you
saturating either CPU or IO in your cluster (so that the test is really a
full cluster capacity test)?

Also, the relationship between refresh_interval and indexing performance is
tricky: it turns out, -1 is often a poor choice, because it means your bulk
indexing threads are sometimes tied up flushing segments when with
refreshing enabled, it's a separate thread that does that.  So a refresh of
5s is maybe a good choice.

Mike McCandless

http://blog.mikemccandless.com


On Wed, Jul 16, 2014 at 6:51 AM, Marek Dabrowski <[email protected]>
wrote:

> Hello
>
> My configuration is:
> 6 nodes Elasticsearch cluster
> OS: Centos 6.5
> JVM: 1.7.0_25
>
> Cluster is working fine. I can indexing data, query, etc. Now I'm doing
> test on package about ~50mln doc (~13GB). I would like take better
> performance during indexing data. To take this target I has been changed
> parameter refresh_interval. I did test for 1s, -1 and 600s. Time for
> indexing data is that same. I checked configuration (_settings) for index
> and value for refresh_interval is ok (has proper value), eg:
>
> {
>   "smt_20140501_100000_20g_norefresh" : {
>     "settings" : {
>       "index" : {
>         "uuid" : "q3imiZGQTDasQUuMWS8oiw",
>         "number_of_replicas" : "1",
>         "number_of_shards" : "6",
>         "refresh_interval" : "600s",
>         "version" : {
>           "created" : "1020199"
>         }
>       }
>     }
>   }
> }
>
>
>
> Create index, setting refresh_interval and load is done on that same
> cluster node. Before test index is deleted and created again before start
> new test with new value of refresh_interval. All cluster nodes logs
> information that parameter has been changed, eg:
> [2014-07-16 11:24:09,813][INFO ][index.shard.service      ] [h6]
> [smt_20140501_100000_20g_norefresh][1] updating refresh_interval from [1s]
> to [-1]
> or
> [2014-07-16 11:32:32,928][INFO ][index.shard.service      ] [h6]
> [smt_20140501_100000_20g_norefresh][1] updating refresh_interval from [1s]
> to [10m]
>
> After start test new data are available immediately and indexing time that
> same in 3 cases. I don't know where is failure. Somebody know what is going
> on?
>
> Regards
> Marek
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/f7565c36-98c7-4e3e-8132-796f9edfb3fa%40googlegroups.com
> <https://groups.google.com/d/msgid/elasticsearch/f7565c36-98c7-4e3e-8132-796f9edfb3fa%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAD7smRezWeZFQMSMVXj7ELW0xGSu3sPRxfXqcuF4bmtrLVBjYg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to