Hi Chris, Did you got better TPS after doing some change. Can you share your experience? If so, which parameter did you tweak to get Linear scaling of ES.
We are kind of in the same situation, we aren't able to see better Write TPS by adding more nodes. Thanks in advance for your help. Pranav. On Wednesday, December 4, 2013 12:54:23 PM UTC-5, Christopher Jones wrote: > > Dear Folks: > > I am running some scaling tests on elastic search. We are considering > using elastic search to store a large volume of log file data. A key metric > for us is the write throughput. We've been trying to understand how > Elasticsearch can scale. We were hoping to see some form of linear scaling. > That is, add in new nodes, increase the shard count, etc, in order to be > able to handle an increased number of writes. No matter what lever we pull, > we're not seeing an increase in write throughput. Clearly we're missing > something. > > > We have tested several different clusters on AWS (details below). We have > doubled the number of nodes; we have doubled (and doubled again) the number > of shards. We have doubled the number of servers feeding the cluster log > file data. We have put the nodes in the cluster behind an AWS load > balancer. We have increased the number of open files allowed by ubuntu. > None of these changes has had any effect on the number of records indexed > per second. > > Details: > > > We are piping json data using fluentd, using the fluentd elastic search > plugin, storing the log file entries using the logstash format, which is > supported by the fluentd elastic search plugin. Note that we are not using > logstash as a data pipeline. > > > We configured the cluster with only 1 replica, the default. > > We increased the ulimit for open files to 32000 > > Elastic search configuration: > > discovery.type: ec2 > > cloud: > aws: > > region: us-west-2 > groups: elastic-search > > cloud.node.auto_attributes: true > > > index.number_of_shards: 40 # we varied this between 10-40 > index.refresh_interval: 60s > > bootstrap.mlockall: true > > discovery.zen.ping.multicast.enabled: false > > Here's how we setup the memory configuration in setup.sh > > #!/bin/sh > export ES_MIN_MEM=16g > export ES_MAX_MEM=30g > bin/elasticsearch -f > > Each node in the cluster is m1.xlarge, which has the following > characteristics: > > *Instance Family* *Instance Type* *Processor Arch* *vCPU* *ECU* *Memory > (GiB)* *Instance Storage (GB)* *EBS-optimized Available* *Network > Performance* General purpose m1.xlarge 64-bit 4 8 15 4 x 420 Yes High > > We've monitored the cluster using http://www.elastichq.org. For each > index, we calculated (Indexing Total)/(Indexing Time) to get the number of > records indexed per second. Whether we have a single node with 10 shards or > 6 nodes with 40 shards, we consistently see indexing occurring at a rate of > around 3000-3500 records/second. It is this measure that we never seem to > be able to increase. > > Here's some characteristic stats for an individual node: > > Heap Used: 1.1gb > Heap Committed: 7.9gb > Non Heap Used: 42.3mb > Non Heap Committed: 66.2mb > JVM Uptime: 4h > Thread Count/Peak: 62 / 82 > GC Count: 8156 > GC Time: 4.8m > Java Version: 1.7.0_45 > JVM Vendor: Oracle Corporation > JVM: Java HotSpot(TM) 64-Bit Server VM > > We have not seen memory contention, or high CPU utilization, in general > (CPU utilization is around 80% out of 400% possible). We're not doing any > reading of the database (that would be a subsequent test). > > Here's some more stats: > Open File Descriptors:795CPU Usage:68% of 400%CPU System:1.5mCPU User: > 26.6mCPU Total:28.2mResident Memory:8.4gbShared Memory:19.6mbTotal > Virtual Memory:10.3gb > > Thanks for any help. > > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a17430a3-4493-40a9-9b7c-74f2029e943a%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
