Hi BKeep Just for performance, some tips:
Consider using 3 replicas if you can handle the space. Each node will be capable to handle a search requests so you will have speed improvement on search (a lot). It is where the replica's are meant for with backup as an second advantage https://www.elastic.co/guide/en/elasticsearch/guide/current/replica-shards.html. Consider using a master only node with no data for elasticsearch. This node will be responsible for managing the cluster. You are using Centos, and it is for that. /etc/grub.conf add to the kernel line "cgroup_disable=memory" for better and full memory performance. Because you are running es in a virtual environment, you could also change the disk scedular by adding: "elevator=noop" because your external disks is managing this. /etc/fstab: Disable tmpfs (ramdisk) just put an # in front of the line. At least change the mount options for the volume with /var into sometning like: /dev/mapper/vg_nagios-lv_root / ext4 defaults,noatime,nodiratime,nobarrier,data=writeback,journal_ioprio=4 1 1 Take a look at this best practice foor your virtual enviroment: http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2053145 Disable swapping by putting "swappiness = 0" at the and in /etc/sysctl.conf Did you implement this? http://ithubinfo.blogspot.nl/2013/07/how-to-increase-ulimit-open-file-and.html You couls allso gain some performance with this, but you should consider the amount of free memory you have: https://lonesysadmin.net/2013/12/22/better-linux-disk-caching-performance-vm-dirty_ratio/ I like using Elastic HQ for a nice look on my cluster. http://www.elastichq.org/ Like you, I would want to know what is better in increasing record count vs the number of indexes. This might depend on the history you generally do a search on (wild guess). I would increase the number of indexes. A. On Friday, August 14, 2015 at 4:52:37 AM UTC+2, BKeep wrote: > > This question may be better answered on the Elasticsearch forum but I > thought I would give the GL list a try first. I recently added two > additional nodes to a working cluster and would like some help/ideas on > tuning for optimized performance and growth. My environment has 4 data > nodes each spec'd out with 4 vCPU's, 12GB of Ram (ES HEAP is at 6GB), 250GB > of storage (207GB on /var) running CentOS v6.7. Graylog is at v1.1.6, ES at > v1.6.2 and openjdk 1.8. I am also using the stock settings for 20 indices > with 20 Million records each. I have set 4 shards with one replica. The > master node runs ES, GL, and GL web using the same specs, except instead of > 250GB of storage, it only has 120GB. All nodes are thick provisioned VMDK's > on a VMware cluster. Right now with our current sending rate, I see indices > rotate about every 4-12 hours and generally shards have a size between > 1.5GB's to 2GB's. The total used storage on the data nodes is ~73GB used > with ~124GB available. > > Okay, so finally to my question. I would like to increase either the > number of indices or increase the number of records per index. Is one > method preferred over the other? If the records count increases from 20 > Million to 30 Million, would that increase/decrease index/search > performance or should the index limit be set to 30 indices. Basically, > which method would allow for increased historical data retention with the > least overhead if that makes sense. > > Regards, > Brandon > -- You received this message because you are subscribed to the Google Groups "Graylog Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/graylog2/a9471dbb-fae3-498f-bc73-9eb0c28445eb%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
