Hi BKeep

Just for performance, some tips:

Consider using 3 replicas if you can handle the space. Each node will be 
capable to handle a search
requests so you will have speed improvement on search (a lot). It is where 
the replica's are meant for
with backup as an second advantage
https://www.elastic.co/guide/en/elasticsearch/guide/current/replica-shards.html.

Consider using a master only node with no data for elasticsearch. This node 
will be responsible
for managing the cluster.

You are using Centos, and it is for that.

/etc/grub.conf
add to the kernel line "cgroup_disable=memory" for better and full memory 
performance.
Because you are running es in a virtual environment, you could also change 
the disk scedular by adding: "elevator=noop" because
your external disks is managing this.

/etc/fstab:
Disable tmpfs (ramdisk) just put an # in front of the line.
At least change the mount options for the volume with /var into sometning 
like:
  /dev/mapper/vg_nagios-lv_root /                       ext4    
defaults,noatime,nodiratime,nobarrier,data=writeback,journal_ioprio=4        
1 1

Take a look at this best practice foor your virtual enviroment:
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2053145

Disable swapping by putting "swappiness = 0" at the and in /etc/sysctl.conf

Did you implement this?
http://ithubinfo.blogspot.nl/2013/07/how-to-increase-ulimit-open-file-and.html

You couls allso gain some performance with this, but you should consider 
the amount of free memory you have:
https://lonesysadmin.net/2013/12/22/better-linux-disk-caching-performance-vm-dirty_ratio/


I like using Elastic HQ for a nice look on my cluster. 
http://www.elastichq.org/

Like you, I would want to know what is better in increasing record count vs 
the number of indexes.
This might depend on the history you generally do a search on (wild guess). 
I would increase the number of indexes.


A.

On Friday, August 14, 2015 at 4:52:37 AM UTC+2, BKeep wrote:
>
> This question may be better answered on the Elasticsearch forum but I 
> thought I would give the GL list a try first. I recently added two 
> additional nodes to a working cluster and would like some help/ideas on 
> tuning for optimized performance and growth. My environment has 4 data 
> nodes each spec'd out with 4 vCPU's, 12GB of Ram (ES HEAP is at 6GB), 250GB 
> of storage (207GB on /var) running CentOS v6.7. Graylog is at v1.1.6, ES at 
> v1.6.2 and openjdk 1.8. I am also using the stock settings for 20 indices 
> with 20 Million records each. I have set 4 shards with one replica. The 
> master node runs ES, GL, and GL web using the same specs, except instead of 
> 250GB of storage, it only has 120GB. All nodes are thick provisioned VMDK's 
> on a VMware cluster. Right now with our current sending rate, I see indices 
> rotate about every 4-12 hours and generally shards have a size between 
> 1.5GB's to 2GB's. The total used storage on the data nodes is ~73GB used 
> with ~124GB available. 
>
> Okay, so finally to my question. I would like to increase either the 
> number of indices or increase the number of records per index. Is one 
> method preferred over the other? If the records count increases from 20 
> Million to 30 Million, would that increase/decrease index/search 
> performance or should the index limit be set to 30 indices. Basically, 
> which method would allow for increased historical data retention with the 
> least overhead if that makes sense.
>
> Regards,
> Brandon
>

-- 
You received this message because you are subscribed to the Google Groups 
"Graylog Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/graylog2/a9471dbb-fae3-498f-bc73-9eb0c28445eb%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to