Hi Arie,

Thanks for the good suggestions.I do have a dedicated ES master, which 
lives on the same server my GL Master and GL Web do.  I did some testing 
and will make headway on trying all of the changes you mention to see which 
ones have the greatest impact. So far, I added the kernel parameter 
cgroup_disable=memory, set swappiness to 0, and disabled the tmpfs. No real 
change there but when I was looking into the noop elevator setting, I ran 
across some info related to the different io schedulers and did some 
reading. I also came across the RHEL performance tuning guide for 
tuned-adm. On CentOS 6, tuned has to be installed; on CentOS 7; it is there 
by default. 
http://servicesblog.redhat.com/2012/04/16/tuning-your-system-with-tuned/ 
and 
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Performance_Tuning_Guide/sect-Red_Hat_Enterprise_Linux-Performance_Tuning_Guide-Performance_Monitoring_Tools-tuned_and_tuned_adm.html

Anyway, starting off with the default elevator in CentOS 6, which is [cfq] 
and changed it to [noop]. This had a negative impact on system performance. 
While my testing is not as scientific as it could be, I noticed a 
difference after switching to the [deadline] elevator. I used Elastic HQ to 
monitor node stats before and after a cluster restart for each elevator 
change. I let the cluster simmer for a few hours to get a good idea of 
"idle" workload. Another thing about using the [deadline] scheduler is when 
you apply the virtual-guest profile using tuned, [deadline] is what it 
defaults to and is the default on CentOS 7 boxes using the virtual-guest 
profile. I am guessing Red Hat put some thought into the default settings 
for a virtualized guest profile and recommended that for a reason. Of 
course, everyone's mileage may vary depending on other environmental 
factors. 

The VMware Paravirtualized adapter is what is used, which may make a 
difference compared to other virtual disk subsystems. The VMware 
environment is ESXi 5.5.0, 2718055 running on Cisco UCS B230-BASE-M2's 
using Xeon E7- 2830 @ 2.13GHz CPU's with VM Hardware v10, and VMware tools 
v9356 connected to an EMC5400 SAN backend

I am going to give the cluster a day or two to run with the current changes 
and then apply the noatime option. I am confident that will make an impact 
on performance, which is why I left it until last to isolate the effect of 
the other changes. An interesting side note is kernels since 2.6.30, 
default to relatime, which is slightly better than updating the atime on 
files after every read operation but still not optimal. 
http://www.digitalinternals.com/unix/linux-io-performance-tuning-noatime-nodiratime-relatime/388/

Overall, indexing is very fast it is under 0.50 ms. Search is slower, 
typically around 100 ms after changing to the [deadline] elevator. The 
number dropped to under 40ms for the data nodes at "idle" but when doing 
complex searches the numbers spiked to over 200 ms -- still better than 
using the [noop] scheduler where searches went over 800 ms for similar 
searches. One thing I am curious about is reducing the number of records to 
10 million and increasing the index count from 20 to 40. Also, it may be 
worth reducing the number of shards and increasing the number of replicas. 
Throwing more resources at the problem isn't exactly were I want to go at 
this point since the Graylog cluster is the second highest consumer of vm 
cluster resources, right behind our EMR system. 

Regards,
Brandon 

On Wednesday, August 19, 2015 at 8:19:02 AM UTC-5, Arie wrote:
>
> Hi BKeep
>
> Just for performance, some tips:
>
> Consider using 3 replicas if you can handle the space. Each node will be 
> capable to handle a search
> requests so you will have speed improvement on search (a lot). It is where 
> the replica's are meant for
> with backup as an second advantage
>
> https://www.elastic.co/guide/en/elasticsearch/guide/current/replica-shards.html
> .
>
> Consider using a master only node with no data for elasticsearch. This 
> node will be responsible
> for managing the cluster.
>
> You are using Centos, and it is for that.
>
> /etc/grub.conf
> add to the kernel line "cgroup_disable=memory" for better and full memory 
> performance.
> Because you are running es in a virtual environment, you could also change 
> the disk scedular by adding: "elevator=noop" because
> your external disks is managing this.
>
> /etc/fstab:
> Disable tmpfs (ramdisk) just put an # in front of the line.
> At least change the mount options for the volume with /var into sometning 
> like:
>   /dev/mapper/vg_nagios-lv_root /                       ext4    
> defaults,noatime,nodiratime,nobarrier,data=writeback,journal_ioprio=4        
> 1 1
>
> Take a look at this best practice foor your virtual enviroment:
>
> http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2053145
>
> Disable swapping by putting "swappiness = 0" at the and in /etc/sysctl.conf
>
> Did you implement this?
>
> http://ithubinfo.blogspot.nl/2013/07/how-to-increase-ulimit-open-file-and.html
>
> You couls allso gain some performance with this, but you should consider 
> the amount of free memory you have:
>
> https://lonesysadmin.net/2013/12/22/better-linux-disk-caching-performance-vm-dirty_ratio/
>
>
> I like using Elastic HQ for a nice look on my cluster. 
> http://www.elastichq.org/
>
> Like you, I would want to know what is better in increasing record count 
> vs the number of indexes.
> This might depend on the history you generally do a search on (wild 
> guess). I would increase the number of indexes.
>
>
> A.
>
> On Friday, August 14, 2015 at 4:52:37 AM UTC+2, BKeep wrote:
>>
>> This question may be better answered on the Elasticsearch forum but I 
>> thought I would give the GL list a try first. I recently added two 
>> additional nodes to a working cluster and would like some help/ideas on 
>> tuning for optimized performance and growth. My environment has 4 data 
>> nodes each spec'd out with 4 vCPU's, 12GB of Ram (ES HEAP is at 6GB), 250GB 
>> of storage (207GB on /var) running CentOS v6.7. Graylog is at v1.1.6, ES at 
>> v1.6.2 and openjdk 1.8. I am also using the stock settings for 20 indices 
>> with 20 Million records each. I have set 4 shards with one replica. The 
>> master node runs ES, GL, and GL web using the same specs, except instead of 
>> 250GB of storage, it only has 120GB. All nodes are thick provisioned VMDK's 
>> on a VMware cluster. Right now with our current sending rate, I see indices 
>> rotate about every 4-12 hours and generally shards have a size between 
>> 1.5GB's to 2GB's. The total used storage on the data nodes is ~73GB used 
>> with ~124GB available. 
>>
>> Okay, so finally to my question. I would like to increase either the 
>> number of indices or increase the number of records per index. Is one 
>> method preferred over the other? If the records count increases from 20 
>> Million to 30 Million, would that increase/decrease index/search 
>> performance or should the index limit be set to 30 indices. Basically, 
>> which method would allow for increased historical data retention with the 
>> least overhead if that makes sense.
>>
>> Regards,
>> Brandon
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Graylog Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/graylog2/b9ca2dd3-a387-4d88-8a76-cf62ff9d58ac%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to