Re: [graylog2] High CPU and did not find meta info issues since adding new Graylog servers and increased input messages/second

Arie Fri, 22 May 2015 14:49:49 -0700

Same problem here with to many cpu's (not on grayling application).

What happens is that code swaps continuous between cores, in our case it 
helps to bind the
application to a core, but managing it is a dork to do. The virtual layer 
looses a lot of resources
in constantly managing resources over the cores. With 2 it can already be 
up to 10%!


We are running a lot of real-time applications and customer wants 
everything in the cloud. In our
experience 'cloud' delivers us the most of all our problems/glitches. Love 
having some older iron
to run graylog and elastic onto it.



Op vrijdag 22 mei 2015 06:08:44 UTC+2 schreef Pete GS:
>
> Ok, here's where I'm at with this...
>
> I tried implementing the kernel options on one of the Graylog servers as a 
> test but it made no appreciable difference. In fact shortly after the first 
> reboot the VM froze with a locked CPU error. It hasn't done that since a 
> subsequent reboot though. We're not running the PVSCSI adapter either.
>
> After observing this, I revisited Mathieu's comment regarding too many 
> CPU's.
>
> While I still see no contention issues for CPU resources, I started 
> wondering if there was some SMP related issue with CentOS where the extra 
> vCPU's just weren't providing enough extra to cater for the workload.
>
> I scaled all the nodes back to 8 vCPU's and added another four Graylog 
> servers, so I now have 8 servers receiving the inputs.
>
> So far this is running a lot better than the four servers with 16 and 20 
> vCPU's. They still peak at 100% but this is not sustained, even after 
> having an ElasticSearch issue (filling the disks again) that caused a 
> backlog in the message journal overnight.
>
> Almost all the message backlog in the journals have been processed again 
> and it's still working well so far, this is after 24 hours or so.
>
> I'll see how it runs over the weekend.
>
> Incidentally it seems I have inadvertently stumbled across a good number 
> for the process buffer processors... it seems to work well at 2 less than 
> the number of CPU's available to the server. Running with a buffer number 
> of 6 with 8 vCPU's seems to work well. Of course I'm not sure if this is 
> just in my particular environment or if it's a general thing.
>
> Cheers, Pete
>
> On Thursday, 14 May 2015 19:13:24 UTC+10, Pete GS wrote:
>>
>> Thanks very much Arie, I will check these tomorrow and report back.
>>
>> One thing I can confirm is the heap size is configured correctly.
>>
>> Cheers, Pete
>>
>> On 14 May 2015, at 05:35, Arie <[email protected] <javascript:>> wrote:
>>
>> Lets try some more options.
>>
>> I see you are running your stuf virtual. Then you can consider the 
>> following for centos6
>>
>> In your startup kernel config you can add the following options 
>> (/etc/grub.conf)
>>
>>   nohz=off (for high cpu intensive systems)
>>   elevator=noop (disc scheduling is done by the virtual layer, so disable 
>> that)
>>   cgroup_disable=memory (possibly not used, it fees up some memory and 
>> allocation)
>>   
>> if you use the pvscsi device, add the following:
>>   vmw_pvscsi.cmd_per_lun=254 
>>  vmw_pvscsi.ring_pages=32
>>
>>  Check disk buffers on the virtual layer too. vmware kb 2053145
>>   see 
>> http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&externalId=2053145&sliceId=1&docTypeID=DT_KB_1_1&dialogID=621755330&stateId=1%200%20593866502
>>
>>  Optimize your disk for performance (up to 30%!!! yes):
>>
>>  for the filesystems were graylog and or elastic is located add the 
>> following to /etc/fstab
>>
>> example:
>> /dev/mapper/vg_nagios-lv_root /  ext4 
>> defaults,noatime,nobarrier,data=writeback 1 1
>> and if you want to be more safe:
>> /dev/mapper/vg_nagios-lv_root /  ext4 defaults,noatime,nobarrier 1 1    
>>
>> is ES_HEAP_SIZE configured @ the correct place (I did that wrong at first)
>> it is in /etc/systconfig/elasticsearch
>>
>>
>> All these options together can improve system performance huge specially 
>> when they are virtial.
>>
>> ps did you correctly cha
>>
>> ...
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"graylog2" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [graylog2] High CPU and did not find meta info issues since adding new Graylog servers and increased input messages/second

Reply via email to