Same problem here with to many cpu's (not on grayling application).

What happens is that code swaps continuous between cores, in our case it 
helps to bind the
application to a core, but managing it is a dork to do. The virtual layer 
looses a lot of resources
in constantly managing resources over the cores. With 2 it can already be 
up to 10%!

We are running a lot of real-time applications and customer wants 
everything in the cloud. In our
experience 'cloud' delivers us the most of all our problems/glitches. Love 
having some older iron
to run graylog and elastic onto it.



Op vrijdag 22 mei 2015 06:08:44 UTC+2 schreef Pete GS:
>
> Ok, here's where I'm at with this...
>
> I tried implementing the kernel options on one of the Graylog servers as a 
> test but it made no appreciable difference. In fact shortly after the first 
> reboot the VM froze with a locked CPU error. It hasn't done that since a 
> subsequent reboot though. We're not running the PVSCSI adapter either.
>
> After observing this, I revisited Mathieu's comment regarding too many 
> CPU's.
>
> While I still see no contention issues for CPU resources, I started 
> wondering if there was some SMP related issue with CentOS where the extra 
> vCPU's just weren't providing enough extra to cater for the workload.
>
> I scaled all the nodes back to 8 vCPU's and added another four Graylog 
> servers, so I now have 8 servers receiving the inputs.
>
> So far this is running a lot better than the four servers with 16 and 20 
> vCPU's. They still peak at 100% but this is not sustained, even after 
> having an ElasticSearch issue (filling the disks again) that caused a 
> backlog in the message journal overnight.
>
> Almost all the message backlog in the journals have been processed again 
> and it's still working well so far, this is after 24 hours or so.
>
> I'll see how it runs over the weekend.
>
> Incidentally it seems I have inadvertently stumbled across a good number 
> for the process buffer processors... it seems to work well at 2 less than 
> the number of CPU's available to the server. Running with a buffer number 
> of 6 with 8 vCPU's seems to work well. Of course I'm not sure if this is 
> just in my particular environment or if it's a general thing.
>
> Cheers, Pete
>
> On Thursday, 14 May 2015 19:13:24 UTC+10, Pete GS wrote:
>>
>> Thanks very much Arie, I will check these tomorrow and report back.
>>
>> One thing I can confirm is the heap size is configured correctly.
>>
>> Cheers, Pete
>>
>> On 14 May 2015, at 05:35, Arie <[email protected] <javascript:>> wrote:
>>
>> Lets try some more options.
>>
>> I see you are running your stuf virtual. Then you can consider the 
>> following for centos6
>>
>> In your startup kernel config you can add the following options 
>> (/etc/grub.conf)
>>
>>   nohz=off (for high cpu intensive systems)
>>   elevator=noop (disc scheduling is done by the virtual layer, so disable 
>> that)
>>   cgroup_disable=memory (possibly not used, it fees up some memory and 
>> allocation)
>>   
>> if you use the pvscsi device, add the following:
>>   vmw_pvscsi.cmd_per_lun=254 
>>  vmw_pvscsi.ring_pages=32
>>
>>  Check disk buffers on the virtual layer too. vmware kb 2053145
>>   see 
>> http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&externalId=2053145&sliceId=1&docTypeID=DT_KB_1_1&dialogID=621755330&stateId=1%200%20593866502
>>
>>  Optimize your disk for performance (up to 30%!!! yes):
>>
>>  for the filesystems were graylog and or elastic is located add the 
>> following to /etc/fstab
>>
>> example:
>> /dev/mapper/vg_nagios-lv_root /  ext4 
>> defaults,noatime,nobarrier,data=writeback 1 1
>> and if you want to be more safe:
>> /dev/mapper/vg_nagios-lv_root /  ext4 defaults,noatime,nobarrier 1 1    
>>
>> is ES_HEAP_SIZE configured @ the correct place (I did that wrong at first)
>> it is in /etc/systconfig/elasticsearch
>>
>>
>> All these options together can improve system performance huge specially 
>> when they are virtial.
>>
>> ps did you correctly cha
>>
>> ...
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"graylog2" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to