[graylog2] Re: How to fix Nodes with too long GC pauses issues in my cluster.

Jochen Schalanda Thu, 08 Jan 2015 01:44:48 -0800

Hi Pete,

thanks for sharing this!


In general, using real hardware for database applications like 
Elasticssearch is always a good idea, especially since it prevents problems 
caused with noisy neighbors and disk-cache thrashing in virtualized 
environments.


Cheers,
Jochen


On Thursday, 8 January 2015 02:03:39 UTC+1, Pete GS wrote:
>
> I've recently been through this in another thread here and some very 
> helpful replies had me looking at ElasticHQ also and this is an excellent 
> plugin that helped me understand my Elasticsearch nodes were CPU bound.
>
> I have gone away from VM's for Elasticsearch and am instead now using 
> physical servers. 4 x dual 6 core blades with Hyperthreading enabled and 
> 72GB RAM run my "active" Elasticsearch indices for Graylog2. Two additional 
> dual 4 core blades with 72GB RAM run my "archive" indices which we rarely 
> access.
>
> I have also set the refresh interval to 30 seconds as recommended and 
> since doing this in conjunction with closing older indices everything has 
> been reliable and stable and Graylog2 is very responsive to searches.
>
> I also refined buffer processor settings and batch sizes as per 
> recommendations here, so all these parameters made a huge difference to the 
> environment.
>
> My Graylog2 servers are still VM's and working fine (12 CPU's, 32GB RAM). 
> I did however add two lower spec Graylog2 servers which I use solely for 
> the web interface, one of which is the Master node. Neither of these have 
> been added to our load balancer config for the inputs so they are dedicated 
> search only.
>
> Ours is now live and happily servicing an average of 3000 to 4000 messages 
> per second but this morning it had a spike up over 10000 per second and 
> still worked well.
>
> Hope this helps...
>
> On Saturday, 3 January 2015 23:43:18 UTC+10, Joseph DJOMEDA wrote:
>>
>> Hello guys and happy new year.
>>
>> I am really sorry for this very noob question. I have a cluster (I am not 
>> even sure I set up correctly) of 3 nodes. 
>>
>> 1st node: central logstash/graylogs UI/ graylogs Server  all installed 
>> using their ubuntu repositories.
>>  second and third nodes are solely elasticsearch instances both installed 
>> with ubuntu repository
>>
>> kindly find on pastie the configuration <http://pastie.org/9810898> I 
>> used. 
>>
>> Each box is a VM of :
>>
>>
>>    - *2 dedicated cores*
>>    - *6 virtual cores*
>>    - *12 GB RAM guaranteed*
>>
>>
>>
>>    -  I thought I would get away with this but I get :
>>    
>> Nodes with too long GC pauses 15 hours ago
>> There are Graylog2 nodes on which the garbage collector runs too long. 
>> Garbage collection runs should be as short as possible. Please check 
>> whether those nodes are healthy. (Node: *4108686b-xxxxxxxxxb6400*, GC 
>> duration: *1022 ms*, GC threshold: *1000 ms*)
>>
>> While memory stabilized at 4GB the CPUs are almost 100% constantly for 
>> graylogs UI/Server node. From all I read this type of issue is due to lack 
>> of memory. I had to resume some of the streams because it was automatically 
>> pause I presume due to the same error.
>>
>> Please let me know whether my setup is ok (not sure about the cluster bit 
>> on elasticsearch)
>> Please help me fix the GC issue as well.
>>
>> Thanks 
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"graylog2" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to graylog2+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[graylog2] Re: How to fix Nodes with too long GC pauses issues in my cluster.

Reply via email to