Hello,

With the release of 1.0 we've started moving towards a new cluster of GL 
hosts. These are working very well, with one exception.
For some reason any reasonably significant UDP traffic will choke the 
message processor, fill up and process buffers on all four hosts, and 
effectively choke up all other message processing as well.
Normally we do around 2k messages per second, split roughly 50/50 between 
TCP and UDP. Sending the entire TCP load to one host doesn't present a 
problem, it doesn't break a sweat.

I've also experimented a little with sending a large text file using 
rsyslog's imfile module, sending it via TCP will bottleneck us at the ES 
side of things and cause the disk journal fill up fairly rapidly, but it's 
still working at at ~9k messages per second so that's fine. Sending it via 
UDP just causes GL to choke again, fill up the journal to a certain point 
and slowly slowly process the journal at little bursts of a few thousand 
messages followed by several seconds of apparent sleeping(i.e pretty much 
no CPU usage).

During all of this the input buffer never fills up more than at most single 
digit percentages, using TCP the output buffer sometimes moves up to 
20-30%, with UDP it never moves at all. It's all in the process buffer. 
Sending a large burst of messages and then stopping doesn't seem to affect 
this behavior either, even after the inbound messages stop it still takes a 
long time to process the messages that are already in the journal and 
process buffer.
I'm using VisualVM to look at the CPU and memory usage, this is a 
screenshot of a UDP session:
http://i59.tinypic.com/x23xfl.png

I've tried mucking around with various knobs, processbuffer_processors, JVM 
settings, etc, with no results whatsoever, good or bad.
There's nothing to suggest a problem in neither the graylog nor system logs.

Pertinent specs and settings:
ring_size = 16384 (CPU's have 20 MB L3)
processbuffer_processors = 5

Java 8u31
Using G1GC with StringDeduplication, I've tried without the latter and just 
using CMC as well, no difference.
4 GB Xmx/Xms.
Linux 3.16.0
net.core.rmem_max = 8388608

These are virtual machines, VMware, 8 GB / 8 vCPU's, Xeon E5-2690's.

Software wise the old nodes are running the same setup more or less, except 
kernel 3.2.0, same JVM, G1GC, etc. Hardware wise, they're physical boxes, 
old Dell 2950's with dual quad core E5440's. That's Core2 era so quite a 
bit slower.

Any ideas?

-- 
You received this message because you are subscribed to the Google Groups 
"graylog2" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to