On Tue, Oct 22, 2013 at 7:16 AM, Boylan, James <[email protected]>wrote:

>
> I agree that not doing all of the expensive regex would be a better
> solution, and I'm actually in the process of making changes with our
> developers to address that, but for the short term I'm working with what we
> have on hand. My eventual goal is to just have them output in JSON. It
> saves a lot of time long term and works well with parsing the messages in
> both Elasticsearch and Hadoop.


I previously built a 100k+ syslog infrastructure... per server.  ;)  We
used imptcp - and I know David's experience has been primarily with UDP.
 The difference from our side was that we wanted to know when we dropped
messages, so tcp provided that level of confidence - either the message was
dropped in rsyslog (which we could get from the queue stats) or on the
other side.

On regex:  the format of your regex itself will feed the compute
requirements quite significantly.  Simplify, use anchors, avoid hungry
wildcards.  If you can, move to a straight string match.

On instances:  rsyslog will top out around 2-3 cores.  Run 5-10 instances
on the same machine using different ports if possible, on modern hardware.

On Hadoop ingest and Elastic search:  Take a look at
http://github.com/blackberry/logdriver-hadoop - it might be of use to you.
 Additionally, you may want to consider using Kafka and/or Storm for ingest
rather than rsyslog.  That was the direction we were heading.  (Sorry
Rainer!)

It doesn't sound like your volume is that high.  You just need to segment
your ingest a bit.  One machine should comfortably be able to handle
100-200k messages per second - but the threading model (as much as it's
improved recently) still can't quite max out modern hardware.  Look at
multiple instances on the same machine to see if you can't bring the
concurrency up.

-Aaron
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to