Can you share your regex?  Maybe there is some simple things we can do
there without going straight to C (not that I don't recommend that route -
it will be the best route, but this might be quicker).


On Tue, Oct 22, 2013 at 8:00 AM, Boylan, James <[email protected]>wrote:

> The initial design I'm looking at has 8 instances per server. Which is
> about the maximum these serves can handle with the complex regex we're
> running. As David points out, the regex is the biggest choke point in the
> application when it comes down to it.
>
> I wonder if anyone has a link detailing on how one might build out the
> parsing in a C module for Rsyslog. I'll do the footwork, but if someone has
> a link going over it at a high level it might save me some time.
>
> Thanks!
>
> -- James
>
> -----Original Message-----
> From: [email protected] [mailto:
> [email protected]] On Behalf Of Aaron Wiebe
> Sent: Tuesday, October 22, 2013 6:54 AM
> To: rsyslog-users
> Subject: Re: [rsyslog] Large Scale Rsyslog deployment
>
> Em, link fix:  https://github.com/blackberry/hadoop-logdriver
>
>
> On Tue, Oct 22, 2013 at 7:53 AM, Aaron Wiebe <[email protected]> wrote:
>
> > On Tue, Oct 22, 2013 at 7:16 AM, Boylan, James <[email protected]
> >wrote:
> >
> >>
> >> I agree that not doing all of the expensive regex would be a better
> >> solution, and I'm actually in the process of making changes with our
> >> developers to address that, but for the short term I'm working with
> >> what we have on hand. My eventual goal is to just have them output in
> >> JSON. It saves a lot of time long term and works well with parsing
> >> the messages in both Elasticsearch and Hadoop.
> >
> >
> > I previously built a 100k+ syslog infrastructure... per server.  ;)
> > We used imptcp - and I know David's experience has been primarily with
> UDP.
> >  The difference from our side was that we wanted to know when we
> > dropped messages, so tcp provided that level of confidence - either
> > the message was dropped in rsyslog (which we could get from the queue
> > stats) or on the other side.
> >
> > On regex:  the format of your regex itself will feed the compute
> > requirements quite significantly.  Simplify, use anchors, avoid hungry
> > wildcards.  If you can, move to a straight string match.
> >
> > On instances:  rsyslog will top out around 2-3 cores.  Run 5-10
> > instances on the same machine using different ports if possible, on
> modern hardware.
> >
> > On Hadoop ingest and Elastic search:  Take a look at
> > http://github.com/blackberry/logdriver-hadoop - it might be of use to
> > you.  Additionally, you may want to consider using Kafka and/or Storm
> > for ingest rather than rsyslog.  That was the direction we were heading.
> >  (Sorry Rainer!)
> >
> > It doesn't sound like your volume is that high.  You just need to
> > segment your ingest a bit.  One machine should comfortably be able to
> > handle 100-200k messages per second - but the threading model (as much
> > as it's improved recently) still can't quite max out modern hardware.
> > Look at multiple instances on the same machine to see if you can't
> > bring the concurrency up.
> >
> > -Aaron
> >
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL:
> This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites
> beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE
> THAT.
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> DON'T LIKE THAT.
>
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to