On 2/21/2018 7:02 PM, David Lang wrote:
On Wed, 21 Feb 2018, deoren wrote:
On 2/20/2018 6:58 PM, David Lang wrote:
On 2/20/2018 6:39 PM, deoren wrote:
In this case, my specific goal is to look for log messages
containing "SPECIFIC_PATTERN_HERE" (as shown in sample log message)
and if a match is found parse the message to pull out specific
values. Those values are then used to generate a notification for
our ticketing system (e.g., specific URL patterns indicate abuse
that we need to review further before our vendor contacts us and
threatens to cut off service). In this case we're not matching a
possible range of patterns, but a very specific string that is known
you don't need to do this two stage approach (detect a pattern, then
parse the log) with liblognorm. Instead you just create rules for all
your logs that include the various patters that you want to match,
and liblognorm uses whichever one matches. The two-stage approach is
needed with regexes because they are so expensive to to evaluate, but
since liblognorm rules are so fast, it makes far more sense to just
define all the rules.
Do you recommend running mmnormalize as close to the source as
possible or on the primary receiver? I'm guessing the former so that
the rules are run on the original source and not on content that may
have been modified by other receivers in transit?
there are arguments both ways.
running it close to the source distributes the work (but if you run it
on the machine that has the source, it is some extra load)
but the resulting json is typically a bit larger than the original
message (not always, but typically) and so it can take more network
bandwidth to send the result.
liblognorm is so fast you really have to use it to believe it. At
$lastjob I had a 1400 line ruleset handling >100K logs/sec without the
liblognorm effort being noticable
Wow, that's pretty impressive. I may try employing mmnormalize in both
locations to see which is easier to work with. I suspect that for some
cases it would need to be run on the receiver to handle non-rsyslog
clients (misc equipment for example).
Is mmnormalize primarily intended for content ingested via imfile or
is it pretty standard to apply mmnormalize to all inputs? Perhaps just
the inputs where you expected unstructured log content to be ingested?
it is very much NOT limited to imfile, it's the general purpose tool to
convert unstructured log content to a normalized format.
Thanks for confirming. I've seen the two paired up in some guides I've
looked over, so I began to wonder if that was the common scenario.
rsyslog mailing list
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE