Folks, please bear with me. Right now, I can't comment in a way that makes
sense, as I need to check with some third parties. Once I have done that,
you'll understand. Please bear a day or maybe some few with me.

Rainer

2015-01-28 8:26 GMT+01:00 Kendall Green <[email protected]>:

> >>Thoughts?
>
> Thanks for the examples, as I understand what you mean about missing
> fields. I just want to clarify, for what I've described, when a field is
> not populated, the label still exists, so it's the same sample, which takes
> on a different shape, as pattern changes depending on the field values.
>
> >> - Does it make sense for users to pack unrelated samples in the same
> rulebase?
>
> Data imput is stratified, so that each ruleset that calls mmnormalize with
> specific rulebase definition for that specific data feed. Windows
> normalization uses a separate rulebase than other operating systems and
> platforms.
>
> The rulebase has the prefix for standard message objects (timestamp
> priority syslogtag), and individual rules for each event id. Each rule
> provides the category, tag value for event id number, and annotation to
> match string literals (since that data type doesn't exist)
>
> When parsing the logs, unparsed-data identifies many which require rulebase
> for every possible combination of if the fields are populated or not, or in
> some cases the alternate includes a character the conflicts with full
> parsing of the rule.
>
> So for a big event logs, with lots of fields, in some cases more than 10
> rules for a single event id not to mention a long time sampling to discover
> which combinations of fields ever show up with a hyphen instead of expected
> value; like are quoted if contains spaces, and no quote if doesn't as a
> single word, or is a space if null, or it prints "NULL" or nil, or some
> other 'known' string value which doesn't match the intended/initial data
> type.
>
> TL;DR, no, the samples are for the same rule in the rulebase, which are
> related to the type/log source/specific EventID.
>
> >>     * The rulebase will be composed of several unrelated rules, making
> it harder to read
>
> The rulebase is already hard to read, as it is currently a mess with
> multiple rules for a single event id
>
> There the 'or'|' type would resolve that also a lot of issues will be
> cleared once able to match with the new feature to-string. I like that it's
> similar to fields() extract give option of char or string separators. For
> working on building a windows event log rulebase, I've had to set special
> tags on each rule that are all for variations on a single eventID, just to
> verify it's actually being used and not accidentally overlapping patterns
> between different EventIDs.
>
> Not having string literals or char literals in rule base means to map tags
> and ontology annotation for that.
>
> >> * Multiple parse-trees may have to be maintained in order to satisfy
> >> all combinations of nullMarker (eg. a non-leaf field, marked for
> >> null-handling in one sample, but not marked for it in the other) (so
> >> matching will become O(n) in number of combinations). So it is some
> >> dev-work and little bit of perf-overhead.
>
> I'm not certain what you're referring to, but I understand the number of
> combinations / per rule in a rulebase, would affect performance. Do you
> mean for example, that hyphens could represent a nullMarker, and where the
> nullMarkers would be 'potentially' on specified fields? I think it would
> need to exist in rules for certain fields, not on the rulebase option, as
> it would likely conflict with messages. Different than marker for word type
> option, for the contribution for op-quoted-string, but nullMarker would
> probably be useful for CEF where fields that are null are typically not in
> the log...
>
> Regards,
> Kendall
>
> On Tue, Jan 27, 2015 at 11:27 PM, singh.janmejay <[email protected]
> >
> wrote:
>
> > I see what you are thinking of, but somethings that may be worth thinking
> > about before we decide:
> >
> > - Does it make sense for users to pack unrelated samples in the same
> > rulebase?
> >
> >   There are 3 problems with this:
> >      * The tree will become large, and back-tracking several unrelated
> > branches will be wasteful (a condition in ruleset which calls the action
> > will be much more efficient assuming tests is not very complex)
> >
> >      * The rulebase will be composed of several unrelated rules, making
> it
> > harder to read
> >
> >      * Multiple parse-trees may have to be maintained in order to satisfy
> > all combinations of nullMarker (eg. a non-leaf field, marked for
> > null-handling in one sample, but not marked for it in the other) (so
> > matching will become O(n) in number of combinations). So it is some
> > dev-work and little bit of perf-overhead.
> >
> > - The alternative is to set nullMarker at top level in a rulebase
> (instead
> > of being able to change it for every sample).
> >
> >   But then the flexibility is slightly lowered.
> >
> > - If we go with action level param, its useful in cases where one has
> > standard access-log format but load-balancer level always have some
> fields
> > (say upstream latency or upstream-ip) which app-layer access logs will
> not
> > have.
> >
> >   This can use the same rulebase with nullMarker in one case, and without
> > it in another.
> >
> > Thoughts?
> >
> > On Wed, Jan 28, 2015 at 11:13 AM, David Lang <[email protected]> wrote:
> >
> > > I'm thinking that it needs to only apply to part of a ruleset. I can't
> > see
> > > why you would use the same rulebase with different values overall, but
> I
> > > can easily see a rulebase that covers more than one type of logs
> needing
> > > different values for the different types of logs.
> > >
> > > remember that liblognorm is most effictive if it has one ruleset to
> cover
> > > everything you are looking at rather than doing other conditionals and
> > then
> > > picking which rulset to use.
> > >
> > > David Lang
> > >
> > >
> > > On Wed, 28 Jan 2015, singh.janmejay wrote:
> > >
> > >  I think action parameter is the most flexible place to have it at.
> > Because
> > >> same rulebase can be used with different values.
> > >>
> > >> Either module or rulebase level param will be less flexible compared
> to
> > >> this.
> > >>
> > >> --
> > >> Regards,
> > >> Janmejay
> > >>
> > >> PS: Please blame the typos in this mail on my phone's uncivilized soft
> > >> keyboard sporting it's not-so-smart-assist technology.
> > >>
> > >> On Jan 28, 2015 10:48 AM, "David Lang" <[email protected]> wrote:
> > >>
> > >>  On Wed, 28 Jan 2015, singh.janmejay wrote:
> > >>>
> > >>>  Ok, one way I can think of doing it: expose a parameter at
> > action/module
> > >>>
> > >>>> level which turns on defaulting and picks a default string.
> > >>>>
> > >>>> Eg.
> > >>>>
> > >>>> action(type="mmnormalize "  nullMarker="-")
> > >>>>
> > >>>> Where nullMarker is a string (not a char).
> > >>>>
> > >>>> Whenever a "-" is encountered and a field is expected, it should
> skip
> > >>>> the
> > >>>> key(the key will not be present at all) and continue matching next
> > token
> > >>>> onwards.
> > >>>>
> > >>>> Thoughts?
> > >>>>
> > >>>>
> > >>> This needs to be something in the liblognorm config, not in rsyslog.
> > >>> different types of logs would have different nullMarker strings.
> > >>>
> > >>> with that adjustment, I think it's a good idea.
> > >>>
> > >>> David Lang
> > >>>
> > >>>  --
> > >>>
> > >>>> Regards,
> > >>>> Janmejay
> > >>>>
> > >>>> PS: Please blame the typos in this mail on my phone's uncivilized
> soft
> > >>>> keyboard sporting it's not-so-smart-assist technology.
> > >>>>
> > >>>> On Jan 28, 2015 6:38 AM, "David Lang" <[email protected]> wrote:
> > >>>>
> > >>>>  On Wed, 28 Jan 2015, singh.janmejay wrote:
> > >>>>
> > >>>>>
> > >>>>>  May be it'll be useful to discuss what you want to achieve with
> such
> > >>>>>
> > >>>>>  representations of sample. I mean if possible, take a few samples
> > from
> > >>>>>> your
> > >>>>>> existing rulebase which you think highlight the problem(s) you are
> > >>>>>> facing.
> > >>>>>>
> > >>>>>>
> > >>>>>>  I think the example is the Apache logs, where Apache either puts
> a
> > >>>>> value,
> > >>>>> or it puts a placeholder '-'
> > >>>>>
> > >>>>> if you want to capture a specific type (number or ip address for
> > >>>>> example),
> > >>>>> you won't match a log entry that has a - in that field.
> > >>>>>
> > >>>>> If there are only a couple fields that are like this, you can list
> > all
> > >>>>> the
> > >>>>> combinations in the ruleset, but if you have a lot of fields like
> > this,
> > >>>>> the
> > >>>>> combinatorial explosion would make for a LOT of rules.
> > >>>>>
> > >>>>> So I don't think he really needs a generic 'or' allowing any types
> to
> > >>>>> be
> > >>>>> combined as much as a way to say "this field could be this type or
> > this
> > >>>>> constant"
> > >>>>>
> > >>>>> David Lang
> > >>>>> _______________________________________________
> > >>>>> rsyslog mailing list
> > >>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
> > >>>>> http://www.rsyslog.com/professional-services/
> > >>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
> > >>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
> > >>>>> myriad
> > >>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if
> > you
> > >>>>> DON'T LIKE THAT.
> > >>>>>
> > >>>>>  _______________________________________________
> > >>>>>
> > >>>> rsyslog mailing list
> > >>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
> > >>>> http://www.rsyslog.com/professional-services/
> > >>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
> > >>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
> > myriad
> > >>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if
> you
> > >>>> DON'T LIKE THAT.
> > >>>>
> > >>>>  _______________________________________________
> > >>>>
> > >>> rsyslog mailing list
> > >>> http://lists.adiscon.net/mailman/listinfo/rsyslog
> > >>> http://www.rsyslog.com/professional-services/
> > >>> What's up with rsyslog? Follow https://twitter.com/rgerhards
> > >>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
> > myriad
> > >>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if
> you
> > >>> DON'T LIKE THAT.
> > >>>
> > >>>  _______________________________________________
> > >> rsyslog mailing list
> > >> http://lists.adiscon.net/mailman/listinfo/rsyslog
> > >> http://www.rsyslog.com/professional-services/
> > >> What's up with rsyslog? Follow https://twitter.com/rgerhards
> > >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
> myriad
> > >> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> > >> DON'T LIKE THAT.
> > >>
> > >>  _______________________________________________
> > > rsyslog mailing list
> > > http://lists.adiscon.net/mailman/listinfo/rsyslog
> > > http://www.rsyslog.com/professional-services/
> > > What's up with rsyslog? Follow https://twitter.com/rgerhards
> > > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
> myriad
> > > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> > > DON'T LIKE THAT.
> > >
> >
> >
> >
> > --
> > Regards,
> > Janmejay
> > http://codehunk.wordpress.com
> > _______________________________________________
> > rsyslog mailing list
> > http://lists.adiscon.net/mailman/listinfo/rsyslog
> > http://www.rsyslog.com/professional-services/
> > What's up with rsyslog? Follow https://twitter.com/rgerhards
> > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
> > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> > DON'T LIKE THAT.
> >
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> DON'T LIKE THAT.
>
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to