On Thu, Dec 02, 2010 at 09:22:46AM +0100, Rainer Gerhards wrote:
> Thanks for the nice review and instructions :)
> 
> I have begun to work heavily on a message modification module for rsyslog
> which will support liblognorm-style normalization inside rsyslog. In git
> there already is a branch "lognorm", which I will hopefully complete and
> merge into master soon. It provides some *very* interesting shortcuts of

        In the rsyslog git tree?  I have a specific reason for asking,
which I'll get to later... 

> pulling specific information out of syslog messages. I'll probably promote it
> some more when it is available. IMHO it's the coolest and potentially most
> valuable feature I have added in the past three years. Once I have enabled
> tags in liblognorm/libee, you can even very easily classify log messages
> based on their content.

        To be honest,  it's not only going to be a valuable addition to 
rsyslog,  but my software as well...  Not only that,  I think once
people 'understand it's function',   I could see many other projects
benefiting from it! 

> I did a couple of bug fixes yesterday. Frequently pull from git ;)

        Oh!  I know it's a moving target! :)  I'll git it frequently!

> >     So,  that work nicely.  Nifty.  I made a few more 'complex'
> > rules, and those worked fine as well.  
> 
> I added a capability to generate graphs of the actual call tree. I think this
> is *very* useful. An article on how to do that will be posted soon to the web
> site (will make sure a notification goes to the list).

        Yes,  please do!  I'd seen mention in the source about this,
but didn't really dive into what you where trying to do there. 

> > However,  if the rule is off a
> > bit,  then you've got issue.  Here's what I mean..  Back on my example
> > above..  If this:
> > 
> > Dec  1 14:10:11 testbox ntpd[3821]: synchronized to 192.168.0.10
> > 
> > changes to this:
> > 
> > Dec  1 14:10:11 testbox ntpd[3821]: synchronized to 192.168.0.10,
> > stratium 1
> 
> Well, that's by intention. The normalizer must know exactly which message it
> is dealing with. This is even more important when we use it for
> classification. So these two messages are definitely different, and I would
> consider it very dangerous to automatically merge them into a single one
> (which would not be a problem from a purely implementation PoV). The more
> fuzzy the recognition is, the higher is the chance of false recognition,
> something that would be really, really bad in the context of normalization.

        I understand.   I merely meant that comment as a observation and
wasn't criticizing it.  Once I played a little bit,  I actually thought,
"oh,  actually that make sense why it'd work like that!" :)
> 
> > The 'normalizer' will call the ",stratium 1" part of the message as
> > "unclassified".  However,  it doesn't appear that it'll grab the IP
> > address,  tag, etc.
> > 
> > Also,  I thing the "real work" is going to be writing rules.  That's
> > going to take some effort,  in which I hope to assist with.
> 
> Yes, that's definitely a lot of work -- and more than what a single person
> can do. In order to make the normalizer really useful, we need a community
> effort. If everyone contributes sample databases for their devices, we could
> gain good results fast. But the key is getting enough momentum, so please
> help spread the word!

        I already have been spreading the word.  Strangely,  I mentioned
it to the OSSEC guys,  and they didn't seem that interested in it.
That might change over time.  I've talked about it on the Sagan mailing
list and people there seem to "get it".

        I'm tinkering with the idea about adding some liblognorm code
into Sagan (probably today).  A couple of things dawned on me.  Many log
lines that Sagan detects as "hostile" won't need normalization.  Much of
the information I need I already have.  However,   information from
appliances like firewalls,  routers,  etc.  will.   So,  I'll probably
add a 'normalize' flag into my rule set.  That way,   I'm only
attempting to normalize log lines when I know I need critical
information from. This way,  I don't waste CPU ticks on attempting 
to normalize log lines that don't need it. 

        Another thing....  Much of the base information I'm already
getting.  For example,  if I have this rule....

:%date:date-rfc3164% %host:word% %tag:char-to:\x3a%:synchronized to %ip:ipv4%

        Sagan already has the data, host, tag.  Really,  all I'd need is:

:synchronized to %ip:ipv4%

        This is my dilemma.  Do I make my own 'Sagan' liblognorm rules,  which
are a stripped down version of the liblognorm rules or do I just have
Sagan rebuild the 'syslog string' so that the 'standard' liblognorm
rules can be used?  I like the idea of Sagan's own rules,  but then that
means keeping up with two liblognorm rule sets.   I don't like that.  

        Lastly,  and this goes back to the beginning of this post.  It's
not a huge deal,  but how are you going to handle dependencies with
rsyslog?  That is,  in the end,  will people have to
download/compile/install libestr,  libee,  liblognorm,  {insert other
dependencies here for rsyslog},  then rsyslog?  While that's not a huge
deal for me,  IMHO added dependencies 'turn off' users from using
features.   Even though the 'dependencies' in question are usually
trivial to install,   it adds yet another layer for end users to find
issues with.   Considering the libraries in question (ee/lognorm/estr)
are pretty small (at least now!),  would a one time package/build of
something like 'liblognorm-complete' be possible.  I know all this seems
silly,  but it's these little things IMHO that cause potential users
from shifting away from software.  I could be wrong,  and maybe I'm 
over analyzing the issue. 

        Oh.. on more thing.  Do you think it's to early to start
writing liblognorm rules?  

-- 
        Champ Clark III | Softwink, Inc | 800-538-9357 x 101
                     http://www.softwink.com

GPG Key ID: 58A2A58F
Key fingerprint = 7734 2A1C 007D 581E BDF7  6AD5 0F1F 655F 58A2 A58F
If it wasn't for C, we'd be using BASI, PASAL and OBOL.

Attachment: pgpGCEzdGLsJv.pgp
Description: PGP signature

_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com

Reply via email to