Re: [rsyslog] LibLogNorm way of doing things

Pavel Levshin Fri, 22 Nov 2013 04:49:48 -0800

No, it is backward compatible with older rulebases. At least, I tried topreserve compatibility at this layer.



--
Pavel Levshin


22.11.2013 15:01, Walid Moghrabi:

Great ! :)


But, does that mean that we'll have to rewrite the rulebase to be compatible 
with this release ?

----- Mail original -----

De: "Pavel Levshin" <[email protected]>
À: [email protected], [email protected]
Envoyé: Jeudi 21 Novembre 2013 17:17:30
Objet: Re: [rsyslog] LibLogNorm way of doing things



Well, I am fully agree with you, and I've addressed some of these issues
already. Next version of liblognorm is here:

https://github.com/flicker581/liblognorm/tree/master-json-c

It will be merged into upstream soon, I hope. But this version is a
rewrite, it is not compatible with current mmnormalize. Therefore, it
will be available in rsyslog 7.5 and 8.x, but, most probably, not in
7.4. Main purpose of the rewrite was to replace libee data structures
with json-c objects, which should improve performance.

More comments below:

21.11.2013 18:39, Walid Moghrabi:

Hi,


This message will certainly appears as a "complain" message but I hope this 
could give some ideas on ways to improve LibLogNorm , or, maybe I'm simply not using it 
properly and maybe someone could help me since I had many difficulties to find clear 
documentation or help on this topic.


So, what is wrong with it ?


Well, first of all, building a rulebase file is not very documented (to say the 
least) and it is pretty difficult to build one and test it.

This is indeed a problem, and some quirks of the implementation makes it
even harder. There is a tool to test your rulebase, called
"lognormalizer", did you know? It's debug output is pretty useful... for
one who knows internals already. I've added a sample rulebase, but it is
still incomplete; for example, I've not used prefixes in it:
https://github.com/flicker581/liblognorm/blob/master-json-c/rulebases/sample.rulebase

Maybe it would be useful to have at least an indication of which rule
left "unparsed-data" part?

But what is really annoying is the way it work : its a go/no go way and it is 
pretty painfull ... let me explain :


I use MMNormalize to normalize messages coming from my web servers : I split 
the message in key/value pairs in order to store them in a database so that I 
can use them with LogAnalyzer.
Great but ... sometimes, when you are working with logs, since they are not all 
very normalized themself, their content may vary a little and sometimes, your 
rulebase simply doesn't work because there is a trailing whitespace character 
that was added at the end of the message because one logger version is working 
a bit differently than an other one.
This would be just fine is MMNormalize would simply ignore it and normalize 
what it can but it is not doing that way ... as soon as there are unparsed 
data, it simply stop and don't treat the message passing it untouched and thus, 
not normalized at all so that it completely mess up in the db (the whole 
message would be stored in the MSG field but other fields for normalized data 
are simply empty).


You might say that it is better than simply droping the message but really, 
this is very annoying.


The same applies for added fields ... at first, I was getting every fields from a 
classical "combined" log format from Apache but I had to add a few fields 
(vhost, SSL state, ...).
The first part of the logFormat didn't change, I added the fields at the end of the 
message so, if MMNormalize would have work the way I'd love it would, it would have 
retrieve the fields in the rulebase and ignore the new elements that it would have store 
in the "unparsed data" field and work normaly but no ... it simply ignore 
everything, I get not normalized fields, only the raw message.


Lest but not least ... I have some log files from dedicated applications that 
are partly normalized, let's say that the 2-3 first fields are normalized and 
thus, usable for normalization but the last part of the message is random text 
with no normalization at all.
I can't ask for a change in the format and thus, I can't ask for a quoted 
string that I could handle.
There would be a nice way to handle this : a selector that would say "from that 
point, take everything until the end of the line".
That would be great.

I've implemented it as "rest" type:

# Snow White and the Seven Dwarfs
rule=tale:Snow White and %company:rest%


It matches zero or more characters till end of the line. As far as I
understand all you've said above, this type can do everything you want.

I tried with char-to selectors but never found a way to do this.

For "char-to" fields, there is an open question still. In current
implementation, it cannot match zero characters. It can be a problem,
because, in some cases, you may need to match zero-length field between
two separators. I've did not change this aspect, because this is
incompatible change; with it, older rulebases can begin to match
unexpectedly. Nevertheless, it can be done. Maybe, this is better to do
as a separate type, based on "char-to". Comments and suggestions are
welcome.


_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Re: [rsyslog] LibLogNorm way of doing things

Reply via email to