2015-03-13 1:26 GMT+01:00 David Lang <[email protected]>:

> On Thu, 12 Mar 2015, singh.janmejay wrote:
>
>  I haven't seen the reordering code yet, but the loading does preserve
>> order.
>>
>> It still is deterministic, just that the criteria is rule-order (and
>> it being applicable only for field-subtrees makes it slightly odd).
>>
>
> this is definantly an issue
>
> looking at my cisco.endpoint ruleset
>
> origionally I had:
>
> rule=:%ip:ipv4%%tail:rest%
> rule=:%ip:ipv4%/%port:number%%tail:rest%
> rule=:%ip:ipv4%/%port:number%(%label1:char-to:)%)%tail:rest%
> rule=:%ip:ipv4%/%port:number% (%label2:char-to:)%)%tail:rest%
> rule=:%ip:ipv4%/%port:number%(%label1:char-to:)%)
> (%label2:char-to:)%)%tail:rest%
> rule=:%iface:char-to:\x3a%\x3a%ip:ipv4%/%port:number%%tail:rest%
> rule=:%iface:char-to:\x3a%\x3a%ip:ipv4%/%port:number%(%
> label1:char-to:)%)%tail:rest%
> rule=:%iface:char-to:\x3a%\x3a%ip:ipv4%/%port:number%
> (%label2:char-to:)%)%tail:rest%
> rule=:%iface:char-to:\x3a%\x3a%ip:ipv4%/%port:number%(%label1:char-to:)%)
> (%label2:char-to:)%)%tail:rest%
> rule=:%iface:char-to:\x3a%\x3a%ip:ipv4%/%port:number%(%label1:char-to:)%)
> (%label2:char-to:)%)
>
> After learning about the rest issue I duplicated each line without the
> %tail:rest% at the end
>
> still not working without disabling the items with rest in them
>
> so after the discussion on ordering, I tried reversing all the rules, it
> still didn't work because the char-to matches better than the ipv4.
>
> so for the moment I have the rules as:
>
> rule=:%ip:ipv4%/%port:number%(%label1:char-to:)%) (%label2:char-to:)%)
> rule=:%ip:ipv4%/%port:number%(%label1:char-to:)%)
> (%label2:char-to:)%)%tail:rest%
> rule=:%ip:ipv4%/%port:number% (%label2:char-to:)%)
> rule=:%ip:ipv4%/%port:number% (%label2:char-to:)%)%tail:rest%
> rule=:%ip:ipv4%/%port:number%(%label1:char-to:)%)
> rule=:%ip:ipv4%/%port:number%(%label1:char-to:)%)%tail:rest%
> rule=:%ip:ipv4%/%port:number%
> rule=:%ip:ipv4%/%port:number%%tail:rest%
> rule=:%ip:ipv4%
> rule=:%ip:ipv4%%tail:rest%
> rule=:%iface:char-to:\x3a%\x3a%ip:ipv4%/%port:number%(%label1:char-to:)%)
> (%label2:char-to:)%)
> rule=:%iface:char-to:\x3a%\x3a%ip:ipv4%/%port:number%(%label1:char-to:)%)
> (%label2:char-to:)%)%tail:rest%
> rule=:%iface:char-to:\x3a%\x3a%ip:ipv4%/%port:number% (%label2:char-to:)%)
> rule=:%iface:char-to:\x3a%\x3a%ip:ipv4%/%port:number%
> (%label2:char-to:)%)%tail:rest%
> rule=:%iface:char-to:\x3a%\x3a%ip:ipv4%/%port:number%(%label1:char-to:)%)
> rule=:%iface:char-to:\x3a%\x3a%ip:ipv4%/%port:number%(%
> label1:char-to:)%)%tail:rest%
> rule=:%iface:char-to:\x3a%\x3a%ip:ipv4%/%port:number%
> rule=:%iface:char-to:\x3a%\x3a%ip:ipv4%/%port:number%%tail:rest%
>
> but I'm not sure if this really will work or not without testing every
> specific case because I don't know where the order is going to matter, and
> the char-to may match cases where It isn't going to match the rest of the
> rule and it won't fall through to the shorter match.
>
> order dependency is not the right answer.
>
>
I currently consider this a bug that needs to be fixed at some time. But
again, I don't think *now* is the right time to do so (at least not for me).

Rainer

> Why does this need to be added to the end of the tree rather than being
> positioned like any other rule components?
>
> David Lang
>
>
>
>
>  On Thu, Mar 12, 2015 at 10:55 PM, Rainer Gerhards
>> <[email protected]> wrote:
>>
>>> 2015-03-12 18:16 GMT+01:00 singh.janmejay <[email protected]>:
>>>
>>>  On Thu, Mar 12, 2015 at 9:29 PM, Rainer Gerhards
>>>> <[email protected]> wrote:
>>>>
>>>>> 2015-03-12 16:41 GMT+01:00 David Lang <[email protected]>:
>>>>>
>>>>>  On Thu, 12 Mar 2015, Rainer Gerhards wrote:
>>>>>>
>>>>>>  2015-03-12 5:55 GMT+01:00 singh.janmejay <[email protected]>:
>>>>>>
>>>>>>>
>>>>>>>  On Thu, Mar 12, 2015 at 9:19 AM, David Lang <[email protected]> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>  On Thu, 12 Mar 2015, singh.janmejay wrote:
>>>>>>>>>
>>>>>>>>>  Tried re-ordering it? Put the one with /port first?
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> no, lognorm rules are not supposed to be order dependent, so I
>>>>>>>>> didn't
>>>>>>>>> try
>>>>>>>>> that (especially after finding things failing to parse with rsyslog
>>>>>>>>>
>>>>>>>> that
>>>>
>>>>> worked manually)
>>>>>>>>>
>>>>>>>>>
>>>>>>>> In case of input strings being matching-rule-wise disjoint, you are
>>>>>>>> right, order won't matter. But when they are not disjoint, order
>>>>>>>> does
>>>>>>>> matter, because the first one to match the string wins.
>>>>>>>>
>>>>>>>> Consider this rulebase:
>>>>>>>> rule=:%ip:ipv4%%last:rest%
>>>>>>>> rule=:%ip:ipv4%%junk:char-sep:/%/%port:number%
>>>>>>>>
>>>>>>>> If you write it the way I have above, you'll end up matching first
>>>>>>>> rule for input 10.20.30.40/5
>>>>>>>>
>>>>>>>> But if you write it this way:
>>>>>>>> rule=:%ip:ipv4%%junk:char-sep:/%/%port:number%
>>>>>>>> rule=:%ip:ipv4%%last:rest%
>>>>>>>>
>>>>>>>> You'll end up matching the first one.
>>>>>>>>
>>>>>>>>
>>>>>>>>  This shouldn't happen. The theory is:
>>>>>>>
>>>>>>> Let i be the current index to be looked at at the line. If for i a
>>>>>>>
>>>>>> parser
>>>>
>>>>> is selected, parsers shall be tried first (in theory, according to
>>>>>>>
>>>>>> parser
>>>>
>>>>> ordering, but I think this is not yet fully implemented). If a parser
>>>>>>> fits,
>>>>>>> processing is advanced to next tree node.
>>>>>>>
>>>>>>> If the node at i does not have a parser (or all parsers failed, I
>>>>>>> think
>>>>>>> [but not sure]), advance to next node basded on character match.
>>>>>>>
>>>>>>
>>>> This is precisely what it does.
>>>>
>>>>
>>>>>>> The order of apperance of rules inside the rulebase should not affect
>>>>>>> this.
>>>>>>>
>>>>>>
>>>> It doesn't for litteral-subtree, but it does for field-subtree,
>>>> because they are inserted at the tail of the linked-list.
>>>>
>>>> This code (
>>>> https://github.com/rsyslog/liblognorm/blob/master/src/ptree.c#L394)
>>>> adds new subtrees at the end of linked-list, which is what causes the
>>>> ordering-sensitive behaviour.
>>>>
>>>>
>>>>  OK, it seems like I overlooked this effect. I don't think it is good to
>>> have any order dependence. Anyways, the work I am carrying out will most
>>> probably lead to algorithmic changes and I'll re-evaluate that when I
>>> reach
>>> that point (not soon). Of course, I won't break anything that exists. If
>>> things diverge too much, I'll add an alternate library,. But again, this
>>> needs to be seen and it is too early to think about this,
>>>
>>> On the ordering issue: are you sure that the order is always properly
>>> preserved? I never put any effort into it (as order was designed
>>> irrelevant) and some reodering (IIRC) happens intentionally (parser
>>> priorities).
>>>
>>> Rainer
>>>
>>>
>>>  If it does, it's either not yet implemented or a bug. this is also why
>>>>>>>
>>>>>> I
>>>>
>>>>> don't like the "rest" syntax -it always matches and thus terminates
>>>>>>> interpretation.
>>>>>>>
>>>>>>>
>>>>>> I'll post a simple test case when I get into the office in a bit.
>>>>>>
>>>>>> In this particular case, it's failing to check other parsers when it
>>>>>>
>>>>> hits
>>>>
>>>>> a failure and backs up.
>>>>>>
>>>>>> But there are other cases where multiple rules may match. stringto,
>>>>>>
>>>>> rest,
>>>>
>>>>>
>>>>>
>>>>> word, stringto are "last resort parsers", to be used only if anything
>>>>>
>>>> else
>>>>
>>>>> fails.
>>>>> rest IMHO should never be used, but I think I can propose something in
>>>>>
>>>> the
>>>>
>>>>> future that solves the need that comes with it (if there still is a
>>>>> need
>>>>>
>>>> at
>>>>
>>>>> that point).
>>>>>
>>>>>
>>>>>  iptables
>>>>>>
>>>>>
>>>>>
>>>>> iptables is a different story, it's actually for a different type of
>>>>>
>>>> logs -
>>>>
>>>>> at least I think so now. I am unfortunately not prepared to discuss
>>>>> this
>>>>> right now, as I want to keep concentrated on the log structure
>>>>> analyzer.
>>>>>
>>>> It
>>>>
>>>>> doesn't help if I do a bit of everything without anything ever nearing
>>>>> completion ;)
>>>>>
>>>>>
>>>>>  are all things that can easily match a lot of data where other rules
>>>>>> may
>>>>>> also match by having more specific listings. In such cases it should
>>>>>>
>>>>> still
>>>>
>>>>> be deterministing which rule 'wins'. I can think of a few ways to
>>>>>> define
>>>>>> this.
>>>>>>
>>>>>> 1. fewest parsers needed wins
>>>>>>
>>>>>> 2. most parsers needed wins
>>>>>>
>>>>>
>>>> This is probably the closest simple approximation to best match.
>>>>
>>>> I was thinking about this too.
>>>>
>>>>
>>>>>> 3. ordering of parsers, where the 'greedier' ones are put last so they
>>>>>> only come into play if the more specific ones don't match.
>>>>>>
>>>>>
>>>> We could assist it by setting relative weights etc. Eg. ipv4 gets
>>>> weight 10, but rest gets only 1 etc.
>>>>
>>>> Once we get the coefficients right, this can probably be achieved(its
>>>> like a costing-based picker, run once ptree has been loaded to sort
>>>> all subtree lists by cost in one shot).
>>>>
>>>>
>>>>>>
>>>>>>  That's the designed approach, and I am very sure it's the right one.
>>>>> As I
>>>>> said, it's at least not fully implemented.
>>>>>
>>>>> This also means we need many more specific parsers. I never get there,
>>>>> because of a) time shortage and b) lack of sufficient log samples.
>>>>> Where
>>>>> log samples is not a single line or two, but at least several
>>>>> thousands,
>>>>>
>>>> so
>>>>
>>>>> that I can evaluate false positives. While b) is still a very big
>>>>> problem
>>>>> to me, a) has been much relaxed thanks to the thesis work. Also, work
>>>>> on
>>>>> the semi-automatic rule creator looks promising. As it is a heuristic,
>>>>>
>>>> the
>>>>
>>>>> lack of log samples unfortunately is a very large hindering block.
>>>>>
>>>>> Rainer
>>>>> _______________________________________________
>>>>> rsyslog mailing list
>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>>> http://www.rsyslog.com/professional-services/
>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
>>>>> myriad
>>>>>
>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
>>>> DON'T LIKE THAT.
>>>>
>>>>
>>>>
>>>> --
>>>> Regards,
>>>> Janmejay
>>>> http://codehunk.wordpress.com
>>>> _______________________________________________
>>>> rsyslog mailing list
>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>> http://www.rsyslog.com/professional-services/
>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
>>>> DON'T LIKE THAT.
>>>>
>>>>  _______________________________________________
>>> rsyslog mailing list
>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>> http://www.rsyslog.com/professional-services/
>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
>>> DON'T LIKE THAT.
>>>
>>
>>
>>
>>
>>  _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> DON'T LIKE THAT.
>
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to