2015-03-13 1:26 GMT+01:00 David Lang <[email protected]>: > On Thu, 12 Mar 2015, singh.janmejay wrote: > > I haven't seen the reordering code yet, but the loading does preserve >> order. >> >> It still is deterministic, just that the criteria is rule-order (and >> it being applicable only for field-subtrees makes it slightly odd). >> > > this is definantly an issue > > looking at my cisco.endpoint ruleset > > origionally I had: > > rule=:%ip:ipv4%%tail:rest% > rule=:%ip:ipv4%/%port:number%%tail:rest% > rule=:%ip:ipv4%/%port:number%(%label1:char-to:)%)%tail:rest% > rule=:%ip:ipv4%/%port:number% (%label2:char-to:)%)%tail:rest% > rule=:%ip:ipv4%/%port:number%(%label1:char-to:)%) > (%label2:char-to:)%)%tail:rest% > rule=:%iface:char-to:\x3a%\x3a%ip:ipv4%/%port:number%%tail:rest% > rule=:%iface:char-to:\x3a%\x3a%ip:ipv4%/%port:number%(% > label1:char-to:)%)%tail:rest% > rule=:%iface:char-to:\x3a%\x3a%ip:ipv4%/%port:number% > (%label2:char-to:)%)%tail:rest% > rule=:%iface:char-to:\x3a%\x3a%ip:ipv4%/%port:number%(%label1:char-to:)%) > (%label2:char-to:)%)%tail:rest% > rule=:%iface:char-to:\x3a%\x3a%ip:ipv4%/%port:number%(%label1:char-to:)%) > (%label2:char-to:)%) > > After learning about the rest issue I duplicated each line without the > %tail:rest% at the end > > still not working without disabling the items with rest in them > > so after the discussion on ordering, I tried reversing all the rules, it > still didn't work because the char-to matches better than the ipv4. > > so for the moment I have the rules as: > > rule=:%ip:ipv4%/%port:number%(%label1:char-to:)%) (%label2:char-to:)%) > rule=:%ip:ipv4%/%port:number%(%label1:char-to:)%) > (%label2:char-to:)%)%tail:rest% > rule=:%ip:ipv4%/%port:number% (%label2:char-to:)%) > rule=:%ip:ipv4%/%port:number% (%label2:char-to:)%)%tail:rest% > rule=:%ip:ipv4%/%port:number%(%label1:char-to:)%) > rule=:%ip:ipv4%/%port:number%(%label1:char-to:)%)%tail:rest% > rule=:%ip:ipv4%/%port:number% > rule=:%ip:ipv4%/%port:number%%tail:rest% > rule=:%ip:ipv4% > rule=:%ip:ipv4%%tail:rest% > rule=:%iface:char-to:\x3a%\x3a%ip:ipv4%/%port:number%(%label1:char-to:)%) > (%label2:char-to:)%) > rule=:%iface:char-to:\x3a%\x3a%ip:ipv4%/%port:number%(%label1:char-to:)%) > (%label2:char-to:)%)%tail:rest% > rule=:%iface:char-to:\x3a%\x3a%ip:ipv4%/%port:number% (%label2:char-to:)%) > rule=:%iface:char-to:\x3a%\x3a%ip:ipv4%/%port:number% > (%label2:char-to:)%)%tail:rest% > rule=:%iface:char-to:\x3a%\x3a%ip:ipv4%/%port:number%(%label1:char-to:)%) > rule=:%iface:char-to:\x3a%\x3a%ip:ipv4%/%port:number%(% > label1:char-to:)%)%tail:rest% > rule=:%iface:char-to:\x3a%\x3a%ip:ipv4%/%port:number% > rule=:%iface:char-to:\x3a%\x3a%ip:ipv4%/%port:number%%tail:rest% > > but I'm not sure if this really will work or not without testing every > specific case because I don't know where the order is going to matter, and > the char-to may match cases where It isn't going to match the rest of the > rule and it won't fall through to the shorter match. > > order dependency is not the right answer. > > I currently consider this a bug that needs to be fixed at some time. But again, I don't think *now* is the right time to do so (at least not for me).
Rainer > Why does this need to be added to the end of the tree rather than being > positioned like any other rule components? > > David Lang > > > > > On Thu, Mar 12, 2015 at 10:55 PM, Rainer Gerhards >> <[email protected]> wrote: >> >>> 2015-03-12 18:16 GMT+01:00 singh.janmejay <[email protected]>: >>> >>> On Thu, Mar 12, 2015 at 9:29 PM, Rainer Gerhards >>>> <[email protected]> wrote: >>>> >>>>> 2015-03-12 16:41 GMT+01:00 David Lang <[email protected]>: >>>>> >>>>> On Thu, 12 Mar 2015, Rainer Gerhards wrote: >>>>>> >>>>>> 2015-03-12 5:55 GMT+01:00 singh.janmejay <[email protected]>: >>>>>> >>>>>>> >>>>>>> On Thu, Mar 12, 2015 at 9:19 AM, David Lang <[email protected]> wrote: >>>>>>> >>>>>>>> >>>>>>>> On Thu, 12 Mar 2015, singh.janmejay wrote: >>>>>>>>> >>>>>>>>> Tried re-ordering it? Put the one with /port first? >>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> no, lognorm rules are not supposed to be order dependent, so I >>>>>>>>> didn't >>>>>>>>> try >>>>>>>>> that (especially after finding things failing to parse with rsyslog >>>>>>>>> >>>>>>>> that >>>> >>>>> worked manually) >>>>>>>>> >>>>>>>>> >>>>>>>> In case of input strings being matching-rule-wise disjoint, you are >>>>>>>> right, order won't matter. But when they are not disjoint, order >>>>>>>> does >>>>>>>> matter, because the first one to match the string wins. >>>>>>>> >>>>>>>> Consider this rulebase: >>>>>>>> rule=:%ip:ipv4%%last:rest% >>>>>>>> rule=:%ip:ipv4%%junk:char-sep:/%/%port:number% >>>>>>>> >>>>>>>> If you write it the way I have above, you'll end up matching first >>>>>>>> rule for input 10.20.30.40/5 >>>>>>>> >>>>>>>> But if you write it this way: >>>>>>>> rule=:%ip:ipv4%%junk:char-sep:/%/%port:number% >>>>>>>> rule=:%ip:ipv4%%last:rest% >>>>>>>> >>>>>>>> You'll end up matching the first one. >>>>>>>> >>>>>>>> >>>>>>>> This shouldn't happen. The theory is: >>>>>>> >>>>>>> Let i be the current index to be looked at at the line. If for i a >>>>>>> >>>>>> parser >>>> >>>>> is selected, parsers shall be tried first (in theory, according to >>>>>>> >>>>>> parser >>>> >>>>> ordering, but I think this is not yet fully implemented). If a parser >>>>>>> fits, >>>>>>> processing is advanced to next tree node. >>>>>>> >>>>>>> If the node at i does not have a parser (or all parsers failed, I >>>>>>> think >>>>>>> [but not sure]), advance to next node basded on character match. >>>>>>> >>>>>> >>>> This is precisely what it does. >>>> >>>> >>>>>>> The order of apperance of rules inside the rulebase should not affect >>>>>>> this. >>>>>>> >>>>>> >>>> It doesn't for litteral-subtree, but it does for field-subtree, >>>> because they are inserted at the tail of the linked-list. >>>> >>>> This code ( >>>> https://github.com/rsyslog/liblognorm/blob/master/src/ptree.c#L394) >>>> adds new subtrees at the end of linked-list, which is what causes the >>>> ordering-sensitive behaviour. >>>> >>>> >>>> OK, it seems like I overlooked this effect. I don't think it is good to >>> have any order dependence. Anyways, the work I am carrying out will most >>> probably lead to algorithmic changes and I'll re-evaluate that when I >>> reach >>> that point (not soon). Of course, I won't break anything that exists. If >>> things diverge too much, I'll add an alternate library,. But again, this >>> needs to be seen and it is too early to think about this, >>> >>> On the ordering issue: are you sure that the order is always properly >>> preserved? I never put any effort into it (as order was designed >>> irrelevant) and some reodering (IIRC) happens intentionally (parser >>> priorities). >>> >>> Rainer >>> >>> >>> If it does, it's either not yet implemented or a bug. this is also why >>>>>>> >>>>>> I >>>> >>>>> don't like the "rest" syntax -it always matches and thus terminates >>>>>>> interpretation. >>>>>>> >>>>>>> >>>>>> I'll post a simple test case when I get into the office in a bit. >>>>>> >>>>>> In this particular case, it's failing to check other parsers when it >>>>>> >>>>> hits >>>> >>>>> a failure and backs up. >>>>>> >>>>>> But there are other cases where multiple rules may match. stringto, >>>>>> >>>>> rest, >>>> >>>>> >>>>> >>>>> word, stringto are "last resort parsers", to be used only if anything >>>>> >>>> else >>>> >>>>> fails. >>>>> rest IMHO should never be used, but I think I can propose something in >>>>> >>>> the >>>> >>>>> future that solves the need that comes with it (if there still is a >>>>> need >>>>> >>>> at >>>> >>>>> that point). >>>>> >>>>> >>>>> iptables >>>>>> >>>>> >>>>> >>>>> iptables is a different story, it's actually for a different type of >>>>> >>>> logs - >>>> >>>>> at least I think so now. I am unfortunately not prepared to discuss >>>>> this >>>>> right now, as I want to keep concentrated on the log structure >>>>> analyzer. >>>>> >>>> It >>>> >>>>> doesn't help if I do a bit of everything without anything ever nearing >>>>> completion ;) >>>>> >>>>> >>>>> are all things that can easily match a lot of data where other rules >>>>>> may >>>>>> also match by having more specific listings. In such cases it should >>>>>> >>>>> still >>>> >>>>> be deterministing which rule 'wins'. I can think of a few ways to >>>>>> define >>>>>> this. >>>>>> >>>>>> 1. fewest parsers needed wins >>>>>> >>>>>> 2. most parsers needed wins >>>>>> >>>>> >>>> This is probably the closest simple approximation to best match. >>>> >>>> I was thinking about this too. >>>> >>>> >>>>>> 3. ordering of parsers, where the 'greedier' ones are put last so they >>>>>> only come into play if the more specific ones don't match. >>>>>> >>>>> >>>> We could assist it by setting relative weights etc. Eg. ipv4 gets >>>> weight 10, but rest gets only 1 etc. >>>> >>>> Once we get the coefficients right, this can probably be achieved(its >>>> like a costing-based picker, run once ptree has been loaded to sort >>>> all subtree lists by cost in one shot). >>>> >>>> >>>>>> >>>>>> That's the designed approach, and I am very sure it's the right one. >>>>> As I >>>>> said, it's at least not fully implemented. >>>>> >>>>> This also means we need many more specific parsers. I never get there, >>>>> because of a) time shortage and b) lack of sufficient log samples. >>>>> Where >>>>> log samples is not a single line or two, but at least several >>>>> thousands, >>>>> >>>> so >>>> >>>>> that I can evaluate false positives. While b) is still a very big >>>>> problem >>>>> to me, a) has been much relaxed thanks to the thesis work. Also, work >>>>> on >>>>> the semi-automatic rule creator looks promising. As it is a heuristic, >>>>> >>>> the >>>> >>>>> lack of log samples unfortunately is a very large hindering block. >>>>> >>>>> Rainer >>>>> _______________________________________________ >>>>> rsyslog mailing list >>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>>>> http://www.rsyslog.com/professional-services/ >>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a >>>>> myriad >>>>> >>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >>>> DON'T LIKE THAT. >>>> >>>> >>>> >>>> -- >>>> Regards, >>>> Janmejay >>>> http://codehunk.wordpress.com >>>> _______________________________________________ >>>> rsyslog mailing list >>>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>>> http://www.rsyslog.com/professional-services/ >>>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >>>> DON'T LIKE THAT. >>>> >>>> _______________________________________________ >>> rsyslog mailing list >>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>> http://www.rsyslog.com/professional-services/ >>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >>> DON'T LIKE THAT. >>> >> >> >> >> >> _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com/professional-services/ > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > DON'T LIKE THAT. > _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.

