[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 Henrik Krohns changed: What|Removed |Added Resolution|--- |FIXED Status|REOPENED|RESOLVED --- Comment #94 from Henrik Krohns --- Third time's the charm? Sendingtrunk/lib/Mail/SpamAssassin/Plugin/Check.pm Sendingtrunk/t/basic_meta2.t Transmitting file data ..done Committing transaction... Committed revision 1905214. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #93 from Kevin A. McGrail --- +1 from me too because Wolfgang's initial discovery, the subsequent testing, and the patch work from Henrik look good. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #92 from Bill Cole --- +1 to commit. I'm slightly uneasy about my (shallow) depth of understanding of this issue, but I trust those with a deeper understanding and really want to get 4.0 released. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #91 from Henrik Krohns --- I guess let's vote on the patch then. +1 Note that I did one small if tweak, $mp to $mt if (!exists $mp->{$deprule}) { --> if (!exists $mt->{$deprule}) { Basically it's the exact same result and doesn't affect anything, but more clear on what we do. Too lazy to upload new patch. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #90 from Wolfgang Breyha --- Looking at the plain results that looks good. Same score and same (Total Subtest Hits: 208 / Deduplicated Total Hits: 133) I also added a dbg output to the loop so it looks like foreach my $rulename (keys %$mp) { foreach my $deprule (@{$md->{$rulename}||[]}) { if (!exists $mp->{$deprule}) { dbg("rules: $rulename: marking $deprule ready"); $h->{$deprule} ||= 0; The output looks reasonable as far as I checked the rules mentioned in my samples. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #89 from Henrik Krohns --- Created attachment 5855 --> https://bz.apache.org/SpamAssassin/attachment.cgi?id=5855=edit Meta patch for r1905203, fix finishing metas ^ You are very right.. I already finished a quick patch that marks all non-meta rules as finished for the loop. Everything is now done in do_meta_tests. Let me know how this patch works.. atleast tests pass here. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #88 from Wolfgang Breyha --- Just brainstorming... At the point the do_meta_rules()/finish_meta_rules() block is reached: If all unavailable non-meta rules/tags are set to a "0" result before that point shouldn't that clear all the dependencies except meta<->meta? And meta<->meta should be resolved in do_meta_rules() correctly then. And for DKIMDOMAIN and SENDERDOMAIN... it is highly unlikely IMO that those get a result at a later point if header scanning didn't bring them up. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 Henrik Krohns changed: What|Removed |Added Resolution|FIXED |--- Status|RESOLVED|REOPENED --- Comment #87 from Henrik Krohns --- (In reply to Wolfgang Breyha from comment #86) > > In this Sample no DKIM-Sig is available... so DKIMDOMAIN and all __DKIMWL_* > and DKIMWL_* stay "unrun" and a missing dependency. > > Every meta rule which depends on them somehow is run only once in > finish_meta_tests(). And if a meta depends on a meta which depends on > DKIMDOMAIN/ff somehow I get random results (race condition) Ok thanks I think I get it now.. Here's a sample test case: meta __TEST_META_I (1 || TEST_DISABLED3) meta TEST_META_I __TEST_META_I Both should always hit in this case. If unrun TEST_DISABLED3 postpones things to finish_meta_tests, and the foreach with it's random hash order processes TEST_META_I first, obviously things will get wrong. I'll see if can figure out a better logic, maybe just migrate finish_meta_tests final logic into do_meta_tests somehow. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #86 from Wolfgang Breyha --- regarding the revert... sorry, my fault. The part in Check.pm was moved up (out of my sight). regarding the new troubles: With my new sample I again see a lot of meta rules delayed to the point that they get only handled once in finish_meta_tests() leading to random results dependent from perl hashing. In this Sample no DKIM-Sig is available... so DKIMDOMAIN and all __DKIMWL_* and DKIMWL_* stay "unrun" and a missing dependency. Every meta rule which depends on them somehow is run only once in finish_meta_tests(). And if a meta depends on a meta which depends on DKIMDOMAIN/ff somehow I get random results (race condition) I'm wondering why this worked on the patched rc3. An undetermined SENDERDOMAIN seems to block a lot as well. At least I see the same dbg: check: tagrun - tag SENDERDOMAIN is still blocking action 2, 3, 4, 5, 11, 13 as for DKIMDOMAIN dbg: check: tagrun - tag DKIMDOMAIN is still blocking action 0, 1, 2, 3, 5, 6, 12, 13 But since non of my meta rules have dependencies on SENDERDOMAIN yet I can't tell for sure. But I see some PDS_SPF_* and JMQ_SPF_* rules blocked in the standard ruleset. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #85 from Henrik Krohns --- (In reply to Wolfgang Breyha from comment #84) > I noticed that the "revert" patch in comment 71 > also reverted parts (but not all) of the fix for #8060. Was this intentional? Not sure what you are referring to, all of 8060 is still committed from what I see. A detailed hint would help. :-) -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #84 from Wolfgang Breyha --- I noticed that the "revert" patch in comment 71 also reverted parts (but not all) of the fix for #8060. Was this intentional? And I'm sorry to say that I'm currently struggling with another sample where I get bad results with the current trunk compared to rc3 with all the fixes we had for it. I'm still looking into it. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 Henrik Krohns changed: What|Removed |Added Resolution|--- |FIXED Status|REOPENED|RESOLVED --- Comment #83 from Henrik Krohns --- Hopefully no need to reopen anymore. :-) Sendingtrunk/lib/Mail/SpamAssassin/Plugin/Check.pm Transmitting file data .done Committing transaction... Committed revision 1905160. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #82 from Bill Cole --- +1 to commit latest fix. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #81 from Giovanni Bechis --- (In reply to Henrik Krohns from comment #80) > (In reply to Henrik Krohns from comment #76) > > > > --- lib/Mail/SpamAssassin/Plugin/Check.pm (revision 1905124) > > +++ lib/Mail/SpamAssassin/Plugin/Check.pm (working copy) > > @@ -316,7 +316,7 @@ > >$pms->got_hit($rulename, '', ruletype => 'meta', value => $result); > > } else { > >dbg("rules-all: ran meta rule $rulename, no hit") if > > $would_log_rules_all; > > - $h->{$rulename} = 0; # mark meta done > > + $pms->rule_ready($rulename, 1); # mark meta done > > } > > delete $mr->{$rulename}; > > delete $mp->{$rulename}; > > Vote to commit please. Even though it's quite trivial. +1 for me to commit this. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #80 from Henrik Krohns --- (In reply to Henrik Krohns from comment #76) > > --- lib/Mail/SpamAssassin/Plugin/Check.pm (revision 1905124) > +++ lib/Mail/SpamAssassin/Plugin/Check.pm (working copy) > @@ -316,7 +316,7 @@ >$pms->got_hit($rulename, '', ruletype => 'meta', value => $result); > } else { >dbg("rules-all: ran meta rule $rulename, no hit") if > $would_log_rules_all; > - $h->{$rulename} = 0; # mark meta done > + $pms->rule_ready($rulename, 1); # mark meta done > } > delete $mr->{$rulename}; > delete $mp->{$rulename}; Vote to commit please. Even though it's quite trivial. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #79 from Wolfgang Breyha --- Calling rule_ready() looks good and achieves the same results as my changes. I see you added the call to do_meta_tests() in front of finish_meta_tests() as well in October with the patch for #8061. Maybe that and the call to rule_ready() is already enough to run as many metas as possible. I had quite good results with this combination on top of rc3. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #78 from Henrik Krohns --- (In reply to Wolfgang Breyha from comment #77) > > IMO finish_meta_tests() fails to rerun rules if dependencies are resolved > later in the loop... but reading the comments in the code it tries at least. You might very well be right. Not sure what logic would work best here, it gets tedious iterating again and again.. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #77 from Wolfgang Breyha --- I will try it ASAP. But even if it achieves the same as my patch I think this is not enough to guarantee that meta rules are always run with as much dependencies resolved as possible. This fix assures that meta rules are run as early as possible and makes things better, because many rules are run before reaching the "final call" in finish_meta_tests(). IMO finish_meta_tests() fails to rerun rules if dependencies are resolved later in the loop... but reading the comments in the code it tries at least. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 Henrik Krohns changed: What|Removed |Added Status|RESOLVED|REOPENED Resolution|FIXED |--- --- Comment #76 from Henrik Krohns --- (In reply to Wolfgang Breyha from comment #75) > I checked out rev 1905124 today, built a local package and tried to test it > with some of my test E-Mails I used before. Thanks for testing, it's hard to make comprehensive tests for this.. I think it might be as simple as this: --- lib/Mail/SpamAssassin/Plugin/Check.pm (revision 1905124) +++ lib/Mail/SpamAssassin/Plugin/Check.pm (working copy) @@ -316,7 +316,7 @@ $pms->got_hit($rulename, '', ruletype => 'meta', value => $result); } else { dbg("rules-all: ran meta rule $rulename, no hit") if $would_log_rules_all; - $h->{$rulename} = 0; # mark meta done + $pms->rule_ready($rulename, 1); # mark meta done } delete $mr->{$rulename}; delete $mp->{$rulename}; Can you try 1905124 + that change? rule_ready() does the same thing as your patch. Also got_hit() calls rule_ready(), in case someone is wondering. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #75 from Wolfgang Breyha --- I checked out rev 1905124 today, built a local package and tried to test it with some of my test E-Mails I used before. I compared the results with my currently running 4.0-rc3 with the patches Henrik provided before and one patch I sent him privately (4 in sum). The results from trunk are worse then the results of my 4.0-rc3. I can't tell what's wrong exactly yet, but many rules are marked as "not hit" while the patched rc3 correctly hits them. On first sight it looks like some rules are not rerun if others (with dependencies) got hit after they were run. After reapplying this: -- --- /usr/local/spamassassin/share/perl5/Mail/SpamAssassin/Plugin/Check.pm.orig 2022-11-07 16:27:36.063764282 +0100 +++ /usr/local/spamassassin/share/perl5/Mail/SpamAssassin/Plugin/Check.pm 2022-11-07 16:29:54.476177667 +0100 @@ -297,6 +297,7 @@ my $mr = $pms->{meta_check_ready}; my $mp = $pms->{meta_pending}; my $md = $pms->{conf}->{meta_dependencies}; + my $mdr = $pms->{conf}->{meta_deprules}; my $mt = $pms->{conf}->{meta_tests}; my $h = $pms->{tests_already_hit}; my $retry; @@ -318,6 +319,9 @@ dbg("rules-all: ran meta rule $rulename, no hit") if $would_log_rules_all; $h->{$rulename} = 0; # mark meta done } +foreach (keys %{$mdr->{$rulename}}) { + $mr->{$_} = 1; +} delete $mr->{$rulename}; delete $mp->{$rulename}; # Reiterate all metas again, in case some meta depended on us -- ... I get better results. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 Henrik Krohns changed: What|Removed |Added Resolution|--- |FIXED Status|REOPENED|RESOLVED --- Comment #74 from Henrik Krohns --- Committed, let's see how it goes. Sendingtrunk/MANIFEST Sendingtrunk/UPGRADE Sendingtrunk/lib/Mail/SpamAssassin/Conf/Parser.pm Sendingtrunk/lib/Mail/SpamAssassin/Plugin/Check.pm Sendingtrunk/t/basic_meta2.t Deleting trunk/t/basic_meta_net.t Sendingtrunk/t/dcc.t Sendingtrunk/t/dnsbl.t Sendingtrunk/t/shortcircuit.t Transmitting file data done Committing transaction... Committed revision 1904981. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #73 from Kevin A. McGrail --- I have not tested it but I've reviewed it and +1 to commit Meta patch for r1904929, revert to old logic so we can get it into trunk. We'll then look to test. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 Giovanni Bechis changed: What|Removed |Added CC||giova...@paclan.it --- Comment #72 from Giovanni Bechis --- +1 for me to commit latest patch (attachment 5854) to v4.0 -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #71 from Henrik Krohns --- Created attachment 5854 --> https://bz.apache.org/SpamAssassin/attachment.cgi?id=5854=edit Meta patch for r1904929, revert to old logic Here's patch to revert to old meta logic. Basically just removed unrun stuff from finish_meta_tests and fix all tests accordingly. So if something depends on undefined or timed out etc rules, they will be evaluated only at finish_meta_tests stage with the dependencies result as 0. Please test and vote. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #70 from Henrik Krohns --- I think Bill summed it up quite nicely. There's some agreement happening, so I'll start tuning things to the way of old logic. Nothing is going to waste, as the new code will still help make metas run in order without pesky priorities. Other than that, Kevin, Loren, please read the bug through again with thought, and if there's still something unclear, you can try asking. But I really don't have extra time to go over it all again as the details should be all there (admittedly scattered). -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 Bill Cole changed: What|Removed |Added CC||billc...@apache.org --- Comment #69 from Bill Cole --- (In reply to Henrik Krohns from comment #62) > Ok so all pending commits are now done. > > This bug is still open for: > > - Discussion about what to do with "unrun" rules > - Documentation > > Please vote or discuss: > > - Keep logic as is, unrun rules may prevent many metas from running > - Revert to old logic, unrun rules are evaluated as not hitting (0) > > I probably lean a bit to the old logic. It doesn't require much changes to > code, just some tweaking to last do_meta_rules/finish_meta_rules call. I lean very strongly to the old logic because: 1. It is certain that some meta rules in the wild rely on it by design and will have their performance changed if we use the new logic. The change won't necessarily be obvious to people who have deployed such rules. 2. I don't think we understand the changed logic well enough to get it right in a rush to a major release. 3. It limits the impact of disabling costly rules, because 'unrun' subrules don't create a tree of "don't run" dependencies. 4. The logic is more obvious to users. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 Loren Wilton changed: What|Removed |Added CC|lwil...@earthlink.net | -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #68 from Loren Wilton --- (In reply to Kevin A. McGrail from comment #66) > So I believe the new logic is trying to be efficient and not run rules that > are unnecessary. I believe the change was done more for proper rule evaluation ordering, delaying dependent meta rules until all dependencies are available. If some of the dependencies are net rules that time out or are (I think) undefined because they are disabled we get to the case of having unrun meta rules waiting for results that will never be available, so the meta is also marked unrun. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #67 from Loren Wilton --- (In reply to Henrik Krohns from comment #65) > Not sure I follow, would it matter why a rule is unrun? People have been talking about meta rules that evaluate to something useful, and then "season" the result with a term from something like a net rule, that might not be run. If I understand them correctly, they are saying that they would like the result of that rule to be the result of it's expression, with the unrun rule zero. The result of this meta may not be zero, so the rule will end up scoring. If this meta were just set to zero because it was unrun, it likely would not produce the desired result. >From your comment #14: > Let's take this stock rule as example. > meta DIGEST_MULTIPLE RAZOR2_CHECK + DCC_CHECK + PYZOR_CHECK > 1 > This gets problematic even when considering disabled rules. If simply one of > these is not used/loaded, or lookup times out, it won't hit even if two of > them hit. So I claim that if razor and dcc run, but pyzor doesn't, the result should be: meta DIGEST_MULTIPLE RAZOR2_CHECK + DCC_CHECK + 0 > 1 Obviously the rule could still hit. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 Kevin A. McGrail changed: What|Removed |Added CC||kmcgr...@apache.org --- Comment #66 from Kevin A. McGrail --- So I believe the new logic is trying to be efficient and not run rules that are unnecessary. However, with meta rules, I think the efficiency ended up not running rules we should have. Is this correct, Henrik? If so, the old logic is a good triage while we try and revisit the efficiency logic. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #65 from Henrik Krohns --- (In reply to Loren Wilton from comment #64) > Can you tell the difference in the unrun rules list between a rule FOO that > didn't run because it was a net test that timed out (or is undefined, or > similar), and a rule BAR that is unrun because it depends on FOO? If so, I'd > say to zero the FOO result and then evaluate BAR. This is more complex than > simply marking all finally-unrun rules with a zero value, but is I think the > desired result. Not sure I follow, would it matter why a rule is unrun? -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 Loren Wilton changed: What|Removed |Added CC||lwil...@earthlink.net --- Comment #64 from Loren Wilton --- Can you tell the difference in the unrun rules list between a rule FOO that didn't run because it was a net test that timed out (or is undefined, or similar), and a rule BAR that is unrun because it depends on FOO? If so, I'd say to zero the FOO result and then evaluate BAR. This is more complex than simply marking all finally-unrun rules with a zero value, but is I think the desired result. I suppose there is a pathological problem of two rules that depend on each other. That could be resolved by arbitrarily marking one rule as zero and then running the other. Maybe that is already detected earlier and both rules are discarded with a lint error. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #63 from Wolfgang Breyha --- I vote for the old logic. The new one is unpredictable and makes maintenance of "homebrew rules" a burden. And in most cases they are used with "||" or "+" where a zeroed unrun rule makes no difference at all. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #62 from Henrik Krohns --- Ok so all pending commits are now done. This bug is still open for: - Discussion about what to do with "unrun" rules - Documentation Please vote or discuss: - Keep logic as is, unrun rules may prevent many metas from running - Revert to old logic, unrun rules are evaluated as not hitting (0) I probably lean a bit to the old logic. It doesn't require much changes to code, just some tweaking to last do_meta_rules/finish_meta_rules call. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #61 from Henrik Krohns --- (In reply to Loren Wilton from comment #60) > My mind keeps coming back to the idea of marking unrun rules that are > dependencies as zero score, and running the dependent metas. > > Or simplify that to marking all unruns as zero score and as having run. I > mean, if the rule didn't run, then it didn't hit, and a rule that doesn't > hit has zero score. Not sure I understand why you bring up "scores" here. Either a rule hits or doesn't hit. Or neither if it's unrun. > I think unrun is a good internal concept for ordering dependent rule > evaluations, but I'm much less convinced that it is a good external concept. > I tend to think that it is a bit of mechanics that should be hidden. It has been a logical enough concept from the beginning. If we don't know how an unrun rule would affect a meta, then we either can't run the meta or must accept that the result will not be "reliable". The latter probably not being a deal breaker, given how the rules are usually used. But as the discussion is slow and bugs keep piling on, I don't know how soon we can resolve this at this snail pace. Please vote on the open bugs atleast to get them committed. Then we can talk about reverting the meta logic. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 Loren Wilton changed: What|Removed |Added CC|lwil...@earthlink.net | -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 Loren Wilton changed: What|Removed |Added CC||lwil...@earthlink.net --- Comment #60 from Loren Wilton --- My mind keeps coming back to the idea of marking unrun rules that are dependencies as zero score, and running the dependent metas. Or simplify that to marking all unruns as zero score and as having run. I mean, if the rule didn't run, then it didn't hit, and a rule that doesn't hit has zero score. I think unrun is a good internal concept for ordering dependent rule evaluations, but I'm much less convinced that it is a good external concept. I tend to think that it is a bit of mechanics that should be hidden. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #59 from Henrik Krohns --- Maybe new "tflags FOOBAR allow_unrun" or such for a meta rule? A global toggle to allow unrun rules? -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #58 from Henrik Krohns --- (In reply to Henrik Krohns from comment #57) > (In reply to Wolfgang Breyha from comment #55) > > I see other rules which fail that way: > > > > eg. FSL_BULK_SIG > > If the rule without DCC_CHECK will result in 0, but DCC_CHECK is toggled 0/1 > > the "||" will produce different results and the rule stays "unrun". > > There's no reason this rule should fail, it's basically the same case as > comment 52. Maybe you were just looking at the faulty logging fixed in > previous comment. Sorry read things now with more thought.. yeah the arithmetic stuff without >= 1 will fail. Have to think if it can be fixed somehow. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #57 from Henrik Krohns --- (In reply to Wolfgang Breyha from comment #55) > I see other rules which fail that way: > > eg. FSL_BULK_SIG > If the rule without DCC_CHECK will result in 0, but DCC_CHECK is toggled 0/1 > the "||" will produce different results and the rule stays "unrun". There's no reason this rule should fail, it's basically the same case as comment 52. Maybe you were just looking at the faulty logging fixed in previous comment. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #56 from Henrik Krohns --- I see that the "unrun" rules logging was still buggy. dbg: rules: RESULT1 DIGEST_MULTIPLE $VAR1 = ''; dbg: rules: RESULT2 DIGEST_MULTIPLE $VAR1 = ''; dbg: rules-all: ran meta rule DIGEST_MULTIPLE, no hit dbg: rules-all: unrun dependencies prevented meta DIGEST_MULTIPLE from running: DCC_CHECK It didn't check that double evaluation succeeded.. I'll add this to Bug 8061 patch since it's pending for Check.pm if ($would_log_rules_all) { foreach my $rulename (sort keys %{$pms->{conf}->{meta_tests}}) { next if !$pms->{conf}->{scores}->{$rulename}; + next if exists $h->{$rulename}; -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #55 from Wolfgang Breyha --- I see other rules which fail that way: eg. FSL_BULK_SIG If the rule without DCC_CHECK will result in 0, but DCC_CHECK is toggled 0/1 the "||" will produce different results and the rule stays "unrun". -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #54 from Wolfgang Breyha --- (In reply to Henrik Krohns from comment #52) > What does the __METATEST rule contain? It is the __ZID_PHISH_BASE rule for example from my ruleset I sent you. It is a arithmetic rule dependent on several other (meta) rules. I think I see the crucial difference to your example ... My rule does not use "<>="! It simply adds up some other rules. So two different runs for eg. DCC_CHECK with 0 and 1 will not come to the same result! I added debug output to the if ($result != $result2) part in finish_meta_tests() which confirms this. I do the ">" later with an additional meta rule. But if this rule is postponed to finish_meta_tests() it stays "unrun" because $result != $result2. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #53 from Kevin A. McGrail --- Interesting. I think meta was designed to run if rules with dependencies wouldn't be able to run. I know I used this in rules where I would add a condition that might helpful but not required for a rule. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #52 from Henrik Krohns --- (In reply to Wolfgang Breyha from comment #51) > I took the time and read the whole thread and bug 4549 now and understand > how "running a rule twice" should work in theory. No problem with that > anymore. > > But still, it doesn't work that way currently, because many meta rules do > not run because they get not triggered. At least I can "destroy" a working > arithmetic meta rule by adding "+ DCC_CHECK" for example if the plugin is > disabled. > > All I see is > unrun dependencies prevented meta __METATEST from running: DCC_CHECK > and no result, while without "+ DCC_CHECK" I get > rules: ran meta rule __METATEST ==> got hit (4) What does the __METATEST rule contain? Just as an example, this stock rule won't break: meta DIGEST_MULTIPLE RAZOR2_CHECK + DCC_CHECK + PYZOR_CHECK > 1 1 + 0 + 1 > 1 1 + 1 + 1 > 1 If DCC_CHECK is unrun, and RAZOR and PYZOR hit, it will work. > And this still happens with my patch I sent you last week in place, which > helps meta rules running as soon as other metas finished they depend on. So, > delaying a meta rule to the point that it is handled by finish_meta_tests() > seems to break it in some cases. Yes I see now that it's needed in case check_post_dnsbl or check_cleanup hooks finish some rules. I have to wait until I can commit changes from previous bugs first. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #51 from Wolfgang Breyha --- I took the time and read the whole thread and bug 4549 now and understand how "running a rule twice" should work in theory. No problem with that anymore. But still, it doesn't work that way currently, because many meta rules do not run because they get not triggered. At least I can "destroy" a working arithmetic meta rule by adding "+ DCC_CHECK" for example if the plugin is disabled. All I see is unrun dependencies prevented meta __METATEST from running: DCC_CHECK and no result, while without "+ DCC_CHECK" I get rules: ran meta rule __METATEST ==> got hit (4) And this still happens with my patch I sent you last week in place, which helps meta rules running as soon as other metas finished they depend on. So, delaying a meta rule to the point that it is handled by finish_meta_tests() seems to break it in some cases. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #50 from Wolfgang Breyha --- All that I can say is that meta rules did/do not work as expected with SA4. This might be the case because of the one issue which is not in bugzilla yet which I sent you personally and I've patched myself where finished meta rules do not trigger a run of dependent meta rules. But still, even patched, I see meta rules which have no result at all if they depend on an undefined (meta) rule, because they stay unrun till the end. And I ask myself what happens exactly if you run the same rule twice with different parameters for undefined dependencies? Which result is taken at last? If it is a list of "||" the one run with "1" will always result in a hit. This is most likely not intended. And what happens if more then one binary or arithmetic clause is "unrun" and you have to run it even more often with all combinations? The 3.4.x style to handle "unhit" rules as 0 was easy to understand and clear. And undefined was unhit. >From my current experience with SA4 this change kills 3rd party rulesets like KAM. I will ask Kevin McGrail to comment on this topic as well. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 Henrik Krohns changed: What|Removed |Added Status|RESOLVED|REOPENED Resolution|FIXED |--- --- Comment #49 from Henrik Krohns --- (In reply to Wolfgang Breyha from comment #48) > Honestly I believe that the change to handle undefined rules as "unrun" > makes more harm than it helps. Up to 3.4.6 it was possible to include rules > in a " BLA1 || BLA2 || BLA3" chain without having to check those > dependencies with every run of sa-update. With 4.x this is impossible, > because each and every "unrun" rule makes the whole chain and further > dependencies "unrun". You are wrong there as simple metas like this do still work: meta FOOBAR BLA1 || BLA2 || BLA3 As previously described, the double evaluation evaluates the meta twice, with all unrun results as 0 and second time as 1. So the meta above will always work if one of those hits, regardless of the others being undefined/unrun. But you are correct that documentation is still missing, I'll reopen this for that. Of course more discussion is welcome on the unrun stuff. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 Wolfgang Breyha changed: What|Removed |Added CC||wbre...@gmx.net --- Comment #48 from Wolfgang Breyha --- Honestly I believe that the change to handle undefined rules as "unrun" makes more harm than it helps. Up to 3.4.6 it was possible to include rules in a " BLA1 || BLA2 || BLA3" chain without having to check those dependencies with every run of sa-update. With 4.x this is impossible, because each and every "unrun" rule makes the whole chain and further dependencies "unrun". At least if an "unrun" rule is used in an logical "or" or in an arithmetic operation it should be considered "unhit" as used in the man page which still only says: "meta meta SYMBOLIC_TEST_NAME boolean arithmetic expression Can also define an arithmetic expression in terms of other tests, with an unhit test having the value "0" and a hit test having a nonzero value. The value of a hit meta test is that of its arithmetic expression. The value of a hit eval test is that returned by its method. The value of a hit header, body, rawbody, uri, or full test which has the "multiple" tflag is the number of times the test hit. The value of any other type of hit test is "1"." "unrun" or "undefined" values are not mentioned here. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 Henrik Krohns changed: What|Removed |Added Blocks||8062, 8065 Referenced Bugs: https://bz.apache.org/SpamAssassin/show_bug.cgi?id=8062 [Bug 8062] no URL makes uridnsbl rules "unrun" https://bz.apache.org/SpamAssassin/show_bug.cgi?id=8065 [Bug 8065] disabled plugins cause undefined dependencies for meta rules -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #47 from Henrik Krohns --- The more I've seen complex real world meta rulesets, network rules behaving badly, being blocked etc... I'm wondering if the current "unrun" handling makes more trouble than it's worth. No one has tackled the issue yet on mass checks either: https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735#c28 -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 Henrik Krohns changed: What|Removed |Added Blocks||8061 Referenced Bugs: https://bz.apache.org/SpamAssassin/show_bug.cgi?id=8061 [Bug 8061] Fix meta handling for $suppl_attrib->{rule_hits} -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 Henrik Krohns changed: What|Removed |Added Blocks||8059, 8060 Referenced Bugs: https://bz.apache.org/SpamAssassin/show_bug.cgi?id=8059 [Bug 8059] [review] Fix meta handling for URIDNSBL NS/A lookups https://bz.apache.org/SpamAssassin/show_bug.cgi?id=8060 [Bug 8060] [review] Fix meta handling for metas without dependencies -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 Loren Wilton changed: What|Removed |Added CC|lwil...@earthlink.net | -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 Henrik Krohns changed: What|Removed |Added Blocks||7987 Referenced Bugs: https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7987 [Bug 7987] DNSEval.pm,HashBL.pm,URILocalBL.pm: unnecessary use of rule_pending and rule_ready -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 Henrik Krohns changed: What|Removed |Added Status|REOPENED|RESOLVED Resolution|--- |FIXED --- Comment #46 from Henrik Krohns --- I guess it was due to score 0 rules messing up the list. Closing this, can be reopened if someone detects a failing test later. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #45 from Sidney Markowitz --- (In reply to Henrik Krohns from comment #43) > Not seeing any unrun variations after that commit. Can you make any tests > fail? Everything I tried passed. If you are happy with this, you can close this bug. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #44 from Sidney Markowitz --- (In reply to Henrik Krohns from comment #43) > Not seeing any unrun variations after that commit. Can you make any tests > fail? That fix makes a lot more sense to me. My tests looking for unrun variations all work. I'm now running the full suite of tests to make sure nothing else has gone wrong. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #43 from Henrik Krohns --- Not seeing any unrun variations after that commit. Can you make any tests fail? -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #42 from Henrik Krohns --- - Disabled (score 0) metas no longer added to meta_pending - Unrun debug reporting should now be foolproof Committed revision 1901397. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #41 from Henrik Krohns --- (In reply to Sidney Markowitz from comment #39) > (In reply to Sidney Markowitz from comment #38) > > In finish_meta_tests, when it does > > delete $mp->{$rulename}; > > should it also do > > delete $unrun_metas{$rulename}; > > in case one that was found as unrun in a previous loop now can be run? > > That does make the symptoms disappear. Is it correct? No. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #40 from Henrik Krohns --- (In reply to Sidney Markowitz from comment #37) > > I got confused. I really have two separate questions. One is how the checks > for meta and dependencies work when a meta is defined with a header rule > like __THREAD_INDEX_GOOD, which I see is used with && operator, and a > separate question is how the use of || operator works out with the > dependencies. I don't know whether or not any of the header rules are used > with a || in a meta definition. It makes absolutely no difference what kind type of rules or operators are used in a meta. Meta only works with and depends on "rulenames". There is no "dependencies" other than waiting for all rules/rulenames to be ready ($pms->{tests_already_hit}->{$rulename} created by got_hit or rule_ready). meta T_THREAD_INDEX_BAD __HAS_THREAD_INDEX && !__THREAD_INDEX_GOOD For that do_meta_tests basically does: $h = $pms->{tests_already_hit}; if (exists $h->{__HAS_THREAD_INDEX} && exists $h->{__THREAD_INDEX_GOOD}) { $result = eval { $h->{__HAS_THREAD_INDEX} && !$h->{__THREAD_INDEX_GOOD} }; } else { # loop until they exist, or declare unrun in finish_meta_tests } -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #39 from Sidney Markowitz --- (In reply to Sidney Markowitz from comment #38) > In finish_meta_tests, when it does > delete $mp->{$rulename}; > should it also do > delete $unrun_metas{$rulename}; > in case one that was found as unrun in a previous loop now can be run? That does make the symptoms disappear. Is it correct? -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #38 from Sidney Markowitz --- In finish_meta_tests, when it does delete $mp->{$rulename}; should it also do delete $unrun_metas{$rulename}; in case one that was found as unrun in a previous loop now can be run? -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #37 from Sidney Markowitz --- (In reply to Sidney Markowitz from comment #36) > How does this work? I got confused. I really have two separate questions. One is how the checks for meta and dependencies work when a meta is defined with a header rule like __THREAD_INDEX_GOOD, which I see is used with && operator, and a separate question is how the use of || operator works out with the dependencies. I don't know whether or not any of the header rules are used with a || in a meta definition. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #36 from Sidney Markowitz --- How does this work? rules/72_active.cf:header __THREAD_INDEX_GOOD is used in a number of meta rule definitions with a || operator. What does that do with the dependency checks and the do_meta loops in Check.pm? Is it certain that the header rules are evaluated before meta ones? -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #35 from Henrik Krohns --- Sort is just hiding the bug. It should be enough to run a single do_meta_tests/finish_meta_tests in the end, which should iterate all metas over and over until all are ready. I just can't figure out why rule order would matter, even rewrote things now to work in "deterministic" way, firing metas from rule_ready when deps are ready.. and it actually behaves the same way. There's some logical fallacy here that I just can't grasp yet. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #34 from Sidney Markowitz --- (In reply to Henrik Krohns from comment #30) > There's still some strange bug somewhere that makes meta rules behave > differently on different runs. The following change in Check.pm makes that go away. I have no idea if this is a fix or a clue, so I leave it up to you what to do about it: Index: lib/Mail/SpamAssassin/Plugin/Check.pm === --- lib/Mail/SpamAssassin/Plugin/Check.pm (revision 1901358) +++ lib/Mail/SpamAssassin/Plugin/Check.pm (working copy) @@ -329,7 +329,7 @@ my %unrun_metas; RULE: - foreach my $rulename (keys %$mp) { + foreach my $rulename (sort(keys %$mp)) { my %unrun; # Meta is not ready if some dependency has not run yet foreach my $deprule (@{$md->{$rulename}||[]}) { -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #33 from Sidney Markowitz --- (In reply to Henrik Krohns from comment #30) > There should be no > reason for the list to change between runs. > > spamassassin -t -L -D rules-all < message 2>&1 | grep unrun As an easy way to show this, in a trunk directory after a make I ran once ./spamassassin -t -L -D rules-all --siteconfigpath=. < sample-nonspam.txt 2>&1 | grep -o 'dbg: rules-all: unrun.*$' | sort > ~/tmp/onerun and then repeated runs of ./spamassassin -t -L -D rules-all --siteconfigpath=. < sample-nonspam.txt 2>&1 | grep -o 'dbg: rules-all: unrun.*$' | sort | diff -u0 - ~/tmp/onerun show different results every time. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #32 from Sidney Markowitz --- I can't reproduce the sporadic failures in t/basic_meta2.t since the latest fixes to tests, but since it was always sporadic that is not a definitive result. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 Sidney Markowitz changed: What|Removed |Added CC||sid...@sidney.com --- Comment #31 from Sidney Markowitz --- I'm seeing this in sporadic t/basic_meta2.t failures -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 Henrik Krohns changed: What|Removed |Added Status|RESOLVED|REOPENED Resolution|FIXED |--- --- Comment #30 from Henrik Krohns --- Reopening to investigate and keep 4.0.0 blocked. There's still some strange bug somewhere that makes meta rules behave differently on different runs. You can see this by checking the same message multiple times, and looking at the "unrun dependencies prevented meta XXX from running". There should be no reason for the list to change between runs. spamassassin -t -L -D rules-all < message 2>&1 | grep unrun -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 Henrik Krohns changed: What|Removed |Added Resolution|--- |FIXED Status|NEW |RESOLVED --- Comment #29 from Henrik Krohns --- Considering this ready for trunk commit and usage. Please check and fix affected mass-check rules. Transmitting file data ...done Committing transaction... Committed revision 1890274. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 Henrik Krohns changed: What|Removed |Added Attachment #5746|0 |1 is obsolete|| --- Comment #28 from Henrik Krohns --- Created attachment 5747 --> https://bz.apache.org/SpamAssassin/attachment.cgi?id=5747=edit Meta patch for r1890185 v5 Latest version now supports feature check: if can(Mail::SpamAssassin::Conf::feature_local_rules_only) && local_rules_only endif ... AND pseudo-rule type check: meta __SPOOFED DKIM_INVALID || local_tests_only -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #27 from Henrik Krohns --- (In reply to Loren Wilton from comment #26) > I don't understand the concern expressed in Comment 19. It's been exaplained a few times. Rules that are simply intended for reducing FPs are not run in daily masschecks. meta __NOT_SPOOFED SPF_PASS || DKIM_VALID || !__LAST_EXTERNAL_RELAY_NO_AUTH || ALL_TRUSTED SPF_PASS or DKIM_VALID are unrun. Double eval will not make the meta hit. (0||0||0||0) && (1||1||0||0) It makes it harder to tune rules as you need to wait a week for results. There should be a way to bypass it, if the dependencies are not considered critical. > Does this mean that an undefined rule appearing in a meta evaluates to true > rather than to "unrun"? If so I see that as a possible problem; I'd prefer > an undefined rule to be treated as unrun, but I can see the logic of > treating it as a constant value (but I'd think false makes more sense). Sorry, to clarify, this only applies to "if" clauses. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #26 from Loren Wilton --- > Loren, I might be a bit tired from long work day, but I'm not sure what you > are implying/suggesting.. tried reading it thrice. :-) Sorry, I'm too verbose because I don't know the actual code at all well. > Unrun rule currently has simple definition: either it was not run at all, or > result was not received (timeout). Unrun rules in metas are evaluated twice > as That is what I think should happen. I don't understand the concern expressed in Comment 19. > Ugh it gets a bit wonky. Current conf parser considers any unknown string as > true. Does this mean that an undefined rule appearing in a meta evaluates to true rather than to "unrun"? If so I see that as a possible problem; I'd prefer an undefined rule to be treated as unrun, but I can see the logic of treating it as a constant value (but I'd think false makes more sense). -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #25 from Henrik Krohns --- (In reply to Henrik Krohns from comment #24) > I will look if there is a some 3.4 compatible way to implement "if > (local_tests_only)". Ugh it gets a bit wonky. Current conf parser considers any unknown string as true. # Will always apply on old versions if (local_rules_only) meta __SPOOFED 0 endif # Something like this would be required, mandatory feature check if can(Mail::SpamAssassin::Conf::feature_local_rules_only) && local_rules_only meta __SPOOFED 0 else meta __SPOOFED DKIM_INVALID || STUFF endif -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #24 from Henrik Krohns --- Loren, I might be a bit tired from long work day, but I'm not sure what you are implying/suggesting.. tried reading it thrice. :-) Unrun rule currently has simple definition: either it was not run at all, or result was not received (timeout). Unrun rules in metas are evaluated twice as before mentioned. I don't think tflags should matter here, unrun is unrun. Let me know in clear terms, if you think something should be changed there. I will look if there is a some 3.4 compatible way to implement "if (local_tests_only)". -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #23 from Loren Wilton --- It seems to me that if net rules are turned off, they are unrun and the result is indeterminate: that is, we don't know how they would have evaluated if they had been run. In that sense, they are then no different than a net rule that was fired but the result didn't arrive in time. With the recent changes, meta rules are (I believe) run twice with the result both true and false for an unreceived net result. I would say that with net rules disabled, we would treat all meta rules that depend on net rule results the same way: the net results weren't received in time, so they need to be evaluated twice. This should extend to any missing result a meta rule depends on, but I don't know if it makes more sense to do the double evaluation based on not having the net rule result, or based on checking tflags net for the dependency and some flag that says net rules are disabled. Given that the case: #body SOME_DISABLED_RULE /./ meta COMPOUND_TEST SOME_DISABLED_RULE && OTHER_RULE should evaluate the meta based solely on OTHER_RULE, it seems to me that disabled net rules are the same case as the undefined disabled rule, so possibly the same logic works to ignore the results if disabled net rules. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #22 from Henrik Krohns --- (In reply to Henrik Krohns from comment #21) > But net-flag makes no difference on when rules run or are considered unrun. Ok correction, eval-rules marked as net are not run if local rules only. But metas or any other types don't have such a check (should they?). -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #21 from Henrik Krohns --- (In reply to John Hardin from comment #20) > Is that in there to include what we're currently doing? Or is it not > inheriting tflags net and thus it is running with the assumption I stated > above? Not sure I understand the questions. But net-flag makes no difference on when rules run or are considered unrun. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #20 from John Hardin --- (In reply to Henrik Krohns from comment #19) > Example: BITCOIN_SPAM_05 will not run in daily masschecks, because > __SPOOFED_FREEMAIL -> __NOT_SPOOFED depends on DKIM/SPF checks, which > obviously don't run with local checks only. That's in there solely as a FP avoidance. In that use case it's reasonable to run the rule assuming __SPOOFED_FREEMAIL = 0. I don't think that would be a valid assumption in all cases, though. > Solutions? > > - Ignore ifplugin SPF/DKIM without network checks ? Requires hardcoding, not > nice > - "if (local_tests_only)" in .cf ? That would be preferable, because then you could do something like: ifplugin SPF/DKIM && !local_tests_only {actual __SPOOFED_FREEMAIL definition} else meta __SPOOFED_FREEMAIL 0 endif > - New rule __LOCAL_TESTS_ONLY which can help due to false/true > meta-evaluation ? > - Don't run BITCOIN_SPAM_05 in daily, since it doesn't tell the whole truth ? Is that in there to include what we're currently doing? Or is it not inheriting tflags net and thus it is running with the assumption I stated above? -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #19 from Henrik Krohns --- Example: BITCOIN_SPAM_05 will not run in daily masschecks, because __SPOOFED_FREEMAIL -> __NOT_SPOOFED depends on DKIM/SPF checks, which obviously don't run with local checks only. Solutions? - Ignore ifplugin SPF/DKIM without network checks ? Requires hardcoding, not nice - "if (local_tests_only)" in .cf ? - New rule __LOCAL_TESTS_ONLY which can help due to false/true meta-evaluation ? - Don't run BITCOIN_SPAM_05 in daily, since it doesn't tell the whole truth ? -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 Henrik Krohns changed: What|Removed |Added Attachment #5744|0 |1 is obsolete|| --- Comment #18 from Henrik Krohns --- Created attachment 5746 --> https://bz.apache.org/SpamAssassin/attachment.cgi?id=5746=edit Meta patch for r1890037 v4 Managed to close the gap to 10% as tested properly with stock ruleset (which has much less rules than my custom setup and makes meta checks runtime proportionally longer). Calling eval 'foobar' anytime is very expensive, especially in such a loop. Now the meta checks are pre-compiled as subs. A bit similarly as is done on other ruletypes. Will commit when it gets a bit of runtime on my live system. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #17 from Henrik Krohns --- Doing some more testing, mass check 4000 messages with -j4 trunk 138s trunk.meta 190s Seems the performance gap is a bit too unacceptable here. Still looking for ways to optimize. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 Henrik Krohns changed: What|Removed |Added Attachment #5743|0 |1 is obsolete|| --- Comment #16 from Henrik Krohns --- Created attachment 5744 --> https://bz.apache.org/SpamAssassin/attachment.cgi?id=5744=edit Meta patch for r1889916 v3 Hopefully the final patch version. Dual meta evaluation as described in Bug 4549, nonexistent, disabled, timed out lookups are all considered unrun. Some examples of passed tests from basic_meta2.t: # Non-existing rule # Should not hit, meta is evaled twice: (!0) && (!1) meta TEST_META_2 !NONEXISTINGRULE # Should hit, meta is evaled twice: (!0 || 0) && (!1 || 1) meta TEST_META_3 !NONEXISTINGRULE || NONEXISTINGRULE # Disabled rule, same as above body TEST_DISABLED /a/ score TEST_DISABLED 0 # Should not hit meta TEST_META_4 !TEST_DISABLED # Should hit meta TEST_META_5 !TEST_DISABLED || TEST_DISABLED # Unrun rule (due to local tests only), same as above askdns TEST_DISABLED2 spamassassin.org TXT /./ # Should not hit meta TEST_META_6 !TEST_DISABLED2 # Should hit meta TEST_META_7 !TEST_DISABLED2 || TEST_DISABLED2 # Should not hit meta TEST_META_8 __FOO_1 + NONEXISTINGRULE == 2 # Should not hit meta TEST_META_9 __FOO_1 + NONEXISTINGRULE + __FOO_2 == 2 # Should hit (both eval checks are true thanks to >1) meta TEST_META_A __FOO_1 + NONEXISTINGRULE + __FOO_2 > 1 For reference some repeatable benchmarks checking 100 messages: - 3.4 55s 155MB - trunk 50s 168MB - trunk.meta 56s 162MB Slight speed loss from current trunk, but on par with 3.4. Memory usage did drop a bit from ditching merging and other stuff. Will commit in few days unless hear otherwise. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #15 from Henrik Krohns --- Argh never mind the previous comment, brain freeze.. that specific example works, as both evals are true, let's say DCC_CHECK is disabled 1 + 0 + 1 > 1 1 + 1 + 1 > 1 Any other scenario to consider? -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #14 from Henrik Krohns --- Ok I've finished the dual evaluation code, but I'm still wondering whether timed out lookups should count as unrun rules. Let's take this stock rule as example. meta DIGEST_MULTIPLE RAZOR2_CHECK + DCC_CHECK + PYZOR_CHECK > 1 This gets problematic even when considering disabled rules. If simply one of these is not used/loaded, or lookup times out, it won't hit even if two of them hit. Maybe a tflag that allows unrun rules for that rule? -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 Henrik Krohns changed: What|Removed |Added Attachment #5742|0 |1 is obsolete|| --- Comment #13 from Henrik Krohns --- Created attachment 5743 --> https://bz.apache.org/SpamAssassin/attachment.cgi?id=5743=edit Meta patch for r1889731 v2 Updated patch for current trunk. - do_meta_tests() updated a bit, now also uses get_pending_lookups (renamed from has_pending_lookups) to check if rule is pending, this should fix most 3rd party async plugins (like my iXhash2) Currently both of these are considered ready/done rules evaluating to 0: - non-existing rules - disabled rules (score 0) Still considering if it's manageable to do the proposed "evaluate un-run as both true and false". -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #12 from Henrik Krohns --- (In reply to Henrik Krohns from comment #11) > - Duplicate rule merging code has been removed, since it would require new > bloaty extra logic to work with all the above. It saved very little > resources anyway, makes no difference. PS. While doing some tests I found a bug in merging too.. if you have duplicate rules but set some score to 0, it will disable the other rule too. Hit my head for 15 minutes until I realized why my test was acting strange. *groan* body FOO1 /./ body FOO2 /./ score FOO1 0 # FOO2 will never hit Farewell kludgy merging code, won't be missing you. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #11 from Henrik Krohns --- Created attachment 5742 --> https://bz.apache.org/SpamAssassin/attachment.cgi?id=5742=edit Meta patch for trunk v1 Took too many days, but here's patch for a brute force approach. Things done: - Rules now keep track of when they are ready/done in $pms->{tests_already_hit} (0 == done, but didn't hit). This is of obviously required feature for metas to know when evaluating can be done. - $pms->rule_pending() must be called from eval-rule function, if the result can arrive later than when exiting the function. - $pms->rule_done() need be called when above result has arrived. Note that if a rule is waiting for multiple DNS queries, it is considered done when either 1) a positive hit from lookup 2) all queries are returned. ($pms->rule_done() does automatic check with has_pending_lookups() and ignores the call if there are still pending queries) - This might break backwards compatibility for third party async plugins, which do not use above functions. In some cases metas using such rules might not hit then. Still TODO to check if it's a problem and if it can be helped (maybe try calling rule_pending() automatically from AsyncLoop, requires some kludging as it's independent from PMS..) - Check.pm do_meta_tests() basically runs at end of every priority and brute force iterates through all non-finished metas, checking if their dependent rules are ready. A final do_meta_tests() at scan end will also accept "lookup pending" rules as ready (not sure yet if this is desired or not?) - Duplicate rule merging code has been removed, since it would require new bloaty extra logic to work with all the above. It saved very little resources anyway, makes no difference. Some 5% overall scan time penalty from the brute force meta iteration, probably could be improved with some extra lookup tables. But that's without network lookups, there might be some speed up too as metas now don't need to wait for any pending lookups, they just evaluate when it's ready. All tests run ok, testing on my live server.. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 Henrik Krohns changed: What|Removed |Added CC||quin...@pathname.com --- Comment #10 from Henrik Krohns --- *** Bug 4549 has been marked as a duplicate of this bug. *** -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #9 from Loren Wilton --- Minor correction to the previous post. I said there that there was a sync rule list and a sync meta list, and the sync rules were processed in priority order, then the sync meta rules. This isn't quite correct. The two sync lists need to be interleaved. The sync rules of priority N are run, then the sync meta rules of the same priority. The list assignment processing will have seen to it that a meta rule is at a higher (later evaluation) priority than any rules that it depends on. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 Loren Wilton changed: What|Removed |Added CC||lwil...@earthlink.net --- Comment #8 from Loren Wilton --- Just thinking off the top of my head, I think I'd consider something along the following lines. 1. Conceptually build four prioritized tree-lists of rules: a. non-net, non-meta b. non-net, meta c. net, non-meta d. net, meta The tree nodes are the priority order, for each priority where rules exist there is a list of rules of that priority. The rules are evaluated in tree order and branch order, in the order given above. While I've described this as four trees of lists, it could equally be done with one partitioned tree, where each partition is assigned a base rule priority higher than any rule priority that can be manually assigned, for instance 0-99, 100-199, etc. The goal is to create the above-described rule evaluation order constraints. Non-meta rules are trivially assigned a priority, as it is a given for each rule. Meta rules may have been given priorities, but that needs to be considered as a minimum priority, and can only be used to delay the meta evaluation. As an implementation I think I'd initially throw all meta rules into a bucket/list of some sort, remembering their priorities, but not assigning them to the trees until all non-meta rules are assigned to the correct tree nodes. Then I would pull meta rules out of the bucket and look up each referenced rule. I would remember the highest tree node of any referenced rule. If all referenced rules are found for a meta, it does not depend on any other meta rules, and can be assigned to the meta tree at either the level of the highest dependency rule found, or the level given for the meta, only if that is higher than the highest dependency rule. If not all dependencies are found, I would throw it at the back of the unprocessed meta list, with a flag saying it had been seen once. (Alternately, into a reprocessing meta list separate from the current list.) When the current meta list is depleted, it can be replaced with the postponed meta list, and that list again processed exactly as above. A flag can be set if at least one meta is removed from the list into a tree. Any metas still with unresolved dependencies can again be thrown back into an alternate pool. As long as at least one meta is removed from the unprocessed list the reprocessing can be repeated. If all remaining metas are passed and none are placed into the tree, the remaining metas are either circular or depend on undefined rules. I would throw them out as errors at that point. Obviously there are faster ways to perform the above evaluation, but since it only needs to be done once, simplicity may be a virtue in the implementation. In any case, the end result is an ordered list if rules to evaluate in the order they are found in the list. The non-net stuff gets done quickly, and the net rules queued quickly. Metas that depend on async net rules are a pain because the net results are returned out of order. I can't think of a clever way to deal with this, but a simple brute-force approach is easy. Each net rule needs a list of the metas that depend on it. Each net meta needs a count of dependencies required to complete, and a counter initialized to zero. As each net rule completes, it walks it's dependent list and increments the count for the dependency. If the incremented count matches the total dependency count the meta rule can be run. Overall I think that will handle getting things into the right evaluation order fairly easily and also evaluating them in the correct order without too much pain. (Of course I don't recall what the short-circuit flag does exactly. It probably throws a wrench in the above evaluation mechanism.) -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #7 from Henrik Krohns --- Perhaps meta rules should not use the "rule priority" system at all. Any metas that depend on async rules simply hang waiting at that priority, which is pointless. There should be some system that evaluates metas everytime rule execution completes (hit or no hit), some steps like: - rule completes - iterate all metas that use the rule - check if all the rules used by meta are completed - evaluate meta (true / false) I guess it would need some sort of fast lookup tables. Perhaps only check things once at the end of one priority. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 Stingertough changed: What|Removed |Added CC||wolfsp...@gmail.com -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #6 from Henrik Krohns --- But if some of these rules do indeed disappear from our daily masschecks, I might revert this change to my previous code which was simply Reuse.pm doing the same thing: forcing reuse of the net-dependent metas. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #5 from John Hardin --- (In reply to Henrik Krohns from comment #4) > Yes it's a bit tricky, but honestly, who uses "non-net rulesets" these days > anyway? True. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 --- Comment #4 from Henrik Krohns --- Yes it's a bit tricky, but honestly, who uses "non-net rulesets" these days anyway? You can't even use SpamAssassin without sa-update. And those with some strange private network usage, probably don't care that much about the scoring. Either this or we have to come up with some other way for foolproof "reuse" masschecking. Again today I notifed I was blasting DNS queries all over, due to missing reuse flags. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7735] Meta rules need to handle missing/unrun dependencies
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 John Hardin changed: What|Removed |Added CC||jhar...@impsec.org --- Comment #3 from John Hardin --- I object to many of these rules being marked as TFLAGS NET. Your trivial examples are indeed valid, but for more-complex rules where the NET rule is used simply as an exclusion to reduce FPs (rather than a critical component of the logical rule itself), the effect of the NET rule being zero is not important, as the non-NET masscheck scores will reflect the results of that NET exclusion rule not having an effect: the score will be lower for non-NET because the S/O will be lower. The overall rule should not be *entirely disabled* in a non-NET environment. For this reason I also dislike the inheritance being automatic. -- You are receiving this mail because: You are the assignee for the bug.