Huh! The problem is the usual one: GIGO Y 8 .../8bcaeebfaa ...,RCVD_IN_PBL,RCVD_IN_PBL,...
It's assumed that the rule list should have a unique set of names, so hit-frequencies just adds the entry twice. So now the question is: why does mass-check put the same rule in multiple times, and apparently only for weekly runs, and apparently only for this rule (pcregrep '([A-Z0-9_]+),\1(,|$)', shows only this rule duplicating)? <sigh> On Tue, Jul 03, 2007 at 12:02:12PM -0400, Theo Van Dinter wrote: > On Tue, Jul 03, 2007 at 10:24:02AM +0100, Justin Mason wrote: > > no "aha"s here unfortunately :( -- is this in your own local freqs, > > or the freqs on the server (with everyone else's logs too)? > > This is from hit-frequencies off of my net-theo weekly logs. > > It's very reproducable too: > > ~/SA/spamassassin-head/masses/hit-frequencies -a -c \ > ~corpus/SA/spamassassin-corpora/rules -x -p | awk \ > '$1 > 100 || $2 > 100 || $3 > 100' > OVERALL SPAM% HAM% S/O RANK SCORE NAME > 0 142976 25826 0.847 0.00 0.00 (all messages) > 91.555 108.0930 0.0000 1.000 1.00 0.00 RCVD_IN_PBL > > and doing a little bit of debugging yesterday, the spam count for that rule > goes to 154547. I just haven't figured out why yet though. > > -- > Randomly Selected Tagline: > "Our users will know fear and cower before our software! Ship it! > Ship it and let them flee like the dogs they are!" > - Klingon Programmer's Manual -- Randomly Selected Tagline: "I would never have sex with a cow. Cause that is wrong, and I am lactose intolerant." - Dave Attell
pgpB1XAJ4janL.pgp
Description: PGP signature
