So let's look at the following rule which isn't promotable in QA: https://ruleqa.spamassassin.org/20190615-r1861371-n/URI_WP_HACKED_2/detail
This has a publish tflag. Because of the publish tflag it is included in the active.list Because it's in the active.list it is considered for rescoring. When it is rescored, the iterative process scores against both ham and spam in several thousand iterations for the rules from the rev# of that day. During these iterations the score that came out triggered minimal FPs (ham mail > 5.0) and helped towards the spam score the best. The rescore seems to be doing the right thing in my opinion. It might show scores for rules that hit more ham than spam on the qa site, but during the check of the corpus the score generated triggered minimal emails hitting FPs. Paul On Sat, 15 Jun 2019 at 18:06, John Hardin <[email protected]> wrote: > On Fri, 14 Jun 2019, Henrik K wrote: > > > PS. John, all these rules from your sandbox seem to have very broken > > scores, could you perhaps add informative scores to > > 73_sandbox_manual_scores.cf for these? Atleast that method should work > > 100% for now.. > > > > FROM_IN_TO_AND_SUBJ 2.199 > > OBFU_TEXT_ATTACH 1.699 > > MIME_NO_TEXT 1.542 > > AD_PREFS 1.399 > > URI_WP_HACKED_2 1.304 > > STYLE_GIBBERISH 1.111 > > UC_GIBBERISH_OBFU 1.000 > > LUCRATIVE 1.000 > > HEXHASH_WORD 1.000 > > FROM_WORDY 1.000 > > AC_HTML_NONSENSE_TAGS 1.000 > > LONG_HEX_URI 0.896 > > FROM_PAYPAL_SPOOF 0.727 > > Not all of those are in my sandbox. For example, AC_HTML_NONSENSE_TAGS is > in KAM's. > > I spent some time today (which I did not have yesterday) to review and > update the tuning on many of those rules to improve their S/O. > > I also tried adding scores to 73_sandbox_manual_scores.cf for them to > suppress the net scores until those changes can be evaluated by the weekly > masscheck, but ran into a problem - see SA bug 7721. > > The tuning should minimize the problem from the stale net scores, so I'm > reluctant to alter their global scores, except for AD_PREFS, which is a > very simple rule that seems to be falling afoul of a lot of "legitimate" > marketing emails (i.e. actually subscribed to) in the masscheck ham > corpora and thus can't really be tuned. > > > -- > John Hardin KA7OHZ http://www.impsec.org/~jhardin/ > [email protected] FALaholic #11174 pgpk -a [email protected] > key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 > ----------------------------------------------------------------------- > Are you a mildly tech-literate politico horrified by the level of > ignorance demonstrated by lawmakers gearing up to regulate online > technology they don't even begin to grasp? Cool. Now you have a > tiny glimpse into a day in the life of a gun owner. -- Sean Davis > ----------------------------------------------------------------------- > 3 days until SWMBO's Birthday >
