https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6155
--- Comment #132 from Mark Martinec <[email protected]> 2009-10-26 12:26:49 UTC --- > Other whitelisting rules (HABEAS_*, RCVD_IN_IADB_*, RCVD_IN_BSP_TRUSTED etc) > have the same scores as in the previous 50_scores.cf. They do not have the same scores, seems to me they are all mostly much lower. Please ignore the comments in 50_scores_newest3.cf, just take into account uncommented 'score' lines: score HABEAS_ACCREDITED_COI 0 score HABEAS_ACCREDITED_SOI 0 -1.634 0 -0.475 score RCVD_IN_BSP_TRUSTED 0 -0.001 0 -0.001 score RCVD_IN_IADB_DK 0 -0.044 0 -0.001 score RCVD_IN_IADB_DOPTIN 0 score RCVD_IN_IADB_DOPTIN_GT50 0 score RCVD_IN_IADB_DOPTIN_LT50 0 -0.001 0 -0.001 score RCVD_IN_IADB_EDDB 0 score RCVD_IN_IADB_EPIA 0 score RCVD_IN_IADB_GOODMAIL 0 score RCVD_IN_IADB_LISTED 0 -1.144 0 -0.001 score RCVD_IN_IADB_LOOSE 0 score RCVD_IN_IADB_MI_CPEAR 0 score RCVD_IN_IADB_MI_CPR_30 0 score RCVD_IN_IADB_MI_CPR_MAT 0 -0.079 0 -0.001 score RCVD_IN_IADB_ML_DOPTIN 0 score RCVD_IN_IADB_NOCONTROL 0 score RCVD_IN_IADB_OOO 0 score RCVD_IN_IADB_OPTIN 0 -3.265 0 -2.791 score RCVD_IN_IADB_OPTIN_GT50 0 -0.219 0 -1.041 score RCVD_IN_IADB_OPTIN_LT50 0 score RCVD_IN_IADB_OPTOUTONLY 0 score RCVD_IN_IADB_RDNS 0 -0.018 0 -0.001 score RCVD_IN_IADB_SENDERID 0 -0.001 0 -0.001 score RCVD_IN_IADB_SPF 0 -0.006 0 -0.042 score RCVD_IN_IADB_UNVERIFIED_1 0 score RCVD_IN_IADB_UNVERIFIED_2 0 score RCVD_IN_IADB_UT_CPEAR 0 score RCVD_IN_IADB_UT_CPR_30 0 score RCVD_IN_IADB_UT_CPR_MAT 0 -0.001 0 -0.052 score RCVD_IN_IADB_VOUCHED 0 -1.718 0 -0.956 score RCVD_IN_DNSWL_LOW 0 -0.6 0 -1.1 score RCVD_IN_DNSWL_MED 0 -1.5 0 -1.2 score RCVD_IN_DNSWL_HI 0 -1.8 0 -1.8 > I was wondering why the dnswl.org rules have specifically lower scores than in > previous versions - and extremely low scores. This is worrying me, as it would > indicate we have a quality issue in the dnswl.org data. These all have pretty low rank: $ grep RCVD_IN_DNSWL_ freqs.full OVERALL SPAM% HAM% S/O RANK SCORE NAME 0.184 0.0005 0.5708 0.001 0.76 -1.80 RCVD_IN_DNSWL_HI 7.410 0.1094 22.7527 0.005 0.67 -1.20 RCVD_IN_DNSWL_MED 2.551 0.1810 7.5322 0.023 0.59 -1.10 RCVD_IN_DNSWL_LOW the _HI gets a low automatic score probably because it hits very little mail, so it probably needs manual tweaking. The _MED seems to hit too many spam messages in the submitted logs for rescoring runs, or perhaps it has a high overlap with other similar rules. It is quite possible that some of these hits are still false positives, despite several iterations of cleaning: for j in spam*.log; do echo -n $j; grep RCVD_IN_DNSWL_HI $j | \ wc -l; done | sort -k2nr spam-bayes-net-bb-jhardin.log 3 spam-bayes-net-bb-kmcgrail.log 2 spam-bayes-net-bb-guenther_fraud.log 1 spam-bayes-net-hege.log 1 same on _MED: spam-bayes-net-bluestreak.log 381 spam-bayes-net-hege.log 79 spam-bayes-net-bb-jhardin.log 23 spam-bayes-net-wt-en1.log 15 spam-bayes-net-bb-kmcgrail.log 14 spam-bayes-net-jm-decimated.log 11 spam-bayes-net-ahenry.log 9 spam-bayes-net-dos-decimated.log 6 spam-bayes-net-bb-zmi.log 3 spam-bayes-net-mmartinec.log 3 spam-bayes-net-wt-en4.log 2 -- Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug.
