https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6155
Mark Martinec <[email protected]> changed: What |Removed |Added ---------------------------------------------------------------------------- Attachment #4542|0 |1 is obsolete| | Attachment #4553|0 |1 is obsolete| | --- Comment #124 from Mark Martinec <[email protected]> 2009-10-26 07:49:13 UTC --- Created an attachment (id=4558) --> (https://issues.apache.org/SpamAssassin/attachment.cgi?id=4558) resulting 50_scores.cf from garescorer runs - V3 Attached is the latest 50_scores.cf file, obtained in a couple of iterations during the last few days. It takes into account the updated results files from the rsync submit area, in particular the updated net-wt* (Comment 99, 102, 103), and net-hege* files. The binnocenti* are still excluded. The rest of the corpora tweaks/decimation as per my previous run, Comment 96. The RCVD_IN_DNSWL_* scores are hand-tweaked (according to Comment 101), otherwise the _MED stands out above the _HI due to its significantly higher hit rate. The KB_RATWARE_OUTLOOK_08, KB_RATWARE_OUTLOOK_12, KB_RATWARE_OUTLOOK_16 and KB_RATWARE_BOUNDARY were now zeroed-out according to Comment 115. I tried leaving RDNS_NONE and RDNS_DYNAMIC floating (Comment 116, 120, 122), and it seems to me the obtained score is perfectly sensible and useful, and still not too high to punish incompetent admins too hard: score RDNS_NONE 0 1.1 0 0.7 score RDNS_DYNAMIC 0 0.5 0 0.5 so I'm leaving these floating. According to Comment 122 I zeroed out (actually, 0.001'd out) the HTML_MESSAGE, MIME_QP_LONG_LINE, FREEMAIL_FROM, TVD_SPACE_RATIO, and MSGID_MULTIPLE_AT. Some further tweaks: I reduced the BAYES scores somewhat (e.g. from 4.5 to 3.5 for BAYES_99 scoreset3) and tamed down the BAYES_50, which was standing out from the crowd). For DCC_* rules I used the already described approach: obtain DCC_CHECK score from a GA run with all DCC_REPUT_* zeroed-out, then fix the obtained DCC_CHECK, and let DCC_REPUT_* float for the final run. The NML_ADSP_CUSTOM_MED was obtained from a GA run, but other (_LOW, _HIGH) were set manually (currently no hits). The DKIM_ADSP_ALL, DKIM_ADSP_DISCARD, and DKIM_ADSP_NXDOMAIN are based on GA runs, but hand-tweaked somewhat due to inconsistencies between corpora. A word about JM_SOUGHT_FRAUD_{1,2,3}: these three rules come out from a ga RUN with scores between 2 and 3, but are somewhat inconsistent between runs and corpora. As requested by Comment 38 their scores were fixed at zero for the final run, but I'd set these manually to 2.2 each for the published 50_scores.cf. After preparing my manual fixes from a couple of trial runs, I made a final run for each scoreset with these fixed scores, so as to allow other scores to adjust themselves to the new constraints. So here are the manual fixes: score SPF_PASS -0.001 score SPF_HELO_PASS -0.001 score BAYES_00 0 0 -1.2 -1.9 score BAYES_05 0 0 -0.2 -0.5 score BAYES_20 0 0 -0.001 -0.001 score BAYES_40 0 0 -0.001 -0.001 score BAYES_50 0 0 2.0 0.8 score BAYES_60 0 0 2.5 1.5 score BAYES_80 0 0 2.7 2.0 score BAYES_95 0 0 3.2 3.0 score BAYES_99 0 0 3.8 3.5 score RCVD_IN_DNSWL_LOW 0 -0.6 0 -1.1 score RCVD_IN_DNSWL_MED 0 -1.5 0 -1.2 score RCVD_IN_DNSWL_HI 0 -1.8 0 -1.8 score HTML_MESSAGE 0.001 score NO_RELAYS -0.001 score UNPARSEABLE_RELAY 0.001 score NO_RECEIVED -0.001 score NO_HEADERS_MESSAGE 0.001 score DKIM_ADSP_ALL 0 1.1 0 0.8 score DKIM_ADSP_DISCARD 0 1.8 0 1.8 score DKIM_ADSP_NXDOMAIN 0 0.8 0 0.9 score NML_ADSP_CUSTOM_LOW 0 0.7 0 0.7 score NML_ADSP_CUSTOM_MED 0 1.2 0 0.9 score NML_ADSP_CUSTOM_HIGH 0 2.6 0 2.5 score JM_SOUGHT_FRAUD_1 0 score JM_SOUGHT_FRAUD_2 0 score JM_SOUGHT_FRAUD_3 0 score MIME_QP_LONG_LINE 0.001 score FREEMAIL_FROM 0.001 score TVD_SPACE_RATIO 0.001 score MSGID_MULTIPLE_AT 0.001 score EXTRA_MPART_TYPE 1.0 score RDNS_NONE 0 1.1 0 0.7 score RDNS_DYNAMIC 0 0.5 0 0.5 score KB_RATWARE_OUTLOOK_08 0 score KB_RATWARE_OUTLOOK_12 0 score KB_RATWARE_OUTLOOK_16 0 score KB_RATWARE_BOUNDARY 0 -- Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug.
