https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6155

Mark Martinec <[email protected]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Attachment #4550|0                           |1
        is obsolete|                            |

--- Comment #96 from Mark Martinec <[email protected]> 2009-10-14 16:21:44 
UTC ---
Created an attachment (id=4553)
 --> (https://issues.apache.org/SpamAssassin/attachment.cgi?id=4553)
resulting 50_scores.cf from garescorer runs - V2

Here is now a 50_scores.cf from my second attempt after cleaning some
logs: removed binnocenti and wt-en6 logs as per Comment 93, removed
DKIM_ADSP_DISCARD hits from ham-bayes-net-bluestreak.log. I have also
limited the log entries to fewer months following the RescoreMassCheck
(wiki): -m 6 for spam, and -m 25 for ham (after 25th month there is a
large gap in data till the next peak, too far in the past).

This leaves us with the following number of entries in merged logs:
score set 1 (no data from score set 3), provides data for set0 and set1:
  360070 ham-full-set1.log
  472682 spam-full-set1.log
score set 3, provides data for set2 and set3:
  210603 ham-full-set3.log
  442709 spam-full-set3.log

For DCC_ rules, I took the DCC_CHECK value of 1.1 from a preliminary run
which had all the DCC_REPUT_* scores fixed at 0, then for the next run
I fixed the DCC_CHECK, but left the DCC_REPUT_* scores floating. This
should cope with both types of sites: those with a commercial license
that do receive reputation results from DCC servers, and those who don't.

-- 
Configure bugmail: 
https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

Reply via email to