On Sep 15, 2017, at 12:24 PM, David Jones <djo...@ena.com> wrote: > 1. Actually start here with the runGA call: > > http://svn.apache.org/viewvc/spamassassin/trunk/masses/rule-update-score-gen/generate-new-scores.sh?revision=1798589&view=markup#l271 > > 2. Here is the runGA script (not changed in almost 8 years): > > http://svn.apache.org/viewvc/spamassassin/trunk/masses/runGA?view=log > > 3. Somewhere in the bottom of the runGA script there is a problem generating > a complete scores-set0 which becomes the 72_scores.cf: > > http://svn.apache.org/viewvc/spamassassin/trunk/rulesrc/scores/ > > 4. This shows the resulting different in the last good 72_scores.cf and the > latest version: > > http://svn.apache.org/viewvc/spamassassin/trunk/rulesrc/scores/72_scores.cf?r1=1786976&r2=1808406
I don't know if it helps, but I (manually) removed the lines that looked just like score changes from that diff and I get the following (on the assumption we want to have a good example of what to look for after making changes to know that this is 'fixed'): -score AC_SPAMMY_URI_PATTERNS1 1.000 1.000 1.000 1.000 -score AC_SPAMMY_URI_PATTERNS10 1.000 1.000 1.000 1.000 -score AC_SPAMMY_URI_PATTERNS11 1.000 1.000 1.000 1.000 -score AC_SPAMMY_URI_PATTERNS12 1.000 1.000 1.000 1.000 -score AC_SPAMMY_URI_PATTERNS2 1.000 1.000 1.000 1.000 -score AC_SPAMMY_URI_PATTERNS3 1.000 1.000 1.000 1.000 -score AC_SPAMMY_URI_PATTERNS4 1.000 1.000 1.000 1.000 -score AC_SPAMMY_URI_PATTERNS8 1.000 1.000 1.000 1.000 -score AC_SPAMMY_URI_PATTERNS9 1.000 1.000 1.000 1.000 -score ADVANCE_FEE_2_NEW_FORM 1.000 1.000 1.000 1.000 -score ADVANCE_FEE_2_NEW_MONEY 1.997 0.001 1.997 0.001 -score AXB_XM_FORGED_OL2600 1.190 2.699 1.190 2.699 -score BODY_EMPTY 1.997 1.999 1.997 1.999 -score CANT_SEE_AD 2.996 0.500 2.996 0.500 -score CN_B2B_SPAMMER 0.001 0.001 0.001 0.001 -score COMMENT_GIBBERISH 1.498 1.499 1.498 1.499 -score ENCRYPTED_MESSAGE -1.000 -1.000 -1.000 -1.000 -score FORM_LOW_CONTRAST 1.000 1.000 1.000 1.000 -score FREEMAIL_DOC_PDF_BCC 2.596 2.599 2.596 2.599 -score FREEMAIL_FORGED_FROMDOMAIN 0.001 0.199 0.001 0.199 -score FROM_MISSPACED 0.001 0.001 0.001 0.001 -score FROM_MISSP_SPF_FAIL 0.001 1.000 0.001 1.000 -score FROM_MISSP_XPRIO 0.001 0.001 0.001 0.001 -score FROM_WORDY_SHORT 1.000 1.000 1.000 1.000 -score GOOGLE_DOCS_PHISH 1.000 1.000 1.000 1.000 -score GOOGLE_DOCS_PHISH_MANY 1.000 1.000 1.000 1.000 -score GOOG_MALWARE_DNLD 1.000 1.000 1.000 1.000 -score HEADER_FROM_DIFFERENT_DOMAINS 0.001 0.001 0.001 0.001 -score HEXHASH_WORD 1.000 1.000 1.000 1.000 -score HK_RANDOM_FROM 0.998 0.001 0.998 0.001 -score HK_SCAM_N15 1.935 2.499 1.935 2.499 -score HTML_OFF_PAGE 1.000 1.000 1.000 1.000 -score LIST_PRTL_PUMPDUMP 1.000 1.000 1.000 1.000 -score LONG_HEX_URI 2.194 2.290 2.194 2.290 -score LOTS_OF_MONEY 0.001 0.001 0.001 0.001 -score LOTTO_DEPT 0.001 0.001 0.001 0.001 -score LUCRATIVE 1.000 1.000 1.000 1.000 -score MIMEOLE_DIRECT_TO_MX 1.445 0.381 1.445 0.381 -score MIME_NO_TEXT 1.000 1.000 1.000 1.000 -score MONEY_FRAUD_3 2.896 0.001 2.896 0.001 -score MONEY_FRAUD_5 3.096 0.001 3.096 0.001 -score MONEY_FRAUD_8 2.548 0.001 2.548 0.001 -score MONEY_LOTTERY 2.498 1.611 2.498 1.611 -score MSGID_NOFQDN1 2.395 3.299 2.395 3.299 -score MSM_PRIO_REPTO 2.497 0.180 2.497 0.180 -score NSL_RCVD_FROM_USER 0.548 0.001 0.548 0.001 -score NSL_RCVD_HELO_USER 1.273 0.001 1.273 0.001 -score PHP_NOVER_MUA 1.000 1.000 1.000 1.000 -score PHP_ORIG_SCRIPT 0.502 2.499 0.502 2.499 -score PHP_SCRIPT_MUA 1.000 1.000 1.000 1.000 -score PP_MIME_FAKE_ASCII_TEXT 0.429 0.001 0.429 0.001 -score PP_TOO_MUCH_UNICODE02 0.500 0.500 0.500 0.500 -score PP_TOO_MUCH_UNICODE05 1.000 1.000 1.000 1.000 -score PUMPDUMP 1.000 1.000 1.000 1.000 -score PUMPDUMP_MULTI 1.000 1.000 1.000 1.000 -score RAND_HEADER_MANY 1.000 1.000 1.000 1.000 -score RCVD_IN_MSPIKE_BL 0.001 0.010 0.001 0.010 -score RCVD_IN_MSPIKE_H2 0.001 -2.800 0.001 -2.800 -score RCVD_IN_MSPIKE_H3 0.001 -0.010 0.001 -0.010 -score RCVD_IN_MSPIKE_H4 0.001 -0.010 0.001 -0.010 -score RCVD_IN_MSPIKE_H5 0.001 -1.000 0.001 -1.000 -score RCVD_IN_MSPIKE_L2 0.001 0.001 0.001 0.001 -score RCVD_IN_MSPIKE_L3 0.001 0.001 0.001 0.001 -score RCVD_IN_MSPIKE_L4 0.001 0.001 0.001 0.001 -score RCVD_IN_MSPIKE_L5 0.001 0.001 0.001 0.001 -score RCVD_IN_MSPIKE_WL 0.001 -0.010 0.001 -0.010 -score RCVD_IN_MSPIKE_ZBI 0.001 0.001 0.001 0.001 -score RP_MATCHES_RCVD -1.050 -0.001 -1.050 -0.001 -score SHARE_50_50 2.121 1.818 2.121 1.818 -score SPOOFED_FREEM_REPTO 2.498 1.368 2.498 1.368 -score SPOOFED_FREEM_REPTO_CHN 1.000 1.000 1.000 1.000 -score STATIC_XPRIO_OLE 1.997 0.001 1.997 0.001 -score STOCK_LOW_CONTRAST 2.030 2.347 2.030 2.347 -score STOCK_TIP 1.000 1.000 1.000 1.000 -score STYLE_GIBBERISH 2.800 3.093 2.800 3.093 -score SURBL_BLOCKED 0.001 0.001 0.001 0.001 -score SYSADMIN 1.000 1.000 1.000 1.000 -score THIS_AD 0.596 2.200 0.596 2.200 -score TO_EQ_FM_DIRECT_MX 2.497 0.622 2.497 0.622 -score TO_EQ_FM_DOM_SPF_FAIL 0.001 0.001 0.001 0.001 -score TO_EQ_FM_SPF_FAIL 0.001 0.001 0.001 0.001 -score TO_IN_SUBJ 0.099 0.099 0.099 0.099 -score TO_NO_BRKTS_FROM_MSSP 0.001 0.001 0.001 0.001 -score TO_NO_BRKTS_HTML_IMG 0.001 2.000 0.001 2.000 -score TO_NO_BRKTS_HTML_ONLY 1.997 0.001 1.997 0.001 -score TO_NO_BRKTS_MSFT 2.497 0.001 2.497 0.001 -score TO_NO_BRKTS_NORDNS_HTML 0.398 0.001 0.398 0.001 -score TO_NO_BRKTS_PCNT 2.497 0.001 2.497 0.001 -score TVD_SPACE_ENCODED 2.497 0.001 2.497 0.001 -score TVD_SPACE_ENC_FM_MIME 1.997 0.001 1.997 0.001 -score TVD_SPACE_RATIO_MINFP 2.497 0.001 2.497 0.001 -score TW_GIBBERISH_MANY 1.000 1.000 1.000 1.000 -score UC_GIBBERISH_OBFU 1.000 1.000 1.000 1.000 -score URI_DATA 1.000 1.000 1.000 1.000 -score URI_GOOGLE_PROXY 0.710 1.378 0.710 1.378 -score URI_ONLY_MSGID_MALF 0.001 1.191 0.001 1.191 -score URI_OPTOUT_3LD 1.000 1.000 1.000 1.000 -score URI_PHISH 3.995 3.999 3.995 3.999 -score URI_TRY_3LD 0.195 0.001 0.195 0.001 -score URI_TRY_USME 0.001 0.001 0.001 0.001 -score URI_WPADMIN 3.396 3.014 3.396 3.014 -score URI_WP_DIRINDEX 1.000 1.000 1.000 1.000 -score URI_WP_HACKED 2.996 3.000 2.996 3.000 -score URI_WP_HACKED_2 1.187 1.764 1.187 1.764 -score XPRIO 2.248 2.249 2.248 2.249 -score XPRIO_SHORT_SUBJ 1.000 1.000 1.000 1.000 +score ADVANCE_FEE_3_NEW_FRM_MNY 0.001 2.296 0.001 2.296 +score ADVANCE_FEE_4_NEW_FRM_MNY 2.799 2.141 2.799 2.141 +score ADVANCE_FEE_4_NEW_MONEY 3.200 2.508 3.200 2.508 +score ADVANCE_FEE_5_NEW_FRM_MNY 3.199 3.099 3.199 3.099 +score ADVANCE_FEE_5_NEW_MONEY 2.976 0.558 2.976 0.558 +score AXB_X_FF_SEZ_S 3.600 3.399 3.600 3.399 +score BODY_SINGLE_URI 0.001 1.607 0.001 1.607 +score BODY_SINGLE_WORD 2.602 0.001 2.602 0.001 +score COMPENSATION 0.001 0.000 0.001 0.000 +score DEAR_BENEFICIARY 0.483 1.470 0.483 1.470 +score DEAR_EMAIL 3.499 2.715 3.499 2.715 +score DX_TEXT_01 2.699 2.599 2.699 2.599 +score FROM_MISSP_DYNIP 1.536 2.399 1.536 2.399 +score FROM_MISSP_EH_MATCH 1.685 1.263 1.685 1.263 +score HK_NAME_MR_MRS 4.085 2.994 4.085 2.994 +score HK_SCAM_N3 2.799 2.699 2.799 2.699 +score HTML_FONT_TINY 2.194 2.648 2.194 2.648 +score KHOP_DYNAMIC 3.030 1.997 3.030 1.997 +score LIST_PARTIAL_SHORT_MSG 2.499 2.276 2.499 2.276 +score MILLION_USD 3.157 2.189 3.157 2.189 > You kinda have to work backwards through the scripts to find what is > generating the scores-set0 file and turning it into 72_scores.cf. I am > grep'ing through the work dir on the SA server now but it contains a lot of > files. I need to find the large dirs and exclude them. -- Daniel J. Luke