On Sep 15, 2017, at 12:24 PM, David Jones <djo...@ena.com> wrote:
> 1. Actually start here with the runGA call:
> 
> http://svn.apache.org/viewvc/spamassassin/trunk/masses/rule-update-score-gen/generate-new-scores.sh?revision=1798589&view=markup#l271
> 
> 2. Here is the runGA script (not changed in almost 8 years):
> 
> http://svn.apache.org/viewvc/spamassassin/trunk/masses/runGA?view=log
> 
> 3. Somewhere in the bottom of the runGA script there is a problem generating 
> a complete scores-set0 which becomes the 72_scores.cf:
> 
> http://svn.apache.org/viewvc/spamassassin/trunk/rulesrc/scores/
> 
> 4. This shows the resulting different in the last good 72_scores.cf and the 
> latest version:
> 
> http://svn.apache.org/viewvc/spamassassin/trunk/rulesrc/scores/72_scores.cf?r1=1786976&r2=1808406

I don't know if it helps, but I (manually) removed the lines that looked just 
like score changes from that diff and I get the following (on the assumption we 
want to have a good example of what to look for after making changes to know 
that this is 'fixed'):

-score AC_SPAMMY_URI_PATTERNS1               1.000 1.000 1.000 1.000
-score AC_SPAMMY_URI_PATTERNS10              1.000 1.000 1.000 1.000
-score AC_SPAMMY_URI_PATTERNS11              1.000 1.000 1.000 1.000
-score AC_SPAMMY_URI_PATTERNS12              1.000 1.000 1.000 1.000
-score AC_SPAMMY_URI_PATTERNS2               1.000 1.000 1.000 1.000
-score AC_SPAMMY_URI_PATTERNS3               1.000 1.000 1.000 1.000
-score AC_SPAMMY_URI_PATTERNS4               1.000 1.000 1.000 1.000
-score AC_SPAMMY_URI_PATTERNS8               1.000 1.000 1.000 1.000
-score AC_SPAMMY_URI_PATTERNS9               1.000 1.000 1.000 1.000
-score ADVANCE_FEE_2_NEW_FORM                1.000 1.000 1.000 1.000
-score ADVANCE_FEE_2_NEW_MONEY               1.997 0.001 1.997 0.001
-score AXB_XM_FORGED_OL2600                  1.190 2.699 1.190 2.699
-score BODY_EMPTY                            1.997 1.999 1.997 1.999
-score CANT_SEE_AD                           2.996 0.500 2.996 0.500
-score CN_B2B_SPAMMER                        0.001 0.001 0.001 0.001
-score COMMENT_GIBBERISH                     1.498 1.499 1.498 1.499
-score ENCRYPTED_MESSAGE                     -1.000 -1.000 -1.000 -1.000
-score FORM_LOW_CONTRAST                     1.000 1.000 1.000 1.000
-score FREEMAIL_DOC_PDF_BCC                  2.596 2.599 2.596 2.599
-score FREEMAIL_FORGED_FROMDOMAIN            0.001 0.199 0.001 0.199
-score FROM_MISSPACED                        0.001 0.001 0.001 0.001
-score FROM_MISSP_SPF_FAIL                   0.001 1.000 0.001 1.000
-score FROM_MISSP_XPRIO                      0.001 0.001 0.001 0.001
-score FROM_WORDY_SHORT                      1.000 1.000 1.000 1.000
-score GOOGLE_DOCS_PHISH                     1.000 1.000 1.000 1.000
-score GOOGLE_DOCS_PHISH_MANY                1.000 1.000 1.000 1.000
-score GOOG_MALWARE_DNLD                     1.000 1.000 1.000 1.000
-score HEADER_FROM_DIFFERENT_DOMAINS         0.001 0.001 0.001 0.001
-score HEXHASH_WORD                          1.000 1.000 1.000 1.000
-score HK_RANDOM_FROM                        0.998 0.001 0.998 0.001
-score HK_SCAM_N15                           1.935 2.499 1.935 2.499
-score HTML_OFF_PAGE                         1.000 1.000 1.000 1.000
-score LIST_PRTL_PUMPDUMP                    1.000 1.000 1.000 1.000
-score LONG_HEX_URI                          2.194 2.290 2.194 2.290
-score LOTS_OF_MONEY                         0.001 0.001 0.001 0.001
-score LOTTO_DEPT                            0.001 0.001 0.001 0.001
-score LUCRATIVE                             1.000 1.000 1.000 1.000
-score MIMEOLE_DIRECT_TO_MX                  1.445 0.381 1.445 0.381
-score MIME_NO_TEXT                          1.000 1.000 1.000 1.000
-score MONEY_FRAUD_3                         2.896 0.001 2.896 0.001
-score MONEY_FRAUD_5                         3.096 0.001 3.096 0.001
-score MONEY_FRAUD_8                         2.548 0.001 2.548 0.001
-score MONEY_LOTTERY                         2.498 1.611 2.498 1.611
-score MSGID_NOFQDN1                         2.395 3.299 2.395 3.299
-score MSM_PRIO_REPTO                        2.497 0.180 2.497 0.180
-score NSL_RCVD_FROM_USER                    0.548 0.001 0.548 0.001
-score NSL_RCVD_HELO_USER                    1.273 0.001 1.273 0.001
-score PHP_NOVER_MUA                         1.000 1.000 1.000 1.000
-score PHP_ORIG_SCRIPT                       0.502 2.499 0.502 2.499
-score PHP_SCRIPT_MUA                        1.000 1.000 1.000 1.000
-score PP_MIME_FAKE_ASCII_TEXT               0.429 0.001 0.429 0.001
-score PP_TOO_MUCH_UNICODE02                 0.500 0.500 0.500 0.500
-score PP_TOO_MUCH_UNICODE05                 1.000 1.000 1.000 1.000
-score PUMPDUMP                              1.000 1.000 1.000 1.000
-score PUMPDUMP_MULTI                        1.000 1.000 1.000 1.000
-score RAND_HEADER_MANY                      1.000 1.000 1.000 1.000
-score RCVD_IN_MSPIKE_BL                     0.001 0.010 0.001 0.010
-score RCVD_IN_MSPIKE_H2                     0.001 -2.800 0.001 -2.800
-score RCVD_IN_MSPIKE_H3                     0.001 -0.010 0.001 -0.010
-score RCVD_IN_MSPIKE_H4                     0.001 -0.010 0.001 -0.010
-score RCVD_IN_MSPIKE_H5                     0.001 -1.000 0.001 -1.000
-score RCVD_IN_MSPIKE_L2                     0.001 0.001 0.001 0.001
-score RCVD_IN_MSPIKE_L3                     0.001 0.001 0.001 0.001
-score RCVD_IN_MSPIKE_L4                     0.001 0.001 0.001 0.001
-score RCVD_IN_MSPIKE_L5                     0.001 0.001 0.001 0.001
-score RCVD_IN_MSPIKE_WL                     0.001 -0.010 0.001 -0.010
-score RCVD_IN_MSPIKE_ZBI                    0.001 0.001 0.001 0.001
-score RP_MATCHES_RCVD                       -1.050 -0.001 -1.050 -0.001
-score SHARE_50_50                           2.121 1.818 2.121 1.818
-score SPOOFED_FREEM_REPTO                   2.498 1.368 2.498 1.368
-score SPOOFED_FREEM_REPTO_CHN               1.000 1.000 1.000 1.000
-score STATIC_XPRIO_OLE                      1.997 0.001 1.997 0.001
-score STOCK_LOW_CONTRAST                    2.030 2.347 2.030 2.347
-score STOCK_TIP                             1.000 1.000 1.000 1.000
-score STYLE_GIBBERISH                       2.800 3.093 2.800 3.093
-score SURBL_BLOCKED                         0.001 0.001 0.001 0.001
-score SYSADMIN                              1.000 1.000 1.000 1.000
-score THIS_AD                               0.596 2.200 0.596 2.200
-score TO_EQ_FM_DIRECT_MX                    2.497 0.622 2.497 0.622
-score TO_EQ_FM_DOM_SPF_FAIL                 0.001 0.001 0.001 0.001
-score TO_EQ_FM_SPF_FAIL                     0.001 0.001 0.001 0.001
-score TO_IN_SUBJ                            0.099 0.099 0.099 0.099
-score TO_NO_BRKTS_FROM_MSSP                 0.001 0.001 0.001 0.001
-score TO_NO_BRKTS_HTML_IMG                  0.001 2.000 0.001 2.000
-score TO_NO_BRKTS_HTML_ONLY                 1.997 0.001 1.997 0.001
-score TO_NO_BRKTS_MSFT                      2.497 0.001 2.497 0.001
-score TO_NO_BRKTS_NORDNS_HTML               0.398 0.001 0.398 0.001
-score TO_NO_BRKTS_PCNT                      2.497 0.001 2.497 0.001
-score TVD_SPACE_ENCODED                     2.497 0.001 2.497 0.001
-score TVD_SPACE_ENC_FM_MIME                 1.997 0.001 1.997 0.001
-score TVD_SPACE_RATIO_MINFP                 2.497 0.001 2.497 0.001
-score TW_GIBBERISH_MANY                     1.000 1.000 1.000 1.000
-score UC_GIBBERISH_OBFU                     1.000 1.000 1.000 1.000
-score URI_DATA                              1.000 1.000 1.000 1.000
-score URI_GOOGLE_PROXY                      0.710 1.378 0.710 1.378
-score URI_ONLY_MSGID_MALF                   0.001 1.191 0.001 1.191
-score URI_OPTOUT_3LD                        1.000 1.000 1.000 1.000
-score URI_PHISH                             3.995 3.999 3.995 3.999
-score URI_TRY_3LD                           0.195 0.001 0.195 0.001
-score URI_TRY_USME                          0.001 0.001 0.001 0.001
-score URI_WPADMIN                           3.396 3.014 3.396 3.014
-score URI_WP_DIRINDEX                       1.000 1.000 1.000 1.000
-score URI_WP_HACKED                         2.996 3.000 2.996 3.000
-score URI_WP_HACKED_2                       1.187 1.764 1.187 1.764
-score XPRIO                                 2.248 2.249 2.248 2.249
-score XPRIO_SHORT_SUBJ                      1.000 1.000 1.000 1.000
+score ADVANCE_FEE_3_NEW_FRM_MNY      0.001 2.296 0.001 2.296
+score ADVANCE_FEE_4_NEW_FRM_MNY      2.799 2.141 2.799 2.141
+score ADVANCE_FEE_4_NEW_MONEY        3.200 2.508 3.200 2.508
+score ADVANCE_FEE_5_NEW_FRM_MNY      3.199 3.099 3.199 3.099
+score ADVANCE_FEE_5_NEW_MONEY        2.976 0.558 2.976 0.558
+score AXB_X_FF_SEZ_S                 3.600 3.399 3.600 3.399
+score BODY_SINGLE_URI                0.001 1.607 0.001 1.607
+score BODY_SINGLE_WORD               2.602 0.001 2.602 0.001
+score COMPENSATION                   0.001 0.000 0.001 0.000
+score DEAR_BENEFICIARY               0.483 1.470 0.483 1.470
+score DEAR_EMAIL                     3.499 2.715 3.499 2.715
+score DX_TEXT_01                     2.699 2.599 2.699 2.599
+score FROM_MISSP_DYNIP               1.536 2.399 1.536 2.399
+score FROM_MISSP_EH_MATCH            1.685 1.263 1.685 1.263
+score HK_NAME_MR_MRS                 4.085 2.994 4.085 2.994
+score HK_SCAM_N3                     2.799 2.699 2.799 2.699
+score HTML_FONT_TINY                 2.194 2.648 2.194 2.648
+score KHOP_DYNAMIC                   3.030 1.997 3.030 1.997
+score LIST_PARTIAL_SHORT_MSG         2.499 2.276 2.499 2.276
+score MILLION_USD                    3.157 2.189 3.157 2.189

> You kinda have to work backwards through the scripts to find what is 
> generating the scores-set0 file and turning it into 72_scores.cf.  I am 
> grep'ing through the work dir on the SA server now but it contains a lot of 
> files.  I need to find the large dirs and exclude them.

-- 
Daniel J. Luke



Reply via email to