Re: 72_scores.cf compared to the one from march 15
On 2017-11-16 10:06, Merijn van den Kroonenberg wrote: On 11/15/2017 07:10 AM, Dave Jones wrote: I got my SVN authentication issue figured out on my laptop and committed these. Fingers crossed for the run in about 5 hours. I have been comparing last night's 72_scores.cf against the one from march and it looks *really* good now. That last commit pushed up the amount of lines right up to the amount as we had in march. I also ran the compare-rulefiles script just like yesterday. ./compare-rulefiles -d 72_scores_20170315.cf 72_scores-1815405.cf > deleted_rules.txt ./compare-rulefiles -r 0 -d deleted_rules.txt active-1815421.list > deactivated_rules.txt small mistake, I used a too-new active.list ./compare-rulefiles -r 0 -d deleted_rules.txt active-1815296.list > deactivated_rules.txt ./compare-rulefiles -r 0 -a deactivated_rules.txt deleted_rules.txt > disappeared_rules.txt cat disappeared_rules.txt ADVANCE_FEE_4_NEW ADVANCE_FEE_5_NEW CN_B2B_SPAMMER URI_GOOGLE_PROXY cat disappeared_rules.txt ADVANCE_FEE_4_NEW CN_B2B_SPAMMER URI_GOOGLE_PROXY So even less with correct active.list So thats only 4 rules which are not in our new scores file but which were in the march one (discounting deactivated rules). When looking at the changes between now and then, I see nothing suspicious. i am now pretty confident the score generation is running as before in march. Anything which is not right, probably wasn't right in march either ;) I would say, lets get people testing! Here are the full changes between now and march so you can see for yourself: ./compare-rulefiles 72_scores_20170315.cf 72_scores-1815405.cf Only in 1 (removed in 2) ADVANCE_FEE_4_NEW ADVANCE_FEE_5_NEW AXB_XMAILER_MIMEOLE_OL_1ECD5 AXB_XM_FORGED_OL2600 BODY_EMPTY CN_B2B_SPAMMER FREEMAIL_DOC_PDF_BCC FSL_HELO_BARE_IP_2 HDRS_LCASE HK_SCAM_N15 LOTTO_AGENT LOTTO_DEPT MONEY_LOTTERY MSGID_NOFQDN1 RP_MATCHES_RCVD SHARE_50_50 TO_NO_BRKTS_FROM_MSSP URI_GOOGLE_PROXY Only in 2 (added in 2) ADVANCE_FEE_4_NEW_MONEY ADVANCE_FEE_5_NEW_FRM_MNY ADVANCE_FEE_5_NEW_MONEY APOSTROPHE_TOCC AXB_X_AOL_SEZ_S DEAR_BENEFICIARY FROM_MISSP_DYNIP FSL_HELO_FAKE FSL_MIME_NO_TEXT FUZZY_UNSUBSCRIBE HDRS_MISSP MANY_PILL_PRICE MILLION_USD MONEY_ATM_CARD MONEY_FORM MONEY_FORM_SHORT MONEY_FROM_41 MONEY_FROM_MISSP SERGIO_SUBJECT_VIAGRA01 SHORTENED_URL_SRC SINGLETS_LOW_CONTRAST SPOOFED_FREEM_REPTO_RUS TO_NO_BRKTS_DYNIP Changed AC_HTML_NONSENSE_TAGS 1.000 0.001 1.000 0.001 1.000 1.000 1.000 1.000 ADVANCE_FEE_2_NEW_MONEY 1.997 0.001 1.997 0.001 0.001 0.020 0.001 0.020 ADVANCE_FEE_3_NEW 3.496 0.001 3.496 0.001 3.001 3.467 3.001 3.467 ADVANCE_FEE_3_NEW_MONEY 2.796 0.001 2.796 0.001 3.099 2.696 3.099 2.696 AXB_XMAILER_MIMEOLE_OL_024C2 0.367 0.001 0.367 0.001 1.816 0.006 1.816 0.006 BODY_URI_ONLY 0.998 0.001 0.998 0.001 1.000 0.999 1.000 0.999 BOGUS_MSM_HDRS 0.909 0.001 0.909 0.001 0.795 1.377 0.795 1.377 CANT_SEE_AD 2.996 0.500 2.996 0.500 1.000 1.000 1.000 1.000 CK_HELO_DYNAMIC_SPLIT_IP 1.350 0.001 1.350 0.001 1.500 0.107 1.500 0.107 CK_HELO_GENERIC 0.249 0.249 0.249 0.249 0.250 0.248 0.250 0.248 COMMENT_GIBBERISH 1.498 1.499 1.498 1.499 1.000 1.000 1.000 1.000 DATE_IN_FUTURE_96_Q 3.296 3.299 3.296 3.299 2.899 2.696 2.899 2.696 FBI_MONEY 0.696 0.001 0.696 0.001 1.000 1.000 1.000 1.000 FBI_SPOOF 1.999 1.999 1.999 1.999 1.000 1.000 1.000 1.000 FILL_THIS_FORM 2.748 0.001 2.748 0.001 0.113 1.488 0.113 1.488 FORM_FRAUD 0.998 0.001 0.998 0.001 1.000 0.998 1.000 0.998 FORM_FRAUD_3 2.696 0.001 2.696 0.001 2.899 0.999 2.899 0.999 FORM_FRAUD_5 0.209 0.001 0.209 0.001 3.499 1.594 3.499 1.594 FOUND_YOU 3.013 0.001 3.013 0.001 1.000 1.000 1.000 1.000 FREEMAIL_FORGED_FROMDOMAIN 0.001 0.199 0.001 0.199 0.001 0.001 0.001 0.001 FROM_IN_TO_AND_SUBJ 0.287 0.262 0.287 0.262 0.001 0.001 0.001 0.001 FROM_MISSP_FREEMAIL 3.595 0.001 3.595 0.001 2.213 1.781 2.213 1.781 FROM_MISSP_MSFT 0.001 0.001 0.001 0.001 1.097 1.596 1.097 1.596 FROM_MISSP_REPLYTO 0.001 0.001 0.001 0.001 2.443 0.001 2.443 0.001 FROM_MISSP_SPF_FAIL 0.001 1.000 0.001 1.000 0.001 0.001 0.001 0.001 FROM_MISSP_TO_UNDISC 1.438 0.001 1.438 0.001 1.472 0.448 1.472 0.448 FROM_MISSP_USER 0.001 0.001 0.001 0.001 3.316 1.188 3.316 1.188 FROM_MISSP_XPRIO 0.001 0.001 0.001 0.001 1.785 2.497 1.785 2.497 FROM_WORDY 2.497 0.001 2.497 0.001 2.500 2.498 2.500 2.498 FSL_CTYPE_WIN1251 0.001 0.001 0.001 0.001 3.515 3.080 3.515 3.080 FSL_NEW_HELO_USER 0.083 0.001 0.083 0.001 1.719 0.750 1.719 0.750 HELO_MISC_IP 0.248 0.250 0.248 0.250 0.250 0.249 0.250 0.249 HK_RANDOM_FROM 0.998 0.001 0.998 0.001 0.999 0.999 0.999 0.999 HK_SCAM_N2 3.249 0.001 3.249 0.001 1.498 2.696 1.498 2.696 IMG_DIRECT_TO_MX 2.397 2.400 2.397 2.400 3.599 1.744 3.599 1.744 LIST_PRTL_SAME_USER 0.001 0.286 0.001 0.286 1.000 1.000 1.000 1.000 LONG_HEX_URI 2.194 2.290 2.194 2.290 1.102 0.853 1.102 0.853 LONG_IMG_URI 0.553 0.100
Re: 72_scores.cf compared to the one from march 15
> On 11/15/2017 07:10 AM, Dave Jones wrote: > > I got my SVN authentication issue figured out on my laptop and committed > these. Fingers crossed for the run in about 5 hours. I have been comparing last night's 72_scores.cf against the one from march and it looks *really* good now. That last commit pushed up the amount of lines right up to the amount as we had in march. I also ran the compare-rulefiles script just like yesterday. ./compare-rulefiles -d 72_scores_20170315.cf 72_scores-1815405.cf > deleted_rules.txt ./compare-rulefiles -r 0 -d deleted_rules.txt active-1815421.list > deactivated_rules.txt ./compare-rulefiles -r 0 -a deactivated_rules.txt deleted_rules.txt > disappeared_rules.txt cat disappeared_rules.txt ADVANCE_FEE_4_NEW ADVANCE_FEE_5_NEW CN_B2B_SPAMMER URI_GOOGLE_PROXY So thats only 4 rules which are not in our new scores file but which were in the march one (discounting deactivated rules). When looking at the changes between now and then, I see nothing suspicious. i am now pretty confident the score generation is running as before in march. Anything which is not right, probably wasn't right in march either ;) I would say, lets get people testing! Here are the full changes between now and march so you can see for yourself: ./compare-rulefiles 72_scores_20170315.cf 72_scores-1815405.cf Only in 1 (removed in 2) ADVANCE_FEE_4_NEW ADVANCE_FEE_5_NEW AXB_XMAILER_MIMEOLE_OL_1ECD5 AXB_XM_FORGED_OL2600 BODY_EMPTY CN_B2B_SPAMMER FREEMAIL_DOC_PDF_BCC FSL_HELO_BARE_IP_2 HDRS_LCASE HK_SCAM_N15 LOTTO_AGENT LOTTO_DEPT MONEY_LOTTERY MSGID_NOFQDN1 RP_MATCHES_RCVD SHARE_50_50 TO_NO_BRKTS_FROM_MSSP URI_GOOGLE_PROXY Only in 2 (added in 2) ADVANCE_FEE_4_NEW_MONEY ADVANCE_FEE_5_NEW_FRM_MNY ADVANCE_FEE_5_NEW_MONEY APOSTROPHE_TOCC AXB_X_AOL_SEZ_S DEAR_BENEFICIARY FROM_MISSP_DYNIP FSL_HELO_FAKE FSL_MIME_NO_TEXT FUZZY_UNSUBSCRIBE HDRS_MISSP MANY_PILL_PRICE MILLION_USD MONEY_ATM_CARD MONEY_FORM MONEY_FORM_SHORT MONEY_FROM_41 MONEY_FROM_MISSP SERGIO_SUBJECT_VIAGRA01 SHORTENED_URL_SRC SINGLETS_LOW_CONTRAST SPOOFED_FREEM_REPTO_RUS TO_NO_BRKTS_DYNIP Changed AC_HTML_NONSENSE_TAGS 1.000 0.001 1.000 0.001 1.000 1.000 1.000 1.000 ADVANCE_FEE_2_NEW_MONEY 1.997 0.001 1.997 0.001 0.001 0.020 0.001 0.020 ADVANCE_FEE_3_NEW 3.496 0.001 3.496 0.001 3.001 3.467 3.001 3.467 ADVANCE_FEE_3_NEW_MONEY 2.796 0.001 2.796 0.001 3.099 2.696 3.099 2.696 AXB_XMAILER_MIMEOLE_OL_024C2 0.367 0.001 0.367 0.001 1.816 0.006 1.816 0.006 BODY_URI_ONLY 0.998 0.001 0.998 0.001 1.000 0.999 1.000 0.999 BOGUS_MSM_HDRS 0.909 0.001 0.909 0.001 0.795 1.377 0.795 1.377 CANT_SEE_AD 2.996 0.500 2.996 0.500 1.000 1.000 1.000 1.000 CK_HELO_DYNAMIC_SPLIT_IP 1.350 0.001 1.350 0.001 1.500 0.107 1.500 0.107 CK_HELO_GENERIC 0.249 0.249 0.249 0.249 0.250 0.248 0.250 0.248 COMMENT_GIBBERISH 1.498 1.499 1.498 1.499 1.000 1.000 1.000 1.000 DATE_IN_FUTURE_96_Q 3.296 3.299 3.296 3.299 2.899 2.696 2.899 2.696 FBI_MONEY 0.696 0.001 0.696 0.001 1.000 1.000 1.000 1.000 FBI_SPOOF 1.999 1.999 1.999 1.999 1.000 1.000 1.000 1.000 FILL_THIS_FORM 2.748 0.001 2.748 0.001 0.113 1.488 0.113 1.488 FORM_FRAUD 0.998 0.001 0.998 0.001 1.000 0.998 1.000 0.998 FORM_FRAUD_3 2.696 0.001 2.696 0.001 2.899 0.999 2.899 0.999 FORM_FRAUD_5 0.209 0.001 0.209 0.001 3.499 1.594 3.499 1.594 FOUND_YOU 3.013 0.001 3.013 0.001 1.000 1.000 1.000 1.000 FREEMAIL_FORGED_FROMDOMAIN 0.001 0.199 0.001 0.199 0.001 0.001 0.001 0.001 FROM_IN_TO_AND_SUBJ 0.287 0.262 0.287 0.262 0.001 0.001 0.001 0.001 FROM_MISSP_FREEMAIL 3.595 0.001 3.595 0.001 2.213 1.781 2.213 1.781 FROM_MISSP_MSFT 0.001 0.001 0.001 0.001 1.097 1.596 1.097 1.596 FROM_MISSP_REPLYTO 0.001 0.001 0.001 0.001 2.443 0.001 2.443 0.001 FROM_MISSP_SPF_FAIL 0.001 1.000 0.001 1.000 0.001 0.001 0.001 0.001 FROM_MISSP_TO_UNDISC 1.438 0.001 1.438 0.001 1.472 0.448 1.472 0.448 FROM_MISSP_USER 0.001 0.001 0.001 0.001 3.316 1.188 3.316 1.188 FROM_MISSP_XPRIO 0.001 0.001 0.001 0.001 1.785 2.497 1.785 2.497 FROM_WORDY 2.497 0.001 2.497 0.001 2.500 2.498 2.500 2.498 FSL_CTYPE_WIN1251 0.001 0.001 0.001 0.001 3.515 3.080 3.515 3.080 FSL_NEW_HELO_USER 0.083 0.001 0.083 0.001 1.719 0.750 1.719 0.750 HELO_MISC_IP 0.248 0.250 0.248 0.250 0.250 0.249 0.250 0.249 HK_RANDOM_FROM 0.998 0.001 0.998 0.001 0.999 0.999 0.999 0.999 HK_SCAM_N2 3.249 0.001 3.249 0.001 1.498 2.696 1.498 2.696 IMG_DIRECT_TO_MX 2.397 2.400 2.397 2.400 3.599 1.744 3.599 1.744 LIST_PRTL_SAME_USER 0.001 0.286 0.001 0.286 1.000 1.000 1.000 1.000 LONG_HEX_URI 2.194 2.290 2.194 2.290 1.102 0.853 1.102 0.853 LONG_IMG_URI 0.553 0.100 0.553 0.100 0.554 1.000 0.554 1.000 LOTS_OF_MONEY 0.001 0.001 0.001 0.001 0.001 0.005 0.001 0.005 MIMEOLE_DIRECT_TO_MX 1.445 0.381 1.445 0.381 1.999 0.738 1.999 0.738 MIME_NO_TEXT 1.000 1.000 1.000 1.000 1.803 1.997 1.803 1.997 MONEY_FRAUD_3 2.896 0.001 2.896 0.001 3.099 0.263 3.099 0.263 MONEY_FRAUD_5
Re: 72_scores.cf compared to the one from march 15
On 11/15/2017 4:43 PM, Dave Jones wrote: I got my SVN authentication issue figured out on my laptop and committed these. Fingers crossed for the run in about 5 hours. Excellent. Sorry, today was an ASF board meeting so hectic!
Re: 72_scores.cf compared to the one from march 15
On 11/15/2017 07:10 AM, Dave Jones wrote: On 11/15/2017 06:40 AM, Kevin A. McGrail wrote: On 11/15/2017 6:33 AM, Merijn van den Kroonenberg wrote: That, or maybe Kevin can step in for now and do the commit for you? Good to know you are on the road and thanks for still trying to help! Happy to try and help! Regards, KAM On the sa-vm1 server, I need to get these two files committed: /usr/local/spamassassin/automc/svn/trunk/masses/rule-update-score-gen$ svn status M generate-new-scores.sh M lock-scores I would normally copy these to /tmp then scp them down to my local desktop/laptop check out location to commit them. The generate-new-scores.sh has the SVN $REVISION determined from the majority masscheck submissions and we think the lock-scores is the one that was running on the old server back in March but wasn't committed to the main dir like it should have been. Dave I got my SVN authentication issue figured out on my laptop and committed these. Fingers crossed for the run in about 5 hours. Dave
Re: 72_scores.cf compared to the one from march 15
On 11/15/2017 06:40 AM, Kevin A. McGrail wrote: On 11/15/2017 6:33 AM, Merijn van den Kroonenberg wrote: That, or maybe Kevin can step in for now and do the commit for you? Good to know you are on the road and thanks for still trying to help! Happy to try and help! Regards, KAM On the sa-vm1 server, I need to get these two files committed: /usr/local/spamassassin/automc/svn/trunk/masses/rule-update-score-gen$ svn status M generate-new-scores.sh M lock-scores I would normally copy these to /tmp then scp them down to my local desktop/laptop check out location to commit them. The generate-new-scores.sh has the SVN $REVISION determined from the majority masscheck submissions and we think the lock-scores is the one that was running on the old server back in March but wasn't committed to the main dir like it should have been. Dave
Re: 72_scores.cf compared to the one from march 15
On 11/15/2017 6:33 AM, Merijn van den Kroonenberg wrote: That, or maybe Kevin can step in for now and do the commit for you? Good to know you are on the road and thanks for still trying to help! Happy to try and help! Regards, KAM
Re: 72_scores.cf compared to the one from march 15
> On 11/15/2017 05:22 AM, Merijn van den Kroonenberg wrote: >>> I updated the masses/rule-update-score-gen/lock-scores file from >>> rulesrc/sandbox/dos/new-rule-score-gen/lock-scores on the >>> sa-vm1.apache.org server so fingers crossed on the 72_scores.cf here in >>> about 5 hours. >> This script is always freshly checked out, so uncommitted changes can >> never be tested. If you check automc/tmp you will see still the old >> version of the script. I must admit, I fell for it too, only found out >> after actually checking the temp dir to check the script after I >> wondered >> why there was no change in ranges.data. >> >>> Dave >>> >>> >> > Darn. I am having problems with my SVN ID right now so I was hoping I > didn't have to commit these changes to test them on the server. I am > travelling with my laptop that doesn't have something setup quite right > so I will have to figure out the SVN authentication setup since I won't > be back at my primary desktop PC for about 10 days. That, or maybe Kevin can step in for now and do the commit for you? Good to know you are on the road and thanks for still trying to help! > > Dave > >
Re: 72_scores.cf compared to the one from march 15
On 11/15/2017 05:22 AM, Merijn van den Kroonenberg wrote: I updated the masses/rule-update-score-gen/lock-scores file from rulesrc/sandbox/dos/new-rule-score-gen/lock-scores on the sa-vm1.apache.org server so fingers crossed on the 72_scores.cf here in about 5 hours. This script is always freshly checked out, so uncommitted changes can never be tested. If you check automc/tmp you will see still the old version of the script. I must admit, I fell for it too, only found out after actually checking the temp dir to check the script after I wondered why there was no change in ranges.data. Dave Darn. I am having problems with my SVN ID right now so I was hoping I didn't have to commit these changes to test them on the server. I am travelling with my laptop that doesn't have something setup quite right so I will have to figure out the SVN authentication setup since I won't be back at my primary desktop PC for about 10 days. Dave
Re: 72_scores.cf compared to the one from march 15
> I updated the masses/rule-update-score-gen/lock-scores file from > rulesrc/sandbox/dos/new-rule-score-gen/lock-scores on the > sa-vm1.apache.org server so fingers crossed on the 72_scores.cf here in > about 5 hours. This script is always freshly checked out, so uncommitted changes can never be tested. If you check automc/tmp you will see still the old version of the script. I must admit, I fell for it too, only found out after actually checking the temp dir to check the script after I wondered why there was no change in ranges.data. > > Dave > >
Re: 72_scores.cf compared to the one from march 15
> On 11/14/2017 07:28 AM, Merijn van den Kroonenberg wrote: Hi, When I compare the current 72_scores.cf with the one from march 15 I can see we are getting closer and closer. The march one has 144 lines and the current one has 108. >> Actually, personally I think below issue should be addressed before >> going >> live with the new score generation. Without it there still is a too big >> of >> a difference and I would not feel confident that some major issue is not >> lurking below this. I understand you want to go live sooner rather than >> later, but well these are my thoughts :) > > Last night's issue was my goof. Yesterday's 72_scores.cf was much > closer to March's size. It has been hovering around 103 line for some time now. But still it used to be 140-160 lines. > > Keep in mind that the size of the 72_scores.cf has fluctuated over the > years so we aren't sure that it has to be the same number of lines that > it was back in March to be correct now. If you know something is still > broken with the 72_scores.cf we can hold off and get it corrected. Do > you know of anything we need to address still? Yes, thats why I do not only look at the amount of lines, but also do checks on which lines are missing (or new) and why. In the part below (which you cut off in this mail, but was still in my previous mail) I found another change in the "dos" lock-scores which is not in the actually used lock-scores. It is similar to what happend to the merge-scores script. I think with this fixed the 72_scores.cf will look much more like the march one, so its much easier for me to check and explain any remaining differences. So then I could see if suspicious rules are missing or added. > > I have installed yesterday's ruleset manually on my SA platforms and > will check the scoring levels today and tomorrow. > > Dave > Due to the way I debug and solve problems I need to theorize them instead of just testing. Thats why I really would like to have lock-scores fixed.
Re: 72_scores.cf compared to the one from march 15
> Hi, > > When I compare the current 72_scores.cf with the one from march 15 I can > see > we are getting closer and closer. > The march one has 144 lines and the current one has 108. I have been looking at this and by backtracking I see the lock-scores script which has a definite impact on ranges.data and by that on which rules are used by the garescorer. Looking at the script I remembered I had already a note about this script. Its also in rulesrc/sandbox/dos/new-rule-score-gen/lock-scores Which has been changed compared to the masses/rule-update-score-gen/lock-scores: version which we use now. The changes seem to be related to assigning ranges to rules if they have scores defined in the sandboxes. I think its likely this was also running in production in march. So I would like to see what happens if these changes are ported to masses/rule-update-score-gen/lock-scores. (must be committed to svn for testing). When I have some time I want to make a write up of which rules are considered for score generation and what happens if scores are not generated for rules. Probably need to have a good look at what the intention should be, after we have updates running again. > > When looking at the rules which are missing, then one case stands out > clearly: > All rules in the march version with a score like this: > 1.000 1.000 1.000 1.000 > Are missing from our current 72_scores.cf > [edit: they all seem to be in active.list with a tflags publish] > > I will see if I can find where they get lost ;) > > One other rule which is still missing is RP_MATCHES_RCVD, which i could > imagine being used in custom meta rules. > > So I compile a list of all rules in the March 72_scores.cf which are not > in > our current: > > AC_SPAMMY_URI_PATTERNS1 > AC_SPAMMY_URI_PATTERNS10 > AC_SPAMMY_URI_PATTERNS11 > AC_SPAMMY_URI_PATTERNS12 > AC_SPAMMY_URI_PATTERNS2 > AC_SPAMMY_URI_PATTERNS3 > AC_SPAMMY_URI_PATTERNS4 > AC_SPAMMY_URI_PATTERNS8 > AC_SPAMMY_URI_PATTERNS9 > AXB_XMAILER_MIMEOLE_OL_1ECD5 > AXB_XM_FORGED_OL2600 > BODY_EMPTY > CANT_SEE_AD > CN_B2B_SPAMMER > COMMENT_GIBBERISH > ENCRYPTED_MESSAGE > FORM_LOW_CONTRAST > FOUND_YOU > FREEMAIL_DOC_PDF_BCC > FROM_WORDY_SHORT > FSL_HELO_BARE_IP_2 > GOOGLE_DOCS_PHISH > GOOGLE_DOCS_PHISH_MANY > GOOG_MALWARE_DNLD > HDRS_LCASE > HEXHASH_WORD > HK_SCAM_N15 > HTML_OFF_PAGE > LIST_PRTL_PUMPDUMP > LIST_PRTL_SAME_USER > LOTTO_AGENT > LOTTO_DEPT > LUCRATIVE > MIME_NO_TEXT > MONEY_LOTTERY > MSGID_NOFQDN1 > MSM_PRIO_REPTO > PHP_NOVER_MUA > PHP_ORIG_SCRIPT > PHP_SCRIPT_MUA > PP_TOO_MUCH_UNICODE02 > PP_TOO_MUCH_UNICODE05 > PUMPDUMP > PUMPDUMP_MULTI > RAND_HEADER_MANY > RP_MATCHES_RCVD > SHARE_50_50 > SPOOFED_FREEM_REPTO_CHN > STOCK_LOW_CONTRAST > STOCK_TIP > SYSADMIN > TO_NO_BRKTS_PCNT > TW_GIBBERISH_MANY > UC_GIBBERISH_OBFU > URI_DATA > URI_OPTOUT_3LD > XPRIO_SHORT_SUBJ > > Which are 57 rules, more than the difference in rulecount. This means > there > are also many rules in our current 72_scores.cf which are not in the march > version. > > Can someone explain to me why or in which cases rules are added or removed > from the 72_scores.cf? > > What I already know: > 1) during rule promotion rules are added/removed frome active.list which > in > turn will add/remove them from 72_scores.cf 2) when the hitrate in corpus falls below 0.01% they are removed too it seems. So this also depends on absolute corpus size. In this case they get the default score. (which also sounds weird to me) > > A few from the above list of rules can be tracked to active.list changes > (rule promotions) between then and now. But most are still in active.list. > > Cheers, > Merijn > >