Hello Justin, Friday, May 21, 2004, 1:32:51 AM, you wrote:
>> header RM_hex_MessageInfo exists:X-Message-Info >> describe RM_hex_MessageInfo X-Message-Info header found >> score RM_hex_MessageInfo 4.000 # type=spamp >> #stype RM_hex_MessageInfo spamp >> #counts RM_hex_MessageInfo 1392s/0h of 115937 corpus (94614s/21323h) >> 04/29/04 JM> hey Robert -- what does the "spamp" mean? My home-grown mass-check script (which calls masses/mass-check and hit-frequencies) not only gives me the #counts line above, but also recommends scores based on a series of algorithms. I indicate which algorithm should apply to any given rule in my special #stype line above. (The "# type=keyword" on the score line is an older version of the same thing.) The default algorithm is my "spam" rule, which starts at a very minimal score for a single spam, and grows slowly to 1/3 of required-hits at 200s/0h, 400s/1h, 600s/2h, etc. The great majority of my custom rules fall into this category. My "spamp" rule (probable spam) is used for email characteristics which very strongly suggest spam, such as the header above which is not used by any non-spam email client. Another example: header RM_ft_KS5601 From:raw =~ /\=\?ks_c_5601\-1987\?/i describe RM_ft_KS5601 From header specifies display in Korean?, unnecessary unless spam hides subject score RM_ft_KS5601 1.000 # type=spamp #stype RM_ft_KS5601 spamp #counts RM_ft_KS5601 9s/0h of 125163 corpus (104972s/20191h) 03/28/04 This rule scores RH/9 for 1-9 spam, 2*RH/9 for 10-99 spam, 3*RH/9 for 100-999 spam, etc. My "spamg" rule (guaranteed spam) is used for BigEvil type rules, where I'm very confident that they won't match ham. Example: header RM_hr_carat Received =~ /\^/ describe RM_hr_carat Received header has apparently invalid character score RM_hr_carat 3.000 # type=spamg #stype RM_hr_carat spamg #counts RM_hr_carat 8s/0h of 96854 corpus (75458s/21396h) 05/03/04 #hist RM_hr_carat Created by Bob Menschel May 3 2004 Scoring for these rules starts at RH/3 and goes up from there (provided no ham hits). I'm expecting/hoping much of this will go away when I'm able to migrate to 3.0 and use the new perceptron methods for scoring my rules. Bob Menschel
