I have a question about - understanding how are rulesets generated for spamassassin.
For example - consider the rule in 20_drugs.cf : header SUBJECT_DRUG_GAP_C Subject =~ /\bc.{0,2}i.{0,2}a.{0,2}l.{0,2}i.{0,2}s\b/i describe SUBJECT_DRUG_GAP_C Subject contains a gappy version of 'cialis' Who generated the regular expression "/\bc.{0,2}i.{0,2}a.{0,2}l.{0,2}i.{0,2}s\b/i" a. Is it done manually with people writing regex to see how efficiently they capture spams? b. Is there an algorithm that identifies large corpus of spam and the comes up with these regex'es on its own? c. Is it a combination of (a), (b)? I know scores for rules are generated using "a neural network trained with error back propagation" http://wiki.apache.org/spamassassin/HowScoresAreAssigned But how are the rules generated themselves? Thnx -- View this message in context: http://www.nabble.com/SpamAssassin-Ruleset-Generation-tp25773508p25773508.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.