easier than that ! you dont need to check any ratio at all ... as legitimate mails dont have non-word characters between characters ! Non spamer people don´t write subjects like that ! Spamers had to do that in order to avoid sex, porn, xxx, viagra directly in subject (which is more or less easily detected)...but when they put things in between you can be 99.999% confident it is spam !
2011/10/16 <dar...@chaosreigns.com>: > On 10/15, John Hardin wrote: >> >Subject: T !r (a -n*n =l&e ` S !e .x| >> >Subject: Se^x M-o ^v ~l e - >> >> More chickenpoxed subjects. > > Might be fun to create a plugin to check the ratio of word characters to > non-word characters, possibly roughly based on html_title_subject_ratio() > in Mail::SpamAssassin::Plugin::HTMLEval. > > We could then run it through RuleQA with a few ratio thresholds to find > the optimal hit rate (highest RuleQA rank). > > -- > "Hermes will help you get your wagon unstuck, but only if you push on it." > - Greek Alphabet Oracle > http://www.ChaosReigns.com >