On 31 May 2007, Graham Murray said:

> Nix <[EMAIL PROTECTED]> writes:
>
>> (And, let's be blunt, the pure this-word-is-spammy recognition part of
>> FuzzyOCR is much less smart than the Bayesian system already present
>> in SA: FuzzyOCR should really use the Bayesian system to determine the
>> spamminess of words, I suppose...)
>
> Or even just act as a MIME part 'decoding' system (like Base64) and feed
> all words it finds in images into Bayes, along with all other text in
> the mail, rather than generating a score itself.

Perhaps so, but if so those words should have a score-multiplier of some
sort applied, because the fact that those words originated in images is
itself an obfuscation technique that should be noted in the score.

-- 
`On a scale of one to ten of usefulness, BBC BASIC was several points ahead
 of the competition, scoring a relatively respectable zero.' --- Peter Corlett

Reply via email to