On 31 May 2007, Graham Murray said: > Nix <[EMAIL PROTECTED]> writes: > >> (And, let's be blunt, the pure this-word-is-spammy recognition part of >> FuzzyOCR is much less smart than the Bayesian system already present >> in SA: FuzzyOCR should really use the Bayesian system to determine the >> spamminess of words, I suppose...) > > Or even just act as a MIME part 'decoding' system (like Base64) and feed > all words it finds in images into Bayes, along with all other text in > the mail, rather than generating a score itself.
Perhaps so, but if so those words should have a score-multiplier of some sort applied, because the fact that those words originated in images is itself an obfuscation technique that should be noted in the score. -- `On a scale of one to ten of usefulness, BBC BASIC was several points ahead of the competition, scoring a relatively respectable zero.' --- Peter Corlett