-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

"Ken Anderson (Pacific Internet)" writes:
>This is a dev question.
>
>Since SA scores for identical messages are identical if the message is 
>simply passed to SA via perl (like from MailScanner), wouldn't it 
>increase SA's performance if it cached scores for a short time based on 
>message checksum?

it would, but ...  the issue is that spammers are trying very hard to
*avoid* making hashable messages, since multiple messages with the same
hash means "bulk email", and they do not want their messages to be
identified as that.  Hence the low accuracy rates of DCC, Pyzor and Razor
(low relative to what they *could* be that is).

Making hashing schemes that are resistant to spammer evasion, without
FPs, is quite hard.

>This would be beneficial in typical dictionary attacks when messages are 
>not unique in some way, or when sendmail is splitting recipients using 
>queue groups so that a message with 10 recipients is actually passed to 
>SA 10 times. That might sound odd, but using MailScanner with SA and 
>sendmail on a mail gateway/relay, this is commonly done to permit per 
>user rules. The time SA sometimes spends scanning identical messages is 
>a waste of cpu.

No, makes perfect sense -- that's the thing that's initially
counterintuitive until you consider what per-user customisation means ;).

But that then points out the other problem with the idea.  What if
user A has a score for MIME_HTML_ONLY of 0.1, but user B has a score
of 5.0?   We can't simply cache scores, we should cache the rules hit.

But then what if user A has a bayes DB that says that the "Daily Blah
Newsletter" is ham, but user B has trained that as spam?  We'd have
to cache all hits *except* for bayes, and run that separately.

It gets messy very quickly.  As far as I can see -- with per-user
customisation in the mix, this is not necessarily a good idea at all.

- --j.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFATgRMQTcbUG5Y7woRAjsyAKCQQLs+yQxfY9W3LZw6YogTjjQ9fQCgyhTo
w8rXoAwz/C9/JyYRLU5SHms=
=dGmS
-----END PGP SIGNATURE-----

Reply via email to