On Jun 8, 2007, at 4:32 AM, Michael Monnerie wrote:
On Freitag, 8. Juni 2007 Tom Allison wrote:
- there are distributed checksums like DCC, pyzor, razor
This makes sense, but how do you want to support it?
If you'd store the checksum(s) of these projects within searchable,
indexable DB fields per each e-mail, you could make easy spam checking
yourself. Imagine every 30 minutes you recheck the last received
e-mails, and you find that 30% of them have the same DCC checksum.
There's a BIG chance that those messages are spam.
The point is it must be cheap (in terms of I/O and CPU) to find out
which e-mails you want to doublecheck, as you can't reprocess all
messages (at least not an ISP).
So there's not really a lot of support to implement in dbmail. Just
one/some fields per e-mail with checksum(s). A field spam_score would
also be good. But it'd be good to speak with developers of SA, DCC and
so on before, maybe they've got their own ideas already.
I still see this as an add-on and not dbmail with the exception of
building a signature hash into the MDA process.
But is DCC going to want you sweeping 1000's of emails a minute
against their servers to see if they're spam yet?
I'm thinking that a few hundred dbmail applications doing an hourly
sweep on what may be a dozen to 100 emails at a time. That's pretty
much a continuous thread of checking. I'm not sure how DCC works
exactly, but I thought that it was effective by means of who reported
signatures of mail delivered. By delaying this as you describe you
penalize DCC by not reporting hashes until later then delivered.
I wasn't aware that SA had anything like this (hashes or volume
checking).
_______________________________________________
DBmail mailing list
[email protected]
https://mailman.fastxs.nl/mailman/listinfo/dbmail