Hi,

I have written a new command script that creates a filter which seems to be promising. The rationale behind this is two observations: most spam have several recipients and most spam is sent to invalid addresses. Our server rejects about 75% of all incoming messages for the latter reason.

This script analyses an IMail log file and extracts all attempts to send e-mail to invalid users. These invalid addresses are then used to create an ALLRECIPS filter. In order to keep the number of false positives to a minimum (preferably 0), I have included two settings. The first one uses regular expressions to describe valid address patterns. Addresses that are constructed in a certain way (in our case [EMAIL PROTECTED] or [EMAIL PROTECTED]) are never added to the filter. This will minimize the number of false positives due to misspelled addresses. The second settings is perhaps more important and is a threshold value for the minimum number of attempts for filter inclusion. Only invalid addresses that have been used more than a certain number of times during a 24 hour period are added to the filter.

If I use a threshold of 30, I will get an ALLRECIPS filter with about 700 invalid addresses. Most of these addresses are "blind tries", but a few must have arisen from broken address harvesters, as they are very short and partly match a valid address. So I needed a way to do an exact match on the addresses in the ALLRECIPS variable. Until the "angled bracket delimiters" are implemented in Declude, I have used the commas between the addresses as a provisional solution. Therefore, at present the filter can't match the first address in the variable.

I have only tested this filter for a few hours. So far it has caught about 25% of the spam based on unique messages, but with an analysis based on the total number of accepted messages this figure rises to about 60%. The script doesn't need any user input and always extracts invalid addresses from the last log file (24 hour period), so it can be scheduled to update the current filter at regular intervals.

Does this seem to be a good approach for a spam filter, or is it something that I have forgotten to take into consideration? I will make the script available for download if it proves to work well after some additional testing.

/Roger
---
[This E-mail was scanned for viruses by Declude Virus (http://www.declude.com)]

---
This E-mail came from the Declude.JunkMail mailing list.  To
unsubscribe, just send an E-mail to [EMAIL PROTECTED], and
type "unsubscribe Declude.JunkMail".  The archives can be found
at http://www.mail-archive.com.

Reply via email to