Thanks for the suggestions.
With a maxweight variable it would also make sense to add a body weight variable (so that this weight easily can be set to a value other than 0). At present, the processor load shouldn't be any problem, however, since the number of entries is rather few. This situation can quickly change if SURBL change the expiration time of the records, however, and that is why I included a limit variable for max number of entries. The filter will not be updated if this limit is exceeded.
I also thought of an exclude file, but decided that the exclude variable would be enough (but maybe it isn't). After all, this filter script is just a provisional solution until this kind of test is directly implemented in Declude.
We also have to remember that SURBL is still very experimental, and the listing criteria havn't settled yet. It also has some other problems, e.g., the permanent test entry example.com (which is excluded in the filter file since it would catch many legitimate messages). I only use this filter with a low weight (15% of hold) to push spam over the edge, but the FP rate should be rather low (or at least lower than similar lists).
/Roger
Roger,
Thanks for the fine work. I finally got around to setting this up after figuring out that it wasn't thousands of URL's long and my server seems to be handling it well enough for now.
I have two suggestions for the script.
1) Add a MAXWEIGHT variable. If you score each line at say 4 points and set the MAXWEIGHT to 4, then the filter will stop processing on the first hit and save resources. I tried playing around with this to get it to work, but I'm totally clueless when it comes to batch file programming and I think I was hitting some sort of a reserved word.
2) Add the ability to remove listings contained in a text file (an exclude list). From looking over the current list of domains, there are the following:
- norton.com
- webhosting.yahoo.com
These were probably in spam, but they are not unique to spam. I also found an entry for "pe.kg" in the list which doesn't resolve and seems like it was probably from a parsing error. A list of top sites from a page linked to from the project's site shows that yahoo.com is one of the most frequently spamvertised domains, though this is clearly not listed in this file due to an exception on their end.
This type of test is definitely very vulnerable to pollution and it would be great to have a way to detect such problems and add them to a list for exclusion.
Long-term this is best suited for a DNS lookup due to various limitations of doing a contains filter, but for now, it seems to be working very well at adding points to things that are coming in below my drop weight, in fact it might very well be tagging the majority of what is scoring in my Hold level and pushing it over the top.
Matt
--- [This E-mail was scanned for viruses by Declude Virus (http://www.declude.com)]
--- This E-mail came from the Declude.JunkMail mailing list. To unsubscribe, just send an E-mail to [EMAIL PROTECTED], and type "unsubscribe Declude.JunkMail". The archives can be found at http://www.mail-archive.com.
