Jon Carnes wrote:
You could expand the earlier script and add a nospamdb file (using ip's that should be ignored by the script. To do so, simply add a line to exit the script if the ip is in your nospamdb file: if (`grep -wq $BADIP nospamdb`); then exit; fi
Also, with a bit of trial and error you can gauge just how many entries will be in your info file after a minute of being attacked, and instead of grepping the whole file, you can simply grep the end of the file. To grep the last 200 entries: tail -200 $INFO |grep $ENTRIES |grep " 550 " | ...
This is extremely fast and makes the script take under a second to execute.
Jon
Disclaimer: This rant is my opinion. Take it with a grain of salt.
Part of my point was that yes, this concept is entirely possible. It's just one of those projects that would start as a simple script and then consume your entire life as you tried to properly implement it. :) As a suggested improvement on the theme, your script should be run as a daemon (to avoid startup costs) and you should open the log file for reading directly in PERL. Sleep for 60 seconds, then read from your current point in the file to the end of file. Sleep for another 60 seconds, repeat. Also by keeping a single running process, this allows you to easily keep a hash of the senders, and how often they've sent in the past 5 mins, or hour, or how ever long you prefer. You can keep a separate hash of senders who've sent you more than X messages in the past 24 hours, and each 24 hour period that person has sent you more than 5 valid messages, you bump the count on their domain - this way you can naturally develop a "learned" white list. Anyone in that 24 hour hash who's value is more than 5 (they've sent you more than 5 valid messages on 5 separate days) is considered to be a safe sender. You can of course tune those values to be appropriate for your domain. Don't forget to write them out to a file every X hours so that you won't loose your learned white list every time you restart.
Since you don't have the overhead of start-up costs, you can even check the log file more often, say every 15 seconds. This would allow you to respond more quickly to bursts of traffic (the whole purpose of the script to start with). By the time you've gotten this far, you've begun to realize how much faster (yet less flexible) this program would be in C, as opposed to PERL. Once you've fully explored the problem domain (and worked out all of the problem-specific bugs), you begin to rewrite it in C, for performance. Then, and only then, have you reached a point where you're saving cpu cycles as well as bandwidth by running the daemon. Of course, then as things settle, you look around, upgrade Postfix, and realize that months ago anvil(8) stabilized, and it's a better implementation (as it doesn't have to use the log file and the file system as intermediate steps) which solves the same problem. :) At this point, you realize that if only you could share this information with others, you'd be one up on anvil(8) again. All that work would not have been in vain. If only you could provide the information as a blacklist - perhaps through DNS! Then hopefully, before you implement pushing the block-data out to zone files, you realize that you'd be writing yet-another DNS blacklist based on tracking zombies. :)
Can you tell that I've been down this road (with similar-yet-not-the-same projects) before? :) I don't entirely want to discourage anyone from it, as it can be a very useful learning experience, and a great way to learn PERL and C if you don't know them already. But the particular situation (gah, block the bastards from storming me with mail!) is a very natural knee-jerk response, yet frequently not the best response, in my experiences.
Aaron S. Joyner -- TriLUG mailing list : http://www.trilug.org/mailman/listinfo/trilug TriLUG Organizational FAQ : http://trilug.org/faq/ TriLUG Member Services FAQ : http://members.trilug.org/services_faq/ TriLUG PGP Keyring : http://trilug.org/~chrish/trilug.asc
