I'd like everybody to run it in a daily cron job (along with your mass-checks, if you're doing them).
http://www.chaosreigns.com/iprep/dl/iprep.pl Works like: ./iprep.pl ham:dir:~/masscheckwork/ham spam:dir:~/masscheckwork/spam/ Where the arguments are the same as for mass-check. Config file is ~/.ipreprc : $trusted_networks = ''; $user = 'username'; $pass = 'password'; Email me for an account. There's more detailed instructions in the perl script (like argument definitions, for those not familiar with mass-check targets.) It uploads IP address and date of each ham and spam to my server via rsync. (Everybody gets their own chroot jail, and I consider the data confidential.) I'm planning to aggregate the data and make it available as: IP <percent ham> <count> Where <count> is a logarithm of the total number of emails seen from that IP. And <percent ham> is normalized the same as the s/o value in ruleqa. And old values will receive less weight then new values. (Maybe 0.99^(age in days) ?) I kind of like the idea of only making the data available via rsync. Seems like it would reduce bandwidth usage, relative to serving via DNS? Next I'm planning to create a plugin to create tests to record values (like iprep_ham_<percent>, iprep_count_<count>). Then I can use them to determine what tests would be most useful. Output from my own corpora: http://www.chaosreigns.com/iprep/iprep.txt With 2618 hams, and 2956 spams, there were only *two* IP addresses that were not 100% spam or 100% ham. Both belong to google. For IPv6, I'm thinking about aggregating at /48, just because that's what he.net is letting me allocate. That leaves 80 bits of addresses. This is an attempt to deal with a problem Warren worded well: "IPv6 makes it possible to send one spam per IPv6 address and never run out of IP addresses". -- "For every complex problem, there is a solution that is simple, neat, and wrong." - H. L. Mencken http://www.ChaosReigns.com
