On Thursday, September 9, 2004, 2:28:07 PM, Ryan Thompson wrote: > However, for all we know *so far*, 219.254.32.111 could be a HA cluster > of a few dozen machines, and, while there may be 200 pill spammers on > that cluster, there may be 20,000 other legit sites.
> With our current data, we can't make either determination. But, using > forward zone data, we can do forward lookups, and track them in a database. > Then, do forward lookups on SURBL data to get the IPs of spammers, and > (algorithmically!) find correlations. > The programming effort to implement this would not be trivial, not to > mention processing power and bandwidth, to do the initial run. The > datasets (.com!) are huge. After that, we just have to periodically > sample for new, removed, and changed domains, at which point the > processing will be reduced. .com is so large and rapidly changing as to be practically unknowable. That's what I mean by "can't". By the time you have all of .com fully cataloged, it will have changed significantly. Really the only ones who could collectively determine how spammy a particular virtual host IP is are the domain registrars working together and pooling all their registration data then resolving every hostname and building a database of all the resolved IPs mapped back into all of their domain names. If you can't see all the good guy domains on a virtual hosting IP, then you can't see who else you would block. Jeff C.