On Thursday, September 9, 2004, 2:28:07 PM, Ryan Thompson wrote:
> However, for all we know *so far*, 219.254.32.111 could be a HA cluster
> of a few dozen machines, and, while there may be 200 pill spammers on
> that cluster, there may be 20,000 other legit sites.

> With our current data, we can't make either determination. But, using
> forward zone data, we can do forward lookups, and track them in a database.
> Then, do forward lookups on SURBL data to get the IPs of spammers, and
> (algorithmically!) find correlations.

> The programming effort to implement this would not be trivial, not to
> mention processing power and bandwidth, to do the initial run. The
> datasets (.com!) are huge. After that, we just have to periodically
> sample for new, removed, and changed domains, at which point the
> processing will be reduced.

.com is so large and rapidly changing as to be practically
unknowable.  That's what I mean by "can't".

By the time you have all of .com fully cataloged, it will have
changed significantly.

Really the only ones who could collectively determine how spammy
a particular virtual host IP is are the domain registrars working
together and pooling all their registration data then resolving
every hostname and building a database of all the resolved IPs
mapped back into all of their domain names.

If you can't see all the good guy domains on a virtual hosting
IP, then you can't see who else you would block.

Jeff C.

Reply via email to