So, as you may be aware, we have a minor issue in terms of figuring out
which whitelisted domains should be skipped in queries.

  SpamAssassin now ships with list of domains that are excluded for
  SURBL lookups from the SURBL whitelist.  This list is the 125
  most commonly queried domains.

  SURBL counts the number of queries each domain receives to track the
  most commonly queried domains so we can produce an accurate list of
  domains.

  But, once we skip a domain, its relative volume is going to drop way
  off in the SURBL data.

One idea I had to fix this is that SA not use the SURBL whitelist for 1
in 10 queries and that those be directed to a different zone.  However,
that would be somewhat counterproductive in terms of DNS caching and I'm
not sure how happy Jeff would be about the idea.

Another way would be to not use the exclusion list for certain periods
of time if you could select just those times for generating volume
data.  A bit too hacky.

Another way to fix the problem would be to rank the domains with some
other source of volume data (not SURBL-related) such as looking at a DNS
cache at a large ISP.

Any other ideas?

Daniel

-- 
Daniel Quinlan
http://www.pathname.com/~quinlan/

Reply via email to