[Bug 1375] do RBL look-ups on URLs

bugzilla-daemon 25 Jan 2004 03:03:28 -0000

http://bugzilla.spamassassin.org/show_bug.cgi?id=1375






------- Additional Comments From [EMAIL PROTECTED]  2004-01-24 19:03 -------
Dan said:

'Confirming email addresses is something
we cannot do.  What about just looking up the A/MX record for the domain itself
and checking those?  That should be safe.  Nobody is going to register one
domain per spam victim, but doing a one-way hash between hostname and user is
not too hard.  It doesn't have to be something that easily decodes into an
email address, it could just be an English word in a table, like:

  Hamlet -> [EMAIL PROTECTED]
  Mouse  -> [EMAIL PROTECTED]'

ok -- how's about this algorithm:

1. split hostname into host, domain parts, e.g. "www.slashdot.org" becomes
"www", "slashdot.org"; "www.foo.co.uk" becomes "www", "foo.co.uk";
"three.levels.of.crap.foo.org" becomes "three.levels.of.crap", "foo.org". (we
already have a RE in 2.70 to match the CCTLDs that do ".co.uk"-style 
subdelegation.)

2. if host != "www", empty, or one of a known set of ok hostnames (determined
empirically from our corpora), then replace it with something different (like
random text) to avoid confirmation.

3. perform lookups etc.

This should still work OK, because:

1. spammers are using wildcard DNS to do addy confirmation (if they are) and to
evade URL filters with random hostnames (if they're not)

2. the level of granularity between a spammer URI and a nonspam one, will be at
the domain level.  Can anyone think of a case where

     host-a.domain.com = spammer
     host-b.domain.com = nonspam

?  All I can think of is something like demon.co.uk who assign subdomains, but
they have a strong antispam clue, do not have a spammer infestation, and are not
the kind of URLs we're talking about catching with these rules anyway.

3. spammers cannot register enough domains to act as addy confirmation
mechanism.  if they have a list of 300000 addresses, that'd require 300000
domains.  Expensive!




------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.

[Bug 1375] do RBL look-ups on URLs

Reply via email to