I'm still fiddling around with these spams that have a bunch of one-letter
words hiding drugs for sale:

    V k I p A m G i R u A v
    V j A v L s I t U w M g
    X g A f N a A f X q
    C x I e A a L g I c S l

followed by a url:

    http://www.prouceteir.com

followed by some presumably benign text:

    physiolog
    resis
    comminute
    Phoeb
    ideologis
    not called for; local anesthetics were sufficient for the cleansing and
    suturing, followed by generous injections of antibiotics. The foreign
    objects had passed through their bodies, explained the chief doctor.
    I presume you mean bullets when you speak so reverently of foreign
    objects, said Krupkin in high dudgeon.
    He means bullets, confirmed Alex hoarsely in Russian. The retired

I don't think there's much to grab onto in the benign text section, however
the url tends to vary a lot and the domain name generally seems very new.
For instance, according to whois, the above domain was created on April
28th.  I received the spam it contained on April 30th.  The others of this
ilk I've looked at were also new domains.  That suggests to me a couple
possibilities:

    * look up the age of the domains via whois (preferably caching those
      lookups for a reasonable period - 90 days, one year?)

    * note whether or not you've seen the domain before

    * lookup (and cache) other information about the domain name -
      registrar, registrant, etc.

The creation date currently seems the hardest to fake, though it's expensive
to calculate and I suppose eventually the spammers will start creating their
own registrars (if they haven't already) and back-date the information they
provide.

I suppose you could start tokenizing these one-letter runs as well and see
if they contain embedded words:

    C x I e A a L g I c S l ==> CIALIS

Thoughts?  Anybody else seeing lots of this stuff sneak through as unsure?

Skip
_______________________________________________
spambayes-dev mailing list
spambayes-dev@python.org
http://mail.python.org/mailman/listinfo/spambayes-dev

Reply via email to