Hi,

>> > >> http://pastebin.com/P0cJdf2V
>>
>> The URLs in the body of these messages don't give consistent results for
>> a domain lookup and a reverse lookup on the IP:

I was hoping to be able to write a rule based on a short message body
that also simply contained a URL. I thought this would be a good basis
for a meta, perhaps with RDNS_NONE or BAYES_99. However, I've fallen
far short in my attempt:

body            __SHORT_BODY    /.{1,150}$/
describe        __SHORT_BODY    Short email body
body            __BODY_URI      m{https?://.{1,50}$}
describe        __BODY_URI      Message body contains URI
meta            LOC_SHORT       (__SHORT_BODY && __BODY_URI)
describe        LOC_SHORT       Contains short body and URI
score           LOC_SHORT       0.2

I'd appreciate it if someone could help me create rules to identify a
message body less than 150 chars and contains URL less than 50 chars.

Would it make sense to parse the interpreted HTML or analyze the
rawbody directly? Many times the spam doesn't contain any HTML at all.

Thanks,
Alex

Reply via email to