On 30/12/10 19:15, Lawrence @ Rogers wrote: > Lately, I notice we are getting a fair amount (10-12 per day per client) > of spam coming from freemail users (FREEMAIL_FROM triggers). Usually the > Subject is non-existent or empty, and the message is always just an URL
I see a fair amount matching that description, and corresponding complaints. In the past few weeks there seems to be a shift from Hotmail/MSN/Live to also use cracked Yahoo and AOL/AIM accounts. Someone at the freemail providers should know if passwords are obtained by phishing (such as tabnabbing) or a keylogger or even by a dictionary attack. There's no text to match Bayes or body rules; because the URL is on a cracked site, URIBL_* isn't usually appropriate; because it's from a cracked account, the headers are fine and it may even reach users who've chosen to only accept email from friends/contacts. More of the originating IPs should hit deep-parsing RBLs than actually do. So it could be argued that the nest response is not to block, but to let owners of cracked accounts know they need to change their password and secret questions (or close the account if it can't be recovered), and also to report the cracked sites and originating IPs, possibly by educating users about SpamCop. > Is there a good rule for flagging these as possible spam? I understand > that there may be some legit e-mails that would hit all 3 factors, so I > would score the rule low. > > Thoughts? Something like: meta FREEMAIL_PHARM_PROB ((FREEMAIL_FROM + MISSING_SUBJECT + LINK_NR_TOP) >=3) describe FREEMAIL_PHARM_PROB Looks like simple link from cracked account score FREEMAIL_PHARM_PROB 2.5 LINK_NR_TOP is the only additional element needed, to indicate message length: rawbody LINK_NR_TOP /^.{0,20}http:(?<!src=.http:)(?<!xmlns=.http:)\S{5,100}.{0,100}$/si describe LINK_NR_TOP Short message with link near top score LINK_NR_TOP 0.1 The length of text either side of the URL could be adjusted as needed. rawbody LINK_ONLY /^\s{0,20}http:\S{5,100}\s{0,100}$/si TVD_SPACE_RATIO usually hits when there is no whitespace, and could also be used in the meta, and GENERIC_IXHASH <http://sourceforge.net/projects/ixhash/> seems to hit a greater percentage than other body checksums (the body being empty or very short). Also there are short-lived patterns in the abusive file uploaded: uri FREEMAIL_PHARM1 /\/mtxtsx\.htm/ describe FREEMAIL_PHARM1 Particular link on cracked site, Jan 2011 score FREEMAIL_PHARM1 8.0 uri FREEMAIL_PHARM2 /\/(?:2011\.php\?\w+=\w+$|foto2011\.php|clickhere\.php|important\.php|mywork\.html)/ describe FREEMAIL_PHARM2 Particular link on cracked site, Jan 2011 score FREEMAIL_PHARM2 4.0 uri FREEMAIL_PHARM3 /\/\/[a-z0-9A-Z.-]+\/images\/[A-Za-z0-9\-]+\.(?:php|htm)/ describe FREEMAIL_PHARM3 Top-level images folder, php or htm extension score FREEMAIL_PHARM3 0.1 HTH CK