Re: Spam from compromised accounts scoring just under block threshold
Following up on this... I've been consistently seeing a lot of spam like this, with multi-dot usernames. Sometimes with "person.from.spam" but more often just a punctuated phrase like "some.spammy.item.sold" or whatever. Most often only two dots (three words), sometimes four or more. Has anyone been testing this as a meta rule? Cheers. --- Amir > On Mar 6, 2018, at 9:37 AM, John Hardinwrote: > > On Mon, 5 Mar 2018, Amir Caspi wrote: > >> On Mar 5, 2018, at 11:13 PM, John Hardin wrote: >>> >>> *before* the @ sign. >>> >>> It may be perfectly valid to do that, but if it happens more often in spam >>> than in legitimate mail it is useful to us. >> >> I’m seeing a lot of spam lately with usernames like >> “bob.from.somespamcompany”. Could definitely be at least a meta rule. > > ...or potentiallyfrom:addr =~ /[^@]*\.from\.[^@]*@/if ".from." is > literally in the username part. > > -- > John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ > jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org > key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 > --- > Failure to plan ahead on someone else's part does not constitute > an emergency on my part. -- David W. Barts in a.s.r > --- > 5 days until Daylight Saving Time begins in U.S. - Spring Forward
Bayes and hyphens
Hi all, Does Bayes tokenize on word boundaries and hence would ignore hyphens? Or does it include them? I've seen a lot of spam lately inserting random hyphens between key spammy words (like "economic-crisis"), presumably in an attempt to bypass word filters and/or Bayes. So would word1-word2 get tokenized as a single item or as two words? If hyphens are currently included, then perhaps Bayes should be updated to ignore hyphens and/or tokenize at word boundaries? Cheers. --- Amir
Re: Lots of money, score of 0??
On Thu, 29 Mar 2018 08:50:48 -0700 (PDT) John Hardin wrote: > On Thu, 29 Mar 2018, RW wrote: > > > The rule is matching on "$10.99 o" and "£1.70 2 6" respectively. > > Sadly that's kind of unavoidable given spammer obfuscation and the > fact that cultures differ on what character to use for the decimal > point and thousands separator. > > > I've seen other types too, e.g. > > > > https://example.com/?f=a37688909bc4f6 > > > > £20 M voucher > > *that* is a bit unexpected... It's understandable though because it's "£20 M" followed by a word boundary. The other one could be seen as a bug, __LOTSA_MONEY_01 is an ordinary body rule, so a "=a3" that represent a "£" should have already been decoded.