Re: A little help with a local.cf rule... please!

2009-12-30 Thread Michael Alan Dorman
 So my rule:
 # hotmail drug spam
 uri MY_HOTMAIL_SPAM
 m{https?://{1,30}\.{1,30}\.(com|ru|cn)/[0-9][0-9][0-9][0-9]/i}
 describe MY_HOTMAIL_SPAM Druggy hotmail.com links
 score MY_HOTMAIL_SPAM 5.0
 
 And running emails through it using -D, it does not hit it as far as
 I can tell - scores 3.5 due to other tests.
 Yes, it IS reading it cause if I mess with the rule and make it have
 bad syntax, SA --lint complains loudly. Right now, no complaints -
 and no results.
 Any ideas? Suggestions?

//{1,30} matches a slash, followed by 1-30 more slashes.
\.{1,30} matches 1-30 periods.

I think you forgot a \S or something before each of those.  Also,
[0-9]{4} would do what you want for numeric component.  And I think you
want the i *after* the bracket, no?

Mike.


Re: Dear Santa

2009-12-20 Thread Michael Alan Dorman
On Sat, 19 Dec 2009 10:06:11 -0600
Dave Pooser dave...@pooserville.com wrote:
 share the code so that some of us could auto-generate rules based on
 our own ham/spam mailstreams, and then share those rules with you for
 possible SOUGHT inclusion?

I think that's already done, though not well documented; check
$SRC/masses/rule-dev. The blog posts that are referenced in the sought
page on the wiki talk about the process some.

Mike.


Re: Serious problem with scores file for todays rule update?

2008-12-30 Thread Michael Alan Dorman
On Tue, 30 Dec 2008 09:55:52 +
Justin Mason jma...@gmail.com wrote:

 Does the sa-compile step complete with an exit code of 0?  If there
 are problems with re2c (which has happened in the past) it should exit
 with !=0.

There were no errors visible in the output, but the script I was using
to do the update is, of course, one of the few that I've written
without using /bin/sh -e, so even if sa-compile had failed, it would
have continued.

I suspect we can mark this down to re2c not liking something yesterday
+ I/O error.

Thanks, Justin,

Mike.


Serious problem with scores file for todays rule update?

2008-12-29 Thread Michael Alan Dorman
Hey, all,

I have a bunch of servers that picked up a rule update, 729912 this
morning about 10am EST, at which point all hell broke loose---scores for
everything but bayes dropped to almost nothing.

Has anyone else experienced anything like this?

Mike.


Re: Serious problem with scores file for todays rule update?

2008-12-29 Thread Michael Alan Dorman
On Mon, 29 Dec 2008 23:21:48 +
j...@jmason.org (Justin Mason) wrote:

 hmm.  What do you have in /var/lib/spamassassin for the scores files?
 they should look like this:
 
 : 183...; ls
 -l /var/lib/spamassassin/3.002006/updates_spamassassin_org/50_scores.cf  
 /var/lib/spamassassin/3.002006/updates_spamassassin_org/72_scores.cf
 -rw-r--r-- 1 root root 48928 Dec 29
 23:20 /var/lib/spamassassin/3.002006/updates_spamassassin_org/50_scores.cf
 -rw-r--r-- 1 root root  1392 Dec 29
 23:20 /var/lib/spamassassin/3.002006/updates_spamassassin_org/72_scores.cf

Hey, Justin, thanks for the quick response.

My 50_scores.cf is 48923, so it differs, but close enough.

In fact, it didn't occur to me immediately, but further investigation
(for lack of a better word for the last rather tense 45 minutes :) seems
to be pointing the finger at sa-compile, rather than the scores.

I got fixated on 72_scores.cf and totally forgot about 50_scores, which
is why I was thinking scores at first.

I'll be doing more testing and such, later, but just zapping the
compiled files and restarting the processes seems to have taken care of
it.

If there's anything in particular you'd like me to do to try and help
track the interaction down, please let me know.  I'm using re2c 0.13.5
on debian amd64 boxes, and am happy to throw some time and resources at
figuring out what's going on.

Mike.


Re: BOTNET Exceptions for Today

2007-08-21 Thread Michael Alan Dorman
On Tue, 21 Aug 2007 16:56:27 -0500
Andy Sutton [EMAIL PROTECTED] wrote:

 On Tue, 2007-08-21 at 13:42 -0700, John Rudd wrote:
  b) Botnet gets 0% false positives at one of my services (not just 
  borked DNS == bad, as you're suggesting, but actual everything
  that triggered botnet was actually spam).  And, yes, I actually
  check
 
 I never suggested that.

Um, you suggested _exactly_ that.  From the message John was replying to
([EMAIL PROTECTED]):

  On Tue, 2007-08-21 at 13:08 -0700, Bret Miller wrote:
   When I see on the list that many people run botnet with ZERO false
   positives, I have to ask myself, how?   

  Anyone who claims that isn't really looking at the email they are
  blocking, or don't believe borked DNS qualify as a FP.

 A bit tetchy today?

When you're presenting hyperbole as reasoned commentary, seems to me
John has a right to be tetchy.

If you had said what you said in this message originally, I suspect you
would have gotten a different response.

Mike.


Re: Bayes column 'token'

2006-11-21 Thread Michael Alan Dorman
On Tue, 21 Nov 2006 13:42:09 +0100
Jonas Eckerman [EMAIL PROTECTED] wrote:

  CREATE TABLE bayes_token (
PRIMARY KEY (id, token),
INDEX bayes_token_idx1 (token),
INDEX bayes_token_idx2 (id, atime)
  ) TYPE=MyISAM;
 
  PRIMARY for `id` and `token` should not have INDEX for `id` and
  `token` added, too.
 
 Why not?
 
 IIRC the three indexes above makes perfect sense. Like this:
 
 WHERE id=xxx AND token=xxx will use the primary index.
 
 WHERE token=xxx will use the bayes_token_idx1 index.
 
 WHERE id=xxx AND atime=xxx will use the bayes_token_idx2 index.
 
 Again IIRC, the clause WHERE token=xxx should be faster with the
 existance of the bayes_token_idx1 index than without it.

If the primary key was changed to (token, id), it should be able to be
used in the second sort of query as well as the first, no? Or is MySQL
not smart enough to recognize that it's got an index it could match on
a prefix basis?

 Or is it simply that the MySQL bayse store module never queries with
 token as the first column in a WHERE clause?

The position of a column in the WHERE clause shouldn't make a
difference whether an index is used; the nature of SQL is such that
WHERE clauses should be reorderable.  I'm a PostgreSQL guy myself, but
I would still be surprised if MySQL were limited in this way.

Mike.


Re: RelayChecker 0.3

2006-11-17 Thread Michael Alan Dorman
On Thu, 16 Nov 2006 17:56:21 -0800
Derek Harding [EMAIL PROTECTED] wrote:

 On Sun, 2006-11-12 at 17:26 -0800, John Rudd wrote:
 
  http://people.ucsc.edu/~jrudd/spamassassin/RelayChecker.tar
 
 I've been running this for a few days now and am finding it to be
 pretty effective, especially against the bots that are producing all
 the image spam.
 
 Currently it's running about 87.55% hit rate with only two false
 positives so far (one a company on adsl, the other a mail server with
 no reverse DNS).

For reasons that I haven't investigated closely, I'm finding
RelayChecker consistently tags mail from the dojo toolkit's mailing
list as well as the catalyst toolkit's mailing list.

I lowered the score from 6 to 4.5, though, and it's continued to be
effective, while letting those emails through.

Mike.