subject:"Re\: \[AMaViS\-user\] BAYES scoring is allowing more spam through\?"

Re: [AMaViS-user] BAYES scoring is allowing more spam through?

2006-10-08 Thread Matthias Andree

On Sat, 07 Oct 2006, Jo Rhett wrote:

 2. SA's self-registering stuff should be disabled - Bayes filters need
to know what the human that is to receive the messages considers
spam, not what a machine second-guesses.
 
 Thanks.  I knew most of this, which is why I wasn't going to bother 
 trying to train databases this time around.  But what do you mean by 
 self-registering ?

That SpamAssassin enters messages with a certain low or high score into
the Bayes filtering data bases automatically. ISTR SA calls this
autolearning or something like that (bear with me, I don't use it).

-- 
Matthias Andree

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
AMaViS-user mailing list
AMaViS-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/amavis-user
AMaViS-FAQ:http://www.amavis.org/amavis-faq.php3
AMaViS-HowTos:http://www.amavis.org/howto/

Re: [AMaViS-user] BAYES scoring is allowing more spam through?

2006-10-08 Thread Mark Martinec

Jo,

 Nope.  I'm coming off of using CanIt, which does a damn good job.  Since
 I can't afford the pro version for my personal machine, I was just
 checking out how good the open source interfaces have gotten.  (besides
 spamassassin, which is obviously the core of CanIt's rules too)
...
 I do plan to look at SARE and sa-update and various other things, but
 right now I'm working deliberately with as-bone-stock-as-possible just
 to see how well it works out of the box.

As an experiment it is valid to try bare-bones SA, although
for production use most sites use some subset of SARE rules,
and with more recent versions of SA the use of sa-update is
very much recommended, as it adds a couple of very useful
last-minute additions or fixes to base rules. It would not be fair
to judge SA without SARE, sa-update (and network test and bayes).

 I'm fairly pleased with amavis, but struggling with the lack of
 documentation.  Something that I plan to spend a lot of November
 improving, once I've gotten my head around all of it.

Patrick Ben Koetter  p at state-of-mind.de  is investing his time
in documenting amavisd-new, the project is progressing steadily.
If you have serious intentions in helping with documentation,
please contact him to avoid duplicating efforts.

  Mark

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
AMaViS-user mailing list
AMaViS-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/amavis-user
AMaViS-FAQ:http://www.amavis.org/amavis-faq.php3
AMaViS-HowTos:http://www.amavis.org/howto/

Re: [AMaViS-user] BAYES scoring is allowing more spam through?

2006-10-08 Thread Jo Rhett

Mark Martinec wrote:
 As an experiment it is valid to try bare-bones SA, although
 for production use most sites use some subset of SARE rules,
 and with more recent versions of SA the use of sa-update is
 very much recommended, as it adds a couple of very useful
 last-minute additions or fixes to base rules. It would not be fair
 to judge SA without SARE, sa-update (and network test and bayes).

I had come to that conclusion myself, but only by research.  This should 
be documented better (which leads to the next point...)

 Patrick Ben Koetter  p at state-of-mind.de  is investing his time
 in documenting amavisd-new, the project is progressing steadily.
 If you have serious intentions in helping with documentation,
 please contact him to avoid duplicating efforts.

Will do.  Thanks.

-- 
Jo Rhett
Network/Software Engineer
Net Consonance

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
AMaViS-user mailing list
AMaViS-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/amavis-user
AMaViS-FAQ:http://www.amavis.org/amavis-faq.php3
AMaViS-HowTos:http://www.amavis.org/howto/

Re: [AMaViS-user] BAYES scoring is allowing more spam through?

2006-10-07 Thread Matthias Andree

Jo Rhett [EMAIL PROTECTED] writes:

 I keep seeing spam messages with negative scores, like this:

 No, score=-0.317 tagged_above=-1.99 required=4.01
 tests=[BAYES_00=-2.599, DNS_FROM_RFC_ABUSE=0.2,   DNS_FROM_RFC_POST=1.708,
 HTML_30_40=0.374, HTML_MESSAGE=0.001, MIME_HTML_ONLY=0.001,
 NO_RECEIVED=-0.001, NO_RELAYS=-0.001]

 So I'm trying to figure out what amavisd is doing with the sa bayes
 stuff, but all I can find is a few notes that amavisd-new uses a single
 bayes database for all users.  Nothing about how to tweak it ...

Take this with a grain of salt, I'm biased (being the bogofilter co-maintainer):

IMHO the SpamAssassin Bayes stuff should be disabled -- for various reasons:

1. is that it causes a major performance issue, even on lightly loaded
   Pentium D and Xeon 2.8 equipped servers, which causes filtering to be
   aborted through amavis's timeouts (at least for my older amavisd-new
   version).

2. SA's self-registering stuff should be disabled - Bayes filters need
   to know what the human that is to receive the messages considers
   spam, not what a machine second-guesses.

3. centralistic Bayes filtering doesn't work well,
   as you've just seen in one of the failure modes that are sort of
   expected for this class of filters

4. per-user databases don't scale too well

5. Bayes requires users to be permanently alert and correct mistakes.

I'm not saying that SpamAssassin alone or Bayes are bad, but
SpamAssassin's Bayes implementation is, and integrating it with
amavisd-new doesn't ameliorate the pain.

3 to 5 are generic issues that affect all trainable/Bayesian filters.

-- 
Matthias Andree

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
AMaViS-user mailing list
AMaViS-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/amavis-user
AMaViS-FAQ:http://www.amavis.org/amavis-faq.php3
AMaViS-HowTos:http://www.amavis.org/howto/

Re: [AMaViS-user] BAYES scoring is allowing more spam through?

2006-10-07 Thread Jo Rhett

Gary V wrote:
 It's sad, but we are at war with the spammers and the virus writers
 and the adware companies and the hackers and the spyware writers.
 Their tactics change on a daily basis, and so must ours. War often
 requires effort, sorry.

I know that.  Which is why I keep pointing out that spammers have 
adopted tactics to deliberately poison bayes caches.  But everyone 
ignores that.

Unless the bayes mechanisms are updated to be smarter, the current crop 
of bayes-polluting auto-bot spam will invalidate everyone's bayes.

 What version of SpamAssassin are you running? Are you using any SARE
 rules?

Nope.  I'm coming off of using CanIt, which does a damn good job.  Since 
I can't afford the pro version for my personal machine, I was just 
checking out how good the open source interfaces have gotten.  (besides 
spamassassin, which is obviously the core of CanIt's rules too)

I'm fairly pleased with amavis, but struggling with the lack of 
documentation.  Something that I plan to spend a lot of November 
improving, once I've gotten my head around all of it.

I do plan to look at SARE and sa-update and various other things, but 
right now I'm working deliberately with as-bone-stock-as-possible just 
to see how well it works out of the box.

-- 
Jo Rhett
Network/Software Engineer
Net Consonance

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
AMaViS-user mailing list
AMaViS-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/amavis-user
AMaViS-FAQ:http://www.amavis.org/amavis-faq.php3
AMaViS-HowTos:http://www.amavis.org/howto/

Re: [AMaViS-user] BAYES scoring is allowing more spam through?

2006-10-07 Thread Jo Rhett

Matthias Andree wrote:
 Take this with a grain of salt, I'm biased (being the bogofilter 
 co-maintainer):
 
 IMHO the SpamAssassin Bayes stuff should be disabled -- for various reasons:
 
 1. is that it causes a major performance issue, even on lightly loaded
Pentium D and Xeon 2.8 equipped servers, which causes filtering to be
aborted through amavis's timeouts (at least for my older amavisd-new
version).
 
 2. SA's self-registering stuff should be disabled - Bayes filters need
to know what the human that is to receive the messages considers
spam, not what a machine second-guesses.
 
 3. centralistic Bayes filtering doesn't work well,
as you've just seen in one of the failure modes that are sort of
expected for this class of filters
 
 4. per-user databases don't scale too well
 
 5. Bayes requires users to be permanently alert and correct mistakes.
 
 I'm not saying that SpamAssassin alone or Bayes are bad, but
 SpamAssassin's Bayes implementation is, and integrating it with
 amavisd-new doesn't ameliorate the pain.
 
 3 to 5 are generic issues that affect all trainable/Bayesian filters.

Thanks.  I knew most of this, which is why I wasn't going to bother 
trying to train databases this time around.  But what do you mean by 
self-registering ?

-- 
Jo Rhett
Network/Software Engineer
Net Consonance

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
AMaViS-user mailing list
AMaViS-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/amavis-user
AMaViS-FAQ:http://www.amavis.org/amavis-faq.php3
AMaViS-HowTos:http://www.amavis.org/howto/

Re: [AMaViS-user] BAYES scoring is allowing more spam through?

2006-10-07 Thread Clifton Royston

On Sat, Oct 07, 2006 at 09:58:49AM -0700, Jo Rhett wrote:
 Matthias Andree wrote:
  Take this with a grain of salt, I'm biased (being the bogofilter 
  co-maintainer):
  
  IMHO the SpamAssassin Bayes stuff should be disabled -- for various reasons:
  
  1. is that it causes a major performance issue, even on lightly loaded
 Pentium D and Xeon 2.8 equipped servers, which causes filtering to be
 aborted through amavis's timeouts (at least for my older amavisd-new
 version).
  
  2. SA's self-registering stuff should be disabled - Bayes filters need
 to know what the human that is to receive the messages considers
 spam, not what a machine second-guesses.
...
 Thanks.  I knew most of this, which is why I wasn't going to bother 
 trying to train databases this time around.  But what do you mean by 
 self-registering ?

  autolearn - The default is for SA to feed into Bayes anything that
scores high enough (as spam) or low enough (as ham).  I agree that this
is a fundamentally and philosophically flawed idea.  At the least it
should not be a default and should use much further thresholds than it
did when I last checked.

  If you weren't aware of this, this may be how your Bayes DBs ended up
poisoned.  With autolearn it can happen anytime a new style of spam
comes through which is not caught by SA's existing rules.

  -- Clifton

-- 
Clifton Royston  --  [EMAIL PROTECTED] / [EMAIL PROTECTED]
   President  - I and I Computing * http://www.iandicomputing.com/
 Custom programming, network design, systems and network consulting services

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
AMaViS-user mailing list
AMaViS-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/amavis-user
AMaViS-FAQ:http://www.amavis.org/amavis-faq.php3
AMaViS-HowTos:http://www.amavis.org/howto/

Re: [AMaViS-user] BAYES scoring is allowing more spam through?

2006-10-07 Thread Gary V

Jo wrote:

 Gary V wrote:

 It's sad, but we are at war with the spammers and the virus writers
 and the adware companies and the hackers and the spyware writers.
 Their tactics change on a daily basis, and so must ours. War often
 requires effort, sorry.

 I know that.  Which is why I keep pointing out that spammers have 
 adopted tactics to deliberately poison bayes caches.  But everyone 
 ignores that.

I would say that because of the way Bayes works, this does
happen, but also because of the way Bayes works, it can be
prevented. I appreciate the fact you have illustrated that if
one chooses not to maintain their database, this may result.

 Unless the bayes mechanisms are updated to be smarter, the current crop 
 of bayes-polluting auto-bot spam will invalidate everyone's bayes.

This could be rephrased:

Unless the anti-spam mechanisms are updated to be smarter, the current crop
of spam will invalidate everyone's anti-spam mechanisms.

Not surprising then that a new version of SpamAssassin comes out so
often, and rules (SA or 3rd party) change on a nearly daily basis.

Personally, my only goal is to attempt to solve problems with what is
currently available. For your system, if your Bayes is corrupt, then
work toward setting it right or get rid of it. SpamAssassin will work
without it, and is designed to compensate for the loss.

Gary V


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
AMaViS-user mailing list
AMaViS-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/amavis-user
AMaViS-FAQ:http://www.amavis.org/amavis-faq.php3
AMaViS-HowTos:http://www.amavis.org/howto/

Re: [AMaViS-user] BAYES scoring is allowing more spam through?

2006-10-06 Thread Mark Martinec

Jo,

 Now, the spammers are putting lots of junk text in their spam and
 polluting the databases to such an extend that Bayes is much less useful.

 So I guess I'm saying that I have very little interest in spending the
 effort to retrain a new Bayes database, and none of my other users are
 capable or clueful enough to do so.

I hardly ever need to train bayes (1000 users, an organization, not an ISP),
I just feed it half a dozen spam messages per week that got through. It is 
essential that your other rules are good, including dcc, razor, uribls, 
sa-update rules, SARE rules, FuzzyOCR and possibly a handful of custom rules.
Also p0f rules help. So other rules apply their collective knowledge to 
auto-train bayes, and bayes pays back with its digested knoweledge, like
a large flywheel.

  Mark

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
AMaViS-user mailing list
AMaViS-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/amavis-user
AMaViS-FAQ:http://www.amavis.org/amavis-faq.php3
AMaViS-HowTos:http://www.amavis.org/howto/

Re: [AMaViS-user] BAYES scoring is allowing more spam through?

2006-10-06 Thread Jo Rhett

 Okay, so I used to deal with Bayes quite a bit.  I spent a very long
 time specially training my Bayes database, and it seemed to work.

 Now, the spammers are putting lots of junk text in their spam and
 polluting the databases to such an extend that Bayes is much less  
 useful.

 So I guess I'm saying that I have very little interest in spending  
 the
 effort to retrain a new Bayes database, and none of my other users  
 are
 capable or clueful enough to do so.

 Short of disabling Bayes entirely, is there anything I can do with
 minimal effort that yields a minimal return?  I'm just not convinced
 that Bayes databases are going to stack up well against the modern
 bayes-scatter spam.

On Oct 5, 2006, at 10:32 PM, Gary V wrote:
 If your effort was some time ago, those tokens are long gone by now.
 The rate at which new tokens are added is somewhat the same as the
...
 If you quarantine, consider feeding all the spam in your quarantine to
 Bayes (after you remove any false positives). Sometimes spam like the
..
 deletes everything once it is learned. I admit I don't do it every
 day, but at least two or three times a week. I also have a script that
 deletes anything older than 30 days from both accounts - simply so the
 server won't croak if I should happen to. :)

I really, really don't want to be rude but who are you replying to?   
You apparently didn't read a single word of what I wrote above.   
Really, not trying to be rude -- just can't follow this thread.

NO.  I won't spend another minute of my day training Bayes, and  
nobody else on this server is clueful enough.  Period, end of  
subject.  Done.

Now, how can I prevent Bayes from SUBTRACTING 2.6 from every message  
short of completely disabling it?

-- 
Jo Rhett
Senior Network Engineer
Network Consonance


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
AMaViS-user mailing list
AMaViS-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/amavis-user
AMaViS-FAQ:http://www.amavis.org/amavis-faq.php3
AMaViS-HowTos:http://www.amavis.org/howto/

Re: [AMaViS-user] BAYES scoring is allowing more spam through?

2006-10-06 Thread Jo Rhett

On Oct 6, 2006, at 1:26 AM, Mark Martinec wrote:
 I hardly ever need to train bayes (1000 users, an organization, not  
 an ISP),
 I just feed it half a dozen spam messages per week that got  
 through. It is
 essential that your other rules are good, including dcc, razor,  
 uribls,
 sa-update rules, SARE rules, FuzzyOCR and possibly a handful of  
 custom rules.
 Also p0f rules help. So other rules apply their collective  
 knowledge to
 auto-train bayes, and bayes pays back with its digested knoweledge,  
 like
 a large flywheel.

Okay, that's good.  Now, how can I avoid having Bayes subtract 2.6  
from every message until it gets a clue?

Also, are there any commands to see what bayes knows about, thinks  
about, etc?

-- 
Jo Rhett
Senior Network Engineer
Network Consonance


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
AMaViS-user mailing list
AMaViS-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/amavis-user
AMaViS-FAQ:http://www.amavis.org/amavis-faq.php3
AMaViS-HowTos:http://www.amavis.org/howto/

Re: [AMaViS-user] BAYES scoring is allowing more spam through?

2006-10-06 Thread Peter Olsson

On Fri, 6 Oct 2006 10:44 -0700, Jo Rhett wrote:

 Now, how can I prevent Bayes from SUBTRACTING 2.6 from every message
 short of completely disabling it?

These lines in /usr/local/etc/mail/spamassassin/local.cf
(or whatever path your local.cf is in) and then amavisd reload
should do it I think:

score BAYES_00 0 0 0 0
score BAYES_05 0 0 0 0
score BAYES_20 0 0 0 0
score BAYES_40 0 0 0 0

Those four categories are the ones with negative values as default.
See grep BAYES /usr/local/share/spamassassin/50_scores.cf.

Peter Olsson[EMAIL PROTECTED]

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
AMaViS-user mailing list
AMaViS-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/amavis-user
AMaViS-FAQ:http://www.amavis.org/amavis-faq.php3
AMaViS-HowTos:http://www.amavis.org/howto/

Re: [AMaViS-user] BAYES scoring is allowing more spam through?

2006-10-06 Thread Gary V

Jo wrote:

 Also, are there any commands to see what bayes knows about, thinks
 about, etc?

This one can (at least) show number of learned spam and ham:

su vscan -c 'sa-learn --dump magic'

0.000  0 158089  0  non-token data: nspam
0.000  0  19527  0  non-token data: nham

Gary V


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
AMaViS-user mailing list
AMaViS-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/amavis-user
AMaViS-FAQ:http://www.amavis.org/amavis-faq.php3
AMaViS-HowTos:http://www.amavis.org/howto/

Re: [AMaViS-user] BAYES scoring is allowing more spam through?

2006-10-06 Thread jrhett

 Jo wrote:
 I really, really don't want to be rude but who are you replying to?
 You apparently didn't read a single word of what I wrote above.
 Really, not trying to be rude -- just can't follow this thread.

On Fri, October 6, 2006 11:01 am, Gary V wrote:
 It is my opinion that only people who feel they are need to say
 stuff like that.

Yes exactly so.  If I had known you better, I would have been rude :-) But
without context and being new on the list, you wouldn't know me well
enough to know I'm joking.  People can take things far too seriously ;-)

 Now, how can I prevent Bayes from SUBTRACTING 2.6 from every message
 short of completely disabling it?

 adjust scores, here are likely current settings:

This I knew already.  I was questioning if doing so would make bayes
invalid enough that I should simply disable Bayes entirely?  There might
be more and better logic for this...



-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
AMaViS-user mailing list
AMaViS-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/amavis-user
AMaViS-FAQ:http://www.amavis.org/amavis-faq.php3
AMaViS-HowTos:http://www.amavis.org/howto/

Re: [AMaViS-user] BAYES scoring is allowing more spam through?

2006-10-06 Thread jrhett

 Jo wrote:
 I really, really don't want to be rude but who are you replying to?
 You apparently didn't read a single word of what I wrote above.
 Really, not trying to be rude -- just can't follow this thread.

On Fri, October 6, 2006 11:01 am, Gary V wrote:
 It is my opinion that only people who feel they are need to say
 stuff like that.

Yes exactly so.  If I had known you better, I would have been rude :-) But
without context and being new on the list, you wouldn't know me well
enough to know I'm joking.  People can take things far too seriously ;-)

 Now, how can I prevent Bayes from SUBTRACTING 2.6 from every message
 short of completely disabling it?

 adjust scores, here are likely current settings:

This I knew already.  I was questioning if doing so would make bayes
invalid enough that I should simply disable Bayes entirely?  There might
be more and better logic for this...



-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
AMaViS-user mailing list
AMaViS-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/amavis-user
AMaViS-FAQ:http://www.amavis.org/amavis-faq.php3
AMaViS-HowTos:http://www.amavis.org/howto/

Re: [AMaViS-user] BAYES scoring is allowing more spam through?

2006-10-06 Thread Gary V

jrhett wrote:

 Now, how can I prevent Bayes from SUBTRACTING 2.6 from every message
 short of completely disabling it?

 adjust scores, here are likely current settings:

 This I knew already.  I was questioning if doing so would make bayes
 invalid enough that I should simply disable Bayes entirely?  There might
 be more and better logic for this...

Adjusting Bayes scores does not affect Bayes. Bayes is not recursive.

If you know all the stuff that has been discussed, then you know
enough about Bayes to realize that it needs good data to make decisions.
A lot of stuff can be auto-learned, but when you find spam like the
one that scored BAYES_00, then you need to manually feed it to Bayes.
You desire to not spend any time doing that (none of us do), but I
don't see how you can correct your out-of-whack (possibly poisoned)
database without putting *some* sort of effort into retraining it
(even if that means dumping it and starting over - and letting it
simply autolearn - which is a valid option for a poisoned database).

It's sad, but we are at war with the spammers and the virus writers
and the adware companies and the hackers and the spyware writers.
Their tactics change on a daily basis, and so must ours. War often
requires effort, sorry.

Mark's suggestions are good ones: make sure everything else is working
as good as it can so it is more likely spam will score over your
autolearn threshold. I also suggested lowering your thresholds. The
only harm that zeroing out your lower scoring Bayes rules will cause
is your legitimate mail is likely to score higher. Whether it makes
a difference or not depends on your tag2_level, kill_level and in
general, what typical legitimate mail is currently scoring at. You
should probably keep an eye on it if you zero them out.

What version of SpamAssassin are you running? Are you using any SARE
rules?

Gary V


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
AMaViS-user mailing list
AMaViS-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/amavis-user
AMaViS-FAQ:http://www.amavis.org/amavis-faq.php3
AMaViS-HowTos:http://www.amavis.org/howto/

Re: [AMaViS-user] BAYES scoring is allowing more spam through?

2006-10-05 Thread Gary V

Jo wrote:

 I keep seeing spam messages with negative scores, like this:

 No, score=-0.317 tagged_above=-1.99 required=4.01
 tests=[BAYES_00=-2.599, DNS_FROM_RFC_ABUSE=0.2, DNS_FROM_RFC_POST=1.708,
 HTML_30_40=0.374, HTML_MESSAGE=0.001,   MIME_HTML_ONLY=0.001,
 NO_RECEIVED=-0.001, NO_RELAYS=-0.001]

 So I'm trying to figure out what amavisd is doing with the sa bayes
 stuff, but all I can find is a few notes that amavisd-new uses a single
 bayes database for all users.  Nothing about how to tweak it ...

This is a SpamAssassin issue, but generally it appears your Bayes
could use some manual training:
http://spamassassin.apache.org/full/3.1.x/dist/doc/sa-learn.html

Run sa-learn as the amavisd-new user, e.g. :
su vscan -c 'sa-learn --spam /path/to/spam'

If the spam is not on your amavisd-new system, see if you can save the
spam as plain text files (that are readable as such) and transfer them
to your amavisd-new system using something like WinSCP.

Having the mail in a format usable by sa-learn is a frequent topic on
the SA list. It's not an issue when the mail store is on the same server
as SpamAssassin, but often difficult when it is not. For example:
http://wiki.apache.org/spamassassin/SiteWideBayesFeedback

Gary V


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
AMaViS-user mailing list
AMaViS-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/amavis-user
AMaViS-FAQ:http://www.amavis.org/amavis-faq.php3
AMaViS-HowTos:http://www.amavis.org/howto/

Re: [AMaViS-user] BAYES scoring is allowing more spam through?

2006-10-05 Thread Jo Rhett

Gary V wrote:
 This is a SpamAssassin issue, but generally it appears your Bayes
 could use some manual training:
 http://spamassassin.apache.org/full/3.1.x/dist/doc/sa-learn.html

Okay, so I used to deal with Bayes quite a bit.  I spent a very long 
time specially training my Bayes database, and it seemed to work.

Now, the spammers are putting lots of junk text in their spam and 
polluting the databases to such an extend that Bayes is much less useful.

So I guess I'm saying that I have very little interest in spending the 
effort to retrain a new Bayes database, and none of my other users are 
capable or clueful enough to do so.

Short of disabling Bayes entirely, is there anything I can do with 
minimal effort that yields a minimal return?  I'm just not convinced 
that Bayes databases are going to stack up well against the modern 
bayes-scatter spam.

-- 
Jo Rhett
Network/Software Engineer
Net Consonance

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
AMaViS-user mailing list
AMaViS-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/amavis-user
AMaViS-FAQ:http://www.amavis.org/amavis-faq.php3
AMaViS-HowTos:http://www.amavis.org/howto/

Re: [AMaViS-user] BAYES scoring is allowing more spam through?

2006-10-05 Thread Gary V

Jo wrote:

 Gary V wrote:
 This is a SpamAssassin issue, but generally it appears your Bayes
 could use some manual training:
 http://spamassassin.apache.org/full/3.1.x/dist/doc/sa-learn.html

 Okay, so I used to deal with Bayes quite a bit.  I spent a very long 
 time specially training my Bayes database, and it seemed to work.

 Now, the spammers are putting lots of junk text in their spam and 
 polluting the databases to such an extend that Bayes is much less useful.

 So I guess I'm saying that I have very little interest in spending the 
 effort to retrain a new Bayes database, and none of my other users are 
 capable or clueful enough to do so.

 Short of disabling Bayes entirely, is there anything I can do with 
 minimal effort that yields a minimal return?  I'm just not convinced 
 that Bayes databases are going to stack up well against the modern 
 bayes-scatter spam.

If your effort was some time ago, those tokens are long gone by now.
The rate at which new tokens are added is somewhat the same as the
rate that old tokens are discarded. I assume you are auto-learning.
It looks like if nothing else, you might have to feed stuff like the
spam in question to Bayes. You can adjust the auto-learn thresholds
down, this might help a little, possibly something like:

bayes_auto_learn_threshold_nonspam -0.5
bayes_auto_learn_threshold_spam 9.0

If you quarantine, consider feeding all the spam in your quarantine to
Bayes (after you remove any false positives). Sometimes spam like the
one in question will score low on Bayes, but high enough on other stuff
to get quarantined. I have a relay server but I use the virtual map to
fork a copy of my personal mail to local Maildir style folders. I also
quarantine locally to a mail box and use Courier IMAP to access these
two accounts. I move spam in my account to a spam folder and check the
quarantine for false positives (which almost never happens). Once this
is handled, I run a script to learn everything - ham and spam. Everything
in both accounts is disposable at this point, so my script also
deletes everything once it is learned. I admit I don't do it every
day, but at least two or three times a week. I also have a script that
deletes anything older than 30 days from both accounts - simply so the
server won't croak if I should happen to. :)

Gary V


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
AMaViS-user mailing list
AMaViS-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/amavis-user
AMaViS-FAQ:http://www.amavis.org/amavis-faq.php3
AMaViS-HowTos:http://www.amavis.org/howto/

Re: [AMaViS-user] BAYES scoring is allowing more spam through?

Re: [AMaViS-user] BAYES scoring is allowing more spam through?

Re: [AMaViS-user] BAYES scoring is allowing more spam through?

Re: [AMaViS-user] BAYES scoring is allowing more spam through?

Re: [AMaViS-user] BAYES scoring is allowing more spam through?

Re: [AMaViS-user] BAYES scoring is allowing more spam through?

Re: [AMaViS-user] BAYES scoring is allowing more spam through?

Re: [AMaViS-user] BAYES scoring is allowing more spam through?

Re: [AMaViS-user] BAYES scoring is allowing more spam through?

Re: [AMaViS-user] BAYES scoring is allowing more spam through?

Re: [AMaViS-user] BAYES scoring is allowing more spam through?

Re: [AMaViS-user] BAYES scoring is allowing more spam through?

Re: [AMaViS-user] BAYES scoring is allowing more spam through?

Re: [AMaViS-user] BAYES scoring is allowing more spam through?

Re: [AMaViS-user] BAYES scoring is allowing more spam through?

Re: [AMaViS-user] BAYES scoring is allowing more spam through?

Re: [AMaViS-user] BAYES scoring is allowing more spam through?

Re: [AMaViS-user] BAYES scoring is allowing more spam through?

Re: [AMaViS-user] BAYES scoring is allowing more spam through?

19 matches

Site Navigation

Mail list logo

Footer information