One more thing, make sure the files are owned by the proper account.  In
our case it's filter.users rw-r----- on the files.  If you copy them
with root you might have copied them as root.root rw-rw---- in which
case bayes can't read them.

BTW, We fixed this by putting the chmod statement into the spamd
startup/shutdown script.  It bit me too many times while manually
resetting the database.  I even wrote a wrapper for sa-learn for the
same reason.  Sometimes the journal file doesn't exist and when I
sa-learn root creates it causing it to be r/o to spamd.

Gary

-----Original Message-----
From: Steve Dimoff [mailto:[EMAIL PROTECTED] 
Sent: Thursday, July 15, 2004 11:28 AM
To: Gary Smith; Spam Admin; [EMAIL PROTECTED]
Subject: RE: Bayes Bit Me

Hmmm, I tried this.  Then when I run spamassassin on a sample spam email
it
says 0 spam in database, not using.  But the other machine works fine.

I restarted spamd.  Anything else I'm missing?

Steve


-----Original Message-----
From: Gary Smith [mailto:[EMAIL PROTECTED] 
Sent: Thursday, July 15, 2004 1:18 PM
To: Spam Admin; [EMAIL PROTECTED]
Subject: RE: Bayes Bit Me

It's common...  The short answer is yes, copy bayes_* over, and restart.
We have a job that does this every so often to our offsite cluster.

Actually, we do this on 6 machines.  We take the bayes db files one of
our load balanced nodes and copy it to a bunch of different servers.  We
have a single one that we use as a primary so we can manually feed it
daily.

Gary Wayne Smith

-----Original Message-----
From: Spam Admin [mailto:[EMAIL PROTECTED] 
Sent: Thursday, July 15, 2004 10:14 AM
To: [EMAIL PROTECTED]
Subject: Bayes Bit Me

I have a dual SA system, two different severs running identical configs.
As noted in prior posts, by primary MX box takes cares of the majority
of the load, but my secondary box still gets hits from spammers trying
to bypass spam filtering (expecting, I suppose, a lower level of
protection. That'll show 'em.)

I've never had to use the secondary box until yesterday afternoon, when
a clumsy co-worker accidentally pulled out the NIC cable on my primary
box. He didn't notice the transgression and for about 30-45 minutes my
secondary box picked up the slack. When I found out about the failure
and fixed it, I looked at the logs of the secondary box to see how well
it worked and noticed a CRAPLOAD of diversions to my quarantine email
account.

As I reviewed the quarantined emails (hundreds of them) the one thing
that stuck out was a BAYES_99 rule slap. Then it hit me: that secondary
box pretty much gets nothing but spam, so it's cynical view of the world
is that almost all email is spam. Thus, a lot of "good" email was
slapped with BAYES_99 and quarantined; I got hundreds of false
positives. Once the primary box came back up the problem went away and
everything was back to normal. I turned off Bayes on the secondary box
for now, but I need a longer-term solution.

I know you're going to tell me to feed email to Bayes to train it, but
that's a problem: I'm using the SA boxes as spam-filtering relays to my
internal GroupWise system. I've yet to figure out a way to get the email
back to the box for learning. The other option I'm considering is
copying the Bayes database from the primary to the secondary server, but
I'm not quite sure how to do that. Do I simply copy over the bayes_*
files and restart?

Worst case, I'll leave off the Bayes autolearn on the secondary and
continue relying on blacklists for the time being...

Thanks,

Greg Amy


Reply via email to