I know this hot potato has been discussed before - but I'm afraid it's
back to haunt me and I can't fathom it out. I'm getting again different
bayes results if I test a message on the command line, compared to it
going through exim -> spamassassin.
The header of the message received in the Inbox contains the following
report:
Content analysis details: (10.5 points, 4.2 required)
pts rule name description
---- ----------------------
--------------------------------------------------
0.4 STOX_REPLY_TYPE No description available.
3.0 DATE_IN_FUTURE_03_06 Date: is 3 to 6 hours after Received: date
3.2 BAYES_50 BODY: Bayes spam probability is 40 to 60%
[score: 0.5000]
0.0 MIME_QP_LONG_LINE RAW: Quoted-printable line longer than 76
chars
0.0 UNPARSEABLE_RELAY Informational: message has unparseable
relay lines
1.8 STOX_REPLY_TYPE_WITHOUT_QUOTES No description available.
2.1 FREEMAIL_FORGED_REPLYTO Freemail in Reply-To, but not From
While if I test it on the command line (spamc -R < /test_message.eml), I
get really different results:
ontent analysis details: (20.2 points, 4.2 required)
pts rule name description
---- ----------------------
--------------------------------------------------
4.9 BAYES_99 BODY: Bayes spam probability is 99 to 100%
[score: 1.0000]
0.4 STOX_REPLY_TYPE No description available.
3.0 DATE_IN_FUTURE_03_06 Date: is 3 to 6 hours after Received: date
8.0 BAYES_999 BODY: Bayes spam probability is 99.9 to 100%
[score: 1.0000]
0.0 MIME_QP_LONG_LINE RAW: Quoted-printable line longer than 76 chars
0.0 UNPARSEABLE_RELAY Informational: message has unparseable
relay lines
1.8 STOX_REPLY_TYPE_WITHOUT_QUOTES No description available.
2.1 FREEMAIL_FORGED_REPLYTO Freemail in Reply-To, but not From
On the command line it is hitting BAYES_99 and BAYES_999 - while through
Exim it doesn't. I know the first thing is to look for is file
permissions for the bayes databases. I've checked them. Also, I have
spamassassin listening on a TCP port - and both Exim and spamc connect
to it this way (I believe) - so permissions shouldn't make a difference
between the two methods of testing the email - is that correct?
Also, I use a site-wide bayes database - so only one set of files.
I'm running spamd under the "spamd" user - which owns the bayes database
files and directory:
/usr/bin/spamd -d -l --pidfile=/var/run/spamd/spamd.pid --username=spamd
What could possibly account for the large discrepancy in bayes results?