I could be wrong, but I believe for the learning process to be useful,
you also need to learn HAM.
(IIRC, an equal amount of each.)
Evan
NGSS wrote:
Hi,
I am losing confident in SA, the training process is pretty slow or it
doesn’t seem to be learning.
I am training SA with around 30-50 manually identified spam (moving
spam mails to and spam folder created in squirrelmail and crond the
sa-train command on that folder every hour to train and delete them).
The script is tested to be working on the shell before I put it on crond
However, I found that the learning process is either not right or it
is rather slow.
I gone through the headers of the spams and found that even almost
identical (in content) spams always got a score 0.1 and these spams
are received on separated occasions across several days. This had made
me losing confident on SA.
I wonder if had it setup correct to detect and learn spams . I am
using a default setup from qmail-toaster cnt50 , do I need more
filters to harden my defense? Any recommendations you will be
appreciated.
Here are sample samples I taken from my mailbox on this server,
(eg, sample spam 1 and 8 are almost identical in content but they are
both scored with only 0.1 … : (
http://www.keac.com/id3303/spam-egs.txt