Hi SA experts,
We have procmail filters that see emails before SA. They can:
1. whitelist emails direct to our Inbox,
2. send emails to direct to the bit bucket (/dev/null)
3. send emails to the Junk folder for review, or
4. leave them for processing by SA.
So SA never sees the emails in categories 1-3.
SA can also send emails to /dev/null or send emails to the Junk
folder for review.
I've been saving up emails with which to train SA's Bayesian filters
for some time now, in 3 categories:
a. spam that was sent to the Junk folder by custom filters and SA,
b. spam that got through, and
c. ham for the same data range (last 6 months)
So, all 3 categories include emails that SA has already seen and
presumably included in its Bayesian filters, and emails that it has
never seen.
My question is, should I write a program to take out emails that SA
has already seen before I send them through Bayesian processing, or
is it smart enough not to process those again?
Best Regards,
Craig MacKenna