[Bug 7890] Integration with IMAP servers

bugzilla-daemon Mon, 15 Mar 2021 15:38:38 -0700

https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7890


--- Comment #8 from [email protected] <[email protected]> 
---
(In reply to RW from comment #7)
> (In reply to [email protected] from comment #6)
> > (In reply to RW from comment #5)
> > > (In reply to [email protected] from comment #4)
> 
> > It is still polling -- an inferior method: increasing frequency increases
> > load, but the reaction is still delayed.
> 
> It makes little difference as long as the interval is small compared
> the typical time the users take to react to misclassifications.

It should still be done without /further/ delay. Also, Thunderbird's own Bayes
is invoked automatically, without user's own actions.

"Small" interval, means it is done too often -- and still, there is a delay of,
on average, half the polling interval. This is an inevitable flaw of polling.

> I use ls to determine whether there is anything in a training folder before
> running sa-learn on it. Typically it isn't even accessing the drive as it's
> working on cached metadata.

Human beings cannot distinguish a millisecond from a microsecond. That's not a
good reason to not care about things taking 1000 times longer, than they need
to take...

> I'm not sure that your idea can be made reliable without keeping an extra
> database or doing a periodic full retrain.

Such a retrain can still be done -- via cron -- but a lot less often. Say, once
a day, or even at reboot.

> spamc can be used to train to spamd if you prefer

Really? Can you elaborate? If spamc can -- without itself loading the Bayesian
functionality -- tell spamd to process yet another file (as either spam or
ham), that will solve a big part of the problem.

One'd still need a daemon, but it can be as simple as inotifyd...

> Doing it from the IMAP server has the advantage that you can train as ham
> when mail is moved from the spam folder, and it can distinguish the special
> case of spam being sent to a trash folder.

Yes, that is the situation I'm describing here:
1. sa-learn runs on the same machine as the imap-server.
2. sa-learn trains the same database used by spamd guarding the incoming mail
to the same server.

> I'm not sure you should even be training directly on a Cyrus mailbox, I
> think they contain additional metadata files.

Yes, there are metadata files there, but they are not appearing /anew/. Unlike
e-mail messages, which appear as new files, one message per file. Very
convenient.

> Training from IMAP would avoid any problems around that.

Teaching imap to talk to spamd's database is (much) harder, than teaching spamd
to monitor a few directories.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 7890] Integration with IMAP servers

Reply via email to