Soeren D. Schulze wrote:
> Hello,
>
> I found the following patch:
>
> http://da.andaka.org/Doku/imapspamfilter.html
>
> To describe it briefly, it automatically trains the SPAM filter when the 
> user moves messages to a SPAM or HAM folder.
>
> First, what do you think about this in principal?
>
> I see two design issues:
> 1. The user does not have the chance to use his own preferred settings, 
> as everything is controlled by an environment variable.
>   

what would be the benefit of that? too much flexibility kills usability!

Here is the setup I use:

- maildrop puts spam in .Junk/ folder, if this folder exists (so user 
can "optout" by deleteing this folder).
- False positives: user can move a message to .Junk.Error/ (false positives)
- False negatives: user can confirm spam or puts undetected spam message 
in .Junk.Trash/
- sa-learn is called on FPs and FNs. the messages are then moved to an 
"invisible" folder (no leading dot) to avoid having them around. They 
may be deleted or kept for use as a "corpus".

if user doesn't move messages, no sa-learn. If user complains about 
errors, he gets a recommendation to move messages.


> 2. The server freezes until the SPAM learner has done its job.
>   

This is bad indeed. he could hard link the file and run a bg process on 
the new link.

> Personally, I would solve it by specifying a new column (or more than 
> one) in the user database which includes the SPAM policy.  The learning 
> would be done in the background without the server waiting for the 
> process to finish.
>
> I am ready to do the coding, but as I am quite new to Courier, I would 
> like to hear about your opinion.
>   

I'm not convinced of the value of this compared to a periodic job. after 
all, the user doesn't sort his mail at delivery time, so why hurry?
regarding the "reprocess" issue, it is enough to move the processed 
messages out of the way. If they must stay in place, then an "ln" may be 
done one all the files and if it fails, the message is not processed (ln 
cur/$f domedo/$f && do_process domedo/$f). This still requires reading 
the whole directory, but what? "large" folders cause performance issues 
anyway...





-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
courier-users mailing list
[email protected]
Unsubscribe: https://lists.sourceforge.net/lists/listinfo/courier-users

Reply via email to