On 23 Jun 2015, at 14:58, Michael B Allen wrote:

On Tue, Jun 23, 2015 at 12:48 PM, Bill Cole
<sausers-20150...@billmail.scconsult.com> wrote:
Yes, I want a system-wide bayes db. And I am running spamd and spamc
and I assume that is all working (but of course I have no idea if it
really is).

But I want users to be able to put spams that get through into
~/Maildir/.LearnAsSpam and then, every once in a while, I want to run
sa-learn on all of those messages for the system-wide db.

So can that be done without running sa-learn as root?


Of course. As I said in other words that you quoted but apparently
misunderstood:

***** sa-learn IS NOT THE RIGHT TOOL FOR LEARNING MESSAGES INTO A
SYSTEM-WIDE DB ****

Use 'spamc -L (spam|ham)'. Have users run it if they like, or have it run as the user whose magic maildirs are being learned. It talks to the spamd daemon, running as the spamd user, managing the system-wide Bayes DB. If it isn't run as root, it can't do random violence limited only by your capacity
for typos.

Well, ever since we stopped using The UNIX® Time-Sharing System back
in '87 generally "users" don't run stuff on their own like this
anymore.

Sure, but if you're using Real Users (i.e. if diverse ownership of Maildirs is an actual system issue) then maybe you populate a crontab for each one as well. Or not. My point is that if you have spamd running as the user spamd, it will only ever operate based on the SpamAssassin configuration for the user spamd, never as if it were root. No matter how you run spamc, it can't make spamd break ownership of the DB files so that spamd can't continue to use them. Because sa-learn running as root is a root process manipulating files itself (not mediated by spamd) you need to be careful about how you invoke it because you MIGHT end up with something like this:

# ls -l ~spamd/.spamassassin/
total 400461
-rw-------  1 spamd  spamd   80642048 Jun 23 19:24 auto-whitelist
-rw-------  1 root   spamd      51264 Jun 23 04:29 bayes_journal
-rw-------  1 spamd  spamd  324435968 Jun 23 19:24 bayes_seen
-rw-------  1 spamd  spamd    5046272 Jun 23 19:24 bayes_toks
-rw-r--r--  1 spamd  spamd       1869 Jul 17  2011 user_prefs

(Sigh.... gotta go spank someone...)

But if spamc -L could consume an entire Maildir without requiring an
awk expert, that would be great.

No awk needed. Assuming the Maildir gets cleaned out so you aren't constantly trying to re-learn an ever-growing pile of old messages:

cd ~$USER/Maildir/.LearnAsSpam/cur
for x in *; do spamc -L spam < $x & done

Replace the '&' with a ';' if you find the concurrency a problem.

A bit fancier, run it hourly for rapid learning of fewer messages, still no awk, :

for x in $( find /home/*/Maildir/.LearnAsSpam/cur/ -type f -cmin -61 ) ; spamc -L spam < $x & done




Reply via email to