On 23 Jun 2015, at 14:58, Michael B Allen wrote:
On Tue, Jun 23, 2015 at 12:48 PM, Bill Cole
<sausers-20150...@billmail.scconsult.com> wrote:
Yes, I want a system-wide bayes db. And I am running spamd and spamc
and I assume that is all working (but of course I have no idea if it
really is).
But I want users to be able to put spams that get through into
~/Maildir/.LearnAsSpam and then, every once in a while, I want to
run
sa-learn on all of those messages for the system-wide db.
So can that be done without running sa-learn as root?
Of course. As I said in other words that you quoted but apparently
misunderstood:
***** sa-learn IS NOT THE RIGHT TOOL FOR LEARNING MESSAGES INTO A
SYSTEM-WIDE DB ****
Use 'spamc -L (spam|ham)'. Have users run it if they like, or have it
run as
the user whose magic maildirs are being learned. It talks to the
spamd
daemon, running as the spamd user, managing the system-wide Bayes DB.
If it
isn't run as root, it can't do random violence limited only by your
capacity
for typos.
Well, ever since we stopped using The UNIX® Time-Sharing System back
in '87 generally "users" don't run stuff on their own like this
anymore.
Sure, but if you're using Real Users (i.e. if diverse ownership of
Maildirs is an actual system issue) then maybe you populate a crontab
for each one as well. Or not. My point is that if you have spamd running
as the user spamd, it will only ever operate based on the SpamAssassin
configuration for the user spamd, never as if it were root. No matter
how you run spamc, it can't make spamd break ownership of the DB files
so that spamd can't continue to use them. Because sa-learn running as
root is a root process manipulating files itself (not mediated by spamd)
you need to be careful about how you invoke it because you MIGHT end up
with something like this:
# ls -l ~spamd/.spamassassin/
total 400461
-rw------- 1 spamd spamd 80642048 Jun 23 19:24 auto-whitelist
-rw------- 1 root spamd 51264 Jun 23 04:29 bayes_journal
-rw------- 1 spamd spamd 324435968 Jun 23 19:24 bayes_seen
-rw------- 1 spamd spamd 5046272 Jun 23 19:24 bayes_toks
-rw-r--r-- 1 spamd spamd 1869 Jul 17 2011 user_prefs
(Sigh.... gotta go spank someone...)
But if spamc -L could consume an entire Maildir without requiring an
awk expert, that would be great.
No awk needed. Assuming the Maildir gets cleaned out so you aren't
constantly trying to re-learn an ever-growing pile of old messages:
cd ~$USER/Maildir/.LearnAsSpam/cur
for x in *; do spamc -L spam < $x & done
Replace the '&' with a ';' if you find the concurrency a problem.
A bit fancier, run it hourly for rapid learning of fewer messages, still
no awk, :
for x in $( find /home/*/Maildir/.LearnAsSpam/cur/ -type f -cmin -61 ) ;
spamc -L spam < $x & done