On 29 Dec 2015, at 20:02, Ian Zimmerman wrote:

On 2015-12-29 19:44 -0500, Bill Cole wrote:

On 29 Dec 2015, at 18:54, Ian Zimmerman wrote:

In fact sa-learn accepts multiple named arguments on the command line, so the alternative I use is to go through the spambox N files at a time
in a shell loop.  (I have N=100 but obviously this depends.)

Which successfully ignores the original issue of this thread completely: that the user sa-learn must run as cannot read the files being learnt. If you pass unreadable filenames as arguments, sa-learn just whines and fails. Shockingly, that is not the
desired result.

Clearly you can do the su magic if needed.

Um, no.

Neither su nor sudo magically changes the permissions or ownership of files. If you pass filenames as arguments they must be readable by the user actually running sa-learn, which is the *unprivileged* user handling the system-wide BayesDB ("amavis" in the case originating this thread, but "spamd" and "defang" are other common ones...) In most reasonably well-secured systems using Maildir message stores, the Maildirs are all owned by individual users or by one user that handles delivery to "virtual users" understood by the MTA and IMAP or POP server by not by the OS. That is generally NOT the same user running spamd or content filters for a system-wide BayesDB. As a result, relearning has to be done as root, shuttling data from files owned by one user into a process running as another.


The point is that the
overhead which you fear is reduced N times.

And since the sa-learn processes can't read the files it is given as arguments, they run with blinding speed, skipping all that costly parsing and learning stuff...

Reply via email to