https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7890
Bill Cole <[email protected]> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |[email protected] --- Comment #9 from Bill Cole <[email protected]> --- (In reply to [email protected] from comment #8) > (In reply to RW from comment #7) [...] > > spamc can be used to train to spamd if you prefer > > Really? Of course. Feeding spamd is what spamc is for. See spamd/PROTOCOL for how it does that. RTFM for all the details. > Can you elaborate? Use "-L ham" and "-L spam" options. It's all there on the fine man page. > If spamc can -- without itself loading the > Bayesian functionality -- Look at the code for yourself: spamc doesn't know anything about any Perl modules. > tell spamd to process yet another file (as either > spam or ham), that will solve a big part of the problem. > > One'd still need a daemon, but it can be as simple as inotifyd... Not even really that. I do this using spamc from a shell script that runs from cron periodically and figures out what to have spamc pass by maintaining a 'last run' flag file and using find's '-newer' directive. I can't share that code because it was written for hire, but the basic concept is Not Hard. Yes, you'll get better performance (probably) with inotify/kqueue in a daemon, but it's not really a heavy task at all to identify new files and feed them to spamc. Also: the real reason to avoid re-submitting messages to spamd for training is not that you'll skew the data but only that you're wasting the Bayes subsystem's effort in noticing that it has seen the message before. FWIW, I am mildly negative on adding this functionality into SpamAssassin itself. It's feature bloat and scope creep. It would invite and take ownership of a whole new class of integration problems that we don't have the aggregate attention to provide support for. -- You are receiving this mail because: You are the assignee for the bug.
