We're currently evaluating how to cluster spamassassin without having
to have different heuristic databases (shared knowledge). I guess the
best way to go is to put the spamd/.spamassassin in a shared file
system so that every machine connected to it could read/write to the
same file. I haven't been able to find anyone that has done this or
something similar and published results to see if it's feasible or if
it's not worth and is rather better to have independent machines with
different databases and different knowledge caused by the differences
in email processed by each server. Any views/urls ?

I'm running SA 2.63 on a cluster but each machine has it's own Bayes database. I initially set it up with a shared database (not really thinking about it) and the database got corrupted. So each node now has it's own. They started from the same corpus. Since each handles nearly identical message loads and message content I figure they haven't diverged very much. When I add new spam/ham I just do it on all nodes.


That said, I'm looking forward to trying out SA 3.0 and moving to Mysql for the database. That way it can be shared and I already have plenty of clustered Mysql experience.

--
_______________________________________________________________________

   Rick Beebe                                            (203) 785-6416
   Manager, Systems & Network Engineering           FAX: (203) 785-3481
   ITS-Med Production Systems                    [EMAIL PROTECTED]
   Yale University School of Medicine
   Suite 124, 100 Church Street South           http://its.med.yale.edu
   New Haven, CT 06519
_______________________________________________________________________

Reply via email to