If you insist on file bayes, atleast make sure you use "lock_method flock". 
Or maybe BDB backend, don't remember if it's faster.

> On 4/15/21 2:45 PM, Christian Völker wrote:
> > Hi,
> > 
> > well, here it is not I/O bound (running on RAID1-SSDs). I am using the
> > "default" file based backend ~/.spamassassin/bayes*.
> > 
> > 40msg/sec is not really fast enough for me. The number of messages to be
> > processed is really huge.
> > 
> > So again asking: is it possible with the file-based dbackend to do this
> > stuff in parallel?
> > 
> > Thanks
> > 
> > /Christian
> > 
> > Am 15.04.2021 um 14:38 schrieb Axb:
> > > Depending on your Bayes backend, your bottleneck will not be the
> > > CPUs but I/O.
> > > Normally there's no need for running multiple sa-learn instances.
> > > 
> > > My sa-learn is learning +40 msgs/sec from a SSD into a Redis DB.
> > > 
> > > On 4/15/21 2:33 PM, Christian Völker wrote:
> > > > Hi all,
> > > > 
> > > > I am going to add some large spam archives for my Bayes database
> > > > with sa-learn.
> > > > 
> > > > I have a machine with six vCPUs and obviously I would like to
> > > > speed up the learning process. I am thinking of running six
> > > > sa-learn processes in parallel. Is there any issue with this
> > > > like locks for the database?
> > > > 
> > > > Or is sa-learn itself multithreaded and I do not need to run it
> > > > in parallel (does not look so)?
> > > > 
> > > > Next, when running the above in parallel (if possible) should I
> > > > use the "--no-sync" and do the syncing afterwards? But again,
> > > > this is then only single-threaded, right?
> > > > 
> > > > Thanks a lot for your input!
> > > > 
> > > > /Christian
> > > > 
> > > > 
> > > 
> > > 
> > 
> 

Reply via email to