Also, just to be sure that the problem isn't the drive array, I tried migrating the hypver-v vm to another host (same config). No difference.
I've compared rebuild times to the production (older version of ASSP v2) that we're running. That's on much older hardware with slower disks (passmark shows them to be about 40% slower than what's in the new machines). Cycling through 40k files takes around 25 minutes, vs 45 on the new hardware. These are the same messages, copied from old to new. Could there be something with Perl 5.20 (vs 5.08) or the modules that is causing this slowness? On Mon, Apr 27, 2015 at 2:31 PM, K Post <nntp.p...@gmail.com> wrote: > Hi all- > > I'm having a rough go getting the rebuild process to quickly rebuild > spamdb. The HMM db, which I have using BerkeleyDB rebuilds wonderfully, in > under a minute. However, spamdb, which uses MySQL, is taking over 45 > minutes. That's no good. > > The real question is if there is a downside for using BerkeleyDB for > everything? > > In reality, I'd like to figure out why my installation is taking so slow > with MySQL (and I've got another stalled out thread going on that). I > worry about the lack of management tools with BerkeleyDB. I'd be > uncomfortable with the whitelist being in Berkeley. > > > More info: > > ASSP and MySQL are running on the same Windows 2012 hypver-v virtual > machine. 16gb ram. 4gb ram disk for c:/assp/tmpDB (using the imdisk > driver), The vm seems to be running quickly for all other tasks. > > I've got a corpus of around 15k spam, 15k not spam, and 5k errors for each > of error-spam and error-notspam (so about 40k total). It takes about 45 > minutes to go through all of these messages and I'm okay with that > > MySQL is using the setting suggested here: > http://sourceforge.net/p/assp/mailman/message/29893302/ by Thomas, though > net_buffer_length > is limited to 1M according to the documentation. > > Apr-27-15 13:23:47 start populating Spamdb with 1,140,905 records - > Bayesian check is now disabled! > Apr-27-15 14:07:09 Finished populating Spamdb with 1,140,905 records - > Bayesian check is now enabled! > > > I'd really like to stick with MySQL for spamdb and the other databases, > but berkeleydb as recommended for HMM. I just can't see doing that if the > rebuild of spamdb will be so slow. > > What kind of speeds is everyone else seeing for the spamdb rebuild portion > of the rebuild? > > I'd love some suggestions on speeding up MySQL or anything else. Thank you > > Ken > > > ------------------------------------------------------------------------------ One dashboard for servers and applications across Physical-Virtual-Cloud Widest out-of-the-box monitoring support with 50+ applications Performance metrics, stats and reports that give you Actionable Insights Deep dive visibility with transaction tracing using APM Insight. http://ad.doubleclick.net/ddm/clk/290420510;117567292;y _______________________________________________ Assp-test mailing list Assp-test@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/assp-test