Hi folks, Further to the issues I posted last week, I have moved my backup server from CentOS with perl v5.10 to Ubuntu with perl v5.12 and the threads getting stuck issue has cleared up. The problem is therefore either in a CentOS config or something to do with perl v5.10 as I have reapplied all of my specific configs to Ubuntu. I know there was a link to getting perl v5.12 running on CentOS but the general advice from CentOS is to not change the base perl version as so much relies on it.
Things seem to be staying up mostly, however I am having problems with rebuildspamdb. It takes absolutely ages to run and generates a system load of 30-40. It can take up to five minutes just to run su on the machine and SMTP connections quite often time out. Current example has been running for 52229s. It has gotten through the error folders and the spam folder and seems to be chugging away very slowly with the notspam folder. I can't see a way to get any extra information for rebuildspamdb other than turning on general debug mode which generates a lot of extra information. When I do turn on debug, all I see for Worker_10001 is the following, repeated many times with different percentages: >2012-02-09 10:26:21 [Worker_10001] <ASSP_WordStem - set_active_languages >2012-02-09 10:26:21 [Worker_10001] <ASSP_WordStem - cleanup HTML Tags >2012-02-09 10:26:21 [Worker_10001] <ASSP_WordStem - cleanup exception words >2012-02-09 10:26:21 [Worker_10001] <ASSP_WordStem language detection >2012-02-09 10:26:21 [Worker_10001] <language en detected to 35.54 percent >2012-02-09 10:26:21 [Worker_10001] <language da detected to 8.57 percent >2012-02-09 10:26:21 [Worker_10001] <language fr detected to 8.35 percent >2012-02-09 10:26:21 [Worker_10001] <language it detected to 7.81 percent >2012-02-09 10:26:21 [Worker_10001] <language ro detected to 7.78 percent >2012-02-09 10:26:21 [Worker_10001] <language sv detected to 6.48 percent >2012-02-09 10:26:21 [Worker_10001] <language nl detected to 6.22 percent >2012-02-09 10:26:21 [Worker_10001] <language es detected to 4.97 percent >2012-02-09 10:26:21 [Worker_10001] <language pt detected to 4.56 percent >2012-02-09 10:26:21 [Worker_10001] <language de detected to 3.11 percent >2012-02-09 10:26:21 [Worker_10001] <language fi detected to 2.78 percent >2012-02-09 10:26:21 [Worker_10001] <language tr detected to 1.97 percent >2012-02-09 10:26:21 [Worker_10001] <language hu detected to 1.84 percent >2012-02-09 10:26:21 [Worker_10001] <ASSP_WordStem start word stemming >2012-02-09 10:26:21 [Worker_10001] <ASSP_WordStem process word stem - with StopWords cleanup >2012-02-09 10:26:21 [Worker_10001] <ASSP_WordStem finished I presume this is WordStem being used in the generation of Bayes pairs but have no idea how to make it cause less load. CPU load on the server is actually quite low so I suspect I/O issues. If I turn off the wordstem module then rebuildspamdb runs as it used to. Higher CPU usage but a load between one and two and it gets through the corpus much, much quicker. So, any ideas on how to make Wordstem not be a resource hog? All the best, Colin. ------------------------------------------------------------------------------ Virtualization & Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also focuses on allowing computing to be delivered as a service. http://www.accelacomm.com/jaw/sfnl/114/51521223/ _______________________________________________ Assp-test mailing list Assp-test@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/assp-test