Re: Parallelizing Spam Assassin

2009-08-03 Thread Dan Schaefer
This whole time I thought the subject line was Paralyzing Spam Assassin and the original poster was having trouble with SA locking up. Oops. ;-) -- Dan Schaefer Web Developer/Systems Analyst Performance Administration Corp.

Re: Parallelizing Spam Assassin

2009-08-03 Thread jp
I would run a tcpdump on the ethernet interface while doing this, just in case there are network tests happening that you are not aware of. On Thu, Jul 30, 2009 at 11:55:21PM -0700, poifgh wrote: Hi I was measuring how quickly could SA [spam assassin] process spams when several SA

Re: Parallelizing Spam Assassin

2009-08-03 Thread poifgh
I did that - with DNSBL off there are no port 53 communications from SA -- Jason Philbrook wrote: I would run a tcpdump on the ethernet interface while doing this, just in case there are network tests happening that you are not aware of. On Thu, Jul 30, 2009 at 11:55:21PM -0700, poifgh

Re: Parallelizing Spam Assassin

2009-08-01 Thread Linda Walsh
It's an American thing. Things that are normal speech for UK blokes, get Americans all disturbed. Funny, used to be the other way around...but well...times change. Justin Mason wrote: On Fri, Jul 31, 2009 at 09:32, rich...@buzzhost.co.ukrich...@buzzhost.co.uk wrote: Imagine what Barracuda

Re: Parallelizing Spam Assassin

2009-08-01 Thread Patrick Ben Koetter
* Linda Walsh sa-u...@tlinx.org: It's an American thing. Things that are normal speech for UK blokes, get Americans all disturbed. Sloppy language is sloppy language everywhere! I took offense in the message, too and I am neither American nor am I from the UK. But what annoys me the most is

Re: Parallelizing Spam Assassin

2009-08-01 Thread Linda Walsh
May I point out, that while you may find the language crude -- it isn't language that would violate FTC standards in that in used any of the 7 or so 'unmentionable words'... People -- these standards of 'crude language' really need to be strongly held 'in check' -- the US is 'supposed' to be

Re: Parallelizing Spam Assassin

2009-08-01 Thread Linda Walsh
Well -- it's not just the cores -- what was the usage of the cores that were being used? were 3 out the 8 'pegged'? Are these 'real' cores, or HT cores? In the Core2 and P4 archs, HT's actually slowed down a good many workloads unless they were tightly constructed to work on the same data in

Re: Parallelizing Spam Assassin

2009-08-01 Thread rich...@buzzhost.co.uk
On Fri, 2009-07-31 at 23:40 -0700, Linda Walsh wrote: It's an American thing. Things that are normal speech for UK blokes, get Americans all disturbed. I'm sure that is mostly it, Linda. They don't seem to 'get' it. Two things I observe in this whole 'barracuda-gate' posting; 1. Being

Re: Parallelizing Spam Assassin

2009-08-01 Thread Henrik K
On Sat, Aug 01, 2009 at 12:04:08AM -0700, Linda Walsh wrote: Well -- it's not just the cores -- what was the usage of the cores that were being used? were 3 out the 8 'pegged'? Are these 'real' cores, or HT cores? In the Core2 and P4 archs, HT's actually slowed down a good many workloads

Re: Parallelizing Spam Assassin

2009-08-01 Thread Per Jessen
Henrik K wrote: On Sat, Aug 01, 2009 at 12:04:08AM -0700, Linda Walsh wrote: Well -- it's not just the cores -- what was the usage of the cores that were being used? were 3 out the 8 'pegged'? Are these 'real' cores, or HT cores? In the Core2 and P4 archs, HT's actually slowed down a

Re: Parallelizing Spam Assassin

2009-08-01 Thread Justin Mason
On Sat, Aug 1, 2009 at 10:04, Henrik Kh...@hege.li wrote: On Sat, Aug 01, 2009 at 12:04:08AM -0700, Linda Walsh wrote: Well -- it's not just the cores -- what was the usage of the cores that were being used?  were 3 out the 8 'pegged'?  Are these 'real' cores, or HT cores?  In the Core2 and

Re: Parallelizing Spam Assassin

2009-08-01 Thread Henrik K
On Sat, Aug 01, 2009 at 11:46:57AM +0200, Per Jessen wrote: Henrik K wrote: On Sat, Aug 01, 2009 at 12:04:08AM -0700, Linda Walsh wrote: Well -- it's not just the cores -- what was the usage of the cores that were being used? were 3 out the 8 'pegged'? Are these 'real' cores, or

Re: Parallelizing Spam Assassin

2009-08-01 Thread Per Jessen
Henrik K wrote: On Sat, Aug 01, 2009 at 11:46:57AM +0200, Per Jessen wrote: Henrik K wrote: On Sat, Aug 01, 2009 at 12:04:08AM -0700, Linda Walsh wrote: Well -- it's not just the cores -- what was the usage of the cores that were being used? were 3 out the 8 'pegged'? Are these

Re: Parallelizing Spam Assassin

2009-08-01 Thread Karsten Bräckelmann
On Fri, 2009-07-31 at 23:56 -0700, Linda Walsh wrote: May I point out, that while you may find the language crude -- it isn't language that would violate FTC standards in that in used any of the 7 or so 'unmentionable words'... It's not about words on their own -- it's about how they are

Re: Parallelizing Spam Assassin

2009-08-01 Thread Matt Kettler
Um, Linda.. I'm pretty positive Justin is Irish, not American. Linda Walsh wrote: It's an American thing. Things that are normal speech for UK blokes, get Americans all disturbed. Funny, used to be the other way around...but well...times change. Justin Mason wrote: On Fri, Jul 31, 2009

Some benchmarks (Re: Parallelizing Spam Assassin)

2009-08-01 Thread Henrik K
On Sat, Aug 01, 2009 at 01:34:34PM +0300, Henrik K wrote: That reminds me, gotta test how SA runs on a Sun T5240 with 16 core 128 cores.. Well not that impressive for SA, price/speed wise.. T2+ 2x8x1.4Ghz, 144 msgs/sec @ 128 processes AMD X4 4x3Ghz, 43 msgs/sec @ 4 processes Note that this

Re: Parallelizing Spam Assassin

2009-07-31 Thread Justin Mason
hi -- turn off Bayes and AWL. On Fri, Jul 31, 2009 at 07:55, poifghabhinav.pat...@gmail.com wrote: Hi I was measuring how quickly could SA [spam assassin] process spams when several SA processes are run in parallel over separate mbox files. I used a 8 core machine. Below are the numbers

Re: Parallelizing Spam Assassin

2009-07-31 Thread Christian Recktenwald
On Thu, Jul 30, 2009 at 11:55:21PM -0700, poifgh wrote: Why am I not seeing a linear increase in the throughput? Is a file locking creating the bottleneck? Maybe the auto white list. --

Re: Parallelizing Spam Assassin

2009-07-31 Thread rich...@buzzhost.co.uk
On Thu, 2009-07-30 at 23:55 -0700, poifgh wrote: Hi I was measuring how quickly could SA [spam assassin] process spams when several SA processes are run in parallel over separate mbox files. I used a 8 core machine. Below are the numbers when I forked different number of processes. Fork

Re: Parallelizing Spam Assassin

2009-07-31 Thread Justin Mason
On Fri, Jul 31, 2009 at 09:32, rich...@buzzhost.co.ukrich...@buzzhost.co.uk wrote: Imagine what Barracuda Networks could do with that if they did not fill their gay little boxes with hardware rubbish from the floors of MSI and supermicro. Jesus, try and process that many messages with a $30,000

Re: Parallelizing Spam Assassin

2009-07-31 Thread Henrik K
On Fri, Jul 31, 2009 at 09:32:42AM +0100, rich...@buzzhost.co.uk wrote: On Thu, 2009-07-30 at 23:55 -0700, poifgh wrote: Hi I was measuring how quickly could SA [spam assassin] process spams when several SA processes are run in parallel over separate mbox files. I used a 8 core

Re: Parallelizing Spam Assassin

2009-07-31 Thread rich...@buzzhost.co.uk
On Fri, 2009-07-31 at 09:53 +0100, Justin Mason wrote: On Fri, Jul 31, 2009 at 09:32, rich...@buzzhost.co.ukrich...@buzzhost.co.uk wrote: Imagine what Barracuda Networks could do with that if they did not fill their gay little boxes with hardware rubbish from the floors of MSI and

Re: Parallelizing Spam Assassin

2009-07-31 Thread Bernd Petrovitsch
On Thu, 2009-07-30 at 23:55 -0700, poifgh wrote: [...] I was measuring how quickly could SA [spam assassin] process spams when several SA processes are run in parallel over separate mbox files. I used a 8 core machine. Below are the numbers when I forked different number of processes. Fork

Re: Parallelizing Spam Assassin

2009-07-31 Thread Matt Kettler
rich...@buzzhost.co.uk wrote: On Fri, 2009-07-31 at 09:53 +0100, Justin Mason wrote: On Fri, Jul 31, 2009 at 09:32, rich...@buzzhost.co.ukrich...@buzzhost.co.uk wrote: Imagine what Barracuda Networks could do with that if they did not fill their gay little boxes with hardware

Re: Parallelizing Spam Assassin

2009-07-31 Thread rich...@buzzhost.co.uk
On Fri, 2009-07-31 at 07:26 -0400, Matt Kettler wrote: rich...@buzzhost.co.uk wrote: On Fri, 2009-07-31 at 09:53 +0100, Justin Mason wrote: On Fri, Jul 31, 2009 at 09:32, rich...@buzzhost.co.ukrich...@buzzhost.co.uk wrote: Imagine what Barracuda Networks could do with that if

Re: Parallelizing Spam Assassin

2009-07-31 Thread John Hardin
On Fri, 31 Jul 2009, rich...@buzzhost.co.uk wrote: ... dropping in here and making jokes at such low hanging fruit. Make all the jokes at Barracuda's expense that you like, complain about them all you like, just avoid offensive language. Vitriol is more impressive if you are creative enough

Re: Parallelizing Spam Assassin

2009-07-31 Thread rich...@buzzhost.co.uk
On Fri, 2009-07-31 at 08:25 -0700, John Hardin wrote: On Fri, 31 Jul 2009, rich...@buzzhost.co.uk wrote: ... dropping in here and making jokes at such low hanging fruit. Make all the jokes at Barracuda's expense that you like, complain about them all you like, just avoid offensive

Re: Parallelizing Spam Assassin

2009-07-31 Thread poifgh
Henrik K wrote: Yeah, given that my 4x3Ghz box masscheck peaks at 22 msgs/sec, without Net/AWL/Bayes. But that's the 3.3 SVN ruleset.. wonder what version was used and any nondefault rules/settings? Certainly sounds strange that 1 core could top out the same. Anyone else have figures?

Re: Parallelizing Spam Assassin

2009-07-31 Thread poifgh
Bernd Petrovitsch wrote: On Thu, 2009-07-30 at 23:55 -0700, poifgh wrote: [...] I ran freshly build SA with Bayes and DNSBL turned off. Why am I not seeing a linear increase in the throughput? Is a file locking creating the Because the bottleneck is not (only) the CPUs? Run `vmstat 1`

Re: Parallelizing Spam Assassin

2009-07-31 Thread poifgh
c. r. wrote: On Thu, Jul 30, 2009 at 11:55:21PM -0700, poifgh wrote: Why am I not seeing a linear increase in the throughput? Is a file locking creating the bottleneck? Maybe the auto white list. -- I can try turning off AWL and get back here.. Thnx -- View this message in

Re: Parallelizing Spam Assassin

2009-07-31 Thread poifgh
Henrik K wrote: Yeah, given that my 4x3Ghz box masscheck peaks at 22 msgs/sec, without Net/AWL/Bayes. But that's the 3.3 SVN ruleset.. wonder what version was used and any nondefault rules/settings? Certainly sounds strange that 1 core could top out the same. Anyone else have figures?

Re: Parallelizing Spam Assassin

2009-07-31 Thread Nigel Frankcom
I'm assuming you run a tad more messages than I, but on a quad with a failover I have never seen the failover kick in 4 years. This is not disputing your observations, just noting mine. I claim absolutely no knowledge about the core processing/stacking though I would assume (perhaps incorrectly)

Re: Parallelizing Spam Assassin

2009-07-31 Thread poifgh
In my tests - there was not MTA. The mails/spam were collected from some server in mbox format and fed to SA using --mbox switch. The size of msgs was not altered in any fashion - just the usual size of incoming spam/mails There are no AV [you mean Anti Virus right?] running on the machine

Re: Parallelizing Spam Assassin

2009-07-31 Thread Nigel Frankcom
OK - I can see what metrics you are trying to ascertain - I think. I'm not sure that your test and real life are 'right'. For obvious reasons I don't want to carry this one on via list - I would suggest you ask Justin and I will be happy to give info on my local setup (this assumes Justin can grab

Re: Parallelizing Spam Assassin

2009-07-31 Thread Paweł Sasin
In my tests - there was not MTA. The mails/spam were collected from some server in mbox format and fed to SA using --mbox switch. The size of msgs was not altered in any fashion - just the usual size of incoming spam/mails If you're interested in testing/tuning spamassassin for heavy loads

Re: Parallelizing Spam Assassin

2009-07-31 Thread Michael Parker
On Jul 31, 2009, at 1:55 AM, poifgh wrote: I ran freshly build SA with Bayes and DNSBL turned off. Why am I not seeing a linear increase in the throughput? Is a file locking creating the bottleneck? If yes, which particular file is being locked? If no, what could be the reason for this?

Re: Parallelizing Spam Assassin

2009-07-31 Thread LuKreme
On Jul 31, 2009, at 2:53 AM, Justin Mason wrote: On Fri, Jul 31, 2009 at 09:32, rich...@buzzhost.co.ukrich...@buzzhost.co.uk wrote: Imagine what Barracuda Networks could do with that if they did not fill their gay little boxes with hardware rubbish from the floors of MSI and supermicro.

Re: Parallelizing Spam Assassin

2009-07-31 Thread LuKreme
On Jul 31, 2009, at 9:25 AM, John Hardin wrote: On Fri, 31 Jul 2009, rich...@buzzhost.co.uk wrote: ... dropping in here and making jokes at such low hanging fruit. Make all the jokes at Barracuda's expense that you like, complain about them all you like, just avoid offensive language.

Re: Parallelizing Spam Assassin

2009-07-31 Thread jdow
From: Matt Kettler mkettler...@verizon.net Sent: Friday, 2009/July/31 04:26 rich...@buzzhost.co.uk wrote: On Fri, 2009-07-31 at 09:53 +0100, Justin Mason wrote: On Fri, Jul 31, 2009 at 09:32, rich...@buzzhost.co.ukrich...@buzzhost.co.uk wrote: ... Richard -- please watch

Re: Parallelizing Spam Assassin

2009-07-31 Thread LuKreme
On Jul 31, 2009, at 1:33 PM, jdow wrote: Given that profanity is the effort of a small mind to express itself I have a feeling he's going to receive his third and final warning any time now, Matt Given that nothing that richard said is not anything I've heard on, say, prime time TV or... a

Re: Parallelizing Spam Assassin

2009-07-31 Thread John Rudd
On Fri, Jul 31, 2009 at 12:37, LuKremekrem...@kreme.com wrote: On Jul 31, 2009, at 1:33 PM, jdow wrote: Given that profanity is the effort of a small mind to express itself I have a feeling he's going to receive his third and final warning any time now, Matt Given that nothing that richard

Re: Parallelizing Spam Assassin

2009-07-31 Thread Glenn Sieb
LuKreme said the following on 7/31/09 3:27 PM: Richard -- please watch your language. This is a public mailing list, and offensive language here is inappropriate. I dunno, 'gay' isn't that offensive. Gay is *not* a synonym for stupid. I do take offense to the term being used in that

Re: Parallelizing Spam Assassin

2009-07-31 Thread Matt Kettler
rich...@buzzhost.co.uk wrote: email me off list as I've just been banned for upsetting a sponsor LOL Richard, this has nothing to do with Barracuda. They have no influence over my opinions whatsoever. I don't work for Apache or Barracuda, or any company sponsored by either.Neither Apache

Re: Parallelizing Spam Assassin

2009-07-31 Thread Henrik K
On Fri, Jul 31, 2009 at 10:41:47AM -0700, poifgh wrote: Henrik K wrote: Yeah, given that my 4x3Ghz box masscheck peaks at 22 msgs/sec, without Net/AWL/Bayes. But that's the 3.3 SVN ruleset.. wonder what version was used and any nondefault rules/settings? Certainly sounds strange that

Re: Parallelizing Spam Assassin

2009-07-31 Thread poifgh
I am sorry, I did not provide any statistics of the machine involved. CPU - 8 cores with each core 2327 MHz RAM - 16GB Afair its has 7200RPM disk - 2TB. Yes, people were right in indicating AWL could be the problem. turning off AWL results in near linear scaling of SA as we increase number of

Re: Parallelizing Spam Assassin

2009-07-31 Thread poifgh
I havent tried with sa-compile yet - I can give it a shot -- Henrik K wrote: On Fri, Jul 31, 2009 at 10:41:47AM -0700, poifgh wrote: Henrik K wrote: Yeah, given that my 4x3Ghz box masscheck peaks at 22 msgs/sec, without Net/AWL/Bayes. But that's the 3.3 SVN ruleset.. wonder what

Re: Parallelizing Spam Assassin

2009-07-31 Thread rich...@buzzhost.co.uk
On Fri, 2009-07-31 at 17:37 -0400, Glenn Sieb wrote: LuKreme said the following on 7/31/09 3:27 PM: Richard -- please watch your language. This is a public mailing list, and offensive language here is inappropriate. I dunno, 'gay' isn't that offensive. Gay is *not* a synonym for

Re: Parallelizing Spam Assassin

2009-07-31 Thread jdow
From: LuKreme krem...@kreme.com Sent: Friday, 2009/July/31 12:30 On Jul 31, 2009, at 9:25 AM, John Hardin wrote: On Fri, 31 Jul 2009, rich...@buzzhost.co.uk wrote: ... dropping in here and making jokes at such low hanging fruit. Make all the jokes at Barracuda's expense that you like,

Re: Parallelizing Spam Assassin

2009-07-31 Thread jdow
From: LuKreme krem...@kreme.com Sent: Friday, 2009/July/31 12:37 On Jul 31, 2009, at 1:33 PM, jdow wrote: Given that profanity is the effort of a small mind to express itself I have a feeling he's going to receive his third and final warning any time now, Matt Given that nothing that

Re: Parallelizing Spam Assassin

2009-07-31 Thread jdow
From: poifgh abhinav.pat...@gmail.com Sent: Friday, 2009/July/31 19:47 I am sorry, I did not provide any statistics of the machine involved. CPU - 8 cores with each core 2327 MHz RAM - 16GB Afair its has 7200RPM disk - 2TB. One disk you might consider a striped array to get disk speed.