RE: questions about performance and setup

2000-07-18 Thread Austad, Jay

I did some benchmarking using a standard 7200 RPM disk and a 128MB ramdisk.
The machine was not using any swap, so there was no chance of the ramdisk
accidentally making it to disk.  

In short, performance on it sucked.  The throughput was about 10% less than
IDE, but seeks/sec were 5-10 times more.  However, the CPU was maxed at 100%
during tests to the ramdisk.  

Jay  

-Original Message-
From: Oliver White [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, July 18, 2000 12:27 AM
To: [EMAIL PROTECTED]
Subject: Re: questions about performance and setup


Steve Wolfe wrote:

  With all of the emails I recieved, I get the impression that I'm going
to
  I/O bound instead of processor or memory bound.  How much disk will be
  sufficient for the queue?  1GB?  More?

   It's not so much a matter of disk size (I don't think you'll have a 1
gig
 queue!),

You could quite easily get a 1 Gig queue, even if you don't run into the
obvious problem of temporary loss of network connectivity.  Say you've
got 200,000 subscribers and you generate your messages twice as fast
as qmail can send them, then when you've finished generating the
messages you've still got 100,000 in the queue.  If the messages are
10Kb each, that's 1 Gb.

  (I can put 2GB of ram in the box)?  Linux has support for making a disk
in
  memory, putting a filesystem on it and mounting it.  Wouldn't this take
  care of I/O problems?

That's about as good of I/O as you can get, I would imagine. ; )  As
 another author stated, the largest gain would be in writes, but that's
where
 the largest expenditure is anyway.   Just make dang, dang sure that your
 machine is NOT going to have any hiccups or lose power while the queue is
 full, or you'll instantly lose it all.

What if you put the 2 Gb RAM in the box, but let Linux use it as a disk
cache?
I'm not sure how the disk caching under Linux works, but if you create a
file
and then delete it before it actually gets written to disk, is there any
disk
activity required?
Sure, the disks will be thrashing away, trying to keep up, but would the I/O
actually block if there was still room in the disk cache?

 - Oliver.




Re: questions about performance and setup

2000-07-18 Thread markd

 What if you put the 2 Gb RAM in the box, but let Linux use it as a disk cache?
 I'm not sure how the disk caching under Linux works, but if you create a file
 and then delete it before it actually gets written to disk, is there any disk
 activity required?
 Sure, the disks will be thrashing away, trying to keep up, but would the I/O
 actually block if there was still room in the disk cache?

Yes it will block. That's the whole point of the fsync() calls embedded within
qmail. The code wants to be sure that data is on disk before proceeding. The
only caveat is that some file systems may *lie* about the results of their
fsync() and tell the process that the data has been placed on disk when it
still sits in memory. In that sort of scenario you may well gain, especially
if the I/O queue is subsequently sorted by cylinder prior to sending to the
disk.

As others have said, it's the cost of seeking - the amount of data is
often trivial. Thus the concept of zeroseek which is pretty similar to
what a journalling file system is trying to do on a more general level.


Regards.



Re: questions about performance and setup

2000-07-18 Thread Clint Bullock

"Hubbard, David" wrote:

 know what you're getting into on the Dell boxes if you choose
 to run linux.  I've got a Dell PE2400 dual that runs linux
 and you're going to be at the mercy of Dell and Adaptec on
 when you upgrade your kernel because they have some sorry
 proprietary drivers for their RAID controllers that are
 tailored to a specific kernel version and redhat sub-revision.
 If you can put up with that, then Redhat Linux/Qmail on a
 Dell runs very fast, I'm happy with mine.  But at the same

Just for the record, it depends on the RAID controller that you purchase
from Dell.  The PERC 2/DC and 2/SC (Dual Channel and Single Channel) are
just AMI MegaRAID controllers with open source drivers included in the
standard kernels.

The PERC 3/Si (and maybe the PERC 2/Si?) are the Adaptec RAIDPort
controllers with closed-source drivers.  You have to wait for Adaptec/Dell
to release new precompiled modules that can only be used with specific
kernels that Redhat releases.  But, the PERC 3/Si is much cheaper than the
2/DC if you are on a budget, need RAID, and don't care about having the
latest kernel.  You can probably get a better DPT card for around the same
price, though.  (Note:  Adaptec now owns DPT)

I have a Dell 2450 with PERC 2/DC controller and 18GB mirrored disks
running linux with Qmail.  Compiled latest standard kernel with no problem,
and the machine runs like a champ.

Later ;)

--

S. Clint Bullock
Network Administrator
University of Georgia
Office of the Vice President for Research
626 Boyd GSRC
Athens, GA 30602-7411
(706) 542-5936
(706) 542-5638 FAX



begin:vcard 
n:Bullock;Clint
tel;fax:(706) 542-5946
tel;work:(706) 542-5936
x-mozilla-html:FALSE
url:http://www.ovpr.uga.edu
org:University of Georgia;Office of the Vice President for Research
adr:;;626 Boyd GSRC;Athens;GA;30602-7411;USA
version:2.1
email;internet:[EMAIL PROTECTED]
title:Network Administrator
fn:Clint Bullock
end:vcard



Re: questions about performance and setup

2000-07-18 Thread Michael T. Babcock

Nothing wrong with 100% CPU usage.  It just means that the kernel was able to
soak the CPU with work ... which is good.  Maxing out your performance on a RAM
disk at 75% CPU usage means your system has a problem somewhere.

As for performance though, I'd be interested in seeing the actual numbers from
the ramdisk test to check against my 10k RPM disk stats.

"Austad, Jay" wrote:

 I did some benchmarking using a standard 7200 RPM disk and a 128MB ramdisk.
 The machine was not using any swap, so there was no chance of the ramdisk
 accidentally making it to disk.

 In short, performance on it sucked.  The throughput was about 10% less than
 IDE, but seeks/sec were 5-10 times more.  However, the CPU was maxed at 100%
 during tests to the ramdisk.




RE: questions about performance and setup

2000-07-18 Thread Austad, Jay

As for performance though, I'd be interested in seeing the actual numbers
from
the ramdisk test to check against my 10k RPM disk stats.

I used bonnie++ to test it.  I'll post the results sometime today, when I
get some time.

Jay
-Original Message-
From: Michael T. Babcock [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, July 18, 2000 10:41 AM
To: Austad, Jay; Qmail Mailing List
Subject: Re: questions about performance and setup


Nothing wrong with 100% CPU usage.  It just means that the kernel was able
to
soak the CPU with work ... which is good.  Maxing out your performance on a
RAM
disk at 75% CPU usage means your system has a problem somewhere.

As for performance though, I'd be interested in seeing the actual numbers
from
the ramdisk test to check against my 10k RPM disk stats.

"Austad, Jay" wrote:

 I did some benchmarking using a standard 7200 RPM disk and a 128MB
ramdisk.
 The machine was not using any swap, so there was no chance of the ramdisk
 accidentally making it to disk.

 In short, performance on it sucked.  The throughput was about 10% less
than
 IDE, but seeks/sec were 5-10 times more.  However, the CPU was maxed at
100%
 during tests to the ramdisk.



Re: questions about performance and setup

2000-07-18 Thread Michael T. Babcock

Is UTIME necessary in a mail queue?  If a logging filesystem were mounted on a
separate disk (or network array, etc.) specifically for the mail queue,
shouldn't it be mounted without UTIME?

Bruce Guenter wrote:

 The only way to get truely zero seek performance is to use a
 log-structured file system on a clean disk.  Otherwise, you will seek
 occasionally to write out some dirty metadata.  Even if you pre-allocate
 your log file on a regular filesystem, you will seek occasionally (once
 a second, AFAICT) to update the utime in the inode.




Re: questions about performance and setup

2000-07-18 Thread Bruce Guenter

On Tue, Jul 18, 2000 at 01:25:36PM -0400, Michael T. Babcock wrote:
 Is UTIME necessary in a mail queue?  If a logging filesystem were mounted on a
 separate disk (or network array, etc.) specifically for the mail queue,
 shouldn't it be mounted without UTIME?

You cannot mount without mtime (I misspelt it -- utime is the syscall)
AFAIK.  You can mount without atime (access time).  mtime is changed
every time the file is modified.  ctime is changed every time the inode
is modified (file size change, permissions, etc.)  atime is changed
every time the file is accessed.
-- 
Bruce Guenter [EMAIL PROTECTED]   http://em.ca/~bruceg/

 PGP signature


Re: questions about performance and setup

2000-07-18 Thread markd

On Tue, Jul 18, 2000 at 01:25:36PM -0400, Michael T. Babcock wrote:
 Is UTIME necessary in a mail queue?  If a logging filesystem were mounted on a
 separate disk (or network array, etc.) specifically for the mail queue,
 shouldn't it be mounted without UTIME?

Do you mean atime or mtime? In either case, not all Unixen allow such
mount options. Sepcifically Solaris only has noatime. I'd be surprised
though if the OS wants to update the directory once a second to get
an atime/mtime on disk for an opened file. Maybe once a minute which
is not an unreasonable cost for zeroseek.

This is probably something that's more appropriately discussed on
the zeroseek list. The bottom line though is that when qmail-queue
exits(0), the email must be phsyically on disk which means there
must be at least one fsync() - no choice whatsoever.

The zeroseek question is all about how you minimize the number
of fsyncs and how you structure the queue so that the fsync() incurs
a minimal seek on disk. Oh and combine that with appropriate security
access to that queue structure and your done!


Regards.

 
 Bruce Guenter wrote:
 
  The only way to get truely zero seek performance is to use a
  log-structured file system on a clean disk.  Otherwise, you will seek
  occasionally to write out some dirty metadata.  Even if you pre-allocate
  your log file on a regular filesystem, you will seek occasionally (once
  a second, AFAICT) to update the utime in the inode.
 



Re: questions about performance and setup

2000-07-18 Thread Michael T. Babcock

To be honest, I'm not aware of being able to disable UTIME either, although NOATIME
is an option on Linux as well.  I asked because it occured to me that this meta data
is not terribly useful to mail servers (as the times necessary are stored in the
data files themselves).  Being able to shut these off may or may not reduce
performance penalties of fsync()'s.  Might be an issue for the ReiserFS or EXT3
people to think about.

[EMAIL PROTECTED] wrote:

 On Tue, Jul 18, 2000 at 01:25:36PM -0400, Michael T. Babcock wrote:
  Is UTIME necessary in a mail queue?  If a logging filesystem were mounted on a
  separate disk (or network array, etc.) specifically for the mail queue,
  shouldn't it be mounted without UTIME?

 Do you mean atime or mtime? In either case, not all Unixen allow such
 mount options. Sepcifically Solaris only has noatime. I'd be surprised
 though if the OS wants to update the directory once a second to get
 an atime/mtime on disk for an opened file. Maybe once a minute which
 is not an unreasonable cost for zeroseek.




Re: questions about performance and setup

2000-07-18 Thread Michael T. Babcock

Yes, sorry ... utime.

But as I said in the other message ... it would be nice.

Bruce Guenter wrote:

 You cannot mount without mtime (I misspelt it -- utime is the syscall)
 AFAIK.  You can mount without atime (access time).  mtime is changed
 every time the file is modified.  ctime is changed every time the inode
 is modified (file size change, permissions, etc.)  atime is changed
 every time the file is accessed.




Re: questions about performance and setup

2000-07-17 Thread markd

On Mon, Jul 17, 2000 at 03:33:54PM +1000, Oliver White wrote:
 We're in a similar situation at the moment.  However, we want to send out
 100,000 UNIQUE emails per day, expanding to 500,000 or more in the near
 future.  Also, our send window is only actually a couple of hours.

Is that for your TV stuff?

500K queue insertions and delivery (reliably) within 2-3 hours is a lot. I
would not rely on one server, nor one point of internet connection.

One thing that you may want to think about is the amount of
bandwidth you will need. Let's see now, assuming a 10Kbyte message
size (which is pretty close to the current average, especially if it's
HTML)...

500,000 x 10,000 x 8 = 
400 bits. In two hours, that makes
200 bits per hour, that makes
3 bits per minute, that makes
555 bits per second.

Let's put some commas in to make this obvious:

5,555,555

In other words you'll need to pump out 5+ megabits per second, which
means a connection of around double that, say 10 Mbits per second.

Is that what you have available?


Looking at the disk I/O, 500K queue insertions and deletions implies
1Million fsynced I/Os (one for insertion, one for delivery) in 2 hours,
which means:

50 fsynched I/Os per hour, that makes
8333 fsynched I/Os per minute, that makes
138 fsynched I/Os per second, that means that a 7ms access disk
will be flat out. Again doubling it to make a safety margin means
that you're looking at a disk subsystem that will give you an
fsynced I/O rate of 3ms.


Regards.


 I'm trying to work out the best settings for the concurrencyremote and
 conf-split parameters.  Our system is a HP Netserver 2000r PIII-667 RAID5
 running Linux.  Are there any problems in setting conf-split to a very large
 value?  Is it necessary on a Linux system, assuming a queue size of, say
 100,000?  Any information appreciated.
 
  - Oliver.
 
 "Austad, Jay" wrote:
 
  Non-unique emails will most likely be generated by other machines and send
  the box running mini-qmail via smtp.  Non-unique emails will be a small
  percentage of what gets sent out, for now.
 
  Jay
 
  -Original Message-
  From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
  Sent: Saturday, July 15, 2000 12:10 AM
  To: '[EMAIL PROTECTED]'
  Subject: Re: questions about performance and setup
 
  On Fri, Jul 14, 2000 at 07:01:46PM -0500, Austad, Jay wrote:
   Then have the script that does the mailing call randomly
   on of the /var/qmail#/bin/qmail-inject. This will emulate round robin
   without any patching.
  
   Won't this way be a performance hit though?  I admit, it is an easy
  solution
 
  No. My experience is that the cost of running a script to inject the mail
  in a way similar to that mentioned above, is pretty small compared to the
  queue injection cost and the delivery cost. sh or perl will be fine.
 
   and would work excellent, but I have to think about efficiency also.  C
  code
   is much faster than shell or perl, and I'd like to set it up once and not
   have to ever worry about again, or at least for a long, long time.
  
   As I said, we're doing 50 million emails a month right now, but this is
   increasing substantially each month, and as we rollout new subscription
   services, we'll have even more load.  Sending 10 times this amount by the
   same time next year is a good possibility, possibly sooner as we seem to
   underestimate the rate at which we're growing much of the time...
 
  You may also need to look at the scalability of the generation of the
  emails. One system I recently looked at claimed to be able to generate
  nicely unique emails at a targetted database, but it burned CPU like
  it was free - just in generating the content.
 
  Mark.
 
  
   Jay
  
   -Original Message-
   From: JuanE [mailto:[EMAIL PROTECTED]]
   Sent: Friday, July 14, 2000 5:55 PM
   To: '[EMAIL PROTECTED]'
   Subject: Re: questions about performance and setup
  
  
  
   Jay,
  
   That's the beauty of having multiple instances, not having to patch qmail.
   All you need to do is install qmail once per machine (ie, /var/qmail1,
   /var/qmail2,...). Then have the script that does the mailing call randomly
   on of the /var/qmail#/bin/qmail-inject. This will emulate round robin
   without any patching.
  
   JES
  
   Austad, Jay writes:
  
Where would I start in the code to modify the QMQP servers list so that
  it
would load balance between all of the servers in the list instead of
  just
using the first one it can contact?  This would be very useful to me.  I
assume qmail-qmqpc.c is one of them, are there others I would need to
  play
around with?
   
Jay
   
-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
Sent: Friday, July 14, 2000 3:55 PM
To: '[EMAIL PROTECTED]'
    Subject: Re: questions about performance and setup
   
   
On Fri, Jul 14, 2000 at 02:29:06PM -0500, Austad, Jay wrote:
 I already have Mandrake

Re: questions about performance and setup

2000-07-17 Thread Steve Wolfe

 We're in a similar situation at the moment.  However, we want to send out
 100,000 UNIQUE emails per day, expanding to 500,000 or more in the near
 future.  Also, our send window is only actually a couple of hours.

  That shouldn't be too hard.  With a Pentium 233 (not a P-II, a regular
Pentium) attached to a 512k dsl line, using an IDE hard drive, I sent out
1,000 unique emails from a Perl script, the script took about 30 seconds to
run, and all (deliverable) remote messages were delivered in about 45
seconds.  That was with a concurrencyremote of 60.

  So, with equal hardware, 500,000 would take about 6 hours to run.
Considering that you have about 10 times the CPU of the machine I used, and
a much better disk, if you have a large enough pipe, you can turn the
concurrencyremote to 200 (or even more), and it should work out in a couple
of hours.

steve




RE: questions about performance and setup

2000-07-17 Thread Austad, Jay

With all of the emails I recieved, I get the impression that I'm going to
I/O bound instead of processor or memory bound.  How much disk will be
sufficient for the queue?  1GB?  More?

I'm just grasping here to figure out the best solution, so bear with me...
What if I only needed a 1GB queue, and what if that queue was a 1GB ramdisk
(I can put 2GB of ram in the box)?  Linux has support for making a disk in
memory, putting a filesystem on it and mounting it.  Wouldn't this take care
of I/O problems?

Jay

 

-Original Message-
From: Oliver White [mailto:[EMAIL PROTECTED]]
Sent: Monday, July 17, 2000 12:34 AM
To: [EMAIL PROTECTED]
Subject: Re: questions about performance and setup


We're in a similar situation at the moment.  However, we want to send out
100,000 UNIQUE emails per day, expanding to 500,000 or more in the near
future.  Also, our send window is only actually a couple of hours.

I'm trying to work out the best settings for the concurrencyremote and
conf-split parameters.  Our system is a HP Netserver 2000r PIII-667 RAID5
running Linux.  Are there any problems in setting conf-split to a very large
value?  Is it necessary on a Linux system, assuming a queue size of, say
100,000?  Any information appreciated.

 - Oliver.

"Austad, Jay" wrote:

 Non-unique emails will most likely be generated by other machines and send
 the box running mini-qmail via smtp.  Non-unique emails will be a small
 percentage of what gets sent out, for now.

 Jay

 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
 Sent: Saturday, July 15, 2000 12:10 AM
 To: '[EMAIL PROTECTED]'
 Subject: Re: questions about performance and setup

 On Fri, Jul 14, 2000 at 07:01:46PM -0500, Austad, Jay wrote:
  Then have the script that does the mailing call randomly
  on of the /var/qmail#/bin/qmail-inject. This will emulate round robin
  without any patching.
 
  Won't this way be a performance hit though?  I admit, it is an easy
 solution

 No. My experience is that the cost of running a script to inject the mail
 in a way similar to that mentioned above, is pretty small compared to the
 queue injection cost and the delivery cost. sh or perl will be fine.

  and would work excellent, but I have to think about efficiency also.  C
 code
  is much faster than shell or perl, and I'd like to set it up once and
not
  have to ever worry about again, or at least for a long, long time.
 
  As I said, we're doing 50 million emails a month right now, but this is
  increasing substantially each month, and as we rollout new subscription
  services, we'll have even more load.  Sending 10 times this amount by
the
  same time next year is a good possibility, possibly sooner as we seem to
  underestimate the rate at which we're growing much of the time...

 You may also need to look at the scalability of the generation of the
 emails. One system I recently looked at claimed to be able to generate
 nicely unique emails at a targetted database, but it burned CPU like
 it was free - just in generating the content.

 Mark.

 
  Jay
 
  -Original Message-
  From: JuanE [mailto:[EMAIL PROTECTED]]
  Sent: Friday, July 14, 2000 5:55 PM
  To: '[EMAIL PROTECTED]'
  Subject: Re: questions about performance and setup
 
 
 
  Jay,
 
  That's the beauty of having multiple instances, not having to patch
qmail.
  All you need to do is install qmail once per machine (ie, /var/qmail1,
  /var/qmail2,...). Then have the script that does the mailing call
randomly
  on of the /var/qmail#/bin/qmail-inject. This will emulate round robin
  without any patching.
 
  JES
 
  Austad, Jay writes:
 
   Where would I start in the code to modify the QMQP servers list so
that
 it
   would load balance between all of the servers in the list instead of
 just
   using the first one it can contact?  This would be very useful to me.
I
   assume qmail-qmqpc.c is one of them, are there others I would need to
 play
   around with?
  
   Jay
  
   -Original Message-
   From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
   Sent: Friday, July 14, 2000 3:55 PM
   To: '[EMAIL PROTECTED]'
   Subject: Re: questions about performance and setup
  
  
   On Fri, Jul 14, 2000 at 02:29:06PM -0500, Austad, Jay wrote:
I already have Mandrake Linux 7.0 and 7.1 running on multiple Dell
 boxes
with no trouble, some of them took work to get going, but it runs
 well.
  I
have a few Crystal PC's here also that I may use instead, dual PIII
  550's
with 512MB ram and 9 or 18GB 1rpm drives.  I'll probably use
these
  for
testing.
  
   I agree with the earlier poster that more spindles for your queue
   (c/- raid) is a good thing in general.
  
The bulk of the messages will be the same content to many rcpt's.
   However,
once in awhile we'll have 100,000 different messages go out to
100,000
different people.
   
Since the QMQP support under mini-qmail doesn't load balance, can I
 feed
   it
a hostname with mu

Re: questions about performance and setup

2000-07-17 Thread markd

  In other words you'll need to pump out 5+ megabits per second, which
  means a connection of around double that, say 10 Mbits per second.
 
  Is that what you have available?
 
 In theory, yes.  In practice... remains to be seen.
 I did some similar calculations and came up with a similar result, which
 shocked me at first, but it must be possible, because there are
 companies out there that do it!

Right. One I've was helping a little while ago had 20+ systems
dedicated to the task and they were co-lo'd with plenty of
connectivity. Once you started getting into large scale you need
to consider multiple systems to at least ensure that you have some
sort of redundancy strategy.


Regards.



Re: questions about performance and setup

2000-07-17 Thread markd

On Mon, Jul 17, 2000 at 10:29:03AM -0500, Austad, Jay wrote:
 With all of the emails I recieved, I get the impression that I'm going to
 I/O bound instead of processor or memory bound.  How much disk will be
 sufficient for the queue?  1GB?  More?
 
 I'm just grasping here to figure out the best solution, so bear with me...
 What if I only needed a 1GB queue, and what if that queue was a 1GB ramdisk
 (I can put 2GB of ram in the box)?  Linux has support for making a disk in
 memory, putting a filesystem on it and mounting it.  Wouldn't this take care
 of I/O problems?

The I/O cost is simply there to protect again machine failure, reboots,
power-loss, OS bugs, that sort of thing. Your memory file system will
be ok if it's battery-backed up and running on a system as reliable as
a hard-disk. Otherwise you increase the chances of losing part of your
queue at some point.


Having said that, you may find the trade-off acceptable. That is putting
the queue in a memory file system and accepting a total queue loss once
every now and and again. If eg, it's advertising email and occassional
losses are tolerable, then this may be a perfectly acceptable
cost/reliability trade-off for you.


Regards.



Re: questions about performance and setup

2000-07-17 Thread John White

On Mon, Jul 17, 2000 at 10:29:03AM -0500, Austad, Jay wrote:
 With all of the emails I recieved, I get the impression that I'm going to
 I/O bound instead of processor or memory bound.  

Ahhh... someone who gets it.

 How much disk will be sufficient for the queue?  1GB?  More?

What if you did your entire queue injection with your network
connection down?  I'd budget for a significant portion of that 
plus growth and safety.
 
 I'm just grasping here to figure out the best solution, so bear with me...
 What if I only needed a 1GB queue, and what if that queue was a 1GB ramdisk
 (I can put 2GB of ram in the box)?  Linux has support for making a disk in
 memory, putting a filesystem on it and mounting it.  Wouldn't this take care
 of I/O problems?
 
I'd read up on ramdisks first.  They aren't instant i/o.

Alternatively, Quantum has a line of solid state disks which might do
the trick for you.  Pretty pricey, though.

http://www.zdnet.com/etestinglabs/stories/main/0,8829,2352381,00.html

John



RE: questions about performance and setup

2000-07-17 Thread Jason Murphy


I overlooked that when I posted this message; I totally forgot about the
write penalty. Sorry about that.



-Original Message-
From: John White [mailto:[EMAIL PROTECTED]]
Sent: Friday, July 14, 2000 7:08 PM
To: qmail mailing list
Subject: Re: questions about performance and setup


On Fri, Jul 14, 2000 at 12:21:57PM -0700, Jason Murphy wrote:
  The machine I built contains a DPT SmartRAID V SCSI RAID 0/1/5
controller
 with 5 1RPM 9.1 gig drives. The thing I notice about RAID 5 in the
 right configuration is that you can throw tons of IO at it and you will
 see little decrease in performance. Our Database server (Ya, I know, its
 not MAIL SERVER) gets tons of IO and its nothing to it; just eats it up
 and continues on its way.

A massive mail injection, especially if the content is unique to the
user, can overwhelm a disk subsystem.

This is reccomending the exact -wrong- kind of disk system.  RAID 5
has a write penalty, as it has to calculate parity for each write,
and write to multiple spindles.

The best type of RAID for small block writes is RAID 10 or RAID 1+0
(not to be confused with RAID 0+1).  Even better is to use a disk
system with write-back cache.  Ideally, you need at least seven
spindles.

I've seen great things with the Infortrend controller.

A great setup would be 1U pc's connected to an external RAID.

John

 smime.p7s


Re: questions about performance and setup

2000-07-17 Thread Jason Haar

On Mon, Jul 17, 2000 at 09:51:22AM -0700, John White wrote:
  (I can put 2GB of ram in the box)?  Linux has support for making a disk in
  memory, putting a filesystem on it and mounting it.  Wouldn't this take care
  of I/O problems?
  
 I'd read up on ramdisks first.  They aren't instant i/o.

Indeed. I haven't played around with ramdisks for a couple of years now, but
last time I benchmarked them, they didn't appear to run much faster than a
harddisk FOR READS as buffer caches on harddisks made them act very
similarly Writes would be a different prospect of course...

-- 
Cheers

Jason Haar

Unix/Network Specialist, Trimble NZ
Phone: +64 3 9635 377 Fax: +64 3 9635 417
   



Re: questions about performance and setup

2000-07-17 Thread Steve Wolfe



 With all of the emails I recieved, I get the impression that I'm going to
 I/O bound instead of processor or memory bound.  How much disk will be
 sufficient for the queue?  1GB?  More?

  It's not so much a matter of disk size (I don't think you'll have a 1 gig
queue!), but of throughput.  For example, a single IDE drive will get you a
couple of megabytes of throughput per second, at a very high CPU cost.  SCSI
will yield more, with a lower CPU utilization, and with RAID arrays, you can
move up to hundreds of megabytes per second if you want to.

 I'm just grasping here to figure out the best solution, so bear with me...
 What if I only needed a 1GB queue, and what if that queue was a 1GB
ramdisk
 (I can put 2GB of ram in the box)?  Linux has support for making a disk in
 memory, putting a filesystem on it and mounting it.  Wouldn't this take
care
 of I/O problems?

   That's about as good of I/O as you can get, I would imagine. ; )  As
another author stated, the largest gain would be in writes, but that's where
the largest expenditure is anyway.   Just make dang, dang sure that your
machine is NOT going to have any hiccups or lose power while the queue is
full, or you'll instantly lose it all.

steve




Re: questions about performance and setup

2000-07-17 Thread Bruce Guenter

On Mon, Jul 17, 2000 at 10:24:53PM -0600, Steve Wolfe wrote:
  With all of the emails I recieved, I get the impression that I'm going to
  I/O bound instead of processor or memory bound.  How much disk will be
  sufficient for the queue?  1GB?  More?
   It's not so much a matter of disk size (I don't think you'll have a 1 gig
 queue!), but of throughput.  For example, a single IDE drive will get you a
 couple of megabytes of throughput per second, at a very high CPU cost.  SCSI
 will yield more, with a lower CPU utilization, and with RAID arrays, you can
 move up to hundreds of megabytes per second if you want to.

Not entirely true.  With UDMA mode, modern IDE drives get high
throughput with low CPU utilization.  On my Celeron PC, I could get well
over 10MB/sec at well under 20% CPU, and it's hardly performance
hardware (5400RPM spindle).  With a 10K RPM spindle and a faster chipset
(mine's a VIA) this will rival or beat fast SCSI disks in raw streaming
bandwidth.  However, the majority of mail queues are not even bandwidth
bound -- they're seek bound, which is where SCSI disks still beat IDE.
The faster seek time, the better (which is the motivation behind DJB's
ingenious zeroseek proposal).  Also, RAID5 arrays (the most common one
for large capacities) suffer a significant write penalty due to
recalculation and rewiting of the parity, and the mail queue is mostly
written (and subsequently cached).  A RAID1+0 array works better, but
uses more disks.
-- 
Bruce Guenter [EMAIL PROTECTED]   http://em.ca/~bruceg/

 PGP signature


Re: questions about performance and setup

2000-07-17 Thread Oliver White

Steve Wolfe wrote:

  With all of the emails I recieved, I get the impression that I'm going to
  I/O bound instead of processor or memory bound.  How much disk will be
  sufficient for the queue?  1GB?  More?

   It's not so much a matter of disk size (I don't think you'll have a 1 gig
 queue!),

You could quite easily get a 1 Gig queue, even if you don't run into the
obvious problem of temporary loss of network connectivity.  Say you've
got 200,000 subscribers and you generate your messages twice as fast
as qmail can send them, then when you've finished generating the
messages you've still got 100,000 in the queue.  If the messages are
10Kb each, that's 1 Gb.

  (I can put 2GB of ram in the box)?  Linux has support for making a disk in
  memory, putting a filesystem on it and mounting it.  Wouldn't this take
  care of I/O problems?

That's about as good of I/O as you can get, I would imagine. ; )  As
 another author stated, the largest gain would be in writes, but that's where
 the largest expenditure is anyway.   Just make dang, dang sure that your
 machine is NOT going to have any hiccups or lose power while the queue is
 full, or you'll instantly lose it all.

What if you put the 2 Gb RAM in the box, but let Linux use it as a disk cache?
I'm not sure how the disk caching under Linux works, but if you create a file
and then delete it before it actually gets written to disk, is there any disk
activity required?
Sure, the disks will be thrashing away, trying to keep up, but would the I/O
actually block if there was still room in the disk cache?

 - Oliver.





Re: questions about performance and setup

2000-07-16 Thread Oliver White

We're in a similar situation at the moment.  However, we want to send out
100,000 UNIQUE emails per day, expanding to 500,000 or more in the near
future.  Also, our send window is only actually a couple of hours.

I'm trying to work out the best settings for the concurrencyremote and
conf-split parameters.  Our system is a HP Netserver 2000r PIII-667 RAID5
running Linux.  Are there any problems in setting conf-split to a very large
value?  Is it necessary on a Linux system, assuming a queue size of, say
100,000?  Any information appreciated.

 - Oliver.

"Austad, Jay" wrote:

 Non-unique emails will most likely be generated by other machines and send
 the box running mini-qmail via smtp.  Non-unique emails will be a small
 percentage of what gets sent out, for now.

 Jay

 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
 Sent: Saturday, July 15, 2000 12:10 AM
 To: '[EMAIL PROTECTED]'
 Subject: Re: questions about performance and setup

 On Fri, Jul 14, 2000 at 07:01:46PM -0500, Austad, Jay wrote:
  Then have the script that does the mailing call randomly
  on of the /var/qmail#/bin/qmail-inject. This will emulate round robin
  without any patching.
 
  Won't this way be a performance hit though?  I admit, it is an easy
 solution

 No. My experience is that the cost of running a script to inject the mail
 in a way similar to that mentioned above, is pretty small compared to the
 queue injection cost and the delivery cost. sh or perl will be fine.

  and would work excellent, but I have to think about efficiency also.  C
 code
  is much faster than shell or perl, and I'd like to set it up once and not
  have to ever worry about again, or at least for a long, long time.
 
  As I said, we're doing 50 million emails a month right now, but this is
  increasing substantially each month, and as we rollout new subscription
  services, we'll have even more load.  Sending 10 times this amount by the
  same time next year is a good possibility, possibly sooner as we seem to
  underestimate the rate at which we're growing much of the time...

 You may also need to look at the scalability of the generation of the
 emails. One system I recently looked at claimed to be able to generate
 nicely unique emails at a targetted database, but it burned CPU like
 it was free - just in generating the content.

 Mark.

 
  Jay
 
  -Original Message-
  From: JuanE [mailto:[EMAIL PROTECTED]]
  Sent: Friday, July 14, 2000 5:55 PM
  To: '[EMAIL PROTECTED]'
  Subject: Re: questions about performance and setup
 
 
 
  Jay,
 
  That's the beauty of having multiple instances, not having to patch qmail.
  All you need to do is install qmail once per machine (ie, /var/qmail1,
  /var/qmail2,...). Then have the script that does the mailing call randomly
  on of the /var/qmail#/bin/qmail-inject. This will emulate round robin
  without any patching.
 
  JES
 
  Austad, Jay writes:
 
   Where would I start in the code to modify the QMQP servers list so that
 it
   would load balance between all of the servers in the list instead of
 just
   using the first one it can contact?  This would be very useful to me.  I
   assume qmail-qmqpc.c is one of them, are there others I would need to
 play
   around with?
  
   Jay
  
   -Original Message-
   From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
   Sent: Friday, July 14, 2000 3:55 PM
   To: '[EMAIL PROTECTED]'
   Subject: Re: questions about performance and setup
  
  
   On Fri, Jul 14, 2000 at 02:29:06PM -0500, Austad, Jay wrote:
I already have Mandrake Linux 7.0 and 7.1 running on multiple Dell
 boxes
with no trouble, some of them took work to get going, but it runs
 well.
  I
have a few Crystal PC's here also that I may use instead, dual PIII
  550's
with 512MB ram and 9 or 18GB 1rpm drives.  I'll probably use these
  for
testing.
  
   I agree with the earlier poster that more spindles for your queue
   (c/- raid) is a good thing in general.
  
The bulk of the messages will be the same content to many rcpt's.
   However,
once in awhile we'll have 100,000 different messages go out to 100,000
different people.
   
Since the QMQP support under mini-qmail doesn't load balance, can I
 feed
   it
a hostname with multiple dns entries (round-robin dns)?  Or better
 yet,
   how
easy would it be to modify the qmail code to just load balance between
   them?
  
   The manpage for qmail-qmqpc tells us that they have to be IP addresses
   in qmqpservers so a RR DNS won't help. If all of the messages are
  generated
   on one machine, then I'd be inclined to go for a much simpler solution
   than modifying qmail. I'd have an instance of qmail for each outbound
   server with the appropriate qmqpservers entry, then have your queue
   insertion script do a round-robin itself by simply cycling thru
   the qmail-inject command associated with each instance.
  
   for instance in 1 2 3

RE: questions about performance and setup

2000-07-14 Thread Hubbard, David

Hey Jay,
   I don't know much about setting that type of thing up in
qmail, but I would like to give you some ideas on the
hardware.  I'm not sure how much load qmail would generate
in a scenario like that, but you may want to consider
Solaris x86 for the superior SMP performance.  Also, you should
know what you're getting into on the Dell boxes if you choose
to run linux.  I've got a Dell PE2400 dual that runs linux
and you're going to be at the mercy of Dell and Adaptec on
when you upgrade your kernel because they have some sorry
proprietary drivers for their RAID controllers that are
tailored to a specific kernel version and redhat sub-revision.
If you can put up with that, then Redhat Linux/Qmail on a
Dell runs very fast, I'm happy with mine.  But at the same
time, I'm sitting on a kernel with a known suid exploit
hoping Dell will release newer drivers soon...  It is much
nicer running Linux on an older Dell server of mine that has
an AMI MegaRaid card with drivers built into the kernel.

Dave

-Original Message-
From: Austad, Jay
To: '[EMAIL PROTECTED]'
Sent: 7/14/00 2:18 PM
Subject: questions about performance and setup

I've been given the task of setting up our own "blaster" for sending
out emails of our financial news and charts to our subscribers.  We
outsource this right now, and it's abysmally expensive.  Basically,
we want 3 boxes (or so) that run in parallel and blast out the emails,
about 50 million per month, but the subscription rate is growing
rapidly each month.  It needs to handle bounced mail by dumping the
addresses into a file for later retrieval so they can be removed
from the database, or by running an external script for each bounced
address.

I'm looking at getting 3 dell dual PIII 750's, with a 18 or 36GB
1rpm disk, and 512M or 1G of mem each.  Each will run Linux or
BSD.  

Here's what I need to know:

1.  How well does qmail take advantage of multiple processors?  How
much memory and disk will I need?  (we're at 50 million messages per
month now, and we only send out monday-friday, so that's over 2
million messages per day, and it's only going up)

2.  How many messages per day would one estimate that each of these
servers could do?

3. I read about mini-qmail and how it's about 100 times faster blasting
out email to QMQP servers.  Since you can specify multiple QMQP
servers, if I have a fourth machine running mini-qmail and managing
the actual mailing list, can I add the other 3 as QMQP servers and
have it load balance between all 3 for sending out mail?  (this way
I could add more servers easily if I needed to)

4. Can I easily make qmail run an external script for each bounced
mail?

5.  Anything else I should know?

Thanks.

--
Jay Austad
Network Administrator
CBS Marketwatch
612.817.1271
[EMAIL PROTECTED]
http://cbs.marketwatch.com
http://www.bigcharts.com




RE: questions about performance and setup

2000-07-14 Thread Jason Murphy


 I might as well jump into this since I just built a RAID 5 system for a
database.

 The machine I built contains a DPT SmartRAID V SCSI RAID 0/1/5 controller
with 5 1RPM 9.1 gig drives. The thing I notice about RAID 5 in the
right configuration is that you can throw tons of IO at it and you will
see little decrease in performance. Our Database server (Ya, I know, its
not MAIL SERVER) gets tons of IO and its nothing to it; just eats it up
and continues on its way.

I gotta say that you can't go wrong with this controller. It's a I2O
controller and thus supported in FreeBSD and Linux.
As Dave stated, you will get stuck with Dell and their proprietary
drivers, this I would avoid like the plague.


-Original Message-
From: Hubbard, David [mailto:[EMAIL PROTECTED]]
Sent: Friday, July 14, 2000 11:48 AM
To: '[EMAIL PROTECTED]'
Subject: RE: questions about performance and setup


Hey Jay,
   I don't know much about setting that type of thing up in
qmail, but I would like to give you some ideas on the
hardware.  I'm not sure how much load qmail would generate
in a scenario like that, but you may want to consider
Solaris x86 for the superior SMP performance.  Also, you should
know what you're getting into on the Dell boxes if you choose
to run linux.  I've got a Dell PE2400 dual that runs linux
and you're going to be at the mercy of Dell and Adaptec on
when you upgrade your kernel because they have some sorry
proprietary drivers for their RAID controllers that are
tailored to a specific kernel version and redhat sub-revision.
If you can put up with that, then Redhat Linux/Qmail on a
Dell runs very fast, I'm happy with mine.  But at the same
time, I'm sitting on a kernel with a known suid exploit
hoping Dell will release newer drivers soon...  It is much
nicer running Linux on an older Dell server of mine that has
an AMI MegaRaid card with drivers built into the kernel.

Dave

-Original Message-
From: Austad, Jay
To: '[EMAIL PROTECTED]'
Sent: 7/14/00 2:18 PM
Subject: questions about performance and setup

I've been given the task of setting up our own "blaster" for sending
out emails of our financial news and charts to our subscribers.  We
outsource this right now, and it's abysmally expensive.  Basically,
we want 3 boxes (or so) that run in parallel and blast out the emails,
about 50 million per month, but the subscription rate is growing
rapidly each month.  It needs to handle bounced mail by dumping the
addresses into a file for later retrieval so they can be removed
from the database, or by running an external script for each bounced
address.

I'm looking at getting 3 dell dual PIII 750's, with a 18 or 36GB
1rpm disk, and 512M or 1G of mem each.  Each will run Linux or
BSD.

Here's what I need to know:

1.  How well does qmail take advantage of multiple processors?  How
much memory and disk will I need?  (we're at 50 million messages per
month now, and we only send out monday-friday, so that's over 2
million messages per day, and it's only going up)

2.  How many messages per day would one estimate that each of these
servers could do?

3. I read about mini-qmail and how it's about 100 times faster blasting
out email to QMQP servers.  Since you can specify multiple QMQP
servers, if I have a fourth machine running mini-qmail and managing
the actual mailing list, can I add the other 3 as QMQP servers and
have it load balance between all 3 for sending out mail?  (this way
I could add more servers easily if I needed to)

4. Can I easily make qmail run an external script for each bounced
mail?

5.  Anything else I should know?

Thanks.

--
Jay Austad
Network Administrator
CBS Marketwatch
612.817.1271
[EMAIL PROTECTED]
http://cbs.marketwatch.com
http://www.bigcharts.com

 smime.p7s


RE: questions about performance and setup

2000-07-14 Thread Austad, Jay

I already have Mandrake Linux 7.0 and 7.1 running on multiple Dell boxes
with no trouble, some of them took work to get going, but it runs well.  I
have a few Crystal PC's here also that I may use instead, dual PIII 550's
with 512MB ram and 9 or 18GB 1rpm drives.  I'll probably use these for
testing.

The bulk of the messages will be the same content to many rcpt's.  However,
once in awhile we'll have 100,000 different messages go out to 100,000
different people.

Since the QMQP support under mini-qmail doesn't load balance, can I feed it
a hostname with multiple dns entries (round-robin dns)?  Or better yet, how
easy would it be to modify the qmail code to just load balance between them?

Jay



-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
Sent: Friday, July 14, 2000 2:09 PM
To: '[EMAIL PROTECTED]'
Subject: Re: questions about performance and setup


 Here's what I need to know:
 
 1.  How well does qmail take advantage of multiple processors?  How much

Indreectly, quite well as it forks many processes, thus if the OS takes
good advantage of your CPUs, then qmail inherits that advantage.

 memory and disk will I need?  (we're at 50 million messages per month now,

Are these message unique per target address or the same. If unique, your
requirements are vastly different and very queue/disk intensive. If they
are the same and you take advantage or VERP support on qmail, then
your load will mainly be sending related which will benefit from
more memory, multiple instances, etc.

 and we only send out monday-friday, so that's over 2 million messages per
 day, and it's only going up)
 
 2.  How many messages per day would one estimate that each of these
servers
 could do?
 
 3. I read about mini-qmail and how it's about 100 times faster blasting
out
 email to QMQP servers.  Since you can specify multiple QMQP servers, if I
 have a fourth machine running mini-qmail and managing the actual mailing
 list, can I add the other 3 as QMQP servers and have it load balance
between
 all 3 for sending out mail?  (this way I could add more servers easily if
I
 needed to)

The qmqp support doesn't load balance. It simply takes the first one
it can connect to.
 
 4. Can I easily make qmail run an external script for each bounced mail?

Absolutely.

 5.  Anything else I should know?

That all hinges on whether your emails are unique for each recipient or
not. Or more importantly, the average number of recipients per unique
email.


Regards.



Re: questions about performance and setup

2000-07-14 Thread markd

On Fri, Jul 14, 2000 at 02:29:06PM -0500, Austad, Jay wrote:
 I already have Mandrake Linux 7.0 and 7.1 running on multiple Dell boxes
 with no trouble, some of them took work to get going, but it runs well.  I
 have a few Crystal PC's here also that I may use instead, dual PIII 550's
 with 512MB ram and 9 or 18GB 1rpm drives.  I'll probably use these for
 testing.

I agree with the earlier poster that more spindles for your queue
(c/- raid) is a good thing in general.

 The bulk of the messages will be the same content to many rcpt's.  However,
 once in awhile we'll have 100,000 different messages go out to 100,000
 different people.
 
 Since the QMQP support under mini-qmail doesn't load balance, can I feed it
 a hostname with multiple dns entries (round-robin dns)?  Or better yet, how
 easy would it be to modify the qmail code to just load balance between them?

The manpage for qmail-qmqpc tells us that they have to be IP addresses
in qmqpservers so a RR DNS won't help. If all of the messages are generated
on one machine, then I'd be inclined to go for a much simpler solution
than modifying qmail. I'd have an instance of qmail for each outbound
server with the appropriate qmqpservers entry, then have your queue
insertion script do a round-robin itself by simply cycling thru
the qmail-inject command associated with each instance.

for instance in 1 2 3 4 5
do
getnext_message_details()
/var/qmail{$instance}/bin/qmail-inject currentmessage  details
done

Or some such.


Alternatively, if you have money to burn, maybe a layer four switch
with load-balancing skills.


Mark.


 
 Jay
 
 
 
 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
 Sent: Friday, July 14, 2000 2:09 PM
 To: '[EMAIL PROTECTED]'
 Subject: Re: questions about performance and setup
 
 
  Here's what I need to know:
  
  1.  How well does qmail take advantage of multiple processors?  How much
 
 Indreectly, quite well as it forks many processes, thus if the OS takes
 good advantage of your CPUs, then qmail inherits that advantage.
 
  memory and disk will I need?  (we're at 50 million messages per month now,
 
 Are these message unique per target address or the same. If unique, your
 requirements are vastly different and very queue/disk intensive. If they
 are the same and you take advantage or VERP support on qmail, then
 your load will mainly be sending related which will benefit from
 more memory, multiple instances, etc.
 
  and we only send out monday-friday, so that's over 2 million messages per
  day, and it's only going up)
  
  2.  How many messages per day would one estimate that each of these
 servers
  could do?
  
  3. I read about mini-qmail and how it's about 100 times faster blasting
 out
  email to QMQP servers.  Since you can specify multiple QMQP servers, if I
  have a fourth machine running mini-qmail and managing the actual mailing
  list, can I add the other 3 as QMQP servers and have it load balance
 between
  all 3 for sending out mail?  (this way I could add more servers easily if
 I
  needed to)
 
 The qmqp support doesn't load balance. It simply takes the first one
 it can connect to.
  
  4. Can I easily make qmail run an external script for each bounced mail?
 
 Absolutely.
 
  5.  Anything else I should know?
 
 That all hinges on whether your emails are unique for each recipient or
 not. Or more importantly, the average number of recipients per unique
 email.
 
 
 Regards.



RE: questions about performance and setup

2000-07-14 Thread Austad, Jay

Where would I start in the code to modify the QMQP servers list so that it
would load balance between all of the servers in the list instead of just
using the first one it can contact?  This would be very useful to me.  I
assume qmail-qmqpc.c is one of them, are there others I would need to play
around with?

Jay

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
Sent: Friday, July 14, 2000 3:55 PM
To: '[EMAIL PROTECTED]'
Subject: Re: questions about performance and setup


On Fri, Jul 14, 2000 at 02:29:06PM -0500, Austad, Jay wrote:
 I already have Mandrake Linux 7.0 and 7.1 running on multiple Dell boxes
 with no trouble, some of them took work to get going, but it runs well.  I
 have a few Crystal PC's here also that I may use instead, dual PIII 550's
 with 512MB ram and 9 or 18GB 1rpm drives.  I'll probably use these for
 testing.

I agree with the earlier poster that more spindles for your queue
(c/- raid) is a good thing in general.

 The bulk of the messages will be the same content to many rcpt's.
However,
 once in awhile we'll have 100,000 different messages go out to 100,000
 different people.
 
 Since the QMQP support under mini-qmail doesn't load balance, can I feed
it
 a hostname with multiple dns entries (round-robin dns)?  Or better yet,
how
 easy would it be to modify the qmail code to just load balance between
them?

The manpage for qmail-qmqpc tells us that they have to be IP addresses
in qmqpservers so a RR DNS won't help. If all of the messages are generated
on one machine, then I'd be inclined to go for a much simpler solution
than modifying qmail. I'd have an instance of qmail for each outbound
server with the appropriate qmqpservers entry, then have your queue
insertion script do a round-robin itself by simply cycling thru
the qmail-inject command associated with each instance.

for instance in 1 2 3 4 5
do
getnext_message_details()
/var/qmail{$instance}/bin/qmail-inject currentmessage  details
done

Or some such.


Alternatively, if you have money to burn, maybe a layer four switch
with load-balancing skills.


Mark.


 
 Jay
 
 
 
 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
 Sent: Friday, July 14, 2000 2:09 PM
 To: '[EMAIL PROTECTED]'
 Subject: Re: questions about performance and setup
 
 
  Here's what I need to know:
  
  1.  How well does qmail take advantage of multiple processors?  How much
 
 Indreectly, quite well as it forks many processes, thus if the OS takes
 good advantage of your CPUs, then qmail inherits that advantage.
 
  memory and disk will I need?  (we're at 50 million messages per month
now,
 
 Are these message unique per target address or the same. If unique, your
 requirements are vastly different and very queue/disk intensive. If they
 are the same and you take advantage or VERP support on qmail, then
 your load will mainly be sending related which will benefit from
 more memory, multiple instances, etc.
 
  and we only send out monday-friday, so that's over 2 million messages
per
  day, and it's only going up)
  
  2.  How many messages per day would one estimate that each of these
 servers
  could do?
  
  3. I read about mini-qmail and how it's about 100 times faster blasting
 out
  email to QMQP servers.  Since you can specify multiple QMQP servers, if
I
  have a fourth machine running mini-qmail and managing the actual mailing
  list, can I add the other 3 as QMQP servers and have it load balance
 between
  all 3 for sending out mail?  (this way I could add more servers easily
if
 I
  needed to)
 
 The qmqp support doesn't load balance. It simply takes the first one
 it can connect to.
  
  4. Can I easily make qmail run an external script for each bounced mail?
 
 Absolutely.
 
  5.  Anything else I should know?
 
 That all hinges on whether your emails are unique for each recipient or
 not. Or more importantly, the average number of recipients per unique
 email.
 
 
 Regards.



Re: questions about performance and setup

2000-07-14 Thread markd

Line 153 of qmail-qmqpc.c is a good place to start. It's a trivial
loop that would benefit from something like adjusting the starting
point by some random value. Eg:


  randj = rand() % servers.len;
  i = 0;
  for (j = randj;j  servers.len;++j)
if (!servers.s[j]) {
  doit(servers.s + i);
  i = j + 1;
}

Then repeat the loop from zero to randj - 1

  i = 0;
  for (j = 0;j  randj;++j)
...



Mark.

On Fri, Jul 14, 2000 at 05:38:44PM -0500, Austad, Jay wrote:
 Where would I start in the code to modify the QMQP servers list so that it
 would load balance between all of the servers in the list instead of just
 using the first one it can contact?  This would be very useful to me.  I
 assume qmail-qmqpc.c is one of them, are there others I would need to play
 around with?
 
 Jay
 
 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
 Sent: Friday, July 14, 2000 3:55 PM
 To: '[EMAIL PROTECTED]'
 Subject: Re: questions about performance and setup
 
 
 On Fri, Jul 14, 2000 at 02:29:06PM -0500, Austad, Jay wrote:
  I already have Mandrake Linux 7.0 and 7.1 running on multiple Dell boxes
  with no trouble, some of them took work to get going, but it runs well.  I
  have a few Crystal PC's here also that I may use instead, dual PIII 550's
  with 512MB ram and 9 or 18GB 1rpm drives.  I'll probably use these for
  testing.
 
 I agree with the earlier poster that more spindles for your queue
 (c/- raid) is a good thing in general.
 
  The bulk of the messages will be the same content to many rcpt's.
 However,
  once in awhile we'll have 100,000 different messages go out to 100,000
  different people.
  
  Since the QMQP support under mini-qmail doesn't load balance, can I feed
 it
  a hostname with multiple dns entries (round-robin dns)?  Or better yet,
 how
  easy would it be to modify the qmail code to just load balance between
 them?
 
 The manpage for qmail-qmqpc tells us that they have to be IP addresses
 in qmqpservers so a RR DNS won't help. If all of the messages are generated
 on one machine, then I'd be inclined to go for a much simpler solution
 than modifying qmail. I'd have an instance of qmail for each outbound
 server with the appropriate qmqpservers entry, then have your queue
 insertion script do a round-robin itself by simply cycling thru
 the qmail-inject command associated with each instance.
 
 for instance in 1 2 3 4 5
 do
   getnext_message_details()
   /var/qmail{$instance}/bin/qmail-inject currentmessage  details
 done
 
 Or some such.
 
 
 Alternatively, if you have money to burn, maybe a layer four switch
 with load-balancing skills.
 
 
 Mark.
 
 
  
  Jay
  
  
  
  -Original Message-
  From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
  Sent: Friday, July 14, 2000 2:09 PM
  To: '[EMAIL PROTECTED]'
  Subject: Re: questions about performance and setup
  
  
   Here's what I need to know:
   
   1.  How well does qmail take advantage of multiple processors?  How much
  
  Indreectly, quite well as it forks many processes, thus if the OS takes
  good advantage of your CPUs, then qmail inherits that advantage.
  
   memory and disk will I need?  (we're at 50 million messages per month
 now,
  
  Are these message unique per target address or the same. If unique, your
  requirements are vastly different and very queue/disk intensive. If they
  are the same and you take advantage or VERP support on qmail, then
  your load will mainly be sending related which will benefit from
  more memory, multiple instances, etc.
  
   and we only send out monday-friday, so that's over 2 million messages
 per
   day, and it's only going up)
   
   2.  How many messages per day would one estimate that each of these
  servers
   could do?
   
   3. I read about mini-qmail and how it's about 100 times faster blasting
  out
   email to QMQP servers.  Since you can specify multiple QMQP servers, if
 I
   have a fourth machine running mini-qmail and managing the actual mailing
   list, can I add the other 3 as QMQP servers and have it load balance
  between
   all 3 for sending out mail?  (this way I could add more servers easily
 if
  I
   needed to)
  
  The qmqp support doesn't load balance. It simply takes the first one
  it can connect to.
   
   4. Can I easily make qmail run an external script for each bounced mail?
  
  Absolutely.
  
   5.  Anything else I should know?
  
  That all hinges on whether your emails are unique for each recipient or
  not. Or more importantly, the average number of recipients per unique
  email.
  
  
  Regards.



Re: questions about performance and setup

2000-07-14 Thread JuanE


Jay,

That's the beauty of having multiple instances, not having to patch qmail.
All you need to do is install qmail once per machine (ie, /var/qmail1,
/var/qmail2,...). Then have the script that does the mailing call randomly
on of the /var/qmail#/bin/qmail-inject. This will emulate round robin
without any patching.

JES

Austad, Jay writes:

 Where would I start in the code to modify the QMQP servers list so that it
 would load balance between all of the servers in the list instead of just
 using the first one it can contact?  This would be very useful to me.  I
 assume qmail-qmqpc.c is one of them, are there others I would need to play
 around with?
 
 Jay
 
 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
 Sent: Friday, July 14, 2000 3:55 PM
 To: '[EMAIL PROTECTED]'
 Subject: Re: questions about performance and setup
 
 
 On Fri, Jul 14, 2000 at 02:29:06PM -0500, Austad, Jay wrote:
  I already have Mandrake Linux 7.0 and 7.1 running on multiple Dell boxes
  with no trouble, some of them took work to get going, but it runs well.  I
  have a few Crystal PC's here also that I may use instead, dual PIII 550's
  with 512MB ram and 9 or 18GB 1rpm drives.  I'll probably use these for
  testing.
 
 I agree with the earlier poster that more spindles for your queue
 (c/- raid) is a good thing in general.
 
  The bulk of the messages will be the same content to many rcpt's.
 However,
  once in awhile we'll have 100,000 different messages go out to 100,000
  different people.
  
  Since the QMQP support under mini-qmail doesn't load balance, can I feed
 it
  a hostname with multiple dns entries (round-robin dns)?  Or better yet,
 how
  easy would it be to modify the qmail code to just load balance between
 them?
 
 The manpage for qmail-qmqpc tells us that they have to be IP addresses
 in qmqpservers so a RR DNS won't help. If all of the messages are generated
 on one machine, then I'd be inclined to go for a much simpler solution
 than modifying qmail. I'd have an instance of qmail for each outbound
 server with the appropriate qmqpservers entry, then have your queue
 insertion script do a round-robin itself by simply cycling thru
 the qmail-inject command associated with each instance.
 
 for instance in 1 2 3 4 5
 do
   getnext_message_details()
   /var/qmail{$instance}/bin/qmail-inject currentmessage  details
 done
 
 Or some such.
 
 
 Alternatively, if you have money to burn, maybe a layer four switch
 with load-balancing skills.
 
 
 Mark.
 
 
  
  Jay
  
  
  
  -Original Message-
  From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
  Sent: Friday, July 14, 2000 2:09 PM
  To: '[EMAIL PROTECTED]'
  Subject: Re: questions about performance and setup
  
  
   Here's what I need to know:
   
   1.  How well does qmail take advantage of multiple processors?  How much
  
  Indreectly, quite well as it forks many processes, thus if the OS takes
  good advantage of your CPUs, then qmail inherits that advantage.
  
   memory and disk will I need?  (we're at 50 million messages per month
 now,
  
  Are these message unique per target address or the same. If unique, your
  requirements are vastly different and very queue/disk intensive. If they
  are the same and you take advantage or VERP support on qmail, then
  your load will mainly be sending related which will benefit from
  more memory, multiple instances, etc.
  
   and we only send out monday-friday, so that's over 2 million messages
 per
   day, and it's only going up)
   
   2.  How many messages per day would one estimate that each of these
  servers
   could do?
   
   3. I read about mini-qmail and how it's about 100 times faster blasting
  out
   email to QMQP servers.  Since you can specify multiple QMQP servers, if
 I
   have a fourth machine running mini-qmail and managing the actual mailing
   list, can I add the other 3 as QMQP servers and have it load balance
  between
   all 3 for sending out mail?  (this way I could add more servers easily
 if
  I
   needed to)
  
  The qmqp support doesn't load balance. It simply takes the first one
  it can connect to.
   
   4. Can I easily make qmail run an external script for each bounced mail?
  
  Absolutely.
  
   5.  Anything else I should know?
  
  That all hinges on whether your emails are unique for each recipient or
  not. Or more importantly, the average number of recipients per unique
  email.
  
  
  Regards.
 






Re: questions about performance and setup

2000-07-14 Thread markd

Of course there is at least one bug in here, but you get
the idea.


Mark.

On Fri, Jul 14, 2000 at 04:00:43PM -0700, [EMAIL PROTECTED] wrote:
 Line 153 of qmail-qmqpc.c is a good place to start. It's a trivial
 loop that would benefit from something like adjusting the starting
 point by some random value. Eg:
 
 
   randj = rand() % servers.len;
   i = 0;
   for (j = randj;j  servers.len;++j)
 if (!servers.s[j]) {
   doit(servers.s + i);
   i = j + 1;
 }
 
 Then repeat the loop from zero to randj - 1
 
   i = 0;
   for (j = 0;j  randj;++j)
   ...
 



RE: questions about performance and setup

2000-07-14 Thread Austad, Jay

Then have the script that does the mailing call randomly
on of the /var/qmail#/bin/qmail-inject. This will emulate round robin
without any patching.

Won't this way be a performance hit though?  I admit, it is an easy solution
and would work excellent, but I have to think about efficiency also.  C code
is much faster than shell or perl, and I'd like to set it up once and not
have to ever worry about again, or at least for a long, long time.

As I said, we're doing 50 million emails a month right now, but this is
increasing substantially each month, and as we rollout new subscription
services, we'll have even more load.  Sending 10 times this amount by the
same time next year is a good possibility, possibly sooner as we seem to
underestimate the rate at which we're growing much of the time...

Jay

-Original Message-
From: JuanE [mailto:[EMAIL PROTECTED]]
Sent: Friday, July 14, 2000 5:55 PM
To: '[EMAIL PROTECTED]'
Subject: Re: questions about performance and setup



Jay,

That's the beauty of having multiple instances, not having to patch qmail.
All you need to do is install qmail once per machine (ie, /var/qmail1,
/var/qmail2,...). Then have the script that does the mailing call randomly
on of the /var/qmail#/bin/qmail-inject. This will emulate round robin
without any patching.

JES

Austad, Jay writes:

 Where would I start in the code to modify the QMQP servers list so that it
 would load balance between all of the servers in the list instead of just
 using the first one it can contact?  This would be very useful to me.  I
 assume qmail-qmqpc.c is one of them, are there others I would need to play
 around with?
 
 Jay
 
 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
 Sent: Friday, July 14, 2000 3:55 PM
 To: '[EMAIL PROTECTED]'
 Subject: Re: questions about performance and setup
 
 
 On Fri, Jul 14, 2000 at 02:29:06PM -0500, Austad, Jay wrote:
  I already have Mandrake Linux 7.0 and 7.1 running on multiple Dell boxes
  with no trouble, some of them took work to get going, but it runs well.
I
  have a few Crystal PC's here also that I may use instead, dual PIII
550's
  with 512MB ram and 9 or 18GB 1rpm drives.  I'll probably use these
for
  testing.
 
 I agree with the earlier poster that more spindles for your queue
 (c/- raid) is a good thing in general.
 
  The bulk of the messages will be the same content to many rcpt's.
 However,
  once in awhile we'll have 100,000 different messages go out to 100,000
  different people.
  
  Since the QMQP support under mini-qmail doesn't load balance, can I feed
 it
  a hostname with multiple dns entries (round-robin dns)?  Or better yet,
 how
  easy would it be to modify the qmail code to just load balance between
 them?
 
 The manpage for qmail-qmqpc tells us that they have to be IP addresses
 in qmqpservers so a RR DNS won't help. If all of the messages are
generated
 on one machine, then I'd be inclined to go for a much simpler solution
 than modifying qmail. I'd have an instance of qmail for each outbound
 server with the appropriate qmqpservers entry, then have your queue
 insertion script do a round-robin itself by simply cycling thru
 the qmail-inject command associated with each instance.
 
 for instance in 1 2 3 4 5
 do
   getnext_message_details()
   /var/qmail{$instance}/bin/qmail-inject currentmessage  details
 done
 
 Or some such.
 
 
 Alternatively, if you have money to burn, maybe a layer four switch
 with load-balancing skills.
 
 
 Mark.
 
 
  
  Jay
  
  
  
  -Original Message-
  From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
  Sent: Friday, July 14, 2000 2:09 PM
  To: '[EMAIL PROTECTED]'
  Subject: Re: questions about performance and setup
  
  
   Here's what I need to know:
   
   1.  How well does qmail take advantage of multiple processors?  How
much
  
  Indreectly, quite well as it forks many processes, thus if the OS takes
  good advantage of your CPUs, then qmail inherits that advantage.
  
   memory and disk will I need?  (we're at 50 million messages per month
 now,
  
  Are these message unique per target address or the same. If unique, your
  requirements are vastly different and very queue/disk intensive. If they
  are the same and you take advantage or VERP support on qmail, then
  your load will mainly be sending related which will benefit from
  more memory, multiple instances, etc.
  
   and we only send out monday-friday, so that's over 2 million messages
 per
   day, and it's only going up)
   
   2.  How many messages per day would one estimate that each of these
  servers
   could do?
   
   3. I read about mini-qmail and how it's about 100 times faster
blasting
  out
   email to QMQP servers.  Since you can specify multiple QMQP servers,
if
 I
   have a fourth machine running mini-qmail and managing the actual
mailing
   list, can I add the other 3 as QMQP servers and have it load balance
  between
   all 3 for sending out mail?  (this way I could add

Re: questions about performance and setup

2000-07-14 Thread John White

On Fri, Jul 14, 2000 at 12:21:57PM -0700, Jason Murphy wrote:
  The machine I built contains a DPT SmartRAID V SCSI RAID 0/1/5 controller
 with 5 1RPM 9.1 gig drives. The thing I notice about RAID 5 in the
 right configuration is that you can throw tons of IO at it and you will
 see little decrease in performance. Our Database server (Ya, I know, its
 not MAIL SERVER) gets tons of IO and its nothing to it; just eats it up
 and continues on its way.

A massive mail injection, especially if the content is unique to the
user, can overwhelm a disk subsystem.

This is reccomending the exact -wrong- kind of disk system.  RAID 5
has a write penalty, as it has to calculate parity for each write,
and write to multiple spindles.

The best type of RAID for small block writes is RAID 10 or RAID 1+0
(not to be confused with RAID 0+1).  Even better is to use a disk
system with write-back cache.  Ideally, you need at least seven
spindles.

I've seen great things with the Infortrend controller.

A great setup would be 1U pc's connected to an external RAID.
 
John



RE: questions about performance and setup

2000-07-14 Thread Austad, Jay

Non-unique emails will most likely be generated by other machines and send
the box running mini-qmail via smtp.  Non-unique emails will be a small
percentage of what gets sent out, for now.

Jay

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
Sent: Saturday, July 15, 2000 12:10 AM
To: '[EMAIL PROTECTED]'
Subject: Re: questions about performance and setup


On Fri, Jul 14, 2000 at 07:01:46PM -0500, Austad, Jay wrote:
 Then have the script that does the mailing call randomly
 on of the /var/qmail#/bin/qmail-inject. This will emulate round robin
 without any patching.
 
 Won't this way be a performance hit though?  I admit, it is an easy
solution

No. My experience is that the cost of running a script to inject the mail
in a way similar to that mentioned above, is pretty small compared to the
queue injection cost and the delivery cost. sh or perl will be fine.


 and would work excellent, but I have to think about efficiency also.  C
code
 is much faster than shell or perl, and I'd like to set it up once and not
 have to ever worry about again, or at least for a long, long time.
 
 As I said, we're doing 50 million emails a month right now, but this is
 increasing substantially each month, and as we rollout new subscription
 services, we'll have even more load.  Sending 10 times this amount by the
 same time next year is a good possibility, possibly sooner as we seem to
 underestimate the rate at which we're growing much of the time...

You may also need to look at the scalability of the generation of the
emails. One system I recently looked at claimed to be able to generate
nicely unique emails at a targetted database, but it burned CPU like
it was free - just in generating the content.


Mark.


 
 Jay
 
 -Original Message-
 From: JuanE [mailto:[EMAIL PROTECTED]]
 Sent: Friday, July 14, 2000 5:55 PM
 To: '[EMAIL PROTECTED]'
 Subject: Re: questions about performance and setup
 
 
 
 Jay,
 
 That's the beauty of having multiple instances, not having to patch qmail.
 All you need to do is install qmail once per machine (ie, /var/qmail1,
 /var/qmail2,...). Then have the script that does the mailing call randomly
 on of the /var/qmail#/bin/qmail-inject. This will emulate round robin
 without any patching.
 
 JES
 
 Austad, Jay writes:
 
  Where would I start in the code to modify the QMQP servers list so that
it
  would load balance between all of the servers in the list instead of
just
  using the first one it can contact?  This would be very useful to me.  I
  assume qmail-qmqpc.c is one of them, are there others I would need to
play
  around with?
  
  Jay
  
  -Original Message-
  From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
  Sent: Friday, July 14, 2000 3:55 PM
  To: '[EMAIL PROTECTED]'
  Subject: Re: questions about performance and setup
  
  
  On Fri, Jul 14, 2000 at 02:29:06PM -0500, Austad, Jay wrote:
   I already have Mandrake Linux 7.0 and 7.1 running on multiple Dell
boxes
   with no trouble, some of them took work to get going, but it runs
well.
 I
   have a few Crystal PC's here also that I may use instead, dual PIII
 550's
   with 512MB ram and 9 or 18GB 1rpm drives.  I'll probably use these
 for
   testing.
  
  I agree with the earlier poster that more spindles for your queue
  (c/- raid) is a good thing in general.
  
   The bulk of the messages will be the same content to many rcpt's.
  However,
   once in awhile we'll have 100,000 different messages go out to 100,000
   different people.
   
   Since the QMQP support under mini-qmail doesn't load balance, can I
feed
  it
   a hostname with multiple dns entries (round-robin dns)?  Or better
yet,
  how
   easy would it be to modify the qmail code to just load balance between
  them?
  
  The manpage for qmail-qmqpc tells us that they have to be IP addresses
  in qmqpservers so a RR DNS won't help. If all of the messages are
 generated
  on one machine, then I'd be inclined to go for a much simpler solution
  than modifying qmail. I'd have an instance of qmail for each outbound
  server with the appropriate qmqpservers entry, then have your queue
  insertion script do a round-robin itself by simply cycling thru
  the qmail-inject command associated with each instance.
  
  for instance in 1 2 3 4 5
  do
  getnext_message_details()
  /var/qmail{$instance}/bin/qmail-inject currentmessage  details
  done
  
  Or some such.
  
  
  Alternatively, if you have money to burn, maybe a layer four switch
  with load-balancing skills.
  
  
  Mark.
  
  
   
   Jay
   
   
   
   -Original Message-
   From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
   Sent: Friday, July 14, 2000 2:09 PM
   To: '[EMAIL PROTECTED]'
   Subject: Re: questions about performance and setup
   
   
Here's what I need to know:

1.  How well does qmail take advantage of multiple processors?  How
 much
   
   Indreectly, quite well as it forks many processes, thus if the OS
takes
   good

Re: questions about performance and setup

2000-07-14 Thread markd

 Here's what I need to know:
 
 1.  How well does qmail take advantage of multiple processors?  How much

Indreectly, quite well as it forks many processes, thus if the OS takes
good advantage of your CPUs, then qmail inherits that advantage.

 memory and disk will I need?  (we're at 50 million messages per month now,

Are these message unique per target address or the same. If unique, your
requirements are vastly different and very queue/disk intensive. If they
are the same and you take advantage or VERP support on qmail, then
your load will mainly be sending related which will benefit from
more memory, multiple instances, etc.

 and we only send out monday-friday, so that's over 2 million messages per
 day, and it's only going up)
 
 2.  How many messages per day would one estimate that each of these servers
 could do?
 
 3. I read about mini-qmail and how it's about 100 times faster blasting out
 email to QMQP servers.  Since you can specify multiple QMQP servers, if I
 have a fourth machine running mini-qmail and managing the actual mailing
 list, can I add the other 3 as QMQP servers and have it load balance between
 all 3 for sending out mail?  (this way I could add more servers easily if I
 needed to)

The qmqp support doesn't load balance. It simply takes the first one
it can connect to.
 
 4. Can I easily make qmail run an external script for each bounced mail?

Absolutely.

 5.  Anything else I should know?

That all hinges on whether your emails are unique for each recipient or
not. Or more importantly, the average number of recipients per unique
email.


Regards.