I would just RAID 5 the whole setup.  With your 6 drives, you get the read performance of 4 drives on any partition in this setup, plus you have a hot spare, and the write performance of close to 4 drives as well.  This is a lot better than your config with a mirrored set of drives and a RAID 5 array that reads and writes at about the capacity of two.  You will definitely get more bang for your buck with it all being one RAID 5 array, and you get much better redundancy that way as well.

As far as splitting things up, there are only 4 things that I feel need to be separate:

    1) OS and Executables
    2) Spool
    3) Mail Boxes
    4) Log File Archive

Some may split them up further, but I don't think it would have any real effect.  The reasons for splitting things up is primarily for the purpose of fragmentation.  Your OS and Executables should be mostly static so long as you don't log to that partition, and it might be more advantageous to have all executables on one partition rather than two or more in terms of seek time.  The fact that Windows caches frequently accessed files mitigates the effect here so that you probably won't notice the improvement of having everything all together on one partition, so you can do that to taste if you wish.  The Spool will get very heavily fragmented, and it is the best place to log to since IMail can't be made to log elsewhere, by Declude and other applications can be made to log there instead of under the IMail tree.  Only Sniffer can't be told to log elsewhere at this time, but thankfully the log is not as heavily fragmented as others due to it's comparatively small size.  The Mail Boxes should also go on their own partition because of fragmentation, and also for personal preference of organization.  They don't get heavily fragmented, but you also don't want to have the spool on the same partition contributing to the fragmentation.  The Log File Archive is just simply better on it's own partition since you want to move the files from one partition to another in order to remove the fragmentation.

Sizing the partitions is also very important.  The outer edges of the disks will run at twice the read and write rates of the inner edges, and if all of your most frequently accessed files are closest to the outer edge there is less seek time, so having your most active partitions nearest the outer edge will improve performance.  You do this by not being wasteful with the partition sizes.  I put all applications, including IMail and associated programs, along with the page file and OS on the C drive and this is the first logical drive in the array.  I am now sizing the primary partition at 20 GB, though you could probably get away with 10 GB.  Take note of the page file's max size and the fact that the temp directory is on this drive and things like zipping and unzipping files make use of the temp directory, so you need some extra space.  I make the D drive my Web drive (might not be necessary for your setup) and that is quite small.  The E drive is for Mail Boxes and I gave it 5 times the actual space in use at this time for growth.  The F drive is the Spool, which is 5 GB on my server, and allows for some play in the event that my log archiving mechanism goes bust without me noticing.  Then I also have separate small partitions, G and H, for both viruses and held spam, primarily due to the number of files and the size of the MFT, and you don't want this mixed with things on an active partition IMO.  Then I have a Log File Archive partition on I which is essentially all the left over space on the array.  Here's a summary:

    C: OS and Executables - 10 GB to 20 GB
    D: Web - 1 GB
    E: Mail Boxes - 5 GB
    F: Spool + Logging - 5 GB
    G: Spam - 2 GB
    H: Viruses - 2 GB
    I: Log Archive - Everything that is left if you wish.

I also of course have other devices for storing software and backups that aren't part of the array.

With a RAID 5 + Hot spare config, assuming 36 GB drives, you can pack all of the most active stuff into the outer edges of the disks and get screaming performance.  Having a large log file archive nearest the center of the platters will have virtually no impact since they would be rarely accessed.  Besides the performance gains, you really must consider the redundancy.  The fact that in such a setup, you could lose 1 drive and still run at full speed after a period of automatic rebuilding, and lose 2 drives and run at half speed, which would still be plenty, and it would take 3 drives failing to bring your box down...well, that's the reliability that one should want.

Like I said earlier, I've done over 125,000 unique messages a day with a sizable Declude setup, Sniffer, two virus scanners, and some Declude helper apps written in VBScript on a server with 3 x 10,000 RPM drives on a zero-channel Adaptec controller with 48 MB of read cache (no write cache) and it had absolutely no issues.  That server also ran MS SMTP as the primary gateway along with VamSoft's ORF doing address validation, so in effect, each message was being received twice by the same machine.  If I can do this off of three slower drives and a bottom of the line mainstream controller, you can surely do the same off of 5 faster drives and better controller and reap the benefits of the extra redundancy as well as the simplicity of managing one array.  There's no reason to get any more complicated than a single array in this case.  If your system is not heavily fragmented (and it won't be under my config if you move the logs periodically), then it would be near impossible for you to run into a wall with disk I/O before you ran out of CPU.  I have seen over 100,000 messages a day run on a single IDE hard drive, single partition, with IMail/Declude/F-Prot and about 7,000 accounts, though it was starting to stress the server at that point and needed to be addressed.

Matt




Goran Jovanovic wrote:

Matt,

 

I think that you sort of answered the question that I did not really ask. I was really trying to get information on the different performance levels for of S/W vs H/W RAID for an “ideal” scanning only box. So let me try this out and people can comment

 

All SCSI 15K drives with HW RAID controller

 

2 x 36 GB drives R1 on first channel (36 GB usable)

    C – Windows 10 GB

    D – IMAIL/Smartermail/Declude files/Declude filters & per domain configs/banned files (5 days only) 20 GB

    P – Page volume 3 GB

 

3 x 36 GB drives R5 on second channel (72 GB usable)

    L – Logs for JM, Virus, IMAIL/SmarterMail, Sniffer, invURIBL, et al 10 GB

    S – Storage for all daily logs 60 GB

 

1 x 36 GB Hot Spare drive

 

From what we have discussed here drive L will get hit a lot. If you create a process that Matt is describing to move the active logs from L to S you should not worry about running out of space on the L drive.

 

Now looking back I am not sure if I have crafted this well since the SPOOL files for IMAIL will end up on D. Is there a way to move them for Smartermail as there does not seem to be a way to move them in IMail? The good part of this config is that the spool files which have a lot of read/write are on a different volume/channel from the other log files. I am not sure what amount of space you should allocate to a server that would process 100,000+ messages a day?

 

Anyone have comments on this config.

 

Thanx

 

 

 

 

     Goran Jovanovic

     The LAN Shoppe

 

 


From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Matt
Sent: Wednesday, March 16, 2005 3:49 PM
To: sniffer@SortMonster.com
Subject: Re: [sniffer] RAID Levels for Spool Folder

 

IMO, Software RAID is not the way to go on a busy machine.  You will save a measurable amount of overhead by going with hardware based RAID of any sort since the controller should handle the processes associated with the RAID.  Note that this isn't the case with inexpensive RAID controllers such as the cheaper IDE and SATA controllers which still place a fair burden on the OS/processor.  True RAID cards also offer additional cache which can speed up the performance on reads, and also on writes if you are battery backed up (otherwise don't use write caching because you could lose or corrupt data during a power outage).

There's also several common misconceptions about what is proper to do for a mail server.  RAID 5 is the best choice under almost all conditions.  The trick here is that while RAID 10 offers both redundancy in mirroring and speed in striping, most servers have a limited amount of space for disks.  So a server with 6 disks will operate with the speed of 3 disks spanned in a RAID 10 configuration, but 6 disks in RAID 5 will operate as 5 disks spanned plus a little bit of overhead, though not nearly enough so that it falls short of the performance of just 3 disks in a simple span.  Therefore RAID 5 should be the default choice for speed in such an environment.

Another misconception is that data is always striped in RAID 0 or RAID 5.  This depends on the file size and the stripe size.  Most stripes are 64 KB (configurable in most setups).  If you have some form of striping for your spool drive, most messages fall far under 64 KB and will only get written to one disk (CRC will also get written in RAID 5).  Therefore for a spool folder, RAID 5 with 3 drives (the minimum), will perform rather closely to RAID 5 with 10 drives since most files will only land on one disk (with the other corresponding stripes containing no data).  The MFT however for a drive with a lot of files will grow to be quite large and benefits from having multiple disks, and opening very large files such as logs will also benefit from having many disks.  There is also an advantage to seek times when having multiple disks, especially if you keep your partitions sized small for performance.

I've run a dual processor 3.06 Ghz server with both 6 Seagate 15,000 RPM drives in RAID 5 and the same with 3 Seagate 10,000 RPM drives in RAID 5 running on a less capable controller, and there was no impact on performance while the server was handling over 125,000 unique messages a day.  The only noticeable difference was the time it would take to open a 500 MB log file, or the time it would take to enumerate the file names from the MFT on a partition that contained tens of thousands of files in the root.  It seems quite apparent that with modern processors, even in dual processor configurations, that you will run out of CUP cycles long before you run out of disk I/O in a well managed RAID 5, 3 drive configuration on an IMail/Declude/Sniffer server.

Take note that the log files for Declude, Sniffer and IMail all become massively fragmented, and if you don't have a process to remove these from active partitions on your server or defragment them, then performance will be severely impacted.  I run a job hourly that copies all such logs to a different partition and combines them with older chucks from that day and then zips them nightly.  The process of moving them to another logical drive removes the fragmentation and that helps to ensure that the spool or mailbox partitions/folders don't also become heavily fragmented, which is a big performance hit.  Opening up a heavily fragmented 500 MB Declude log file is excruciatingly slow.  If I kept the logs in my spool folder without taking action to handle the fragmentation, each Declude log file would reach over 100,000 fragments a day, and that's a lot of seek time.

I would recommend just going to RAID 5 for everything, and buying an LSI or Adaptec card if doing SCSI, or one of the same including 3Ware if doing SATA or IDE.  Based on my personal experience, I don't believe that you need to go over the top with anything, just the cheapest brand name card that can handle RAID 5 will do, even in a zero-channel configuration if your motherboard supports it.  Ultra 160 SCSI RAID cards will also work just fine for all but the most demanding applications these days, so don't be afraid to pick one of the older models up from eBay.  Also pay attention to the drives themselves, SCSI drives are made to be better and more dependable than most IDE or SATA drives, and the faster RPM's mean faster seek times as well, and different brands also require less CPU overhead.  Only Western Digital seems willing to produce a server-class SATA drive, and this is because they are the only SATA drive maker that doesn't have a SCSI line that might be impacted.  SCSI as a protocol also offers some things that still aren't commonly implemented in SATA that will improve performance in a RAID configuration, though that will change over time.  Essentially you should think of SCSI as both a protocol as well as a mark of component quality.

With that said, if performance isn't an issue with a single drive, mirroring it in Windows might be a perfectly fine solution.  I would still lean towards a cheap RAID card for this however.

Matt






Andy Schmidt wrote:

Uh, sorry, I had thought that discussion was RAID-5 vs. RAID-1?
 
If someone is running RAID-5, I assume that it's hardware based. If so, then
that person could use the same hardware to configure a RAID-1 array instead
- so why even bother with software RAID then?
 
If the discussions is software RAID-1 vs. no-raid, then the answer is: Sure,
software RAID is a cost effective solution if the system has sufficient
head-room to deal with whatever possible overhead that may cause. However,
if we are talking about a machine that is already taxed, then I would
suggest plugging in a RAID controller instead of adding software RAID to the
mix.
 
I have several (older) systems running Windows 2000 RAID-1. At least ONE of
the servers I later upgraded to Hardware RAID.  I can't say that I've
noticed any difference (but then again, I have not run benchmarks and the
server was not really taxed before either.)
 
>From what I understand, there are many factors involved and it much depends
on your systems configuration. CPU availability is critical. A server that
is already CPU taxed may suffer if software RAID is added.  Having the
drives split on two SCSI controllers should also help with software RAID-1.
Doing software RAID-1 with a master/slave ATA drive, however, may slow
things down.
 
There may not be a single answer...
 
Best Regards
Andy 
 
 
-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
On Behalf Of Goran Jovanovic
Sent: Wednesday, March 16, 2005 02:05 PM
To: sniffer@SortMonster.com
Subject: RE: Re[2]: [sniffer] Moving Sniffer to Declude/SmarterMail
 
 
OK that is for hardware level RAID. I had thought that you would offset the
extra processing time by being able to write less to each drive.
 
Now does anyone know how much overhead Windows 2000/2003 software RAID 1 on
dynamic disks produces over hardware level RAID 1?
 
 
This E-Mail came from the Message Sniffer mailing list. For information and (un)subscription instructions go to http://www.sortmonster.com/MessageSniffer/Help/Help.html
 
 
  



-- 
=====================================================
MailPure custom filters for Declude JunkMail Pro.
http://www.mailpure.com/software/
=====================================================

-- 
=====================================================
MailPure custom filters for Declude JunkMail Pro.
http://www.mailpure.com/software/
=====================================================


<<image/gif>>

Reply via email to