Seth Vidal wrote:
> 
> Hi folks,
>  I have an odd question. Where I work we will, in the next year, be in a
> position to have to process about a terabyte or more of data. The data is
> probably going to be shipped on tapes to us but then it needs to be read
> from disks and analyzed. The process is segmentable so its reasonable to
> be able to break it down into 2-4 sections for processing so arguably only
> 500gb per machine will be needed. I'd like to get the fastest possible
> access rates from a single machine to the data. Ideally 90MB/s+
> 
> So were considering the following:
> 
> Dual Processor P3 something.
> ~1gb ram.
> multiple 75gb ultra 160 drives - probably ibm's 10krpm drives

Something to think about regarding IBM 10K drives:

http://www8.zdnet.com/eweek/stories/general/0,11011,2573067,00.html

> Adaptec's best 160 controller that is supported by linux.
> 
> The data does not have to be redundant or stable - since it can be
> restored from tape at almost any time.
> 
> so I'd like to put this in a software raid 0 array for the speed.
> 
> So my questions are these:
>  Is 90MB/s a reasonable speed to be able to achieve in a raid0 array
> across say 5-8 drives?
> What controllers/drives should I be looking at?
> 
> And has anyone worked with gigabit connections to an array of this size
> for nfs access? What sort of speeds can I optimally (figuring nfsv3 in
> async mode from the 2.2 patches or 2.4 kernels) expect to achieve for
> network access.
> 
> thanks
> -sv

Lots of discussion on this already, so I will just touch on a few
points.

Jon Lewis mentioned that you should place a value on how long it
takes to read in your data sets when considering the value of RAID.
Unfortunately, RAID write throughput can be relatively slow for
HW RAID compared to SW RAID. I've appended some numbers for 
reference.

Also, the process of reading in from tape will double the load on
a single PCI bus. I think you will be happier with a dual PCI bus
MB. 

One of the the eternal unknowns is how well a particular Intel (or
clone) shipset will work for a particular I/O load. Alpha and Sparc
MB are designed with I/O as a higher priority goal than Intel MBs. 
Intel is getting better at this, but contention between the PCI 
busses for memory access can be a problem as well.

Brian Pomerantz at LLNL has gotten more than 90MB/s streaming to a
Ciprico RAID system, but he went to a fair amount of work to get 
there i.e. 2MB block sizes. You probably want to talk to him and 
you should be able to find post from him in the raid archives.

------------------------

Hardware: dual PIII 600Mhz/Lancewood/128MB and 1GB
Mylex: 150, 1100, & 352
Mylex cache: writethru
Disks: 5 Atlas V

Some caveats:

1. The focus is on sequential performance. YMMV

2. The write number for SW RAID5 is surprisingly good. It either
indicates excellent cache managment and reuse of parity blocks or
some understanding of the sequential nature of the bonnie benchmark. 
A RAID5 update should be approximately 25% of raw write performance 
with no caching assistance. 

3. I am a little bothered by the very strong correlation between CPU% 
and MB/s for all of the Mylex controllers for the bonnie tests. I
guess that is the I/O service overhead, but it still seems high ot me.

4. The HW RAID numbers are for 5 drives. The SW RAID numbers are for
8 drives.

5. Effect of CPU memory on bonnie (AcceleRAID 150) read performance
is 15-20%. See below:

--------------------------------------------
AcceleRAID 150 
DRAM=256MB
                Read            Write
                BW MB/s CPU     BW      CPU

RAID3           42.3    34%      4.6     3%

RAID5           43.0    38%      4.5     3%

RAID6(0+1)      37.5    33%     12.7    11%

DRAM=1GB
                Read            Write
                BW MB/s CPU     BW      CPU

RAID3           48.4    50%      4.6     3%

RAID5           49.1    51%      4.5     3%

RAID6(0+1)      45.2    39%     12.7    10%

6. The Mylex eXtremeRAID1100 does not show much difference in
   write performance between RAID5 and RAID6. See below:

---------------------------------------------

ExtremeRAID 1100 1GB
                Read            Write
                BW MB/s CPU     BW      CPU

RAID3           48.3    50%     14.7    13%

RAID5           52.7    55%     15.1    13%

RAID6(0+1)      48.1    49%     14.7*   13%

RAID0           56.3    60%     40.8    37%

* This should be better
---------------------------------------------

AcceleRAID 352  1GB
                Read            Write
                BW MB/s CPU     BW      CPU

RAID3           45.2    44%      6.8     5%

RAID5           46.2    45%      6.6     5%

RAID6(0+1)      39.6    39%     16.7*   14%

RAID0           50.5    50%     36.7    30%

* This is better


> 
> After talking to Dan Jones and the figures he was getting on Mylex cards, I
> decided to do some simple software raid benchmarks.
> 
> Hardware: dual PIII 500Mhz/Nightshade/128MB
> SCSI: NCR 53c895 (ultra 2 lvd 80MB/s)
> Disks: 8 18G disks (FAST-40 WIDE SCSI 80.0 MB/s hdwr sector= 512 bytes.
>                         Sectors= 35885168 [17522 MB] [17.5 GB])
> 
> Raid 0/Striping:
> 
> ----------------------------------------------------------------------
> Start Date:   Thu Feb 17 13:51:26 PST 2000
> 
> File Size:    500
> Block Size:   4096
> ---------- Output ----------    ----------- Input ----------
>    50521 KB/sec    58% CPU         64563 KB/sec    47% CPU
> 
> End Date:   Thu Feb 17 13:51:45 PST 2000
> ----------------------------------------------------------------------
> 
> Raid 5:
> ----------------------------------------------------------------------
> Start Date:   Thu Feb 17 13:00:13 PST 2000
> 
> File Size:    500
> Block Size:   4096
> ---------- Output ----------    ----------- Input ----------
>    35619 KB/sec    71% CPU         46090 KB/sec    32% CPU
> 
> End Date:   Thu Feb 17 13:00:40 PST 2000
> ----------------------------------------------------------------------
> 
> Yeah, I know, software  raid is not as versatile as HW raid  and there is no
> battery backup (although when  you have a big UPS, you  don't care), yet the
> results are compelling.
>  

-- 
Dan Jones, Manager, Storage Products          VA Linux Systems
V:(408)542-5737 F:(408)745-9911               1382 Bordeaux Drive
[EMAIL PROTECTED]                            Sunnyvale, CA 94089

Reply via email to