I'm resending this as it appears not to have made it to the list.

At 10:54 AM 8/21/2005, Jeremiah Jahn wrote:
On Sat, 2005-08-20 at 21:32 -0500, John A Meinel wrote:
> Ron wrote:
> Well, since you can get a read of the RAID at 150MB/s, that means that
> it is actual I/O speed. It may not be cached in RAM. Perhaps you could
> try the same test, only using say 1G, which should be cached.

[EMAIL PROTECTED] pgsql]# time dd if=/dev/zero of=testfile bs=1024 count=1000000
1000000+0 records in
1000000+0 records out

real    0m8.885s
user    0m0.299s
sys     0m6.998s

This is abysmally slow.

[EMAIL PROTECTED] pgsql]# time dd of=/dev/null if=testfile bs=1024 count=1000000
1000000+0 records in
1000000+0 records out

real    0m1.654s
user    0m0.232s
sys     0m1.415s

This transfer rate is the only one out of the 4 you have posted that is in the vicinity of where it should be.

The raid array I have is currently set up to use a single channel. But I
have dual controllers in the array. And dual external slots on the card.
The machine is brand new and has pci-e backplane.

So you have 2 controllers each with 2 external slots? But you are currently only using 1 controller and only one external slot on that controller?

> > Assuming these are U320 15Krpm 147GB HDs, a RAID 10 array of 14 of them
> > doing raw sequential IO like this should be capable of at
> >  ~7*75MB/s= 525MB/s using Seagate Cheetah 15K.4's
BTW I'm using Seagate Cheetah 15K.4's

OK, now we have that nailed down.

> > AFAICT, the Dell PERC4 controllers use various flavors of the LSI Logic
> > MegaRAID controllers.  What I don't know is which exact one yours is,
> > nor do I know if it (or any of the MegaRAID controllers) are high
> > powered enough.

PERC4eDC-PCI Express, 128MB Cache, 2-External Channels

Looks like they are using the LSI Logic MegaRAID SCSI 320-2E controller. IIUC, you have 2 of these, each with 2 external channels?

The specs on these appear a bit strange. They are listed as being a PCI-Ex8 card, which means they should have a max bandwidth of 20Gb/s= 2GB/s, yet they are also listed as only supporting dual channel U320= 640MB/s when they could easily support quad channel U320= 1.28GB/s. Why bother building a PCI-Ex8 card when only a PCI-Ex4 card (which is a more standard physical format) would've been enough? Or if you are going to build a PCI-Ex8 card, why not support quad channel U320? This smells like there's a problem with LSI's design.

The 128MB buffer also looks suspiciously small, and I do not see any upgrade path for it on LSI Logic's site. "Serious" RAID controllers from companies like Xyratex, Engino, and Dot-hill can have up to 1-2GB of buffer, and there's sound technical reasons for it. See if there's a buffer upgrade available or if you can get controllers that have larger buffer capabilities.

Regardless of the above, each of these controllers should still be good for about 80-85% of 640MB/s, or ~510-540 MB/s apiece when doing raw sequential IO if you plug 3-4 fast enough HD's into each SCSI channel. Cheetah 15K.4's certainly are fast enough. Optimal setup is probably to split each RAID 1 pair so that one HD is on each of the SCSI channels, and then RAID 0 those pairs. That will also protect you from losing the entire disk subsystem if one of the SCSI channels dies.

That 128MB of buffer cache may very well be too small to keep the IO rate up, and/or there may be a more subtle problem with the LSI card, and/or you may have a configuration problem, but _something(s)_ need fixing since you are only getting raw sequential IO of ~100-150MB/s when it should be above 500MB/s.

This will make the most difference for initial reads (first time you load a table, first time you make a given query, etc) and for any writes.

Your HW provider should be able to help you, even if some of the HW in question needs to be changed. You paid for a solution. As long as this stuff is performing at so much less then what it is supposed to, you have not received the solution you paid for.

BTW, on the subject of RAID stripes IME the sweet spot tends to be in the 64KB to 256KB range (very large, very read heavy data mines can want larger RAID stripes.). Only experimentation will tell you what results in the best performance for your application.

I'm not really worried about the writing, it's the reading the reading
that needs to be faster.

Initial reads are only going to be as fast as your HD subsystem, so there's a reason for making the HD subsystem faster even if all you care about is reads. In addition, I'll repeat my previous advice that upgrading to 16GB of RAM would be well worth it for you.

Hope this helps,
Ron Peacetree

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Reply via email to