> Subject: Re: performance bottleneck in Linux MD RAID-1 > To: "Paul M. Dyer" <[email protected]> > Cc: linux-poweredge <[email protected]> > Message-ID: <[email protected]> > Content-Type: text/plain; charset="UTF-8" > > Thanks for the suggestions. Yes, we are aware of those other > parameters, but we now know the bottleneck is in the MD RAID-1 layer. > This is RHEL 5.5 w/ latest updated kernel (don't have the version with > me right now) > > We've tried all schedulers, a variety of read ahead buffers, etc. The > only thing that has allowed us to break the 200MB/s seq. write limit is > when we get rid of the MD RAID-1 layer. > > Even if we don't use the file system (XFS in this case), if we build > the > MD RAID-1 with a missing half, and then add the 2nd half to allow it to > re-sync, the fastest the re-sync will go (with all else pretty much > idle) is about 200MB/s. So, this is MD RAID-1 layer doing it's own > block > copying with no LVM2 or XFS or anything else involved. > > -Bond
We use MD layer on JBODs on a PERC6 and routinely get 350 Mbytes/sec write and 650 Mbytes/sec read. (bonnie++ test). Hardware is R900. OS was SLES10SP2. (The JBODs are individual disks, 450GB x 15krpm, setup RAID "0" although they have only one pdisk per vdisk). You are getting more throughput without MD so I have to ask why bother with MD? If it seems to be a bottleneck then do without. Although I have to say your expectations >We were expecting with 7x effective >spindles on the RAID-10, to get about ~350MBytes/sec sustained writes >for sequential access. are not very realistic, however. You have SATA disks but you want speed. While individual spindles can do 50 to 70 Mbytes/sec (SATA-2, Seagate disks), when you put those behind several layers of Linux software, and a PERC card with only Dell knows what CPU speed, firmware version and memory bandwidth, plus numerous buffering, command decode/schedule/execute/reply loops, you might be lucky to get 50% of disk drive maximum throughput when the whole system is assembled. The only configuration you have not mentioned is MD or LVM for all physical spindles -- rather than layering LVM upon MD upon PERC RAID-10 for a chosen, uh, RAID-110 configuration, choose just MD (or LVM), and skip the PERC's implementation of RAID anything. You don't mention why RAID-1 upon RAID-10. Is this intended for high data availability / system reliability? --John > > On Thu, 2010-07-15 at 10:34 -0500, Paul M. Dyer wrote: > > Hi, > > > > which IO elevator are you using? Are you using RHEL4 or RHEL5? > > > > In RHEL5, you could try the deadline or noop elevator to see if that > works better. Implement by using this example for sda, change for your > particular device: > > > > cat /sys/block/sda/queue/scheduler > > > > echo "deadline" > /sys/block/sda/queue/scheduler > > > > or use noop: > > echo "noop" > /sys/block/sda/queue/scheduler > > > > Here is a link from RHEL4 days about the schedulers. > > http://www.redhat.com/magazine/008jun05/features/schedulers/ > > > > Paul > > > > > > ----- Original Message ----- > > From: "Bond Masuda" <[email protected]> > > To: "linux-poweredge" <[email protected]> > > Sent: Wednesday, July 14, 2010 10:32:57 PM > > Subject: performance bottleneck in Linux MD RAID-1 > > > > Hi Everyone, > > > > I'm wondering if some of the gurus around here might be able to help > me > > out. We have a PE2970 with two PERC 6/E, each PERC6/E is connected > via > > single SAS cable to an MD1000 with 15x 1TB Hitachi SATA 7.2K drives. > We > > have each MD1000 setup in RAID-10 with 14 drives and 1 hot spare. > Within > > Linux, we mirror the two MD1000's with Linux MD RAID-1 as /dev/md0. > On > > top of /dev/md0, we have LVM2 and then XFS on the LV. The reason for > the > > LVM2 is to take snapshots (we reserve about 10% of space in VG for > it) > > > > We're seeing a performance bottleneck of about 200MBytes/sec > sequential > > writes when testing with iozone. We were expecting with 7x effective > > spindles on the RAID-10, to get about ~350MBytes/sec sustained writes > > for sequential access. > > > > After trying out several combinations of things, we found that if we > > remove the Linux MD software RAID layer, and just LVM2 on top of > > the /dev/sdc (the vdisk as presented by the PERC 6/E RAID-10), we get > > about 340MBytes/sec sequential writes. If we put XFS directly on top > > of /dev/sdc1, we get about the same 340MBytes/sec. So, we can get our > > anticipated performance of about 350MB/s only when we don't use the > MD > > RAID-1. > > > > Since both MD1000s are connected via separate PERC 6/E, we didn't > think > > the MD RAID-1 would cause >40% performance loss... > > > > We even tried to degrade the MD RAID-1 and see if writing only to one > of > > the mirrors would improve performance. It did NOT.. .still 200MB/s. > It > > almost seems like Linux MD layer has a performance cap at around > > 200MB/s. > > > > Has anyone encountered this and have suggestions to remove this > > bottleneck? Any advice would be appreciated. > > > > Thanks, > > -Bond > > > > _______________________________________________ Linux-PowerEdge > mailing > > list [email protected] > > https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please > read > > the FAQ at http://lists.us.dell.com/faq > > > > > ------------------------------ > > _______________________________________________ > Linux-PowerEdge mailing list > [email protected] > https://lists.us.dell.com/mailman/listinfo/linux-poweredge > Please read the FAQ at http://lists.us.dell.com/faq > > End of Linux-PowerEdge Digest, Vol 73, Issue 27 > *********************************************** _______________________________________________ Linux-PowerEdge mailing list [email protected] https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
