Thanks for the suggestions. Yes, we are aware of those other parameters, but we now know the bottleneck is in the MD RAID-1 layer. This is RHEL 5.5 w/ latest updated kernel (don't have the version with me right now)
We've tried all schedulers, a variety of read ahead buffers, etc. The only thing that has allowed us to break the 200MB/s seq. write limit is when we get rid of the MD RAID-1 layer. Even if we don't use the file system (XFS in this case), if we build the MD RAID-1 with a missing half, and then add the 2nd half to allow it to re-sync, the fastest the re-sync will go (with all else pretty much idle) is about 200MB/s. So, this is MD RAID-1 layer doing it's own block copying with no LVM2 or XFS or anything else involved. -Bond On Thu, 2010-07-15 at 10:34 -0500, Paul M. Dyer wrote: > Hi, > > which IO elevator are you using? Are you using RHEL4 or RHEL5? > > In RHEL5, you could try the deadline or noop elevator to see if that works > better. Implement by using this example for sda, change for your particular > device: > > cat /sys/block/sda/queue/scheduler > > echo "deadline" > /sys/block/sda/queue/scheduler > > or use noop: > echo "noop" > /sys/block/sda/queue/scheduler > > Here is a link from RHEL4 days about the schedulers. > http://www.redhat.com/magazine/008jun05/features/schedulers/ > > Paul > > > ----- Original Message ----- > From: "Bond Masuda" <[email protected]> > To: "linux-poweredge" <[email protected]> > Sent: Wednesday, July 14, 2010 10:32:57 PM > Subject: performance bottleneck in Linux MD RAID-1 > > Hi Everyone, > > I'm wondering if some of the gurus around here might be able to help me > out. We have a PE2970 with two PERC 6/E, each PERC6/E is connected via > single SAS cable to an MD1000 with 15x 1TB Hitachi SATA 7.2K drives. We > have each MD1000 setup in RAID-10 with 14 drives and 1 hot spare. Within > Linux, we mirror the two MD1000's with Linux MD RAID-1 as /dev/md0. On > top of /dev/md0, we have LVM2 and then XFS on the LV. The reason for the > LVM2 is to take snapshots (we reserve about 10% of space in VG for it) > > We're seeing a performance bottleneck of about 200MBytes/sec sequential > writes when testing with iozone. We were expecting with 7x effective > spindles on the RAID-10, to get about ~350MBytes/sec sustained writes > for sequential access. > > After trying out several combinations of things, we found that if we > remove the Linux MD software RAID layer, and just LVM2 on top of > the /dev/sdc (the vdisk as presented by the PERC 6/E RAID-10), we get > about 340MBytes/sec sequential writes. If we put XFS directly on top > of /dev/sdc1, we get about the same 340MBytes/sec. So, we can get our > anticipated performance of about 350MB/s only when we don't use the MD > RAID-1. > > Since both MD1000s are connected via separate PERC 6/E, we didn't think > the MD RAID-1 would cause >40% performance loss... > > We even tried to degrade the MD RAID-1 and see if writing only to one of > the mirrors would improve performance. It did NOT.. .still 200MB/s. It > almost seems like Linux MD layer has a performance cap at around > 200MB/s. > > Has anyone encountered this and have suggestions to remove this > bottleneck? Any advice would be appreciated. > > Thanks, > -Bond > > _______________________________________________ Linux-PowerEdge mailing > list [email protected] > https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read > the FAQ at http://lists.us.dell.com/faq _______________________________________________ Linux-PowerEdge mailing list [email protected] https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
