On Sat, 16 Mar 2013 21:56:43 -0700 Brian Buhrow <[email protected]> wrote:
> On Mar 14, 8:47am, Greg Oster wrote: > } Subject: Re: Where is the component queue depth actually used in > the raidf } On Thu, 14 Mar 2013 10:32:26 -0400 > } Thor Lancelot Simon <[email protected]> wrote: > } > } > On Wed, Mar 13, 2013 at 09:36:07PM -0400, Thor Lancelot Simon > wrote: } > > On Wed, Mar 13, 2013 at 03:32:02PM -0700, Brian Buhrow > wrote: } > > > hello. What I'm seeing is that the > underlying disks } > > > under both a raid1 set and a raid5 set are > not seeing anymore } > > > than 8 active requests at once across the > entire bus of disks. } > > > This leaves a lot of disk bandwidth > unused, not to mention less } > > > than stellar disk performance. I > see that RAIDOUTSTANDING is } > > > defined as 6 if not otherwise > defined, and this suggests that } > > > this is the limiting factor, > rather than the actual number of } > > > requests allowed to be sent > to a component's queue. } > > > } > > It should be the sum of the number of openings on the underlying > } > > components, divided by the number of data disks in the set. > Well, } > > roughly. Getting it just right is a little harder than > that, but I } > > think it's obvious how. > } > > } > Actually, I think the simplest correct answer is that it should > be the } > minimum number of openings presented by any individual > underlying } > component. I cannot see any good reason why it should > be either more } > nor less than that value. > } > } Consider the case when a read spans two stripes... Unfortunately, > each } of those reads will be done independently, requiring two IOs > for a given } disk, even though there is only one request. > } > } The reason '6' was picked back in the day was that it seemed to > offer } reasonable performance while not requiring a huge amount of > memory to } be reserved for the kernel. And part of the issue there > was that } RAIDframe had no way to stop new requests from coming in > and consuming } all kernel resources :( '6' is probably a reasonable > hack for older } machines, but if we can come up with something > self-tuning I'm all for } it... (Having this self-tuning is going to > be even more critical when } MAXPHYS gets sent to the bitbucket and > the amount of memory needed for } a given IO increases...) > } > } Later... > } > } Greg Oster > > Hello. If I understand Thor's formula right, then a raid set > I have (raid5) with 4 components, each on a wd(ata) disk, then the > correct number of outstanding requests should be limited to 4 because > it looks like our ata drivers only present 1 opening per channel. > However, increasing the outstanding requests on this box from 6, > which is already too high according to the formula as I understand > it, to 20, increases the disk throughput on this machine by almost > 50% for many of the work loads I put on it. Yum! :) > I imagine there is a > point of diminishing returns in terms of how much of a queue I should > allow on the outstanding requests limit, Yes... > but right now, it's unclear > to me how to figure out what the optimal setting is for this number > based on any underlying capacity indicators there may be. It seems > like a better huristic might be to be able to specify a maximum > amount of memory the raidframe driver would be allowed to use, and > then have it set the outstanding request count accordingly. I think that is the preferred approach. At least, that is where the '6' number came from back in the day... > IN the > case of the machine I refer to above, I have 2 raid sets, the stripe > size is set to 64 blocks (32K) with 4 stripes per raid set. with one > of the raid sets running in degraded mode, the maximum amount of > memory used by the raidframe subsystem is 10.4MB. That's not an > insignificant amount of memory, but it's certainly not a profligate > amount. Further thoughts? 10MB is reasonable today, but not so much on a 32MB or 64MB machine :) I'm not sure what the magic number should be... whether we say 5% of kernel memory per RAID set, and then scale that by the size of the RAID set to produce the number of openings (minimum remains at 6?). Another option to self-tuning would be to introduce a sysctl to allow setting the value on-the-fly... According to my notes I was attempting to do memory calculations on this back in 2003/2004, but it doesn't look like I came up with a firm formula back then either... According to those notes, the number of nodes in the IO graph is bounded by: (2 * raidPtr->Layout->numDataCol) + (1 * layoutPtr->numParityCol) + (1 * 2 * layoutPtr->numParityCol) + 3 Multiplying that by the stripe width we get a bound on the memory requirements for the data -- I think it overestimates the requirement per IO, but that's fine. For a 5-disk RAID 5 set with a stripe width of 32 (16K/component, 64K data for the entire stripe) what we end up with is a memory requirement of: (2*4+1*1+1*2*1+3)*16K=224K per IO. It's just a matter of scaling the number of openings to match some reasonable use of kernel memory... Later... Greg Oster
