Re: chunk size (was Re: Time to deprecate old RAID formats?)

2007-10-23 Thread Doug Ledford
On Tue, 2007-10-23 at 21:21 +0200, Michal Soltys wrote:
 Doug Ledford wrote:
  
  Well, first I was thinking of files in the few hundreds of megabytes
  each to gigabytes each, and when they are streamed, they are streamed at
  a rate much lower than the full speed of the array, but still at a fast
  rate.  How parallel the reads are then would tend to be a function of
  chunk size versus streaming rate. 
 
 Ahh, I see now. Thanks for explanation.
 
 I wonder though, if setting large readahead would help, if you used larger 
 chunk size. Assuming other options are not possible - i.e. streaming from 
 larger buffer, while reading to it in a full stripe width at least.

Probably not.  All my trial and error in the past with raid5 arrays and
various situations that would cause pathological worst case behavior
showed that once reads themselves reach 16k in size, and are sequential
in nature, then the disk firmware's read ahead kicks in and your
performance stays about the same regardless of increasing your OS read
ahead.  In a nutshell, once you've convinced the disk firmware that you
are going to be reading some data sequentially, it does the rest.  With
a large stripe size (say 256k+), you'll trigger this firmware read ahead
fairly early on in reading any given stripe, so you really don't buy
much by reading the next stripe before you need it, and in fact can end
up wasting a lot of RAM trying to do so, hurting overall performance.

  
  I'm not familiar with the benchmark you are referring to.
  
 
 I was thinking about 
 http://www.mail-archive.com/linux-raid@vger.kernel.org/msg08461.html
 
 with small discussion that happend after that.
-- 
Doug Ledford [EMAIL PROTECTED]
  GPG KeyID: CFBFF194
  http://people.redhat.com/dledford

Infiniband specific RPMs available at
  http://people.redhat.com/dledford/Infiniband


signature.asc
Description: This is a digitally signed message part


Re: chunk size (was Re: Time to deprecate old RAID formats?)

2007-10-20 Thread Doug Ledford
On Sat, 2007-10-20 at 00:43 +0200, Michal Soltys wrote:
 Doug Ledford wrote:
  course, this comes at the expense of peak throughput on the device.
  Let's say you were building a mondo movie server, where you were
  streaming out digital movie files.  In that case, you very well may care
  more about throughput than seek performance since I suspect you wouldn't
  have many small, random reads.  Then I would use a small chunk size,
  sacrifice the seek performance, and get the throughput bonus of parallel
  reads from the same stripe on multiple disks.  On the other hand, if I
  
 
 Out of curiosity though - why wouldn't large chunk work well here ? If you 
 stream video (I assume large files, so like a good few MBs at least), the 
 reads are parallel either way.

Well, first I was thinking of files in the few hundreds of megabytes
each to gigabytes each, and when they are streamed, they are streamed at
a rate much lower than the full speed of the array, but still at a fast
rate.  How parallel the reads are then would tend to be a function of
chunk size versus streaming rate.  I guess I should clarify what I'm
talking about anyway.  To me, a large chunk size is 1 to 2MB or so, a
small chunk size is in the 64k to 256k range.  If you have a 10 disk
raid5 array with a 2mb chunk size, and you aren't just copying files
around, then it's hard to ever get that to do full speed parallel reads
because you simply won't access the data fast enough.

 Yes, the amount of data read from each of the disks will be in less perfect 
 proportion than in small chunk size scenario, but it's pretty neglible. 
 Benchamrks I've seen (like Justin's one) seem not to care much about chunk 
 size in sequential read/write scenarios (and often favors larger chunks). 
 Some of my own tests I did few months ago confirmed that as well.

I'm not familiar with the benchmark you are referring to.

-- 
Doug Ledford [EMAIL PROTECTED]
  GPG KeyID: CFBFF194
  http://people.redhat.com/dledford

Infiniband specific RPMs available at
  http://people.redhat.com/dledford/Infiniband


signature.asc
Description: This is a digitally signed message part


chunk size (was Re: Time to deprecate old RAID formats?)

2007-10-19 Thread Michal Soltys

Doug Ledford wrote:

course, this comes at the expense of peak throughput on the device.
Let's say you were building a mondo movie server, where you were
streaming out digital movie files.  In that case, you very well may care
more about throughput than seek performance since I suspect you wouldn't
have many small, random reads.  Then I would use a small chunk size,
sacrifice the seek performance, and get the throughput bonus of parallel
reads from the same stripe on multiple disks.  On the other hand, if I



Out of curiosity though - why wouldn't large chunk work well here ? If you 
stream video (I assume large files, so like a good few MBs at least), the 
reads are parallel either way.


Yes, the amount of data read from each of the disks will be in less perfect 
proportion than in small chunk size scenario, but it's pretty neglible. 
Benchamrks I've seen (like Justin's one) seem not to care much about chunk 
size in sequential read/write scenarios (and often favors larger chunks). 
Some of my own tests I did few months ago confirmed that as well.

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html