Re: RAID5/10 chunk size and ext2/3 stride parameter

2006-11-04 Thread martin f krafft
also sprach dean gaudet [EMAIL PROTECTED] [2006.11.03.2019 +0100]:
  I cannot find authoritative information about the relation between
  the RAID chunk size and the correct stride parameter to use when
  creating an ext2/3 filesystem.
 
 you know, it's interesting -- mkfs.xfs somehow gets the right sunit/swidth 
 automatically from the underlying md device.

i don't know enough about xfs to be able to agree or disagree with
you on that.

 # mdadm --create --level=5 --raid-devices=4 --assume-clean --auto=yes 
 /dev/md0 /dev/sd[abcd]1
 mdadm: array /dev/md0 started.

with 64k chunks i assume...

 # mkfs.xfs /dev/md0
 meta-data=/dev/md0   isize=256agcount=32, agsize=9157232 
 blks
  =   sectsz=4096  attr=0
 data =   bsize=4096   blocks=293031424, imaxpct=25
  =   sunit=16 swidth=48 blks, unwritten=1

sunit seems like the stride width i determined (64k chunks / 4k
bzise), but what is swidth? Is it 64 * 3/4 because of the four
device RAID5?

 # mdadm --create --level=10 --layout=f2 --raid-devices=4 --assume-clean 
 --auto=yes /dev/md0 /dev/sd[abcd]1
 mdadm: array /dev/md0 started.
 # mkfs.xfs -f /dev/md0
 meta-data=/dev/md0   isize=256agcount=32, agsize=6104816 blks
  =   sectsz=512   attr=0
 data =   bsize=4096   blocks=195354112, imaxpct=25
  =   sunit=16 swidth=64 blks, unwritten=1

okay, so as before, 16 stride size and 64 stripe width, because
we're now dealing with mirrors.

 # mdadm --create --level=10 --layout=n2 --raid-devices=4 --assume-clean 
 --auto=yes /dev/md0 /dev/sd[abcd]1
 mdadm: array /dev/md0 started.
 # mkfs.xfs -f /dev/md0
 meta-data=/dev/md0   isize=256agcount=32, agsize=6104816 blks
  =   sectsz=512   attr=0
 data =   bsize=4096   blocks=195354112, imaxpct=25
  =   sunit=16 swidth=64 blks, unwritten=1

why not? in this case, -n2 and -f2 aren't any different, are they?

 in a near 2 layout i would expect sunit=16, swidth=32 ...  but swidth=64
 probably doesn't hurt.

why?

 that's how i think it works -- i don't think ext[23] have a concept of stripe
 width like xfs does.  they just want to know how to avoid putting all the
 critical data on one disk (which needs only the chunk size).  but you should
 probably ask on the linux-ext4 mailing list.

once i understand everything...

-- 
martin;  (greetings from the heart of the sun.)
  \ echo mailto: !#^.*|tr * mailto:; [EMAIL PROTECTED]
 
spamtraps: [EMAIL PROTECTED]
 
if you find a spelling mistake in the above, you get to keep it.


signature.asc
Description: Digital signature (GPG/PGP)


Re: RAID5/10 chunk size and ext2/3 stride parameter

2006-11-04 Thread dean gaudet
On Sat, 4 Nov 2006, martin f krafft wrote:

 also sprach dean gaudet [EMAIL PROTECTED] [2006.11.03.2019 +0100]:
   I cannot find authoritative information about the relation between
   the RAID chunk size and the correct stride parameter to use when
   creating an ext2/3 filesystem.
  
  you know, it's interesting -- mkfs.xfs somehow gets the right sunit/swidth 
  automatically from the underlying md device.
 
 i don't know enough about xfs to be able to agree or disagree with
 you on that.
 
  # mdadm --create --level=5 --raid-devices=4 --assume-clean --auto=yes 
  /dev/md0 /dev/sd[abcd]1
  mdadm: array /dev/md0 started.
 
 with 64k chunks i assume...

yup.


  # mkfs.xfs /dev/md0
  meta-data=/dev/md0   isize=256agcount=32, agsize=9157232 
  blks
   =   sectsz=4096  attr=0
  data =   bsize=4096   blocks=293031424, imaxpct=25
   =   sunit=16 swidth=48 blks, unwritten=1
 
 sunit seems like the stride width i determined (64k chunks / 4k
 bzise), but what is swidth? Is it 64 * 3/4 because of the four
 device RAID5?

yup.

and for a raid6 mkfs.xfs correctly gets sunit=16 swidth=32.


  # mdadm --create --level=10 --layout=f2 --raid-devices=4 --assume-clean 
  --auto=yes /dev/md0 /dev/sd[abcd]1
  mdadm: array /dev/md0 started.
  # mkfs.xfs -f /dev/md0
  meta-data=/dev/md0   isize=256agcount=32, agsize=6104816 
  blks
   =   sectsz=512   attr=0
  data =   bsize=4096   blocks=195354112, imaxpct=25
   =   sunit=16 swidth=64 blks, unwritten=1
 
 okay, so as before, 16 stride size and 64 stripe width, because
 we're now dealing with mirrors.
 
  # mdadm --create --level=10 --layout=n2 --raid-devices=4 --assume-clean 
  --auto=yes /dev/md0 /dev/sd[abcd]1
  mdadm: array /dev/md0 started.
  # mkfs.xfs -f /dev/md0
  meta-data=/dev/md0   isize=256agcount=32, agsize=6104816 
  blks
   =   sectsz=512   attr=0
  data =   bsize=4096   blocks=195354112, imaxpct=25
   =   sunit=16 swidth=64 blks, unwritten=1
 
 why not? in this case, -n2 and -f2 aren't any different, are they?

they're different in that with f2 you get essentially 4 disk raid0 read 
performance because the copies of each byte are half a disk away... so it 
looks like a raid0 on the first half of the disks, and another raid0 on 
the second half.

in n2 the two copies are at the same offset... so it looks more like a 2 
disk raid0 for reading and writing.

i'm not 100% certain what xfs uses them for -- you can actually change the 
values at mount time.  so it probably uses them for either read scheduling 
or write layout or both.

-dean
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID5/10 chunk size and ext2/3 stride parameter

2006-11-03 Thread dean gaudet
On Tue, 24 Oct 2006, martin f krafft wrote:

 Hi,
 
 I cannot find authoritative information about the relation between
 the RAID chunk size and the correct stride parameter to use when
 creating an ext2/3 filesystem.

you know, it's interesting -- mkfs.xfs somehow gets the right sunit/swidth 
automatically from the underlying md device.

for example, on a box i'm testing:

# mdadm --create --level=5 --raid-devices=4 --assume-clean --auto=yes /dev/md0 
/dev/sd[abcd]1
mdadm: array /dev/md0 started.
# mkfs.xfs /dev/md0
meta-data=/dev/md0   isize=256agcount=32, agsize=9157232 
blks
 =   sectsz=4096  attr=0
data =   bsize=4096   blocks=293031424, imaxpct=25
 =   sunit=16 swidth=48 blks, unwritten=1
naming   =version 2  bsize=4096
log  =internal log   bsize=4096   blocks=32768, version=2
 =   sectsz=4096  sunit=1 blks
realtime =none   extsz=196608 blocks=0, rtextents=0

# mdadm --stop /dev/md0
mdadm: stopped /dev/md0
# mdadm --zero-superblock /dev/sd[abcd]1
# mdadm --create --level=10 --layout=f2 --raid-devices=4 --assume-clean 
--auto=yes /dev/md0 /dev/sd[abcd]1
mdadm: array /dev/md0 started.
# mkfs.xfs -f /dev/md0
meta-data=/dev/md0   isize=256agcount=32, agsize=6104816 blks
 =   sectsz=512   attr=0
data =   bsize=4096   blocks=195354112, imaxpct=25
 =   sunit=16 swidth=64 blks, unwritten=1
naming   =version 2  bsize=4096
log  =internal log   bsize=4096   blocks=32768, version=1
 =   sectsz=512   sunit=0 blks
realtime =none   extsz=262144 blocks=0, rtextents=0


i wonder if the code could be copied into mkfs.ext3?

although hmm, i don't think it gets raid10 n2 correct:

# mdadm --stop /dev/md0
mdadm: stopped /dev/md0
# mdadm --zero-superblock /dev/sd[abcd]1
# mdadm --create --level=10 --layout=n2 --raid-devices=4 --assume-clean 
--auto=yes /dev/md0 /dev/sd[abcd]1
mdadm: array /dev/md0 started.
# mkfs.xfs -f /dev/md0
meta-data=/dev/md0   isize=256agcount=32, agsize=6104816 blks
 =   sectsz=512   attr=0
data =   bsize=4096   blocks=195354112, imaxpct=25
 =   sunit=16 swidth=64 blks, unwritten=1
naming   =version 2  bsize=4096
log  =internal log   bsize=4096   blocks=32768, version=1
 =   sectsz=512   sunit=0 blks
realtime =none   extsz=262144 blocks=0, rtextents=0


in a near 2 layout i would expect sunit=16, swidth=32 ...  but swidth=64
probably doesn't hurt.


 My understanding is that (block * stride) == (chunk). So if I create
 a default RAID5/10 with 64k chunks, and create a filesystem with 4k
 blocks on it, I should choose stride 64k/4k = 16.

that's how i think it works -- i don't think ext[23] have a concept of stripe
width like xfs does.  they just want to know how to avoid putting all the
critical data on one disk (which needs only the chunk size).  but you should
probably ask on the linux-ext4 mailing list.

 Is the chunk size of an array equal to the stripe size? Or is it
 (n-1)*chunk size for RAID5 and (n/2)*chunk size for a plain near=2
 RAID10?

 Also, I understand that it makes no sense to use stride for RAID1 as
 there are no stripes in that sense. But for RAID10 it makes sense,
 right?

yep.

-dean
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html