Wojciech Puchar wrote:
i read the graid3 manual and http://www.acnc.com/04_01_03.html to make sure i know what's RAID3 and i don't understand few things.

1)

"The number of components must be equal to 3, 5, 9, 17, etc.
                (2^n + 1)."

why it can't be say 5 disks+parity?

The reason is in the definition on "RAID 3", which says the updates to the RAID device must be atomic. In some ideal universe, RAID 3 is implemented in hardware and on individual bytes, but here we cannot write to the drives in units other than sectorsize and sectorsize is 512 bytes.

Parity needs to be calculated with regards to each sector, so at the sector level, the minimum number of sectors is three sectors: two for data and one for parity. This means the high-level atomic sectorsize is 2*512=1024 bytes. If you inspect your RAID 3 devices, you'll see just that:

# diskinfo -v /dev/raid3/homes
/dev/raid3/homes
        1024            # sectorsize
        107374181376    # mediasize in bytes (100G)
        104857599       # mediasize in sectors

But each drive has a normal sectorsize of 512:

# diskinfo -v /dev/ad4
/dev/ad4
        512             # sectorsize
        80026361856     # mediasize in bytes (75G)
        156301488       # mediasize in sectors

Sector sizes cannot be arbitrary for various reasons, mostly dealing with how memory pages and virtual memory are managed. In short, they need to be powers of two. This restricts us to high-level ("big") sector sizes that can be exactly one of the following values: 1024, 2048, 4096, 8192, etc. Since drive sectors are fixed to 512 bytes, this means that the number of *data* drives must also be a power of two: 2, 4, 8, 16, etc. Add one more drive for the parity and you get the starting sequence: 3, 5, 9, 17.

In practice, this means that if you have 17 drives in RAID3, the sectorsize of the array itself will be 16*512 = 8192. Each write to the array will update all 17 drives before returning (one sector on each drive, ensuring an atomic operation). Note that the file system created on such an array will also have its characteristics modified to the sector size (the fragment size will be the sector size).

2) "-r  Use parity component for reading in round-robin fashion.
"Without this option the parity component is not used at
all for reading operations when the device is in a complete state.
 With this option specified random I/O read operations are even 40% faster
, but sequential reads are slower. One cannot use this option if the -w option is also specified."


how parity disk could speed up random I/O?

It will work well only when the number of drives is small (i.e. three drives), by using the parity drive as a valid source of data, avoiding some seeks to all drives. I think that, theoretically, you can save at most 0.33 (1/3) of all seeks - I don't know where the 40% number comes from.


Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to