Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.
The "HDFS-RAID" page has been changed by ScottChen. The comment on this change is: Add contents about ErasureCode. http://wiki.apache.org/hadoop/HDFS-RAID?action=diff&rev1=3&rev2=4 -------------------------------------------------- * the RaidNode, a daemon that creates and maintains parity files for all data files stored in the DRFS, * the BlockFixer, which periodically recomputes blocks that have been lost or corrupted, * the RaidShell utility, which allows the administrator to manually trigger the recomputation of missing or corrupt blocks and to check for files that have become irrecoverably corrupted. + * the ErasureCode, which provides the encode and decode of the bytes in * blocks === DRFS client === @@ -70, +71 @@ recomputation of bad data blocks and also allows the administrator to display a list of irrecoverable files (i.e., files for which too many data or parity blocks have been lost). + === ErasureCode === + + (currently under development) + + ErasureCode is the underlying component used by BlockFixer and RaiNode to generate parity blocks and to fix parity/source blocks. + ErasureCode does encode and decode. When encoding, ErasureCode takes + several source bytes and generate some parity bytes. When decoding, ErasureCode generates + the missing bytes (can be parity or source bytes) by looking at the remaining source bytes and parity bytes. + + The number of missing bytes can be recovered is equal to the number of parity bytes created. For example, if we encode 10 source bytes to 3 parity + bytes. We can recover any 3 missing bytes by the other 10 remaining bytes. + + There are two kinds of erasure codes implemented in Raid: XOR code and Reed-Solomon code. The difference between them is that XOR only allows creating one parity + bytes but Reed-Solomon code allows creating any given number of parity bytes. == Using HDFS RAID ==
