Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.
The "HDFS-RAID" page has been changed by RamkumarVadali. http://wiki.apache.org/hadoop/HDFS-RAID?action=diff&rev1=5&rev2=6 -------------------------------------------------- bytes. We can recover any 3 missing bytes by the other 10 remaining bytes. There are two kinds of erasure codes implemented in Raid: XOR code and Reed-Solomon code. The difference between them is that XOR only allows creating one parity - bytes but Reed-Solomon code allows creating any given number of parity bytes. + bytes but Reed-Solomon code allows creating any given number of parity bytes. As a result, the replication on the source file can be reduce to 1 when using Reed-Solomon + without losing data safety. The downside of having only one replica of a block is that reads of a block have to go to a single machine, reducing parallelism. Thus + Reed-Solomon should be used on data that is not supposed to be used frequently. == Using HDFS RAID == @@ -99, +101 @@ === Configuration === There is a single configuration file named `raid.xml` that describes the HDFS - paths for which RAID should be used. A sample of this file can be found in - `src/contrib/raid/conf/raid.xml`. To apply the policies defined in `raid.xml`, - a reference has to be added to `hdfs-site.xml`: + paths for which RAID should be used. This provides a list of directory/file patterns + that need to be RAIDed. There are quite a few options that can be specified for + each pattern. A sample of this file can be found in`src/contrib/raid/conf/raid.xml`. + To apply the policies defined in `raid.xml`, a reference has to be added to `hdfs-site.xml`: {{{ <property> <name>raid.config.file</name>
