[Hadoop Wiki] Update of "HDFS-RAID" by RamkumarVadali

Apache Wiki Wed, 27 Oct 2010 22:58:14 -0700

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.


The "HDFS-RAID" page has been changed by RamkumarVadali.
http://wiki.apache.org/hadoop/HDFS-RAID?action=diff&rev1=5&rev2=6

--------------------------------------------------

  bytes. We can recover any 3 missing bytes by the other 10 remaining bytes.
  
  There are two kinds of erasure codes implemented in Raid: XOR code and 
Reed-Solomon code. The difference between them is that XOR only allows creating 
one parity
- bytes but Reed-Solomon code allows creating any given number of parity bytes.
+ bytes but Reed-Solomon code allows creating any given number of parity bytes. 
As a result, the replication on the source file can be reduce to 1 when using 
Reed-Solomon
+ without losing data safety. The downside of having only one replica of a 
block is that reads of a block have to go to a single machine, reducing 
parallelism. Thus
+ Reed-Solomon should be used on data that is not supposed to be used 
frequently.
  
  == Using HDFS RAID ==
  
@@ -99, +101 @@

  === Configuration ===
  
  There is a single configuration file named `raid.xml` that describes the HDFS
- paths for which RAID should be used. A sample of this file can be found in
- `src/contrib/raid/conf/raid.xml`. To apply the policies defined in 
`raid.xml`, 
- a reference has to be added to `hdfs-site.xml`:
+ paths for which RAID should be used. This provides a list of directory/file 
patterns
+ that need to be RAIDed. There are quite a few options that can be specified 
for
+ each pattern. A sample of this file can be found 
in`src/contrib/raid/conf/raid.xml`.
+ To apply the policies defined in `raid.xml`, a reference has to be added to 
`hdfs-site.xml`:
  {{{
  <property>
    <name>raid.config.file</name>

[Hadoop Wiki] Update of "HDFS-RAID" by RamkumarVadali

Reply via email to