[
https://issues.apache.org/jira/browse/MAPREDUCE-1969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12901709#action_12901709
]
dhruba borthakur commented on MAPREDUCE-1969:
---------------------------------------------
for all these proposals, the unwritten assumption is that all the blocks in a
stripe belong to the same hdfs file. In that case, when the data file is
deleted, the parity file can be deleted too.
> Allow raid to use Reed-Solomon erasure codes
> --------------------------------------------
>
> Key: MAPREDUCE-1969
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1969
> Project: Hadoop Map/Reduce
> Issue Type: New Feature
> Components: contrib/raid
> Reporter: Scott Chen
> Fix For: 0.22.0
>
>
> Currently raid uses one parity block per stripe which corrects one missing
> block on one stripe.
> Using Reed-Solomon code, we can add any number of parity blocks to tolerate
> more missing blocks.
> This way we can get a good file corrupt probability even if we set the
> replication to 1.
> Here are some simple comparisons:
> 1. No raid, replication = 3:
> File corruption probability = O(p^3), Storage space = 3x
> 2. Single parity raid with stripe size = 10, replication = 2:
> File corruption probability = O(p^4), Storage space = 2.2x
> 3. Reed-Solomon raid with parity size = 4 and stripe size = 10, replication =
> 1:
> File corruption probability = O(p^5), Storage space = 1.4x
> where p is the missing block probability.
> Reed-Solomon code can save lots of space without compromising the corruption
> probability.
> To achieve this, we need some changes to raid:
> 1. Add a block placement policy that knows about raid logic and do not put
> blocks on the same stripe on the same node.
> 2. Add an automatic block fixing mechanism. The block fixing will replace the
> replication of under replicated blocks.
> 3. Allow raid to use general erasure code. It is now hard coded using Xor.
> 4. Add a Reed-Solomon code implementation
> We are planing to use it on the older data only.
> Because setting replication = 1 hurts the data locality.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.