[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13171463#comment-13171463
 ] 

Maheswaran Sathiamoorthy commented on MAPREDUCE-3361:
-----------------------------------------------------

There is another way of doing it:
I will add a new erasure code type called SRC to ErasureCodeType (which has 
XOR, RS now) and start storing SRC coded files in /raidsrc (RS files stored in 
/raidrs, XOR in /raid). When a file corruption is detected 
and recoverBlockToFile is called, the first thing to do is to check whether the 
file is a parity file or a source file. By looking at the location it can be 
easily determined whether this is a parity file and if so which type. Now if 
its not a parity file, then it is a source file and we need to determine its 
corresponding parity file. This can be done by checking for a parity file first 
in /raidsrc, and then in /raidrs and /raid to find out where it is located. 
That way we can find the parity file too. 
The same thing can be done by determining the filesize, for which we still need 
to search for the parity file by going to /raidrs or /raid; so I think the 
above approach is a little bit cleaner. 
For reconstructing the file, in either approach, we need to pass the 
ErasureCodeType all the way till the decoder and encoder. 
                
> Ability to use SimpleRegeratingCode to fix missing blocks
> ---------------------------------------------------------
>
>                 Key: MAPREDUCE-3361
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3361
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: contrib/raid
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>
> ReedSolomon encoding (n, k) has n storage nodes and can tolerate n-k 
> failures. Regenerating a block needs to access k blocks. This is a problem 
> when n and k are large. Instead, we can use simple regenerating codes (n, k, 
> f) that does first does ReedSolomon (n,k) and then does XOR with f stripe 
> size. Then, a single disk failure needs to access only f nodes and f can be 
> very small.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to