[jira] [Commented] (HADOOP-11847) Enhance raw coder allowing to read least required inputs in decoding

Kai Zheng (JIRA) Wed, 22 Apr 2015 14:39:55 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-11847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14507966#comment-14507966
 ]


Kai Zheng commented on HADOOP-11847:
------------------------------------

bq.We try to decode all null slots in the input arrays. I'm not sure if this 
will cause unnecessary computation.
Yes this is something I want to avoid. In theory it's possible to only recover 
and compute the target really erased block(s), at least for RS code. Currently 
having to decode and compute all unintended blocks is due to the limitation 
originated from HDFS-RAID. Looks like resolving the limitation sound a 
non-trivial task in short for me, would you agree we have a follow-on issue for 
it? I checked ISA-L does exactly what we want and doesn't have such limitation. 

Regarding the RS->XOR optimization trick, it doesn't sound a good one now, as 
we would read only 6 good blocks to recover the erased one using RS decoder, 
instead of having to read 8 blocks to recover the erased one using XOR decoder. 
I will remove the codes. As in theory how RS->XOR, you might google and read 
this paper if you're interested, "Flexible Parameterization of XOR based Codes 
for Distributed Storage".

For other points, I will double check my codes later, adding more comments to 
explain or clarify them better.

> Enhance raw coder allowing to read least required inputs in decoding
> --------------------------------------------------------------------
>
>                 Key: HADOOP-11847
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11847
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: io
>            Reporter: Kai Zheng
>            Assignee: Kai Zheng
>         Attachments: HADOOP-11847-v1.patch
>
>
> This is to enhance raw erasure coder to allow only reading least required 
> inputs while decoding. It will also refine and document the relevant APIs for 
> better understanding and usage. When using least required inputs, it may add 
> computating overhead but will possiblly outperform overall since less network 
> traffic and disk IO are involved.
> This is something planned to do but just got reminded by [~zhz]' s question 
> raised in HDFS-7678, also copied here:
> bq.Kai Zheng I have a question about decoding: in a (6+3) schema, if block #2 
> is missing, and I want to repair it with blocks 0, 1, 3, 4, 5, 8, how should 
> I construct the inputs to RawErasureDecoder#decode?
> With this work, hopefully the answer to above question would be obvious.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-11847) Enhance raw coder allowing to read least required inputs in decoding

Reply via email to