[
https://issues.apache.org/jira/browse/HDFS-7678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14516205#comment-14516205
]
Kai Zheng commented on HDFS-7678:
---------------------------------
The refined decoding API from HADOOP-11847:
{noformat}
/**
* Decode with inputs and erasedIndexes, generates outputs.
* How to prepare for inputs:
* 1. Create an array containing parity units + data units;
* 2. Set null in the array locations specified via erasedIndexes to indicate
* they're erased and no data are to read from;
* 3. Set null in the array locations for extra redundant items, as they're
not
* necessary to read when decoding. For example in RS-6-3, if only 1 unit
* is really erased, then we have 2 extra items as redundant. They can be
* set as null to indicate no data will be used from them.
*
* For an example using RS (6, 3), assuming sources (d0, d1, d2, d3, d4, d5)
* and parities (p0, p1, p2), d2 being erased. We can and may want to use only
* 6 units like (d1, d3, d4, d5, p0, p2) to recover d2. We will have:
* inputs = [p0, null(p1), p2, null(d0), d1, null(d2), d3, d4, d5]
* erasedIndexes = [5] // index of d2 into inputs array
* outputs = [a-writable-buffer]
*
* @param inputs inputs to read data from
* @param erasedIndexes indexes of erased units into inputs array
* @param outputs outputs to write into for data generated according to
* erasedIndexes
*/
public void decode(ByteBuffer[] inputs, int[] erasedIndexes, ByteBuffer[]
outputs);
{noformat}
The impact from the caller's point of view:
* It prepares for the input buffers differently, using NULL to indicate not to
read or erased;
* It prepares for the {{erasedIndexes}} and output buffers differently, only
really erased ones are to be taken care of.
{{NativeRSRawDecoder}} will be coming out soon according to the refined APIs,
and it will only compute/recover the really erased items. The using of it is
the same with {{RSRawDecoder}}.
Discussed off-line with [~zhz], it would be good to use the refined API here if
appropriate. Sure we can also follow on separately later if necessary. Thanks.
> Erasure coding: DFSInputStream with decode functionality
> --------------------------------------------------------
>
> Key: HDFS-7678
> URL: https://issues.apache.org/jira/browse/HDFS-7678
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Li Bo
> Assignee: Zhe Zhang
> Attachments: BlockGroupReader.patch, HDFS-7678.000.patch
>
>
> A block group reader will read data from BlockGroup no matter in striping
> layout or contiguous layout. The corrupt blocks can be known before
> reading(told by namenode), or just be found during reading. The block group
> reader needs to do decoding work when some blocks are found corrupt.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)