[jira] [Commented] (HDFS-7678) Erasure coding: DFSInputStream with decode functionality

Kai Zheng (JIRA) Mon, 27 Apr 2015 19:25:16 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-7678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14516205#comment-14516205
 ]


Kai Zheng commented on HDFS-7678:
---------------------------------

The refined decoding API from HADOOP-11847:
{noformat}
  /**
   * Decode with inputs and erasedIndexes, generates outputs.
   * How to prepare for inputs:
   * 1. Create an array containing parity units + data units;
   * 2. Set null in the array locations specified via erasedIndexes to indicate
   *    they're erased and no data are to read from;
   * 3. Set null in the array locations for extra redundant items, as they're 
not
   *    necessary to read when decoding. For example in RS-6-3, if only 1 unit
   *    is really erased, then we have 2 extra items as redundant. They can be
   *    set as null to indicate no data will be used from them.
   *
   * For an example using RS (6, 3), assuming sources (d0, d1, d2, d3, d4, d5)
   * and parities (p0, p1, p2), d2 being erased. We can and may want to use only
   * 6 units like (d1, d3, d4, d5, p0, p2) to recover d2. We will have:
   *     inputs = [p0, null(p1), p2, null(d0), d1, null(d2), d3, d4, d5]
   *     erasedIndexes = [5] // index of d2 into inputs array
   *     outputs = [a-writable-buffer]
   *
   * @param inputs inputs to read data from
   * @param erasedIndexes indexes of erased units into inputs array
   * @param outputs outputs to write into for data generated according to
   *                erasedIndexes
   */
public void decode(ByteBuffer[] inputs, int[] erasedIndexes, ByteBuffer[] 
outputs);
{noformat}

The impact from the caller's point of view:
* It prepares for the input buffers differently, using NULL to indicate not to 
read or erased;
* It prepares for the {{erasedIndexes}} and output buffers differently, only 
really erased ones are to be taken care of.
{{NativeRSRawDecoder}} will be coming out soon according to the refined APIs, 
and it will only compute/recover the really erased items. The using of it is 
the same with {{RSRawDecoder}}.

Discussed off-line with [~zhz], it would be good to use the refined API here if 
appropriate. Sure we can also follow on separately later if necessary. Thanks.

> Erasure coding: DFSInputStream with decode functionality
> --------------------------------------------------------
>
>                 Key: HDFS-7678
>                 URL: https://issues.apache.org/jira/browse/HDFS-7678
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Li Bo
>            Assignee: Zhe Zhang
>         Attachments: BlockGroupReader.patch, HDFS-7678.000.patch
>
>
> A block group reader will read data from BlockGroup no matter in striping 
> layout or contiguous layout. The corrupt blocks can be known before 
> reading(told by namenode), or just be found during reading. The block group 
> reader needs to do decoding work when some blocks are found corrupt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7678) Erasure coding: DFSInputStream with decode functionality

Reply via email to