[
https://issues.apache.org/jira/browse/HADOOP-11847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14512109#comment-14512109
]
Zhe Zhang commented on HADOOP-11847:
------------------------------------
Thanks Kai for the updated patch. A few quick comments before I finish
reviewing the whole patch:
# The below can be combined as {{if (xxx == null)}}
{code}
boolean isFirstTime = (adjustedByteArrayInputsParameter == null);
if (isFirstTime) {
{code}
# Based on Java language [specification |
http://docs.oracle.com/javase/specs/jls/se7/html/jls-4.html#jls-4.12.5] we can
assume zero initial values and don't need to zero-fill arrays again.
{code}
// These are temp buffers for bad inputs.
byteArrayBuffersForInput = new byte[numBadUnitsAtMost][];
for (int i = 0; i < byteArrayBuffersForInput.length; ++i) {
byteArrayBuffersForInput[i] = new byte[getChunkSize()];
}
...
// Ensure only ZERO bytes are read from
for (int i = 0; i < byteArrayBuffersForInput.length; ++i) {
System.arraycopy(ZERO_BYTES, 0, byteArrayBuffersForInput[i],
0, ZERO_BYTES.length);
}
{code}
# The names {{ensureWhenXXX}} are not clear enough ("ensure" should be followed
by something).
# Does a "bad" unit mean an erased unit? If so, why are both
{{byteArrayBuffersForInput}} and {{byteArrayBuffersForOutput}} created with
size {{numBadUnitsAtMost}}? Also, {{numBadUnitsAtMost}} can be
{{maxErasedUnits}}
# {{erasedIndexes}} -> {{erasedIndices}}
> Enhance raw coder allowing to read least required inputs in decoding
> --------------------------------------------------------------------
>
> Key: HADOOP-11847
> URL: https://issues.apache.org/jira/browse/HADOOP-11847
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: io
> Reporter: Kai Zheng
> Assignee: Kai Zheng
> Attachments: HADOOP-11847-v1.patch, HADOOP-11847-v2.patch
>
>
> This is to enhance raw erasure coder to allow only reading least required
> inputs while decoding. It will also refine and document the relevant APIs for
> better understanding and usage. When using least required inputs, it may add
> computating overhead but will possiblly outperform overall since less network
> traffic and disk IO are involved.
> This is something planned to do but just got reminded by [~zhz]' s question
> raised in HDFS-7678, also copied here:
> bq.Kai Zheng I have a question about decoding: in a (6+3) schema, if block #2
> is missing, and I want to repair it with blocks 0, 1, 3, 4, 5, 8, how should
> I construct the inputs to RawErasureDecoder#decode?
> With this work, hopefully the answer to above question would be obvious.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)