[
https://issues.apache.org/jira/browse/HADOOP-12047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14980029#comment-14980029
]
Walter Su commented on HADOOP-12047:
------------------------------------
Thanks [~drankye] for the explanation. I have some other questions. I think
1. Some options can be customized in different use-cases. Like
ALLOW_CHANGE_INPUTS. {{DFSStripedOutputStream}} / {{..InputStream}} are 2
different use-cases.
2. Some options can be customized in different clusters by user. Can be used
for performance tuning.
3. Some options are not "option" exactly, they are more like inside behavior.
Like PREFER_DIRECT_BUFFER, it's decided by the coder impl. These options
shouldn't be exposed, and should be immutable.
Looks like {{CoderOption}} can have above 3 kinds options( 2 kinds if the 3rd
is not counted)? And what's the relationship with {{ECSchema.extraOptions}}?
And the patch need rebase.
> Indicate preference not to affect input buffers during coding in erasure coder
> ------------------------------------------------------------------------------
>
> Key: HADOOP-12047
> URL: https://issues.apache.org/jira/browse/HADOOP-12047
> Project: Hadoop Common
> Issue Type: Sub-task
> Reporter: Kai Zheng
> Assignee: Kai Zheng
> Fix For: HDFS-7285
>
> Attachments: HADOOP-12047-HDFS-7285-v1.patch, HADOOP-12047-v2.patch,
> HADOOP-12047-v3.patch, initial-poc.patch
>
>
> It's good to define and ensure input buffers are not affected during coding
> process in raw erasure coders. Below are copied from discussion with
> [~jingzhao] in HDFS-8481:
> bq. In that case we cannot reuse the source buffers I guess? Then do we need
> to expose this information in the decoder?
> bq. Good catch Jing! Yes in this case we can't reuse the source buffers here
> as they need to be passed to caller/applications without being changed. I'm
> planning to re-implement the Java coders in HADOOP-12041 and related, when
> done it's possible to ensure the input buffers not to be affected. Benefits
> of doing this in coder layer: 1) a more clear contract between coder and
> caller in more general sense for the inputs; 2) concrete coder may have
> specific tweak to optimize in the aspect, ideally no input data copying at
> all, worst, make the copy, but all transparent to callers; 3) allow new
> coders (LRC, HH) to be layered on other primitive coders (RS, XOR) more
> easily.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)