[
https://issues.apache.org/jira/browse/HDFS-8014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14503731#comment-14503731
]
Zhe Zhang commented on HDFS-8014:
---------------------------------
bq. what is the reason not to simply use DFSInputStream?
Thanks for bringing it up Nicholas. That's another option we've discussed.
Intuitively, {{DFSInputStream}} and {{DFSOutputStream}} have some additional
overhead to handle file-level logic, and slow readers/writers etc. DN should be
able to push/pull blocks more efficiently. I haven't thoroughly looked at those
overhead yet, and it's a good chance to brainstorm here.
To begin with, how do we create a {{DFSInputStream}} on a DN? Following the
regular path, we'll have an unnecessary RPC to NN to fetch block locations (DN
should already have them from the block reconstruction command).
Maybe expose {{actualGetFromOneDataNode}} and just reuse that code? To do that,
we still need to create a {{DFSClient}} on the DN. Looks a little bit weird but
I don't see a real downside.
> Erasure Coding: local and remote block reader for coding work in DataNode
> -------------------------------------------------------------------------
>
> Key: HDFS-8014
> URL: https://issues.apache.org/jira/browse/HDFS-8014
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Kai Zheng
> Assignee: Zhe Zhang
>
> As a task of HDFS-7344 ECWorker, in either stripping or non-stripping erasure
> coding, to perform encoding or decoding, we need first to be able to read
> locally or remotely data blocks. This is to come up block reader facility in
> DataNode side. Better to think about the similar work done in client side, so
> in future it's possible to unify the both.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)