[ 
https://issues.apache.org/jira/browse/HDFS-8014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14503731#comment-14503731
 ] 

Zhe Zhang commented on HDFS-8014:
---------------------------------

bq. what is the reason not to simply use DFSInputStream?
Thanks for bringing it up Nicholas. That's another option we've discussed. 

Intuitively, {{DFSInputStream}} and {{DFSOutputStream}} have some additional 
overhead to handle file-level logic, and slow readers/writers etc. DN should be 
able to push/pull blocks more efficiently. I haven't thoroughly looked at those 
overhead yet, and it's a good chance to brainstorm here. 

To begin with, how do we create a {{DFSInputStream}} on a DN? Following the 
regular path, we'll have an unnecessary RPC to NN to fetch block locations (DN 
should already have them from the block reconstruction command). 

Maybe expose {{actualGetFromOneDataNode}} and just reuse that code? To do that, 
we still need to create a {{DFSClient}} on the DN. Looks a little bit weird but 
I don't see a real downside. 

> Erasure Coding: local and remote block reader for coding work in DataNode
> -------------------------------------------------------------------------
>
>                 Key: HDFS-8014
>                 URL: https://issues.apache.org/jira/browse/HDFS-8014
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Kai Zheng
>            Assignee: Zhe Zhang
>
> As a task of HDFS-7344 ECWorker, in either stripping or non-stripping erasure 
> coding, to perform encoding or decoding, we need first to be able to read 
> locally or remotely data blocks. This is to come up block reader facility in 
> DataNode side. Better to think about the similar work done in client side, so 
> in future it's possible to unify the both.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to