[ 
https://issues.apache.org/jira/browse/HDFS-8059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14482483#comment-14482483
 ] 

Yi Liu commented on HDFS-8059:
------------------------------

Jing, Thanks for your detailed and nice comments !

I agree with your analysis, but currently for contiguous block, the storage 
related info (replication, storgePolicy) are stored in INodeFile,  they are the 
same for all blocks, so it's nature we keep (dataBlockNum and parityBlockNum) 
in INodeFile itself too, this can save NN memory in case of large files as you 
said.  

Furthermore, for NN ops, like {{getAdditionalBlock}}, if these these two 
information are available in INodeFile, then it's more easier and efficient to 
construct {{BlockInfoStripedUnderConstruction}}, right? Otherwise we should use 
other ways to get these two information, maybe again through the ECZone? In 
current branch, they are hard-code.

{quote}
Maybe we should wait and see more EC use cases in practice to decide if we want 
to do this optimization?
{quote}
Sure.

> Erasure coding: move dataBlockNum and parityBlockNum from BlockInfoStriped to 
> INodeFile
> ---------------------------------------------------------------------------------------
>
>                 Key: HDFS-8059
>                 URL: https://issues.apache.org/jira/browse/HDFS-8059
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>    Affects Versions: HDFS-7285
>            Reporter: Yi Liu
>            Assignee: Yi Liu
>         Attachments: HDFS-8059.001.patch
>
>
> Move {{dataBlockNum}} and {{parityBlockNum}} from BlockInfoStriped to 
> INodeFile, and store them in {{FileWithStripedBlocksFeature}}.
> Ideally these two nums are the same for all striped blocks in a file, and 
> store them in BlockInfoStriped will waste NN memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to