[
https://issues.apache.org/jira/browse/HDFS-8030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15202419#comment-15202419
]
Zhe Zhang commented on HDFS-8030:
---------------------------------
Thanks [~umamaheswararao] for the helpful feedback!
bq. As this design tries to convert files into EC mode from normal file layout,
Blockgroups needs to be created later when converting. But block groups
generally we allocate continuous blockids, but here how do we make that
continuous blockids when converting?
bq. Does this create overheads on memory as we need to track blockGroups
separately and if the blockids are not continuous as discussed in #1
In the current design we are not assuming continuous block IDs in the same
block group. And therefore we are incurring additional memory overhead to store
the mapping between a block group to its blocks. But this overhead is partially
offset by the reduction of replicas.
Generating parity data in streaming fashion sounds a good idea.
I think contiguous EC will generate new {{ErasureCodingPolicy}}'s. Then it
will be handled by the current {{ErasureCodingPolicy}} design:
{code}
Each individual directory can be configured with an EC policy with command
`hdfs erasurecode -setPolicy`. When a file is created, it will inherit the EC
policy from its nearest ancestor directory to determine how its blocks are
stored.
{code}
> HDFS Erasure Coding Phase II -- EC with contiguous layout
> ---------------------------------------------------------
>
> Key: HDFS-8030
> URL: https://issues.apache.org/jira/browse/HDFS-8030
> Project: Hadoop HDFS
> Issue Type: New Feature
> Components: erasure-coding
> Reporter: Zhe Zhang
> Assignee: Zhe Zhang
> Attachments: HDFSErasureCodingPhaseII-20151204.pdf
>
>
> Data redundancy form -- replication or erasure coding, should be orthogonal
> to block layout -- contiguous or striped. This JIRA explores the combination
> of {{Erasure Coding}} + {{Contiguous}} block layout.
> As will be detailed in the design document, key benefits include preserving
> block locality, and easy conversion between hot and cold modes.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)