[
https://issues.apache.org/jira/browse/HDFS-7339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14290900#comment-14290900
]
Zhe Zhang commented on HDFS-7339:
---------------------------------
Thanks [~szetszwo]. I like the idea of using the first block to represent the
block group. It could allow us to reuse block management code once we go over
more details and make sure it's viable. Seems to me it should work for the
striping layout: all block groups in a file share the same layout and schema,
both of which can be obtained from the inode.
When we implement EC with contiguous layout we need an explicit BlockGroup
class, but it can be much simpler.
Regarding generation stamps: what if an EC block is lost and recovered? Should
NN give the recovered block a new stamp?
> Allocating and persisting block groups in NameNode
> --------------------------------------------------
>
> Key: HDFS-7339
> URL: https://issues.apache.org/jira/browse/HDFS-7339
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Zhe Zhang
> Assignee: Zhe Zhang
> Attachments: HDFS-7339-001.patch, HDFS-7339-002.patch,
> HDFS-7339-003.patch, HDFS-7339-004.patch, HDFS-7339-005.patch,
> HDFS-7339-006.patch, Meta-striping.jpg, NN-stripping.jpg
>
>
> All erasure codec operations center around the concept of _block group_; they
> are formed in initial encoding and looked up in recoveries and conversions. A
> lightweight class {{BlockGroup}} is created to record the original and parity
> blocks in a coding group, as well as a pointer to the codec schema (pluggable
> codec schemas will be supported in HDFS-7337). With the striping layout, the
> HDFS client needs to operate on all blocks in a {{BlockGroup}} concurrently.
> Therefore we propose to extend a file’s inode to switch between _contiguous_
> and _striping_ modes, with the current mode recorded in a binary flag. An
> array of BlockGroups (or BlockGroup IDs) is added, which remains empty for
> “traditional” HDFS files with contiguous block layout.
> The NameNode creates and maintains {{BlockGroup}} instances through the new
> {{ECManager}} component; the attached figure has an illustration of the
> architecture. As a simple example, when a {_Striping+EC_} file is created and
> written to, it will serve requests from the client to allocate new
> {{BlockGroups}} and store them under the {{INodeFile}}. In the current phase,
> {{BlockGroups}} are allocated both in initial online encoding and in the
> conversion from replication to EC. {{ECManager}} also facilitates the lookup
> of {{BlockGroup}} information for block recovery work.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)