[
https://issues.apache.org/jira/browse/HDFS-7339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266566#comment-14266566
]
Andrew Wang commented on HDFS-7339:
-----------------------------------
Thanks for working on this Zhe, here are some quick thoughts about the patch:
* Could make this into a INode Feature, like how we do ACL and XAttrs. I think
we can get rid of isStriped then too.
* Need to wire up getAdditionalBlockGroups. previous handling also needs to
account for block groups.
* LocatedBlockGroup is also missing a bunch of functionality from LocatedBlock,
which I think we need. Check around a bit in the client for how it uses
LocatedBlock too, we will want comparable functionality for erasure coded and
not files.
* Would prefer to throw UnsupportedOperationException for stubbed methods, to
be very clear
* Since BlockGroupManager#chooseNewGroupTargets is called without any locks
held, need to make sure it is threadsafe. Worth adding a comment?
* What's the interaction between the two SequentialBlockIDGenerator classes?
since they don't use the same count, there will be conflicts.
* Why do we have both BlockGroupInfo and BlockGroup? If we put BlockInfos
rather than Blocks in BlockGroup, wouldn't it fill the need. Could move
BlockGroup to blockmanagement package then too.
> NameNode support for erasure coding block groups
> ------------------------------------------------
>
> Key: HDFS-7339
> URL: https://issues.apache.org/jira/browse/HDFS-7339
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Zhe Zhang
> Assignee: Zhe Zhang
> Attachments: HDFS-7339-001.patch, HDFS-7339-002.patch,
> Meta-striping.jpg, NN-stripping.jpg
>
>
> All erasure codec operations center around the concept of _block group_; they
> are formed in initial encoding and looked up in recoveries and conversions. A
> lightweight class {{BlockGroup}} is created to record the original and parity
> blocks in a coding group, as well as a pointer to the codec schema (pluggable
> codec schemas will be supported in HDFS-7337). With the striping layout, the
> HDFS client needs to operate on all blocks in a {{BlockGroup}} concurrently.
> Therefore we propose to extend a file’s inode to switch between _contiguous_
> and _striping_ modes, with the current mode recorded in a binary flag. An
> array of BlockGroups (or BlockGroup IDs) is added, which remains empty for
> “traditional” HDFS files with contiguous block layout.
> The NameNode creates and maintains {{BlockGroup}} instances through the new
> {{ECManager}} component; the attached figure has an illustration of the
> architecture. As a simple example, when a {_Striping+EC_} file is created and
> written to, it will serve requests from the client to allocate new
> {{BlockGroups}} and store them under the {{INodeFile}}. In the current phase,
> {{BlockGroups}} are allocated both in initial online encoding and in the
> conversion from replication to EC. {{ECManager}} also facilitates the lookup
> of {{BlockGroup}} information for block recovery work.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)