[ 
https://issues.apache.org/jira/browse/HDFS-8655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14603806#comment-14603806
 ] 

Zhe Zhang commented on HDFS-8655:
---------------------------------

While working on the patch I had some additional thoughts about the handling 
the blocks of an {{INodeFile}} (hence canceling the patch to gather more 
feedback).

# Under HDFS-7285 and HDFS-8058 we've discussed the type-safety concern of 
simply using a {{BlockInfo}} array. Actually, besides contiguous vs. striped 
layout, there are other invariants we should maintain for the blocks list.
#* Only the last block can be UC
#* In a striped file, all blocks should have the same schema and same cell size
# Currently the {{blocks}} array is widely exposed to BM and NN classes, which 
is bug-prone (after obtaining the array reference, arbitrary changes can be 
made to the blocks list). This also makes it very hard to ensure the 
invariants. I think we should refactor accesses to a file's blocks with clearly 
defined read/write operations. Something like:
{code}
class INodeFile {
        private BlockInfo[] blocks;
        
        public addBlock(BlockInfo b) {
        }

        public setBlock(int idx, BlockInfo b) {
        }

        public Iterable<BlockInfo> getBlocks() {
        }
}
{code}
In {{addBlock}} and {{setBlock}} we should ensure the new block complies with 
the invariants. I initially thought about creating a separate 
{{INodeFileBlocks}} class but gave up the idea because of memory overhead 
(thanks [~andrew.wang] for the offline discussion).
# I started writing a patch along this direction but found the work quite 
significant. If we agree the above is a reasonable way to ensure 
type/layout/state safety of {{INodeFile#blocks}}, I think we should adopt the 
HDFS-8058 approach in EC branch, and target this JIRA as a follow-on.

> Refactor accesses to INodeFile#blocks
> -------------------------------------
>
>                 Key: HDFS-8655
>                 URL: https://issues.apache.org/jira/browse/HDFS-8655
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>    Affects Versions: 2.7.0
>            Reporter: Zhe Zhang
>            Assignee: Zhe Zhang
>         Attachments: HDFS-8655.00.patch, HDFS-8655.01.patch
>
>
> When enabling INodeFile support for striped blocks (mainly in HDFS-7749), 
> HDFS-7285 branch generalized the concept of blocks under an inode. Now 
> {{INodeFile#blocks}} only contains contiguous blocks of an inode. This JIRA 
> separates out code refactors for this purpose. Two main changes:
> # Rename {{setBlocks}} to {{setContiguousBlocks}}
> # Replace direct accesses to {{INodeFile#blocks}} to {{getBlocks}}
> It also contains some code cleanups introduced in the branch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to