[
https://issues.apache.org/jira/browse/HDFS-6133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13978286#comment-13978286
]
Yongjun Zhang commented on HDFS-6133:
-------------------------------------
Hi [~zhaoyunjiong], thanks for letting me know about HDFS-6133 at HDFS-4420,
and sorry for getting back late.
I looked at the patch you did, it looks pretty good to me. Thanks for the good
work.
I have some comments on small things:
- Some lines exceed 80 chars. E.g. NamenodeProtocolServerSideTranslatorPB.java,
NameNodeRpcServer.java etc.
- In BlockManager.getBlocksWithLocations
You might consider stripping out empty prefixes in the pathPrefixes array as a
preprocessing step, and then in the loop you don't need to check
prefix.isEmpty(). When you have a lot of blocks to iterate in the loop, it
would save some run time.
Another corner case is, when there are multiple prexies, one prefix might
cover another (e.g., /a/b covers /a/b/c). Preprocessing the pathPrefixes to
remove the prefixes covered by other can be done, but user should be aware of
that /a/b and /a/b/c are duplicates. So this preprocessing is optional,
depending on what the use cases look like.
- In NamenodeProtocol.java, description of newly added parameter pathPrefix is
missing.
- NamenodeProtocol.proto, Nit thing, you might consider modify the comments
// Exclude blocks belongs to files which their path have prefix: pathPrefix
To something like
// Exclude blocks belonging to files with specified path prefix
- NamenodeProtocol.proto
My understanding is that the newly added optional parameter should have no
impact in compatibility. But other more experienced reviewers please confirm.
- testBalancerWithExcludeDirectory
It might help to add another call at the end of the test to do one more round
of balancing without the excludeDir parameter, and assert that blocks are moved.
Thanks.
> Make Balancer support exclude specified path
> --------------------------------------------
>
> Key: HDFS-6133
> URL: https://issues.apache.org/jira/browse/HDFS-6133
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: balancer, namenode
> Reporter: zhaoyunjiong
> Assignee: zhaoyunjiong
> Attachments: HDFS-6133.patch
>
>
> Currently, run Balancer will destroying Regionserver's data locality.
> If getBlocks could exclude blocks belongs to files which have specific path
> prefix, like "/hbase", then we can run Balancer without destroying
> Regionserver's data locality.
--
This message was sent by Atlassian JIRA
(v6.2#6252)