[ 
https://issues.apache.org/jira/browse/HDFS-6133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13978286#comment-13978286
 ] 

Yongjun Zhang commented on HDFS-6133:
-------------------------------------

Hi [~zhaoyunjiong], thanks for letting me know about HDFS-6133 at HDFS-4420, 
and sorry for getting back late.

I looked at the patch you did, it looks pretty good to me. Thanks for the good 
work.

I have some comments on small things:

- Some lines exceed 80 chars. E.g. NamenodeProtocolServerSideTranslatorPB.java, 
NameNodeRpcServer.java etc.
- In BlockManager.getBlocksWithLocations
You might consider stripping out empty prefixes in the pathPrefixes array as a 
preprocessing step, and then in the loop you don't need to check 
prefix.isEmpty(). When you have a lot of blocks to iterate in the loop, it 
would save some run time.

Another corner case is, when there are multiple prexies, one prefix might  
cover another (e.g., /a/b covers /a/b/c). Preprocessing the pathPrefixes to 
remove the prefixes covered by other can be done, but user should be aware of 
that /a/b and /a/b/c are duplicates. So this preprocessing is optional, 
depending on what the use cases look like. 

- In NamenodeProtocol.java, description of newly added parameter pathPrefix is 
missing. 

- NamenodeProtocol.proto, Nit thing, you might consider modify the comments
  // Exclude blocks belongs to files which their path have prefix: pathPrefix
To something like
  // Exclude blocks belonging to files with specified path prefix

- NamenodeProtocol.proto
My understanding is that the newly added optional parameter should have no 
impact in compatibility. But other more experienced reviewers please confirm.

- testBalancerWithExcludeDirectory 
It might help to add another call at the end of the test to do one more round 
of balancing without the excludeDir parameter, and assert that blocks are moved.

Thanks.



> Make Balancer support exclude specified path
> --------------------------------------------
>
>                 Key: HDFS-6133
>                 URL: https://issues.apache.org/jira/browse/HDFS-6133
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: balancer, namenode
>            Reporter: zhaoyunjiong
>            Assignee: zhaoyunjiong
>         Attachments: HDFS-6133.patch
>
>
> Currently, run Balancer will destroying Regionserver's data locality.
> If getBlocks could exclude blocks belongs to files which have specific path 
> prefix, like "/hbase", then we can run Balancer without destroying 
> Regionserver's data locality.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to