[ 
https://issues.apache.org/jira/browse/HDFS-10551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15344990#comment-15344990
 ] 

Anu Engineer commented on HDFS-10551:
-------------------------------------

 [~liuml07] Thanks for your comments.

bq. In Command constructor, // These arguments are valid for all commands. can 
be deleted.
I left there for the future maintainers. Earlier we had a command like -help on 
each command, but now we have moved it to an independent -help command. 
So you have no reference to how to use this. I think this comment helps with 
someone how wants to add an option to all commands in disk balancer. please let 
me know if you disagree and we must remove this comment.

bq. In TestDiskBalancerCommand other test cases can also use the class filed 
conf (e.g. testReadClusterFromJson()). This will also address the checkstyle 
warning I think.
Agreed, the latest patch fixed that issue and we don't have any more checkstyle 
warnings

bq. For the config key name dfs.disk.balancer.max.disk.throughputInMBperSec, 
I'd suggest dfs.disk.balancer.max.disk.bandwidthPerSec. I think throughput and 
bandwidth have different meanings in different context and bandwidth here is 
better. This also conforms to existing names like 
dfs.image.transfer.bandwidthPerSec, dfs.balance.bandwidthPerSec and 
dfs.datanode.balance.bandwidthPerSec, to name a few.

if my memory serves me correctly it was named something like that because a 
code review comment wanted to make sure that Units are expressed as part of the 
name. Names like bandwithPerSec does not tell you if it is bytes or MB or GB 
that is measured. Also this is setting is exposed to users via --bandwidth on 
command line. 
My understanding about the nomenclature -- Not a native english language 
speaker -- Bandwidth is the maximum channel capacity, throughput represents the 
speed at which things move through a system. if my understanding is correct, 
then the current naming seems to be more apt. I do see that we have used 
bandwidthPerSec without units in earlier config properties. But I am conflicted 
whether we should follow that pattern or use one which is more explicit.

if you feel strongly about this please file a JIRA and I will make a follow up 
fix.


> o.a.h.h.s.diskbalancer.command.Command does not actually verify options as 
> expected.
> ------------------------------------------------------------------------------------
>
>                 Key: HDFS-10551
>                 URL: https://issues.apache.org/jira/browse/HDFS-10551
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: datanode
>            Reporter: Lei (Eddy) Xu
>            Assignee: Anu Engineer
>            Priority: Critical
>         Attachments: HDFS-10551-HDFS-1312.001.patch, 
> HDFS-10551-HDFS-1312.002.patch, HDFS-10551-HDFS-1312.003.patch
>
>
> In {{diskbalancer.command.Command#verifyCommandOptions}}. The following code 
> does not do what it expected to do:
> {code}
> if (!validArgs.containsKey(opt.getArgName())) {
> {code}
> opt.getArgName() always returns "arg" instead of i.e., {{report}} or {{uri}}, 
> which is the expected parameter to check.
> It should use {{opt.getLongOpt()}} to get the option names. It can pass on 
> the branch because {{opt.getArgName()}} always returns {{"arg"}}, which is 
> accidently in {{validArgs}}. However I don't think it is the intention for 
> this function.
> Additionally, in the following code
> {code}
> validArguments.append("Valid arguments are : %n");
> {code}
> This {{%n}} is not used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to