[ https://issues.apache.org/jira/browse/HBASE-29365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Duo Zhang resolved HBASE-29365. ------------------------------- Fix Version/s: 2.7.0 3.0.0-beta-2 2.6.3 2.5.12 Hadoop Flags: Reviewed Resolution: Fixed Pushed to all active branches. Thanks [~junegunn]! > Improve option parser of PerformanceEvaluation > ---------------------------------------------- > > Key: HBASE-29365 > URL: https://issues.apache.org/jira/browse/HBASE-29365 > Project: HBase > Issue Type: Bug > Components: PE > Reporter: Junegunn Choi > Assignee: Junegunn Choi > Priority: Major > Labels: pull-request-available > Fix For: 2.7.0, 3.0.0-beta-2, 2.6.3, 2.5.12 > > > h2. Summary > {{PerformanceEvaluation.parseOpts}} spans over 300 lines with significant > code duplication, and has several usability issues. It needs to be refactored > and improved. > * > [https://github.com/apache/hbase/blob/787f5459883211b6017cee6a9bcb109f3cb66ddb/hbase-diagnostics/src/main/java/org/apache/hadoop/hbase/PerformanceEvaluation.java#L2845-L3158] > h2. Problems > h3. Code duplication > The same code pattern is repeated for each option. > {code:java} > final String rows = "--rows="; > if (cmd.startsWith(rows)) { > opts.perClientRunRows = Long.parseLong(cmd.substring(rows.length())); > continue; > } > {code} > * Define the option prefix as a constant > * Check if the argument starts with the prefix > * Use {{substring}} to extract the value > * Move on to the next argument > We can easily refactor this pattern and reduce the code duplication. > h3. Usability issues > Malformed arguments are not handled gracefully, often resulting in confusing > error messages. Here are some examples. > h4. Confusing error messages > The {{--rows}} option expects a value, but if omitted, it prints a > {{ClassNotFoundException}} exception, followed by an error message that > doesn't clearly indicate the problem. > {noformat} > $ bin/hbase pe --rows > INFO [main {}] hbase.PerformanceEvaluation: No class found for command: > --rows > java.lang.ClassNotFoundException: --rows > at > jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:641) > ~[?:?] > at > jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188) > ~[?:?] > at java.lang.ClassLoader.loadClass(ClassLoader.java:525) ~[?:?] > at > org.apache.hadoop.hbase.PerformanceEvaluation.isCustomTestClass(PerformanceEvaluation.java:3258) > ~[hbase-diagnostics-4.0.0-alpha-1-SNAPSHOT.jar:4.0.0-alpha-1-SNAPSHOT] > at > org.apache.hadoop.hbase.PerformanceEvaluation.isCommandClass(PerformanceEvaluation.java:3251) > ~[hbase-diagnostics-4.0.0-alpha-1-SNAPSHOT.jar:4.0.0-alpha-1-SNAPSHOT] > at > org.apache.hadoop.hbase.PerformanceEvaluation.parseOpts(PerformanceEvaluation.java:3139) > ~[hbase-diagnostics-4.0.0-alpha-1-SNAPSHOT.jar:4.0.0-alpha-1-SNAPSHOT] > at > org.apache.hadoop.hbase.PerformanceEvaluation.run(PerformanceEvaluation.java:3217) > ~[hbase-diagnostics-4.0.0-alpha-1-SNAPSHOT.jar:4.0.0-alpha-1-SNAPSHOT] > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82) > ~[hadoop-common-3.4.1.jar:?] > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:97) > ~[hadoop-common-3.4.1.jar:?] > at > org.apache.hadoop.hbase.PerformanceEvaluation.main(PerformanceEvaluation.java:3273) > ~[hbase-diagnostics-4.0.0-alpha-1-SNAPSHOT.jar:4.0.0-alpha-1-SNAPSHOT] > ERROR: Unrecognized option/command: --rows > Usage: hbase pe <OPTIONS> [-D<property=value>]* <command|class> <nclients> > ... > {noformat} > The same holds true for other options that expect a value. > — > The error message for invalid values can also be improved. > {noformat} > $ bin/hbase pe --rows= --table= --compress= > java.lang.NumberFormatException: For input string: "" > at > java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:67) > at java.base/java.lang.Long.parseLong(Long.java:721) > at java.base/java.lang.Long.parseLong(Long.java:836) > at > org.apache.hadoop.hbase.PerformanceEvaluation.parseOpts(PerformanceEvaluation.java:2864) > at > org.apache.hadoop.hbase.PerformanceEvaluation.run(PerformanceEvaluation.java:3217) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:97) > at > org.apache.hadoop.hbase.PerformanceEvaluation.main(PerformanceEvaluation.java:3273) > {noformat} > It doesn't show which option caused the error, so you would have to guess it > or follow the stack trace to find out. > h4. Inconsistent handling of boolean options > PE supports many boolean options, but their handling is inconsistent. Some > options require {{\-\-flag=true/false}} syntax, while others just accept > {{\-\-flag}}. > * In this case, this fails because {{autoFlush}} requires a value, while > {{valueRandom}} doesn't. > {code:java} > $ bin/hbase pe --valueRandom --autoFlush sequentialWrite 1 > ... > ERROR: Unrecognized option/command: --autoFlush > {code} > * And in this case, {{valueRandom}} is set to true, because {{=false}} after > {{--valueRandom}} is simply ignored. > {code:java} > $ bin/hbase pe --valueRandom=false --autoFlush=true sequentialWrite 1 > SequentialWriteTest test run options={ ... ,"autoFlush":true, ... > "valueRandom":true, ... } > {code} > * Another problem is that they silently ignore illegal values and just treat > them as {{false}}. This can lead to confusion, and we should be more strict > about it. > {code:java} > $ bin/hbase pe --autoFlush=yes sequentialWrite 1 > SequentialWriteTest test run options={ ..., "autoFlush":false, ... } > {code} > h2. Suggestion > Refactor the code, make the behavior more consistent, and improve the error > messages. > h2. Result > * Shaved off over 180 lines of code > * Fixed hidden bugs > * Better error messages > h3. Examples > {code:java} > $ bin/hbase pe --rows > java.lang.IllegalArgumentException: Invalid option: --rows > at > org.apache.hadoop.hbase.PerformanceEvaluation.parseOpts(PerformanceEvaluation.java:2943) > at > org.apache.hadoop.hbase.PerformanceEvaluation.run(PerformanceEvaluation.java:3033) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:97) > at > org.apache.hadoop.hbase.PerformanceEvaluation.main(PerformanceEvaluation.java:3089) > Caused by: java.lang.IllegalArgumentException: --rows requires a value > at > org.apache.hadoop.hbase.PerformanceEvaluation.parseOpts(PerformanceEvaluation.java:2937) > ... 4 more > {code} > {code:java} > $ bin/hbase pe --rows= --table= --compress= > java.lang.IllegalArgumentException: Invalid option: --rows= > at > org.apache.hadoop.hbase.PerformanceEvaluation.parseOpts(PerformanceEvaluation.java:2943) > at > org.apache.hadoop.hbase.PerformanceEvaluation.run(PerformanceEvaluation.java:3033) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:97) > at > org.apache.hadoop.hbase.PerformanceEvaluation.main(PerformanceEvaluation.java:3089) > Caused by: java.lang.NumberFormatException: For input string: "" > at > java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:67) > at java.base/java.lang.Long.parseLong(Long.java:721) > at java.base/java.lang.Long.parseLong(Long.java:836) > at > org.apache.hadoop.hbase.PerformanceEvaluation.lambda$parseOpts$14(PerformanceEvaluation.java:2866) > at > org.apache.hadoop.hbase.PerformanceEvaluation.parseOpts(PerformanceEvaluation.java:2939) > ... 4 more > {code} > {code:java} > $ bin/hbase pe sequentialWrite 1 > SequentialWriteTest test run options={ ..., "autoFlush":false, ... > "valueRandom":false, ... } > {code} > {code:java} > $ bin/hbase pe --autoFlush --valueRandom sequentialWrite 1 > SequentialWriteTest test run options={ ..., "autoFlush":true, ... > "valueRandom":true, ...} > ... > {code} > {code:java} > $ bin/hbase pe --autoFlush=true --valueRandom=false sequentialWrite 1 > SequentialWriteTest test run options={ ..., "autoFlush":true, ... > "valueRandom":false, ... } > ... > {code} > {code:java} > $ bin/hbase pe --autoFlush=yes sequentialWrite 1 > java.lang.IllegalArgumentException: Invalid option: --autoFlush=yes > at > org.apache.hadoop.hbase.PerformanceEvaluation.parseOpts(PerformanceEvaluation.java:2943) > at > org.apache.hadoop.hbase.PerformanceEvaluation.run(PerformanceEvaluation.java:3033) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:97) > at > org.apache.hadoop.hbase.PerformanceEvaluation.main(PerformanceEvaluation.java:3089) > Caused by: java.lang.IllegalArgumentException: Invalid boolean value: yes > at > org.apache.hadoop.hbase.PerformanceEvaluation.parseBoolean(PerformanceEvaluation.java:2973) > at > org.apache.hadoop.hbase.PerformanceEvaluation.parseOpts(PerformanceEvaluation.java:2929) > ... 4 more > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)