[
https://issues.apache.org/jira/browse/HDFS-4451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13566244#comment-13566244
]
Chris Nauroth commented on HDFS-4451:
-------------------------------------
Thanks, Joshua. I applied the patch to trunk and ran these test suites
successfully: TestBalancer, TestBalancerWithEncryptedTransfer,
TestBalancerWithHANameNodes, TestBalancerWithMultipleNameNodes,
TestBalancerWithNodeGroup, TestBalancerBandwidth. Here are a few comments.
1. It appears that it is not possible for the tool to exit with IN_PROGRESS.
We have this code in the static {{Balancer#run}} method:
{code}
boolean done = false;
for(int iteration = 0; !done; iteration++) {
done = true;
Collections.shuffle(connectors);
for(NameNodeConnector nnc : connectors) {
final Balancer b = new Balancer(nnc, p, conf);
final ReturnStatus r = b.run(iteration, formatter, conf);
// clean all lists
b.resetData(conf);
if (r == ReturnStatus.IN_PROGRESS) {
done = false;
} else if (r != ReturnStatus.SUCCESS) {
//must be an error statue, return.
return r.code;
}
}
if (!done) {
Thread.sleep(sleeptime);
}
}
{code}
This loop will not terminate while return status is IN_PROGRESS, so I expect we
would never exit the CLI with IN_PROGRESS. If that's the case, then we
wouldn't need to map IN_PROGRESS to a successful exit, and {{toExitCode}} could
simplify to something like code == 1 ? 0 : 1.
2. Actually, I wonder if the simpler change is just to swap the codes in the
enum, so that it's SUCCESS(0) and IN_PROGRESS(1). Then, we wouldn't need
{{toExitCode}}. If callers never really see the IN_PROGRESS exit code (for the
reasons stated above) and if we're already declaring this change
backwards-incompatible, then this would be simpler.
3. Do you want to add a test that calls {{twoNodeTest(conf, true)}}? I see
that the {{useTool}} flag has been introduced in the method signature, but I
didn't find any tests that set the flag to true.
{code}
- private void twoNodeTest(Configuration conf) throws Exception {
+ private void twoNodeTest(Configuration conf, boolean useTool) throws
Exception {
{code}
> hdfs balancer exits 1 on success, it should exit 0
> --------------------------------------------------
>
> Key: HDFS-4451
> URL: https://issues.apache.org/jira/browse/HDFS-4451
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: balancer
> Affects Versions: 2.0.2-alpha
> Environment: Centos 6.3, JDK 1.6.0_25
> Reporter: Joshua Blatt
> Attachments: HDFS-4451.patch
>
>
> Though the org.apache.hadoop.util.Tool interface javadocs indicate
> implementations should return 0 on success, the
> org.apache.hadoop.hdfs.server.balance.Balancer.Cli implementation returns the
> int values of this enum instead:
> // Exit status
> enum ReturnStatus {
> SUCCESS(1),
> IN_PROGRESS(0),
> ALREADY_RUNNING(-1),
> NO_MOVE_BLOCK(-2),
> NO_MOVE_PROGRESS(-3),
> IO_EXCEPTION(-4),
> ILLEGAL_ARGS(-5),
> INTERRUPTED(-6);
> This created an issue for us when we tried to run the hdfs balancer as a cron
> job. Cron sends emails whenever a executable it runs exits non-zero. We'd
> either have to disable all emails and miss real issues or fix this bug.
> I think both SUCCESS and IN_PROGRESS ReturnStatuses should lead to exit 0.
> Marking this change as incompatible because existing scripts which interpret
> exit 1 as success will be broken (unless they defensively/liberally interpret
> both exit 1 and exit 0 as success).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira