[
https://issues.apache.org/jira/browse/HDFS-13892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16603237#comment-16603237
]
Anu Engineer commented on HDFS-13892:
-------------------------------------
Hi [~Harsha1206], Thank you for taking time to file this JIRA. I appreciate
it. From a high level, I do agree with your thought process. In an ideal world,
the execute should have returned an error code. However, disk balancer works
slightly differently, understanding what happens under the covers might help
you realize what is happening, though you may not agree with the behavior.
When DiskBalancer executes the plan, it is the beginning of an asynchronous
process that can take a long time. So execute command directly hands over the
disk balancer plan file to the data node and data node executes the plan.
Hence a return code from execute indicates merely that the command was
successful in handing over the plan file to data node to perform balancing at a
later point in time.
DiskBalancer also supports a command called query, that will give you
information about the current status, a failure during runtime will be
reflected in the query command, and as you rightly discovered in the data node
logs.
if you believe that Apache Hadoop user documentation does adequately explain
this process, please feel free to convert this Jira to a doc Jira, and one of
us can get this fixed.
if you would like to pursue the architecture of Disk balancer, there are two
documents – the proposal and the architecture and test plan in this Jira
https://issues.apache.org/jira/browse/HDFS-1312 which explains how disk
balancer works in great detail.
Thank you very much for taking time out to file the Jira and bringing this
issue to my attention.
> Disk Balancer : Invalid exit code for disk balancer execute command
> -------------------------------------------------------------------
>
> Key: HDFS-13892
> URL: https://issues.apache.org/jira/browse/HDFS-13892
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: diskbalancer
> Reporter: Harshakiran Reddy
> Priority: Major
>
> {{scenario:-}}
> 1. Write some 5GB data with one DISK
> 2. Add one more non-empty Disk to above Datanode
> 3.Run the plan command for the above specific datanode
> 4. run the Execute command with the above plan file
> the above execute command not happened as per the datanode log
> {noformat}
> ERROR org.apache.hadoop.hdfs.server.datanode.DiskBalancer: Destination
> volume: file:/Test_Disk/DISK2/ does not have enough space to accommodate a
> block. Block Size: 268435456 Exiting from copyBlocks.
> {noformat}
> 5. see the exit code for execute command, it display the 0
> {{Expected Result :-}}
> 1. Exit code should be 1 why means execution was not happened
> 2. In this type of scenario In console print the that error message that
> time customer/user knows execute was not happened.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]