[jira] [Created] (HDFS-13727) Log full stack trace if DiskBalancer exits with an unhandle exceptiopn

Stephen O'Donnell (JIRA) Wed, 11 Jul 2018 09:23:10 -0700

Stephen O'Donnell created HDFS-13727:
----------------------------------------


             Summary: Log full stack trace if DiskBalancer exits with an 
unhandle exceptiopn
                 Key: HDFS-13727
                 URL: https://issues.apache.org/jira/browse/HDFS-13727
             Project: Hadoop HDFS
          Issue Type: Improvement
          Components: diskbalancer
    Affects Versions: 3.0.3
            Reporter: Stephen O'Donnell


In HDFS-13175 it was discovered that when a DN reports the usage on a volume to 
be greater than the volume capacity, the disk balancer will fail with an 
unhelpful error:

{code}
$ hdfs diskbalancer -report -top 5

18/06/11 10:19:43 INFO command.Command: Processing report command
18/06/11 10:19:44 INFO balancer.KeyManager: Block token params received from 
NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
18/06/11 10:19:44 INFO block.BlockTokenSecretManager: Setting block keys
18/06/11 10:19:44 INFO balancer.KeyManager: Update block keys every 2hrs, 
30mins, 0sec
18/06/11 10:19:44 ERROR tools.DiskBalancerCLI: 
java.lang.IllegalArgumentException
{code}

In HDFS-13175, a change was made to include more details in the exception name, 
 so after the change the code is:

{code}
  public void setUsed(long dfsUsedSpace) {
    Preconditions.checkArgument(dfsUsedSpace < this.getCapacity(),
        "DiskBalancerVolume.setUsed: dfsUsedSpace(%s) < capacity(%s)",
        dfsUsedSpace, getCapacity());
    this.used = dfsUsedSpace;
  }
{code}

There may however be other scenarios that cause the balancer to exit with an 
unhandled exception, and it would be helpful if the tool logged out the full 
stack trace on error rather than just the exception name.

In DiskBalancerCLI.java, the relevant code is:

{code}
  public static void main(String[] argv) throws Exception {
    DiskBalancerCLI shell = new DiskBalancerCLI(new HdfsConfiguration());
    int res = 0;
    try {
      res = ToolRunner.run(shell, argv);
    } catch (Exception ex) {
      LOG.error(ex.toString());
      res = 1;
    }
    System.exit(res);
  }
{code}

We should change the error logged in the exception block to log out the full 
stack to give more information on all unhandled errors, eg:

{code}
LOG.error(ex.toString(), ex);
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-13727) Log full stack trace if DiskBalancer exits with an unhandle exceptiopn

Reply via email to