[ 
https://issues.apache.org/jira/browse/HDFS-8179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500770#comment-14500770
 ] 

Xiaoyu Yao commented on HDFS-8179:
----------------------------------

This is a bug in DFSClient#getServerDefaults(). There are two issues here

1. DFSClient#serverDefaults is null within 1 hour 
(SERVER_DEFAULTS_VALIDITY_PERIOD) of NN start. The null serverDefaults is 
returned from DFSClient#getServerDefaults(), which causes the "hadoop fs -rm 
-r" NPE.

2. HDFS-6841 changes to use Time#monotonicNow instead of Time.now(). It 
represents time elapsed since an arbitrary origin which should only be compared 
with value returned from Time#monotonicNow(). But the serverDefaultLastUpdate 
has an initial value of 0 through default initialization. It should be 
initialized with Time.monotonicNow(). There is similar issue Arpit Agarwal 
found in HDFS-8163.

{code}
public FsServerDefaults getServerDefaults() throws IOException {  
long now = Time.monotonicNow();
    if (now - serverDefaultsLastUpdate > SERVER_DEFAULTS_VALIDITY_PERIOD) {
      serverDefaults = namenode.getServerDefaults();
      serverDefaultsLastUpdate = now;
    }
    return serverDefaults;
{code}


The proposed fix:
{code}
public FsServerDefaults getServerDefaults() throws IOException {  
long now = Time.monotonicNow();
    if ((serverDefaults == null) || (now - serverDefaultsLastUpdate > 
SERVER_DEFAULTS_VALIDITY_PERIOD)) {
      serverDefaults = namenode.getServerDefaults();
      serverDefaultsLastUpdate = now;
    }
    return serverDefaults;
{code}

> DFSClient#getServerDefaults returns null within 1 hour of NN start
> ------------------------------------------------------------------
>
>                 Key: HDFS-8179
>                 URL: https://issues.apache.org/jira/browse/HDFS-8179
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Xiaoyu Yao
>            Assignee: Xiaoyu Yao
>
> We recently hit NPE during Ambari Oozie service check. The failed hdfs 
> command is below. It repros sometimes and then go away after the cluster runs 
> for a while.
> {code}
> [ambari-qa@c6401 ~]$ hadoop --config /etc/hadoop/conf fs -rm -r 
> /user/ambari-qa/mapredsmokeoutput
> rm: Failed to get server trash configuration: null. Consider using -skipTrash 
> option
> {code}
> With additional tracing, the failure was located to the following stack.
> {code}
> 15/04/17 20:57:12 DEBUG fs.Trash: Failed to get server trash configuration
> java.lang.NullPointerException
>       at org.apache.hadoop.fs.Trash.moveToAppropriateTrash(Trash.java:86)
>       at org.apache.hadoop.fs.shell.Delete$Rm.moveToTrash(Delete.java:117)
>       at org.apache.hadoop.fs.shell.Delete$Rm.processPath(Delete.java:104)
>       at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:321)
>       at 
> org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:293)
>       at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:275)
>       at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:259)
>       at 
> org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:205)
>       at org.apache.hadoop.fs.shell.Command.run(Command.java:166)
>       at org.apache.hadoop.fs.FsShell.run(FsShell.java:287)
>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>       at org.apache.hadoop.fs.FsShell.main(FsShell.java:340)
> rm: Failed to get server trash configuration: null. Consider using -skipTrash 
> option
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to