[
https://issues.apache.org/jira/browse/HDFS-8179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Xiaoyu Yao updated HDFS-8179:
-----------------------------
Attachment: HDFS-8179.00.patch
Attach a patch that fixed the DFSClient#getServerDefaults() to initialize
serverDefaults appropriately and added additional tracing at WARN level for
Trash policy related failures to make future investigation easier.
I've built private hadoop-common and hadoop-hdfs jars with the patch and
verified it fixed the issue on the machine with the original repro. No new test
is added because it is not easy to do physical machine restart in the unit test.
> DFSClient#getServerDefaults returns null within 1 hour of system start
> ----------------------------------------------------------------------
>
> Key: HDFS-8179
> URL: https://issues.apache.org/jira/browse/HDFS-8179
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: Xiaoyu Yao
> Assignee: Xiaoyu Yao
> Priority: Blocker
> Attachments: HDFS-8179.00.patch
>
>
> We recently hit NPE during Ambari Oozie service check. The failed hdfs
> command is below. It repros sometimes and then go away after the cluster runs
> for a while.
> {code}
> [ambari-qa@c6401 ~]$ hadoop --config /etc/hadoop/conf fs -rm -r
> /user/ambari-qa/mapredsmokeoutput
> rm: Failed to get server trash configuration: null. Consider using -skipTrash
> option
> {code}
> With additional tracing, the failure was located to the following stack.
> {code}
> 15/04/17 20:57:12 DEBUG fs.Trash: Failed to get server trash configuration
> java.lang.NullPointerException
> at org.apache.hadoop.fs.Trash.moveToAppropriateTrash(Trash.java:86)
> at org.apache.hadoop.fs.shell.Delete$Rm.moveToTrash(Delete.java:117)
> at org.apache.hadoop.fs.shell.Delete$Rm.processPath(Delete.java:104)
> at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:321)
> at
> org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:293)
> at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:275)
> at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:259)
> at
> org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:205)
> at org.apache.hadoop.fs.shell.Command.run(Command.java:166)
> at org.apache.hadoop.fs.FsShell.run(FsShell.java:287)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
> at org.apache.hadoop.fs.FsShell.main(FsShell.java:340)
> rm: Failed to get server trash configuration: null. Consider using -skipTrash
> option
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)