Karthik Kambatla created YARN-3793: -------------------------------------- Summary: Several NPEs when deleting local files on NM recovery Key: YARN-3793 URL: https://issues.apache.org/jira/browse/YARN-3793 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.6.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Critical
When NM work-preserving restart is enabled, we see several NPEs on recovery. These seem to correspond to sub-directories that need to be deleted. I wonder if null pointers here mean incorrect tracking of these resources and a potential leak. This JIRA is to investigate and fix anything required. Logs show: {noformat} 2015-05-18 07:06:10,225 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting absolute path : null 2015-05-18 07:06:10,224 ERROR org.apache.hadoop.yarn.server.nodemanager.DeletionService: Exception during execution of task in DeletionService java.lang.NullPointerException at org.apache.hadoop.fs.FileContext.fixRelativePart(FileContext.java:274) at org.apache.hadoop.fs.FileContext.delete(FileContext.java:755) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.deleteAsUser(DefaultContainerExecutor.java:458) at org.apache.hadoop.yarn.server.nodemanager.DeletionService$FileDeletionTask.run(DeletionService.java:293) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)