[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16476534#comment-16476534
 ] 

Miklos Szegedi commented on MAPREDUCE-7094:
-------------------------------------------

Thank you for the patch [~szita] and for the reviews, [~wilfreds] and 
[~pbacsko].

I have some concerns with the singleton implementation. It is indeed true that 
the code is called only once, however it is still external code that calls it 
and there is no guarantee that nobody will add a change in the future to call 
it multiple times breaking the assumption. I would throw, if classLoaderCreated 
is set already but makeClassLoader is called repeatedly.

Close can be called multiple times. Second time it will probably fail on a 
double call on close() The first close should set the field to null.
{code:java}
} finally {
  try {
    fs.delete(systemJobFile.getParent(), true);  // delete submit dir
    localFs.delete(localJobFile, true);              // delete local copy
    // Cleanup distributed cache
    localDistributedCacheManager.close();
  } catch (IOException e) {
    LOG.warn("Error cleaning up "+id+": "+e);
  }
}{code}
This code path above can avoid {{localDistributedCacheManager.close(); }} 
causing a leak, when {{localFs.delete}} fails with an IOException.
{code:java}
public void close() throws IOException {
  for (File symlink : symlinksCreated) {
    if (!symlink.delete()) {
      LOG.warn("Failed to delete symlink created by the local job runner: " +
          symlink);
    }
  }
  FileContext localFSFileContext = FileContext.getLocalFSFileContext();
  for (String archive : localArchives) {
    localFSFileContext.delete(new Path(archive), true);
  }
  for (String file : localFiles) {
    localFSFileContext.delete(new Path(file), true);
  }

  if(classLoaderCreated != null) {
    classLoaderCreated.close();
  }

}{code}
Similarly the one above can also leak not closing, if one of the earlier 
instructions in {{close()}} fail with an IOException like 
{{getLocalFSFileContext}}

Should not close be called as a privileged action as well?

The code is not synchronized. That might be okay for now but it would not hurt 
to put a synchronized around makeClassLoader and close.

> LocalDistributedCacheManager leaves classloaders open, which leaks FDs
> ----------------------------------------------------------------------
>
>                 Key: MAPREDUCE-7094
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7094
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 2.6.0
>            Reporter: Adam Szita
>            Assignee: Adam Szita
>            Priority: Major
>         Attachments: MAPREDUCE-7094.0.patch, MAPREDUCE-7094.1.patch, 
> MAPREDUCE-7094.2.patch
>
>
> When a user starts a local mapred task from Hive's beeline, it will leave 
> open file descriptors on the HS2 process (which runs the mapred task).
> I debugged this and saw that it is caused by LocalDistributedCacheManager 
> class, which creates a new URLClassLoader, with a classpath for the two jars 
> seen below. Somewhere down the line Loaders will be created in this 
> URLClassLoader for these files effectively creating the FD's on the OS level.
> This is never cleaned up after execution, although 
> LocalDistributedCacheManager removes the files, it will not close the 
> ClassLoader, so FDs are left open although they point to deleted files at 
> that time:
> {code:java}
> [root@host-1 ~]# lsof -p 14439 | grep hadoop-hive
> java    14439 hive  DEL       REG                8,1             3348748 
> /tmp/hadoop-hive/mapred/local/1525789796610/hive-exec-core.jar
> java    14439 hive  DEL       REG                8,1             3348750 
> /tmp/hadoop-hive/mapred/local/1525789796609/hive-exec-1.1.0-cdh5.13.4-SNAPSHOT-core.jar
> java    14439 hive  649r      REG                8,1   8112438   3348750 
> /tmp/hadoop-hive/mapred/local/1525789796609/hive-exec-1.1.0-cdh5.13.4-SNAPSHOT-core.jar
>  (deleted)
> java    14439 hive  650r      REG                8,1   8112438   3348748 
> /tmp/hadoop-hive/mapred/local/1525789796610/hive-exec-core.jar (deleted)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

Reply via email to