[
https://issues.apache.org/jira/browse/MAPREDUCE-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13294963#comment-13294963
]
Vinay commented on MAPREDUCE-4340:
----------------------------------
Hi Deva,
Following is the reason for the new filesystem instance creation for every job.
FileSystem.Cache.Key uses scheme, authority and ugi to compare.
NodeManager creates ugi for every job submitted as follows.
{code:java}UserGroupInformation ugi = UserGroupInformation
.createRemoteUser("user");{code}
UserGroupInformation.createRemoteUser(..) will always returns a different ugi
instance with different hashcode even with same username.
so filesystem instance created with this ugi instance will result in new
instance of FileSystem.
If after reading data from fs, filesystem is not closed, then all sockets
present inside the DFSClient#socketCache will be present resulting in socket
leak.
In NodeManager FileSystem instance for each ugi needs to be closed once all
operations done with that fs.
Below is the sample code to get the different instance of the filesystems with
different UGIs.
{code:java}public static void main(String[] args) throws IOException,
InterruptedException {
final HdfsConfiguration conf = new HdfsConfiguration();
final Path path1 = new Path("file:///home");
final Path path2 = new Path("file:///home2");
UserGroupInformation ugi = UserGroupInformation.createRemoteUser("user");
FileSystem fs = ugi.doAs(new PrivilegedExceptionAction<FileSystem>() {
@Override
public FileSystem run() throws Exception {
return path1.getFileSystem(conf);
}
});
UserGroupInformation ugi2 = UserGroupInformation.createRemoteUser("user");
FileSystem fs2 = ugi2.doAs(new PrivilegedExceptionAction<FileSystem>() {
@Override
public FileSystem run() throws Exception {
return path2.getFileSystem(conf);
}
});
System.out.println(ugi + " : " + fs);
System.out.println(ugi2 + " : " + fs2);
}{code}
> Node Manager leaks socket connections connected to Data Node
> ------------------------------------------------------------
>
> Key: MAPREDUCE-4340
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4340
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Reporter: Devaraj K
> Assignee: Devaraj K
> Priority: Critical
>
> I am running simple wordcount example with default configurations, for every
> job run it increases one datanode socket connection and it will be there in
> CLOSE_WAIT state forever.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira