[
https://issues.apache.org/jira/browse/HADOOP-12707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498370#comment-16498370
]
Xinli Shang commented on HADOOP-12707:
--------------------------------------
We hit this issue also and it impacts us a lot. Our Hadoop cluster is pretty
big and Hadoop security plays a big part of it. So please consider it high
priority.
Yes, disabling cache or calling closeAll() would prevent leaking but we lose
the benefit of cache. We would like it to be fixed so that we can have a
performant service.
The use case for us is we have to create proxy user and get FileSystem in
doAs(). The code is as below.
UserGroupInformation ugi = UserGroupInformation.createProxyUser(proxyUser,
UserGroupInformation.getCurrentUser());
fs = ugi.doAs((PrivilegedExceptionAction<FileSystem>) () ->
FileSystem.get(conf));
Because ugi is different object even for same proxy user, the
FileSystem#Cache#Key would be different for same proxy user.
It would be great to fix it. HADOOP-6670 does have a valid reason that mutable
object but simply using identityHashCode() is a bold decision and impact the
usage of it.
> key of FileSystem inner class Cache contains UGI.hascode which uses the
> defualt hascode method, leading to the memory leak
> --------------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-12707
> URL: https://issues.apache.org/jira/browse/HADOOP-12707
> Project: Hadoop Common
> Issue Type: Bug
> Components: fs
> Affects Versions: 2.7.1
> Reporter: sunhaitao
> Assignee: sunhaitao
> Priority: Major
>
> FileSystem.get(conf) method,By default it will get the fs object from
> CACHE,But the key of the CACHE constains ugi.hashCode, which uses the
> default hascode method of subject instead of the hascode method overwritten
> by subject.
> @Override
> public int hashCode() {
> return (scheme + authority).hashCode() + ugi.hashCode() + (int)unique;
> }
> In this case, even if same user, if the calll FileSystem.get(conf) twice, two
> different key will be created. In long duartion, this will lead to memory
> leak.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]