[ 
https://issues.apache.org/jira/browse/HADOOP-12707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498370#comment-16498370
 ] 

Xinli Shang commented on HADOOP-12707:
--------------------------------------

We hit this issue also and it impacts us a lot. Our Hadoop cluster is pretty 
big and Hadoop security plays a big part of it. So please consider it high 
priority.

Yes, disabling cache or calling closeAll() would prevent leaking but we lose 
the benefit of cache. We would like it to be fixed so that we can have a 
performant service. 

The use case for us is we have to create proxy user and get FileSystem in 
doAs(). The code is as below. 

UserGroupInformation ugi = UserGroupInformation.createProxyUser(proxyUser, 
UserGroupInformation.getCurrentUser());

fs = ugi.doAs((PrivilegedExceptionAction<FileSystem>) () -> 
FileSystem.get(conf));

Because ugi is different object even for same proxy user, the 
FileSystem#Cache#Key would be different for same proxy user. 

It would be great to fix it. HADOOP-6670 does have a valid reason that mutable 
object but simply using identityHashCode() is a bold decision and impact the 
usage of it. 

 

> key of FileSystem inner class Cache contains UGI.hascode which uses the 
> defualt hascode method, leading to the memory leak
> --------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-12707
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12707
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 2.7.1
>            Reporter: sunhaitao
>            Assignee: sunhaitao
>            Priority: Major
>
> FileSystem.get(conf) method,By default it will get the fs object from 
> CACHE,But the key of the CACHE  constains ugi.hashCode, which uses the 
> default hascode method of subject instead of the hascode method overwritten 
> by subject.
>    @Override
>       public int hashCode() {
>         return (scheme + authority).hashCode() + ugi.hashCode() + (int)unique;
>       }
> In this case, even if same user, if the calll FileSystem.get(conf) twice, two 
> different key will be created. In long duartion, this will lead to memory 
> leak.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to