[ 
https://issues.apache.org/jira/browse/HADOOP-17214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183617#comment-17183617
 ] 

Haibo Chen commented on HADOOP-17214:
-------------------------------------

The caching inside FileSystem is based on a vanilla HashMap, where the Key is 
partially based on UGI. Whenever UGI.loginwithKeytab() is called, the underly 
UGI object changes, the previously key-value pair is left unused in the cache 
and new entries are continuously added to the cache. Essentially we have a 
memory leak situation in the cache.

I don't think this subtle behavior is documented anywhere, and we have seen 
many FileSystem users follow this pattern where UGI.loginWIthKeytab() maybe 
called concurrently from multiple threads. Overtime, this leads to JVM heap 
being filled with leaked instances in the File System cache 

For most of our internal FileSystem implementations (open source ones too), it 
is often the case that caching is left enabled (which is the default) and we 
would end up discovering this memory leak only in production.

Having a global flag would allow us to avoid such issues in our use cases.

> Allow file system caching to be disabled for all file systems
> -------------------------------------------------------------
>
>                 Key: HADOOP-17214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17214
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs
>    Affects Versions: 3.3.0
>            Reporter: Haibo Chen
>            Assignee: Haibo Chen
>            Priority: Major
>
> Right now, FileSystem.get(URI uri, Configuration conf) allows caching of file 
> systems to be disabled per scheme.
> We can introduce a new global conf to disable caching for all FileSystem, the 
> default would be false (or do not disable cache gobally).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to