[ 
https://issues.apache.org/jira/browse/HADOOP-15565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906520#comment-16906520
 ] 

Erik Krogen commented on HADOOP-15565:
--------------------------------------

Thanks for working on this [~LiJinglun]! Seems like a good problem to fix and I 
think the idea is sound.

I didn't get a chance to look at the tests yet, but I have some comments from 
an initial review of the production code:

* I don't think publicly exposing {{cacheSize()}} on {{FileSystem}} is a great 
idea. Can we make it package-private, and if it is needed in non-package-local 
tests, use a test utility to export it publicly?
* Is there a chance the cache will be accessed in a multi-threaded way? If so 
we need to harden it for concurrent access. Looks like it will only work in a 
single-threaded fashion currently. If the FS instances are actually all created 
on startup, then I think we should explicitly populate the cache on startup.
* I agree that swallowing exceptions on child FS close is the right move, but 
probably we should at least put them at INFO level?
* This seems less strict than {{FileSystem.CACHE}} when checking for equality; 
it doesn't use the {{UserGroupInformation}} at all. I think this is safe 
because the cache is local to a single {{ViewFileSystem}}, so all of the inner 
cached instances must share the same UGI, but please help me to confirm.
* We can use {{Objects.hash()}} for the {{hashCode()}} method of {{Key}}.
* On {{ViewFileSystem}} L257, you shouldn't initialize {{fs}} -- you can just 
declare it: {{FileSystem fs;}} (this allows the compiler to help ensure that 
you remember to initialize it later)

> ViewFileSystem.close doesn't close child filesystems and causes FileSystem 
> objects leak.
> ----------------------------------------------------------------------------------------
>
>                 Key: HADOOP-15565
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15565
>             Project: Hadoop Common
>          Issue Type: Bug
>            Reporter: Jinglun
>            Assignee: Jinglun
>            Priority: Major
>         Attachments: HADOOP-15565.0001.patch, HADOOP-15565.0002.patch, 
> HADOOP-15565.0003.patch, HADOOP-15565.0004.patch, HADOOP-15565.0005.patch
>
>
> ViewFileSystem.close() does nothing but remove itself from FileSystem.CACHE. 
> It's children filesystems are cached in FileSystem.CACHE and shared by all 
> the ViewFileSystem instances. We could't simply close all the children 
> filesystems because it will break the semantic of FileSystem.newInstance().
> We might add an inner cache to ViewFileSystem, let it cache all the children 
> filesystems. The children filesystems are not shared any more. When 
> ViewFileSystem is closed we close all the children filesystems in the inner 
> cache. The ViewFileSystem is still cached by FileSystem.CACHE so there won't 
> be too many FileSystem instances.
> The FileSystem.CACHE caches the ViewFileSysem instance and the other 
> instances(the children filesystems) are cached in the inner cache.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to