[
https://issues.apache.org/jira/browse/HADOOP-15565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906520#comment-16906520
]
Erik Krogen commented on HADOOP-15565:
--------------------------------------
Thanks for working on this [~LiJinglun]! Seems like a good problem to fix and I
think the idea is sound.
I didn't get a chance to look at the tests yet, but I have some comments from
an initial review of the production code:
* I don't think publicly exposing {{cacheSize()}} on {{FileSystem}} is a great
idea. Can we make it package-private, and if it is needed in non-package-local
tests, use a test utility to export it publicly?
* Is there a chance the cache will be accessed in a multi-threaded way? If so
we need to harden it for concurrent access. Looks like it will only work in a
single-threaded fashion currently. If the FS instances are actually all created
on startup, then I think we should explicitly populate the cache on startup.
* I agree that swallowing exceptions on child FS close is the right move, but
probably we should at least put them at INFO level?
* This seems less strict than {{FileSystem.CACHE}} when checking for equality;
it doesn't use the {{UserGroupInformation}} at all. I think this is safe
because the cache is local to a single {{ViewFileSystem}}, so all of the inner
cached instances must share the same UGI, but please help me to confirm.
* We can use {{Objects.hash()}} for the {{hashCode()}} method of {{Key}}.
* On {{ViewFileSystem}} L257, you shouldn't initialize {{fs}} -- you can just
declare it: {{FileSystem fs;}} (this allows the compiler to help ensure that
you remember to initialize it later)
> ViewFileSystem.close doesn't close child filesystems and causes FileSystem
> objects leak.
> ----------------------------------------------------------------------------------------
>
> Key: HADOOP-15565
> URL: https://issues.apache.org/jira/browse/HADOOP-15565
> Project: Hadoop Common
> Issue Type: Bug
> Reporter: Jinglun
> Assignee: Jinglun
> Priority: Major
> Attachments: HADOOP-15565.0001.patch, HADOOP-15565.0002.patch,
> HADOOP-15565.0003.patch, HADOOP-15565.0004.patch, HADOOP-15565.0005.patch
>
>
> ViewFileSystem.close() does nothing but remove itself from FileSystem.CACHE.
> It's children filesystems are cached in FileSystem.CACHE and shared by all
> the ViewFileSystem instances. We could't simply close all the children
> filesystems because it will break the semantic of FileSystem.newInstance().
> We might add an inner cache to ViewFileSystem, let it cache all the children
> filesystems. The children filesystems are not shared any more. When
> ViewFileSystem is closed we close all the children filesystems in the inner
> cache. The ViewFileSystem is still cached by FileSystem.CACHE so there won't
> be too many FileSystem instances.
> The FileSystem.CACHE caches the ViewFileSysem instance and the other
> instances(the children filesystems) are cached in the inner cache.
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]