[
https://issues.apache.org/jira/browse/HADOOP-17028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17127459#comment-17127459
]
Abhishek Das commented on HADOOP-17028:
---------------------------------------
The filesystems for both the non-leaf nodes (InternalDirOfViewFs) and leaf
nodes (ChRootedFileSystem) gets constructed during the viewfs initialize phase.
During this the fs object gets created though FsGetter. Here is my rough idea
about the implementation.
* Make sure the FsGetter.getNewInstance() doesn't initialize the FileSystem
object when asked for.
* For InternalDirOfViewFs, the initialize is called at the constructor but
this call is calling base initailize method.
* For ChRootedFileSystem, if the constructor gets invoked through
ChRootedFileSystem(final URI uri, Configuration conf) then it gets the fs
object from FileSystem.get(uri, conf) which will initalize the fs object but
this constructor is not used except in tests, so we are good. The other
constructor gets the fs object as argument, so we have to make sure caller wont
initialize the fs object before invoking this constructor.
* When a FileSystem api gets invoked for ChRootedFileSystem, before calling
the actual implementation of the underlying fs object through FilterFileSystem,
it can check whether the fs object has been initialized, so that the fs object
gets initialized only once. We can tap at ChRootedFileSystem.fullPath(path) to
check the initialization (through a class level variable)
[~umamaheswararao] let me know your thoughts about the approach. I can start
working on this.
> ViewFS should initialize target filesystems lazily
> --------------------------------------------------
>
> Key: HADOOP-17028
> URL: https://issues.apache.org/jira/browse/HADOOP-17028
> Project: Hadoop Common
> Issue Type: Bug
> Components: client-mounts, fs, viewfs
> Affects Versions: 3.2.1
> Reporter: Uma Maheswara Rao G
> Priority: Major
>
> Currently viewFS initialize all configured target filesystems when
> viewfs#init itself.
> Some target file system initialization involve creating heavy objects and
> proxy connections. Ex: DistributedFileSystem#initialize will create DFSClient
> object which will create proxy connections to NN etc.
> For example: if ViewFS configured with 10 target fs with hdfs uri and 2
> targets with s3a.
> If one of the client only work with s3a target, But ViewFS will initialize
> all targets irrespective of what clients interested to work with. That means,
> here client will create 10 DFS initializations and 2 s3a initializations. Its
> unnecessary to have DFS initialization here. So, it will be a good idea to
> initialize the target fs only when first time usage call come to particular
> target fs scheme.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]