[
https://issues.apache.org/jira/browse/HDFS-15289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17091107#comment-17091107
]
Uma Maheswara Rao G commented on HDFS-15289:
--------------------------------------------
Thanks a lot, Virajith for comments. Glad to hear that you guys are looking for
similar things.
Pretty much our targeted use cases are similar to what you mentioned.
First and foremost, our goal is to make ViewFSOverloadScheme configurable with
different schemes and “hdfs” is a priority use case as Hive-like symptoms
persist “hdfs://nn1” uris in meta stores.
Coming to tools support, we discussed some level of the details and we thought
we should first make ViewFS support different schemes ( ex: hdfs) and keep
configuration centrally to manage easy mount-configurations.
{quote}saveNamespace and other methods in FileSystem all needed to be
implemented in ViewFSOveraloadScheme. Do you have any specific plans around
testing this?
{quote}
I have a question here. In ViewFSOverloadScheme case, we will have multiple
target file systems.
So, when user call ViewFSOverloadScheme#saveNameSpace, we need to delegate
this to all hdfs specific target fs? Or but in reality users may want to run
this on specific targets right?
DistributedFileSystem interface tagged with:
{quote}@InterfaceAudience.LimitedPrivate(
Unknown macro: \{ "MapReduce", "HBase" }
)
@InterfaceStability.Unstable
{quote}
Unfortunately some/many users directly used DFS classes. But we have a public
exposed class for administration functions
{quote}/**
* The public API for performing administrative functions on HDFS. Those writing
* applications against HDFS should prefer this interface to directly accessing
* functionality in DistributedFileSystem or DFSClient.
*
* Note that this is distinct from the similarly-named DFSAdmin, which
* is a class that provides the functionality for the CLI `hdfs dfsadmin ...'
* commands.
*/
@InterfaceAudience.Public
@InterfaceStability.Evolving
public class HdfsAdmin {{quote}
Can we extend this class to support ViewFS functionally for administration
functions?
I mean we can do something like: Currently HdfsAdmin holds DFS class and
delegates calls to DFS. Probably we can modify this class or extend it to
support ViewFSOverloadScheme specific functionality?
If that does not work, sure we can discuss which API needed to be added in
ViewFSOverloadScheme and we may need additional APIs like when users want to
run on specific target child filesystems.
Actually ViewFS already exposed APIs like getChildFileSystems etc. We can add
more functions here to achieve.
example: ViewFSOverloadScheme#getTargetFS(“/mountPath”); This would return DFS
if /mounPath was pointed to the dfs cluster.
It would be great if you have some thoughts on how we wanted to use
“saveNameSpace” like API when we have multiple target hdfs links mounted.
{quote}Admins will not have a way to directly access HDFS unless admin tooling
explicitly sets the right properties. Is this something you considered? How do
you plan to make admin tools work?{quote}
Yes, I agree. However supporting single target dfs ( overloaded scheme target
fs ) would be easy. DFSAdmin gets FS from ViewFSOverLoadScheme and gets the
overloadedScheme fs from there and delegate calls.
Challenge here is, we will have multiple DFS clusters configured as targets.
We should make current DFSAdmin to get all matching hdfs scheme target file
systems from OverLoadedScheme and delegate the calls. More appropriate way may
be to extend DFSAdmin.
I think today if a user configures defaultFS as “viewfs://” and wants to
connect to some of the child hdfs clusters using DFSAdmin, we have the same
problem. So, this problem will be there in ViewFS itself and we should improve
to provide flexibility to access child filesystems. Probably we have to build
ViewDFSAdmin which will provide access to child file systems via
ViewFSOverLoadedScheme APIs.
{quote}How to handle cases where DistributedFileSystem is used instead of
FileSystem?{quote}
If users access DFS directly, they may need to get the childFileSystems from
ViewFSOverloadScheme and check the instanceOf.
{quote}Do you plan to make ViewFSOveraloadScheme extend
DistributedFileSystem?{quote}
The plan is to extend the ViewFileSystem class. So, we will retain the pretty
much viewFS client side mount-building logic as is. And we will address FS
looping issues and remote configuration loading in extended class. Also we can
add more usability functions like getting a child file system by scheme etc.
> Allow viewfs mounts with hdfs scheme and centralized mount table
> ----------------------------------------------------------------
>
> Key: HDFS-15289
> URL: https://issues.apache.org/jira/browse/HDFS-15289
> Project: Hadoop HDFS
> Issue Type: New Feature
> Components: fs
> Affects Versions: 3.2.0
> Reporter: Uma Maheswara Rao G
> Assignee: Uma Maheswara Rao G
> Priority: Major
> Attachments: ViewFSOverloadScheme - V1.0.pdf
>
>
> ViewFS provides flexibility to mount different filesystem types with mount
> points configuration table. Additionally viewFS provides flexibility to
> configure any fs (not only HDFS) scheme in mount table mapping. This approach
> is solving the scalability problems, but users need to reconfigure the
> filesystem to ViewFS and to its scheme. This will be problematic in the case
> of paths persisted in meta stores, ex: Hive. In systems like Hive, it will
> store uris in meta store. So, changing the file system scheme will create a
> burden to upgrade/recreate meta stores. In our experience many users are not
> ready to change that.
> Router based federation is another implementation to provide coordinated
> mount points for HDFS federation clusters. Even though this provides
> flexibility to handle mount points easily, this will not allow
> other(non-HDFS) file systems to mount. So, this does not solve the purpose
> when users want to mount external(non-HDFS) filesystems.
> So, the problem here is: Even though many users want to adapt to the scalable
> fs options available, technical challenges of changing schemes (ex: in meta
> stores) in deployments are obstructing them.
> So, we propose to allow hdfs scheme in ViewFS like client side mount system
> and provision user to create mount links without changing URI paths.
> I will upload detailed design doc shortly.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]