[ 
https://issues.apache.org/jira/browse/HDFS-15289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17091107#comment-17091107
 ] 

Uma Maheswara Rao G commented on HDFS-15289:
--------------------------------------------

Thanks a lot, Virajith for comments. Glad to hear that you guys are looking for 
similar things.
 Pretty much our targeted use cases are similar to what you mentioned. 
 First and foremost, our goal is to make ViewFSOverloadScheme configurable with 
different schemes and “hdfs” is a priority use case as Hive-like symptoms 
persist “hdfs://nn1” uris in meta stores.

Coming to tools support, we discussed some level of the details and we thought 
we should first make ViewFS support different schemes ( ex: hdfs) and keep 
configuration centrally to manage easy mount-configurations.

{quote}saveNamespace and other methods in FileSystem all needed to be 
implemented in ViewFSOveraloadScheme. Do you have any specific plans around 
testing this?
{quote}
I have a question here. In ViewFSOverloadScheme case, we will have multiple 
target file systems.
 So, when user call ViewFSOverloadScheme#saveNameSpace, we need to delegate 
this to all hdfs specific target fs? Or but in reality users may want to run 
this on specific targets right?
 DistributedFileSystem interface tagged with:
{quote}@InterfaceAudience.LimitedPrivate(
 Unknown macro: \{ "MapReduce", "HBase" }
 )
 @InterfaceStability.Unstable
{quote}
Unfortunately some/many users directly used DFS classes. But we have a public 
exposed class for administration functions
{quote}/**
 * The public API for performing administrative functions on HDFS. Those writing
 * applications against HDFS should prefer this interface to directly accessing
 * functionality in DistributedFileSystem or DFSClient.
 *
 * Note that this is distinct from the similarly-named DFSAdmin, which
 * is a class that provides the functionality for the CLI `hdfs dfsadmin ...'
 * commands.
 */
 @InterfaceAudience.Public
 @InterfaceStability.Evolving
 public class HdfsAdmin {{quote}
Can we extend this class to support ViewFS functionally for administration 
functions?
 I mean we can do something like: Currently HdfsAdmin holds DFS class and 
delegates calls to DFS. Probably we can modify this class or extend it to 
support ViewFSOverloadScheme specific functionality?
 If that does not work, sure we can discuss which API needed to be added in 
ViewFSOverloadScheme and we may need additional APIs like when users want to 
run on specific target child filesystems.
 Actually ViewFS already exposed APIs like getChildFileSystems etc. We can add 
more functions here to achieve. 
 example: ViewFSOverloadScheme#getTargetFS(“/mountPath”); This would return DFS 
if /mounPath was pointed to the dfs cluster. 
 It would be great if you have some thoughts on how we wanted to use 
“saveNameSpace” like API when we have multiple target hdfs links mounted.

{quote}Admins will not have a way to directly access HDFS unless admin tooling 
explicitly sets the right properties. Is this something you considered? How do 
you plan to make admin tools work?{quote}
Yes, I agree. However supporting single target dfs ( overloaded scheme target 
fs ) would be easy. DFSAdmin gets FS from ViewFSOverLoadScheme and gets the 
overloadedScheme fs from there and delegate calls. 
 Challenge here is, we will have multiple DFS clusters configured as targets. 
We should make current DFSAdmin to get all matching hdfs scheme target file 
systems from OverLoadedScheme and delegate the calls. More appropriate way may 
be to extend DFSAdmin. 
 I think today if a user configures defaultFS as “viewfs://” and wants to 
connect to some of the child hdfs clusters using DFSAdmin, we have the same 
problem. So, this problem will be there in ViewFS itself and we should improve 
to provide flexibility to access child filesystems. Probably we have to build 
ViewDFSAdmin which will provide access to child file systems via 
ViewFSOverLoadedScheme APIs.

{quote}How to handle cases where DistributedFileSystem is used instead of 
FileSystem?{quote}
If users access DFS directly, they may need to get the childFileSystems from 
ViewFSOverloadScheme and check the instanceOf.

{quote}Do you plan to make ViewFSOveraloadScheme extend 
DistributedFileSystem?{quote}
The plan is to extend the ViewFileSystem class. So, we will retain the pretty 
much viewFS client side mount-building logic as is. And we will address FS 
looping issues and remote configuration loading in extended class. Also we can 
add more usability functions like getting a child file system by scheme etc.

> Allow viewfs mounts with hdfs scheme and centralized mount table
> ----------------------------------------------------------------
>
>                 Key: HDFS-15289
>                 URL: https://issues.apache.org/jira/browse/HDFS-15289
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: fs
>    Affects Versions: 3.2.0
>            Reporter: Uma Maheswara Rao G
>            Assignee: Uma Maheswara Rao G
>            Priority: Major
>         Attachments: ViewFSOverloadScheme - V1.0.pdf
>
>
> ViewFS provides flexibility to mount different filesystem types with mount 
> points configuration table. Additionally viewFS provides flexibility to 
> configure any fs (not only HDFS) scheme in mount table mapping. This approach 
> is solving the scalability problems, but users need to reconfigure the 
> filesystem to ViewFS and to its scheme.  This will be problematic in the case 
> of paths persisted in meta stores, ex: Hive. In systems like Hive, it will 
> store uris in meta store. So, changing the file system scheme will create a 
> burden to upgrade/recreate meta stores. In our experience many users are not 
> ready to change that.  
> Router based federation is another implementation to provide coordinated 
> mount points for HDFS federation clusters. Even though this provides 
> flexibility to handle mount points easily, this will not allow 
> other(non-HDFS) file systems to mount. So, this does not solve the purpose 
> when users want to mount external(non-HDFS) filesystems.
> So, the problem here is: Even though many users want to adapt to the scalable 
> fs options available, technical challenges of changing schemes (ex: in meta 
> stores) in deployments are obstructing them. 
> So, we propose to allow hdfs scheme in ViewFS like client side mount system 
> and provision user to create mount links without changing URI paths. 
> I will upload detailed design doc shortly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to