[
https://issues.apache.org/jira/browse/HDFS-11058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15631318#comment-15631318
]
Manoj Govindassamy edited comment on HDFS-11058 at 11/3/16 2:45 AM:
--------------------------------------------------------------------
Thanks for the review [~andrew.wang].
# {{ViewFsMountPoint}} : Yes, this replaces _ViewFileSystem#MountPoint_. This
is more than a refactor as the new definition is slightly different from the
older one. So far there has not been any real outside user for
_ViewFileSystem#MountPoint_. As part of HDFS-5684, I added annotations in
MountPoint and extended the test case to print MountPoints. {{DfUsage}} for
ViewFileSystem is the first real outside user for MountPoint, and I thought
this would be right time for defining a proper {{ViewFsMountPoint}}. IMHO,
{{ViewFsMountPoint}} should be abstracted and expose only the needed attributes
-- the MountedOn path and its target FileSystem. The FileSystem could be a
_hdfs://_ or it could be a one for _MergeFs_, but I don't see a need for
exposing all the NameServices, at least for now.
# {{ViewFsUtil}} : This is the new helper routine for ViewFileSystem. Already
added the annotation _InterfaceAudience.Public, InterfaceStability.Evolving_ to
it. Utility functions inside this class are introduced only for {{DfUsage}}. Is
it still worth to separate this out to a new patch when there are no other
callers ?
# Responsive and non-responsive filesystems: Yes, this is a new behavior for
{{DF}}, but this is only in the context of ViewFileSystem. I was contemplating
on _Availability_ vs _Consistency_ for the command and inclined towards the
Availability. When any one of the NameNode or backing filesystem is not
reachable, then DF command will error out normally and none of the information
would be printed. Whereas by skipping the unreachable ones, at least the
reachable FileSystems are printed out. I have seen the unix {{df}} command
getting stuck at times when NFS servers are not reachable. But, I am totally ok
to remove this extra feature and error out when any of the backing NameServices
are not reachable. May I will propse this as a separate patch and not part of
this issue.
Please let me know your thoughts.
was (Author: manojg):
# {{ViewFsMountPoint}} : Yes, this replaces _ViewFileSystem#MountPoint_. This
is more than a refactor as the new definition is slightly different from the
older one. So far there has not been any real outside user for
_ViewFileSystem#MountPoint_. As part of HDFS-5684, I added annotations in
MountPoint and extended the test case to print MountPoints. {{DfUsage}} for
ViewFileSystem is the first real outside user for MountPoint, and I thought
this would be right time for defining a proper {{ViewFsMountPoint}}. IMHO,
{{ViewFsMountPoint}} should be abstracted and expose only the needed attributes
-- the MountedOn path and its target FileSystem. The FileSystem could be a
_hdfs://_ or it could be a one for _MergeFs_, but I don't see a need for
exposing all the NameServices, at least for now.
# {{ViewFsUtil}} : This is the new helper routine for ViewFileSystem. Already
added the annotation _InterfaceAudience.Public, InterfaceStability.Evolving_ to
it. Utility functions inside this class are introduced only for {{DfUsage}}. Is
it still worth to separate this out to a new patch when there are no other
callers ?
# Responsive and non-responsive filesystems: Yes, this is a new behavior for
{{DF}}, but this is only in the context of ViewFileSystem. I was contemplating
on _Availability_ vs _Consistency_ for the command and inclined towards the
Availability. When any one of the NameNode or backing filesystem is not
reachable, then DF command will error out normally and none of the information
would be printed. Whereas by skipping the unreachable ones, at least the
reachable FileSystems are printed out. I have seen the unix {{df}} command
getting stuck at times when NFS servers are not reachable. But, I am totally ok
to remove this extra feature and error out when any of the backing NameServices
are not reachable. May I will propse this as a separate patch and not part of
this issue.
> Implement 'hadoop fs -df' command for ViewFileSystem
> -------------------------------------------------------
>
> Key: HDFS-11058
> URL: https://issues.apache.org/jira/browse/HDFS-11058
> Project: Hadoop HDFS
> Issue Type: Task
> Affects Versions: 3.0.0-alpha1
> Reporter: Manoj Govindassamy
> Assignee: Manoj Govindassamy
> Labels: viewfs
> Attachments: HDFS-11058.01.patch
>
>
> Df command doesn't seem to work well with ViewFileSystem. It always reports
> used data as 0. Here is the client mount table configuration I am using
> against a federated clusters of 2 NameNodes and 2 DataNoes.
> {code}
> 1 <?xml version="1.0" ?>
> 2 <configuration>
> 3 <property>
> 4 <name>fs.defaultFS</name>
> 5 <value>viewfs://ClusterX/</value>
> 6 </property>
> ..
> 11 <property>
> 12 <name>fs.default.name</name>
> 13 <value>viewfs://ClusterX/</value>
> 14 </property>
> ..
> 23 <property>
> 24 <name>fs.viewfs.mounttable.ClusterX.link./nn0</name>
> 25 <value>hdfs://127.0.0.1:50001/</value>
> 26 </property>
> 27 <property>
> 28 <name>fs.viewfs.mounttable.ClusterX.link./nn1</name>
> 29 <value>hdfs://127.0.0.1:51001/</value>
> 30 </property>
> 31 <property>
> 32 <name>fs.viewfs.mounttable.ClusterX.link./nn2</name>
> 33 <value>hdfs://127.0.0.1:52001/nn2</value>
> 34 </property>
> 35 <property>
> 36 <name>fs.viewfs.mounttable.ClusterX.link./nn3</name>
> 37 <value>hdfs://127.0.0.1:52001/nn3</value>
> 38 </property>
> 39 <property>
> 40 <name>fs.viewfs.mounttable.ClusterY.linkMergeSlash</name>
> 41 <value>hdfs://127.0.0.1:50001/</value>
> 42 </property>
> 43 </configuration>
> {code}
> {{Df}} command always reports Size/Available as 8.0E and the usage as 0 for
> any federated cluster.
> {noformat}
> # hadoop fs -fs viewfs://ClusterX/ -df /
> Filesystem Size Used Available Use%
> viewfs://ClusterX/ 9223372036854775807 0 9223372036854775807 0%
> # hadoop fs -fs viewfs://ClusterX/ -df -h /
> Filesystem Size Used Available Use%
> viewfs://ClusterX/ 8.0 E 0 8.0 E 0%
> # hadoop fs -fs viewfs://ClusterY/ -df -h /
> Filesystem Size Used Available Use%
> viewfs://ClusterY/ 8.0 E 0 8.0 E 0%
> {noformat}
> Whereas {{Du}} command seems to work as expected even with ViewFileSystem.
> {noformat}
> # hadoop fs -fs viewfs://ClusterY/ -du -h /
> 10.6 K 31.8 K /build.log.16y
> 0 0 /user
> # hadoop fs -fs viewfs://ClusterX/ -du -h /
> 10.6 K 31.8 K /nn0
> 0 0 /nn1
> 20.2 K 35.8 K /nn3
> 40.6 K 34.3 K /nn4
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]