[jira] [Commented] (HDFS-6045) A single RPC API: FileStatus[] getFileStatus(Path f) to get status of all path components.

Steve Loughran (JIRA) Tue, 04 Mar 2014 01:49:05 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-6045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13919196#comment-13919196
 ]


Steve Loughran commented on HDFS-6045:
--------------------------------------

Can hit some scale limits very fast if a directory is both deep & wide -and if 
the scan is done in a sync block in HDFS, the cost of the scan is visible to 
all.

Oddly enough, object stores may handle this better than inode filesystems, as 
they effectively do deep scans of a simulated hierarchical filesystem -though 
there's usually a limit on the #of entries returned, so this operation would 
hit multiple HTTP round trips

> A single RPC API: FileStatus[] getFileStatus(Path f) to get status of all 
> path components.
> ------------------------------------------------------------------------------------------
>
>                 Key: HDFS-6045
>                 URL: https://issues.apache.org/jira/browse/HDFS-6045
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: hdfs-client, namenode
>            Reporter: Gera Shegalov
>            Assignee: Gera Shegalov
>
> This comes up in YARN-1771/MAPREDUCE-4907 on the server/client side of PUBLIC 
> Distributed Cache. The deeper the path the more beneficial is the feature.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6045) A single RPC API: FileStatus[] getFileStatus(Path f) to get status of all path components.

Reply via email to