[ 
https://issues.apache.org/jira/browse/HDFS-16081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17366440#comment-17366440
 ] 

lei w commented on HDFS-16081:
------------------------------

Our cluster dfs.ls.limit is not configured. In many cases, there will be nearly 
one million files in a directory, so every list operation has to wait for 
several minutes.

> List a large directory, the client waits for a long time
> --------------------------------------------------------
>
>                 Key: HDFS-16081
>                 URL: https://issues.apache.org/jira/browse/HDFS-16081
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs-client
>            Reporter: lei w
>            Priority: Minor
>
> When we list a large directory, we need to wait a lot of time. This is 
> because the NameNode only returns the number of files corresponding to 
> dfs.ls.limit each time, and then the client iteratively obtains the remaining 
> files. But in many scenarios, we only need to know part of the files in the 
> current directory, and then process this part of the file. After processing, 
> go to get the remaining files. So can we add a limit on the number of ls 
> files and return it to the client after obtaining the specified number of 
> files ?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to