[ https://issues.apache.org/jira/browse/HDFS-16081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17366440#comment-17366440 ]
lei w edited comment on HDFS-16081 at 6/21/21, 12:46 PM: --------------------------------------------------------- Our cluster dfs.ls.limit is not configured(default is 1000). In many cases, there will be nearly one million files in a directory, so every list operation has to wait for several minutes. was (Author: lei w): Our cluster dfs.ls.limit is not configured. In many cases, there will be nearly one million files in a directory, so every list operation has to wait for several minutes. > List a large directory, the client waits for a long time > -------------------------------------------------------- > > Key: HDFS-16081 > URL: https://issues.apache.org/jira/browse/HDFS-16081 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client > Reporter: lei w > Priority: Minor > > When we list a large directory, we need to wait a lot of time. This is > because the NameNode only returns the number of files corresponding to > dfs.ls.limit each time, and then the client iteratively obtains the remaining > files. But in many scenarios, we only need to know part of the files in the > current directory, and then process this part of the file. After processing, > go to get the remaining files. So can we add a limit on the number of ls > files and return it to the client after obtaining the specified number of > files ? -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org