Hi, The regular HDFS client (DistributedFileSystem) throttles the workload of listing large directories by dividing the work into batches, something like below: {code} // fetch the first batch of entries in the directory DirectoryListing thisListing = dfs.listPaths( src, HdfsFileStatus.EMPTY_NAME); ...... if (!thisListing.hasMore()) { // got all entries of the directory FileStatus[] stats = new FileStatus[partialListing.length]; {code}
However, WebHDFS doesn't seem to have this batching logic. {code} @Override public FileStatus[] listStatus(final Path f) throws IOException { final HttpOpParam.Op op = GetOpParam.Op.LISTSTATUS; return new FsPathResponseRunner<FileStatus[]>(op, f) { @Override FileStatus[] decodeResponse(Map<?,?> json) { .... } }.run(); } {code} Am I missing anything? So a user can DDoS by {{hadoop fs -ls -R /}} via WebHDFS?