Additional reason, HDFS does not have limit on number of files in a directory. Some clusters had millions of files in a single directory. Listing such a directory resulted in very large responses, requiring large contiguous memory allocation in JVM (for the array) and unpredictable GC failures.
On Thu, May 2, 2013 at 9:28 AM, Todd Lipcon <t...@cloudera.com> wrote: > Hi Brad, > > The reasoning is that the NameNode locking is somewhat coarse grained. In > older versions of Hadoop, before it worked this way, we found that listing > large directories (eg with 100k+ files) could end up holding the namenode's > lock for a quite long period of time and starve other clients. > > Additionally, I believe there is a second API that does the "on-demand" > fetching of the next set of files from the listing as well, no? > > As for the consistency argument, you're correct that you may have a > non-atomic view of the directory contents, but I can't think of any > applications where this would be problematic. > > -Todd > > On Thu, May 2, 2013 at 9:18 AM, Brad Childs <b...@redhat.com> wrote: > > > Could someone explain why the DistributedFileSystem's listStatus() method > > does a piecemeal assembly of a directory listing within the method? > > > > Is there a locking issue? What if an element is added to the the > directory > > during the operation? What if elements are removed? > > > > It would make sense to me that the FileSystem class listStatus() method > > returned an Iterator allowing only partial fetching/chatter as needed. > But > > I dont understand why you'd want to assemble a giant array of the listing > > chunk by chunk. > > > > > > Here's the source of the listStatus() method, and I've linked the entire > > class below. > > > > > > --------------------------------- > > > > public FileStatus[] listStatus(Path p) throws IOException { > > String src = getPathName(p); > > > > // fetch the first batch of entries in the directory > > DirectoryListing thisListing = dfs.listPaths( > > src, HdfsFileStatus.EMPTY_NAME); > > > > if (thisListing == null) { // the directory does not exist > > return null; > > } > > > > HdfsFileStatus[] partialListing = thisListing.getPartialListing(); > > if (!thisListing.hasMore()) { // got all entries of the directory > > FileStatus[] stats = new FileStatus[partialListing.length]; > > for (int i = 0; i < partialListing.length; i++) { > > stats[i] = makeQualified(partialListing[i], p); > > } > > statistics.incrementReadOps(1); > > return stats; > > } > > > > // The directory size is too big that it needs to fetch more > > // estimate the total number of entries in the directory > > int totalNumEntries = > > partialListing.length + thisListing.getRemainingEntries(); > > ArrayList<FileStatus> listing = > > new ArrayList<FileStatus>(totalNumEntries); > > // add the first batch of entries to the array list > > for (HdfsFileStatus fileStatus : partialListing) { > > listing.add(makeQualified(fileStatus, p)); > > } > > statistics.incrementLargeReadOps(1); > > > > // now fetch more entries > > do { > > thisListing = dfs.listPaths(src, thisListing.getLastName()); > > > > if (thisListing == null) { > > return null; // the directory is deleted > > } > > > > partialListing = thisListing.getPartialListing(); > > for (HdfsFileStatus fileStatus : partialListing) { > > listing.add(makeQualified(fileStatus, p)); > > } > > statistics.incrementLargeReadOps(1); > > } while (thisListing.hasMore()); > > > > return listing.toArray(new FileStatus[listing.size()]); > > } > > > > -------------------------------------------- > > > > > > > > > > > > Ref: > > > > > https://svn.apache.org/repos/asf/hadoop/common/tags/release-1.0.4/src/hdfs/org/apache/hadoop/hdfs/DistributedFileSystem.java > > http://docs.oracle.com/javase/6/docs/api/java/util/Iterator.html > > > > > > thanks! > > > > -bc > > > > > > -- > Todd Lipcon > Software Engineer, Cloudera > -- http://hortonworks.com/download/