When listing a directory, for directory entries it may be more useful
to display the number of files in a directory, rather than the number
of bytes used by all the files in the directory and its subdirectories.
This a subjective opinion -- comments?
(Currently, the value displayed subdirectory is "0")
On Nov 13, 2006, at 3:25 PM, Hairong Kuang (JIRA) wrote:
dfs list operation is too expensive
-----------------------------------
Key: HADOOP-713
URL: http://issues.apache.org/jira/browse/HADOOP-713
Project: Hadoop
Issue Type: Improvement
Components: dfs
Affects Versions: 0.8.0
Reporter: Hairong Kuang
A list request to dfs returns an array of DFSFileInfo. A DFSFileInfo
of a directory contains a field called contentsLen, indicating its
size which gets computed at the namenode side by resursively going
through its subdirs. At the same time, the whole dfs directory tree is
locked.
The list operation is used a lot by DFSClient for listing a directory,
getting a file's size and # of replicas, and getting the size of dfs.
Only the last operation needs the field contentsLen to be computed.
To reduce its cost, we can add a flag to the list request. ContentsLen
is computed If the flag is set. By default, the flag is false.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the
administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira