[
https://issues.apache.org/jira/browse/HDFS-16044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17352348#comment-17352348
]
ludun commented on HDFS-16044:
------------------------------
after check code, the getLocatedBlocks is called in
FSDirStatAndListingOp#getListing:
{code:java}
for (int i = 0; i < numOfListing && locationBudget > 0; i++) {
INode child = contents.get(startChild+i);
byte childStoragePolicy = (includeStoragePolicy && !child.isSymlink())
? getStoragePolicyID(child.getLocalStoragePolicyID(),
parentStoragePolicy)
: parentStoragePolicy;
listing[i] = createFileStatus(fsd, iip, child, childStoragePolicy,
needLocation, false);
listingCnt++;
if (listing[i] instanceof HdfsLocatedFileStatus) {
// Once we hit lsLimit locations, stop.
// This helps to prevent excessively large response payloads.
// Approximate #locations with locatedBlockCount() * repl_factor
LocatedBlocks blks =
((HdfsLocatedFileStatus)listing[i]).getLocatedBlocks();
locationBudget -= (blks == null) ? 0 :
blks.locatedBlockCount() * listing[i].getReplication();
}
}
{code}
It is based in return of createFileStatus, which create in
HdfsFileStatus#build
{code:java}
public HdfsFileStatus build() {
if (null == locations && !isdir && null == symlink && !locatedStatus) {
return new HdfsNamedFileStatus(length, isdir, replication, blocksize,
mtime, atime, permission, flags, owner, group, symlink, path,
fileId, childrenNum, feInfo, storagePolicy, ecPolicy);
}
return new HdfsLocatedFileStatus(length, isdir, replication, blocksize,
mtime, atime, permission, flags, owner, group, symlink, path,
fileId, childrenNum, feInfo, storagePolicy, ecPolicy, locations);
}
{code}
when isdir is true, it should return HdfsNamedFileStatus not
HdfsLocatedFileStatus.
> getListing call getLocatedBlocks even source is a directory
> -----------------------------------------------------------
>
> Key: HDFS-16044
> URL: https://issues.apache.org/jira/browse/HDFS-16044
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: ludun
> Assignee: ludun
> Priority: Major
>
> In production cluster when call getListing very frequent. The processing
> time of rpc request is very high. we try to optimize the performance of
> getListing request.
> After some check, we found that, even the source and child is dir, the
> getListing request also call getLocatedBlocks.
> {code:java}
> `---ts=2021-05-27 14:19:15;thread_name=IPC Server handler 86 on
> 25000;id=e6;is_daemon=true;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@5fcfe4b2
> `---[35.068532ms]
> org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp:getListing()
> +---[0.003542ms]
> org.apache.hadoop.hdfs.server.namenode.INodesInPath:getPathComponents() #214
> +---[0.003053ms]
> org.apache.hadoop.hdfs.server.namenode.FSDirectory:isExactReservedName() #95
> +---[0.002938ms]
> org.apache.hadoop.hdfs.server.namenode.FSDirectory:readLock() #218
> +---[0.00252ms]
> org.apache.hadoop.hdfs.server.namenode.INodesInPath:isDotSnapshotDir() #220
> +---[0.002788ms]
> org.apache.hadoop.hdfs.server.namenode.INodesInPath:getPathSnapshotId() #223
> +---[0.002905ms]
> org.apache.hadoop.hdfs.server.namenode.INodesInPath:getLastINode() #224
> +---[0.002785ms]
> org.apache.hadoop.hdfs.server.namenode.INode:getStoragePolicyID() #230
> +---[0.002236ms]
> org.apache.hadoop.hdfs.server.namenode.INode:isDirectory() #233
> +---[0.002919ms]
> org.apache.hadoop.hdfs.server.namenode.INode:asDirectory() #242
> +---[0.003408ms]
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory:getChildrenList() #243
> +---[0.005942ms]
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory:nextChild() #244
> +---[0.002467ms] org.apache.hadoop.hdfs.util.ReadOnlyList:size() #245
> +---[0.005481ms]
> org.apache.hadoop.hdfs.server.namenode.FSDirectory:getLsLimit() #247
> +---[0.002176ms]
> org.apache.hadoop.hdfs.server.namenode.FSDirectory:getLsLimit() #248
> +---[min=0.00211ms,max=0.005157ms,total=2.247572ms,count=1000]
> org.apache.hadoop.hdfs.util.ReadOnlyList:get() #252
> +---[min=0.001946ms,max=0.005411ms,total=2.041715ms,count=1000]
> org.apache.hadoop.hdfs.server.namenode.INode:isSymlink() #253
> +---[min=0.002176ms,max=0.005426ms,total=2.264472ms,count=1000]
> org.apache.hadoop.hdfs.server.namenode.INode:getLocalStoragePolicyID() #254
> +---[min=0.002251ms,max=0.006849ms,total=2.351935ms,count=1000]
> org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp:getStoragePolicyID()
> #95
> +---[min=0.006091ms,max=0.012333ms,total=6.439434ms,count=1000]
> org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp:createFileStatus()
> #257
> +---[min=0.00269ms,max=0.004995ms,total=2.788194ms,count=1000]
> org.apache.hadoop.hdfs.protocol.HdfsLocatedFileStatus:getLocatedBlocks() #265
> +---[0.003234ms]
> org.apache.hadoop.hdfs.protocol.DirectoryListing:<init>() #274
> `---[0.002457ms]
> org.apache.hadoop.hdfs.server.namenode.FSDirectory:readUnlock() #277
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]