[
https://issues.apache.org/jira/browse/HDFS-16245?focusedWorklogId=660093&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-660093
]
ASF GitHub Bot logged work on HDFS-16245:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 05/Oct/21 04:55
Start Date: 05/Oct/21 04:55
Worklog Time Spent: 10m
Work Description: jianghuazhu commented on a change in pull request #3512:
URL: https://github.com/apache/hadoop/pull/3512#discussion_r721887148
##########
File path:
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormatPBINode.java
##########
@@ -279,6 +283,9 @@ void loadINodeDirectorySection(InputStream in) throws
IOException {
INodeDirectory p = dir.getInode(e.getParent()).asDirectory();
for (long id : e.getChildrenList()) {
INode child = dir.getInode(id);
+ if (child.isDirectory()) {
Review comment:
Thanks @sodonnel for the comment and review.
When loading FsImage, we have recorded the number of all inodes (including
INodeFile and INodeDirectory) in the log.
For example, here is the specific record information:
`2021-09-30 19:12:55,034 [15609]-INFO
[main:FSImageFormatPBINode$Loader@409]-Loading xxxx INodes.`
Yes, this is good. We can know the data of the loaded inode, but this is a
sum. But we can't know how many INodeFiles or how many INodeDirectory are
loaded, if we can know how many INodeFiles are loaded, similarly, we can know
how many INodeDirectory is loaded. This will help us find the cause of the
problem when there is an exception.
Regarding the reason for dealing with INodeDirectory here. What I want to
show is that in many cases, the number of files created will be more than the
number of directories created. Therefore, it may use less time when calculating
INodeDirectory.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 660093)
Time Spent: 1h (was: 50m)
> Record the number of INodeDirectory when loading the FSImage file
> -----------------------------------------------------------------
>
> Key: HDFS-16245
> URL: https://issues.apache.org/jira/browse/HDFS-16245
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: namenode
> Reporter: JiangHua Zhu
> Assignee: JiangHua Zhu
> Priority: Major
> Labels: pull-request-available
> Time Spent: 1h
> Remaining Estimate: 0h
>
> When starting parallel loading of FsImage (dfs.image.parallel.load=true), you
> can see in the startup log that the number of loaded INodes has been printed,
> but it is impossible to know the specific number of loaded INodeDirectory,
> because it is not printed now.
> For example, here are some startup information:
> 2021-09-30 19:12:55,012 [15587]-INFO
> [main:FSImageFormatProtobuf$Loader@340]-The fsimage will be loaded in
> parallel using 4 threads
> 2021-09-30 19:12:55,031 [15606]-INFO
> [main:FSImageFormatPBINode$Loader@418]-Loading the INode section in parallel
> with 12 sub-sections
> 2021-09-30 19:12:55,034 [15609]-INFO
> [main:FSImageFormatPBINode$Loader@409]-Loading xxxx INodes.
> ......
> 2021-09-30 19:30:37,080 [1077655]-INFO
> [main:FSImageFormatPBINode$Loader@465]-Completed loading all INode sections.
> Loaded xxxx inodes.
> 2021-09-30 19:30:37,086 [1077661]-INFO
> [main:FSImageFormatPBINode$Loader@222]-Loading the INodeDirectory section in
> parallel with 12 sub-sections
> ......
> 2021-09-30 19:36:58,074 [1458649]-INFO
> [main:FSImageFormatPBINode$Loader@261]-Completed loading all INodeDirectory
> sub-sections
> 2021-09-30 19:36:58,076 [1458651]-INFO
> [main:FSImageFormatPBINode$Loader@339]-Completed update blocks map and name
> cache, total waiting duration 1 ms.
> 2021-09-30 19:36:58,111 [1458686]-INFO
> [main:FSImageFormatProtobuf$Loader@248]-Loaded FSImage in xxxx seconds.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]