[ 
https://issues.apache.org/jira/browse/HDFS-16245?focusedWorklogId=659501&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-659501
 ]

ASF GitHub Bot logged work on HDFS-16245:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 04/Oct/21 10:56
            Start Date: 04/Oct/21 10:56
    Worklog Time Spent: 10m 
      Work Description: sodonnel commented on a change in pull request #3512:
URL: https://github.com/apache/hadoop/pull/3512#discussion_r721253275



##########
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormatPBINode.java
##########
@@ -279,6 +283,9 @@ void loadINodeDirectorySection(InputStream in) throws 
IOException {
         INodeDirectory p = dir.getInode(e.getParent()).asDirectory();
         for (long id : e.getChildrenList()) {
           INode child = dir.getInode(id);
+          if (child.isDirectory()) {

Review comment:
       We are only incrementing here if its a directory. The inode table / 
section contains an entry for every file and directory in the system.
   
   The the directory section is what links them all together into the parent 
child relationship, so it should contain about the same number of entries as 
inodes.
   
   I am not sure if it makes sense to just count the directories here, as we 
have already counted them in the inode section.
   
   Why do you want to count just directories? Would it make more sense to count 
each entry and child entry to give an idea of the number of entries processed 
by each parallel section?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 659501)
    Time Spent: 50m  (was: 40m)

> Record the number of INodeDirectory when loading the FSImage file
> -----------------------------------------------------------------
>
>                 Key: HDFS-16245
>                 URL: https://issues.apache.org/jira/browse/HDFS-16245
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>            Reporter: JiangHua Zhu
>            Assignee: JiangHua Zhu
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> When starting parallel loading of FsImage (dfs.image.parallel.load=true), you 
> can see in the startup log that the number of loaded INodes has been printed, 
> but it is impossible to know the specific number of loaded INodeDirectory, 
> because it is not printed now.
> For example, here are some startup information:
> 2021-09-30 19:12:55,012 [15587]-INFO 
> [main:FSImageFormatProtobuf$Loader@340]-The fsimage will be loaded in 
> parallel using 4 threads
> 2021-09-30 19:12:55,031 [15606]-INFO 
> [main:FSImageFormatPBINode$Loader@418]-Loading the INode section in parallel 
> with 12 sub-sections
> 2021-09-30 19:12:55,034 [15609]-INFO 
> [main:FSImageFormatPBINode$Loader@409]-Loading xxxx INodes.
> ......
> 2021-09-30 19:30:37,080 [1077655]-INFO 
> [main:FSImageFormatPBINode$Loader@465]-Completed loading all INode sections. 
> Loaded xxxx inodes.
> 2021-09-30 19:30:37,086 [1077661]-INFO 
> [main:FSImageFormatPBINode$Loader@222]-Loading the INodeDirectory section in 
> parallel with 12 sub-sections
> ......
> 2021-09-30 19:36:58,074 [1458649]-INFO 
> [main:FSImageFormatPBINode$Loader@261]-Completed loading all INodeDirectory 
> sub-sections
> 2021-09-30 19:36:58,076 [1458651]-INFO 
> [main:FSImageFormatPBINode$Loader@339]-Completed update blocks map and name 
> cache, total waiting duration 1 ms.
> 2021-09-30 19:36:58,111 [1458686]-INFO 
> [main:FSImageFormatProtobuf$Loader@248]-Loaded FSImage in xxxx seconds.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to