sodonnel commented on a change in pull request #1028: HDFS-14617 - Improve
fsimage load time by writing sub-sections to the fsimage index
URL: https://github.com/apache/hadoop/pull/1028#discussion_r313799031
##########
File path:
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormatPBINode.java
##########
@@ -217,33 +272,147 @@ void loadINodeDirectorySection(InputStream in) throws
IOException {
INodeDirectory p = dir.getInode(e.getParent()).asDirectory();
for (long id : e.getChildrenList()) {
INode child = dir.getInode(id);
- addToParent(p, child);
+ if (addToParent(p, child)) {
+ if (child.isFile()) {
+ inodeList.add(child);
+ }
+ if (inodeList.size() >= 1000) {
+ addToCacheAndBlockMap(inodeList);
+ inodeList.clear();
+ }
+ }
Review comment:
I have added a message like this to both adding the inode and inode
references to the directory:
```
LOG.warn("Failed to add the inode reference {} to the directory {}",
ref.getId(), p.getId());
```
I opted to log only the inode and directory "inode id" as I am not sure if
the system will be able to resolve the full path of an inode or directory at
this stage, as it is still loading the image. Also this "should never happen"
so hopefully we will not see these messages in practice, but if we do, it will
likely require manually investigating an image corruption, so the ID numbers
should be enough to start with that.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]