Alex Behm has posted comments on this change. Change subject: IMPALA-4172/IMPALA-3653: Improvements to block metadata loading ......................................................................
Patch Set 9: (3 comments) http://gerrit.cloudera.org:8080/#/c/5148/9/fe/src/main/java/org/apache/impala/catalog/DiskIdMapper.java File fe/src/main/java/org/apache/impala/catalog/DiskIdMapper.java: Line 38: private static DiskIdMapper INSTANCE = new DiskIdMapper(); just make it public http://gerrit.cloudera.org:8080/#/c/5148/9/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: Line 403: for (List<HdfsPartition> partLists: partsByPath.values()) { It seems simpler to move this logic into the while loop above and update the partition files and global stats incrementally. It will probably also be faster (look at partition.getSize()), but my main motivation is to make the code simpler to understand. Since we now all the info we need in every LocatedFileStatus it seems we can map that file to its corresponding partition(s), and incrementally update all stats and auxiliary data structures. Line 417: loadDiskIds(perFsFileBlocks); I think we can get rid of this function and move the logic into the while loop above. We can get rid of the classes FsKey and FileBlocksInfo. Same reason as above: Let's make this code simpler. -- To view, visit http://gerrit.cloudera.org:8080/5148 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie127658172e6e70dae441374530674a4ac9d5d26 Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bharath Vissapragada <bhara...@cloudera.com> Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com> Gerrit-Reviewer: Bharath Vissapragada <bhara...@cloudera.com> Gerrit-Reviewer: Mostafa Mokhtar <mmokh...@cloudera.com> Gerrit-HasComments: Yes