Bharath Vissapragada has posted comments on this change. Change subject: IMPALA-4840: Fix REFRESH performance regression. ......................................................................
Patch Set 1: (9 comments) http://gerrit.cloudera.org:8080/#/c/6009/1//COMMIT_MSG Commit Message: Line 20: listStatus() on the partition directory and checking the > Mention that listStatus() is significantly faster than listFiles(), that's Done Line 22: metadata load still fetches the block locations in bulk. > Mention the behavioral change of REFRESH with respect to HDFS block movemen Done http://gerrit.cloudera.org:8080/#/c/6009/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: Line 341: private int computeFdBlockMetadata(FileDescriptor fd, BlockLocation[] locations) > setBlockMetadata() or function/arg names that more clearly indicate the inp Renamed it to setFdBlockMetadata() and added a simple javadoc. Let me know if you think it can be further improved. Line 364: } > newline Done Line 783: * modified files on HDFS. > Mention which case this function is optimized for, e.g., it will be fast if Done Line 785: private void loadMetadataAndDiskIds(HdfsPartition partition) throws CatalogException { > refreshFileMetadata() or something that indicates we are reusing existing f Agreed. Line 804: if (!FileSystemUtil.supportsStorageIds(fs)) { > can move this up (avoid indexing all files in the partition) Done Line 830: if (unknownDiskIdCount > 0 && LOG.isWarnEnabled()) { > Do we really need this warning? It has the potential of log spew because it Now that I think about it, I feel this whole thing is redundant especially given the other disk-id warning improvements patch. I'm removing it here, let me know if you think we should retain it for some reason. Line 835: numHdfsFiles_ += newFileDescs.size(); > where do we subtract the old values? Either dropPartition()+addPartition() or loadPartitionFileMetadata() -- To view, visit http://gerrit.cloudera.org:8080/6009 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I859b9fe93563ba886d0b5db6db42a14c88caada8 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bharath Vissapragada <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Bharath Vissapragada <[email protected]> Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]> Gerrit-HasComments: Yes
