Alex Behm has posted comments on this change. Change subject: IMPALA-4840: Fix REFRESH performance regression. ......................................................................
Patch Set 2: (4 comments) Getting close. http://gerrit.cloudera.org:8080/#/c/6009/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: Line 787: * for modified files on HDFS. This method uses a FileSystem.listStatus() call on the Suggest something like: This method is optimized for the case where the files in the partition have not changed dramatically. It first uses FileSystem.listStatus() ... then you can mention the perf difference between these functions Line 789: * block locations using FileSystem.getFileBlockLocations() method. The initial table the FileSystem.getFileBlockLocations() method Line 793: * (up to ~40x slower in some cases) and hence it is implemented this way to optimize suggest moving this wording to the top as suggested above Line 836: if (unknownDiskIdCount > 0 && LOG.isWarnEnabled()) { remove the disk id warning as you suggested -- To view, visit http://gerrit.cloudera.org:8080/6009 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I859b9fe93563ba886d0b5db6db42a14c88caada8 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bharath Vissapragada <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Bharath Vissapragada <[email protected]> Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]> Gerrit-HasComments: Yes
