Bharath Vissapragada has posted comments on this change.

Change subject: IMPALA-4840: Fix REFRESH performance regression.
......................................................................


Patch Set 1:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/6009/1//COMMIT_MSG
Commit Message:

Line 20: listStatus() on the partition directory and checking the
> Mention that listStatus() is significantly faster than listFiles(), that's 
Done


Line 22: metadata load still fetches the block locations in bulk.
> Mention the behavioral change of REFRESH with respect to HDFS block movemen
Done


http://gerrit.cloudera.org:8080/#/c/6009/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

Line 341:   private int computeFdBlockMetadata(FileDescriptor fd, 
BlockLocation[] locations)
> setBlockMetadata() or function/arg names that more clearly indicate the inp
Renamed it to setFdBlockMetadata() and added a simple javadoc. Let me know if 
you think it can be further improved.


Line 364:   }
> newline
Done


Line 783:    * modified files on HDFS.
> Mention which case this function is optimized for, e.g., it will be fast if
Done


Line 785:   private void loadMetadataAndDiskIds(HdfsPartition partition) throws 
CatalogException {
> refreshFileMetadata() or something that indicates we are reusing existing f
Agreed.


Line 804:       if (!FileSystemUtil.supportsStorageIds(fs)) {
> can move this up (avoid indexing all files in the partition)
Done


Line 830:       if (unknownDiskIdCount > 0 && LOG.isWarnEnabled()) {
> Do we really need this warning? It has the potential of log spew because it
Now that I think about it, I feel this whole thing is redundant especially 
given the other disk-id warning improvements patch. I'm removing it here, let 
me know if you think we should retain it for some reason.


Line 835:       numHdfsFiles_ += newFileDescs.size();
> where do we subtract the old values?
Either dropPartition()+addPartition() or loadPartitionFileMetadata()


-- 
To view, visit http://gerrit.cloudera.org:8080/6009
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I859b9fe93563ba886d0b5db6db42a14c88caada8
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-Reviewer: Bharath Vissapragada <[email protected]>
Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]>
Gerrit-HasComments: Yes

Reply via email to