Alex Behm has posted comments on this change. Change subject: IMPALA-5309: Adds TABLESAMPLE clause for HDFS table refs. ......................................................................
Patch Set 3: (3 comments) http://gerrit.cloudera.org:8080/#/c/6868/4/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java: Line 809: public HdfsPartition cloneNewFds(List<FileDescriptor> newFds) { > i don't think this is used anymore. Done http://gerrit.cloudera.org:8080/#/c/6868/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: Line 1961: List<Pair<Long, FileDescriptor>> allFiles = > i'd say it's cleaner than partition cloning, because the way the cloning wo Returning as Map<Long, List<FileDescriptor>> now. http://gerrit.cloudera.org:8080/#/c/6868/4/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: Line 1940: * The 'percentBytes' parameter must be between 0 and 100. > i'd also point out that it allocates something like 12+ bytes per existing That's definitely a legitimate concern. Added comment. I also did another minor speed improvement to avoid unnecessarily doing a hash lookup to find the partition of each selected file. I'll run some experiments to see how this function performs. -- To view, visit http://gerrit.cloudera.org:8080/6868 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ief112cfb1e4983c5d94c08696dc83da9ccf43f70 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm <alex.b...@cloudera.com> Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com> Gerrit-Reviewer: Marcel Kornacker <mar...@cloudera.com> Gerrit-HasComments: Yes