Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16723 )
Change subject: IMPALA-10314: Optimize planning time for simple limits ...................................................................... Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/16723/3/fe/src/main/java/org/apache/impala/planner/HdfsPartitionPruner.java File fe/src/main/java/org/apache/impala/planner/HdfsPartitionPruner.java: http://gerrit.cloudera.org:8080/#/c/16723/3/fe/src/main/java/org/apache/impala/planner/HdfsPartitionPruner.java@209 PS3, Line 209: for (FeFsPartition p : partitions) { : numRows += p.getNumFileDescriptors(); : prunedPartitions.add(p); : if (numRows >= analyzer.getSimpleLimitStatus().second) { : // here we only limit the partitions; later in HdfsScanNode we will : // limit the file descriptors within a partition : break; : } > Yes, these are the pruned partitions that will be sent over to the runtime. I think on balance it'd be better to make the partition selection deterministic, just cause it'll be easier to debug, support, etc. Agree that it might be a good optimization for workloads that were mostly comprised of queries like this, but we're not really anticipating such workloads. http://gerrit.cloudera.org:8080/#/c/16723/3/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: http://gerrit.cloudera.org:8080/#/c/16723/3/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@845 PS3, Line 845: long numRows = 0; // only used for the simple limit case maybe simpleLimitNumRows to make it clearer what it is? Ok to ignore. -- To view, visit http://gerrit.cloudera.org:8080/16723 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9d6a79263bc092e0f3e9a1d72da5618f3cc35574 Gerrit-Change-Number: 16723 Gerrit-PatchSet: 3 Gerrit-Owner: Aman Sinha <[email protected]> Gerrit-Reviewer: Aman Sinha <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Qifan Chen <[email protected]> Gerrit-Reviewer: Shant Hovsepian <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Comment-Date: Tue, 17 Nov 2020 19:50:42 +0000 Gerrit-HasComments: Yes
