Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/13993 )
Change subject: IMPALA-8834: Short-circuit partition key scan ...................................................................... Patch Set 18: Code-Review+2 (3 comments) Thanks for adding so many tests! The patch looks good to me. Carry on Aman's +1 to +2. http://gerrit.cloudera.org:8080/#/c/13993/16/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: http://gerrit.cloudera.org:8080/#/c/13993/16/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1135 PS16, Line 1135: ty_ > This is a different value - it's an estimate of the relative cost of openin OK. IMO, inputCardinality_ is finally compared with the exec_single_node_rows_threshold query option: https://github.com/apache/impala/blob/a11106e794e793813a86840c2d8599cf149cc7a6/fe/src/main/java/org/apache/impala/planner/Planner.java#L665-L666 So if user set a higher exec_single_node_rows_threshold, this may need to be increased correspondingly. Anyway, I'm ok for the current solution. We can change it later if needed. http://gerrit.cloudera.org:8080/#/c/13993/16/testdata/workloads/functional-planner/queries/PlannerTest/distinct.test File testdata/workloads/functional-planner/queries/PlannerTest/distinct.test: http://gerrit.cloudera.org:8080/#/c/13993/16/testdata/workloads/functional-planner/queries/PlannerTest/distinct.test@526 PS16, Line 526: partition key scan : row-size=0B cardinality=24 > This is what I get locally on my branch. Sorry, I found that I should not set OPTIMIZE_PARTITION_KEY_SCANS to true since this is enabled by default. I get the same plan now. http://gerrit.cloudera.org:8080/#/c/13993/16/testdata/workloads/functional-query/queries/QueryTest/partition-key-scans.test File testdata/workloads/functional-query/queries/QueryTest/partition-key-scans.test: http://gerrit.cloudera.org:8080/#/c/13993/16/testdata/workloads/functional-query/queries/QueryTest/partition-key-scans.test@13 PS16, Line 13: aggregation(SUM, RowsRead): 24 > It's actually the total across all the operators in the plans. I actually r Thanks for the clarification! -- To view, visit http://gerrit.cloudera.org:8080/13993 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I26c87525a4f75ffeb654267b89948653b2e1ff8c Gerrit-Change-Number: 13993 Gerrit-PatchSet: 18 Gerrit-Owner: Tim Armstrong <[email protected]> Gerrit-Reviewer: Aman Sinha <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Comment-Date: Wed, 20 May 2020 08:07:38 +0000 Gerrit-HasComments: Yes
