Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/20040 )
Change subject: IMPALA-12200: Cap stats NDV from SetOperationStmt.createMetadata ...................................................................... Patch Set 2: (1 comment) Tried reran TpcdsPlannerTest after rebase, and I found some query plan change shape with this patch. http://gerrit.cloudera.org:8080/#/c/20040/2/testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q11.test File testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q11.test: http://gerrit.cloudera.org:8080/#/c/20040/2/testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q11.test@103 PS2, Line 103: cardinality=87.18K Join Cardinality increase here, I think because JoinNode switch its cardinality estimate due to lower build side NDV. In JoinNode.getJoinCardinality, there is this comment: * Generic estimation: * cardinality = |child(0)| * |child(1)| / max(NDV(L.c), NDV(R.d)) * - case A: NDV(L.c) <= NDV(R.d) * every row from child(0) joins with |child(1)| / NDV(R.d) rows * - case B: NDV(L.c) > NDV(R.d) * every row from child(1) joins with |child(0)| / NDV(L.c) rows * - we adjust the NDVs from both sides to account for predicates that may * might have reduce the cardinality and NDVs It still in case A, but with lower NDV(R.d). -- To view, visit http://gerrit.cloudera.org:8080/20040 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic0bb2eff5005fdfb11adf31499214c63dd552c05 Gerrit-Change-Number: 20040 Gerrit-PatchSet: 2 Gerrit-Owner: Riza Suminto <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Kurt Deschler <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Reviewer: Wenzhe Zhou <[email protected]> Gerrit-Comment-Date: Mon, 12 Jun 2023 22:50:03 +0000 Gerrit-HasComments: Yes
