[Impala-ASF-CR] IMPALA-13535: Add script to restore stats on PlannerTest
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/22045 ) Change subject: IMPALA-13535: Add script to restore stats on PlannerTest .. Patch Set 11: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/22045 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I27bab7cee93880cd59f01b9c2d1614dfcabdc682 Gerrit-Change-Number: 22045 Gerrit-PatchSet: 11 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Tue, 03 Dec 2024 03:03:25 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13535: Add script to restore stats on PlannerTest
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/22045 ) Change subject: IMPALA-13535: Add script to restore stats on PlannerTest .. IMPALA-13535: Add script to restore stats on PlannerTest Impala has several PlannerTest that validate over EXTENDED profile and validate cardinality. In EXTENDED level, profile display stored table stats from HMS like 'numRows' and 'totalSize', which can vary between data loads. They are not validated by PlannerTest. But frequent change of these lines can disturb code review process because they are mostly noise. This patch provides a python script restore-stats-on-planner-tests.py to fix the table stats information in selected .test files. The test files to check and fixed table stats is declared inside the script. It is currently focus on tests under functional-planner/queries/PlannerTest/tpcds/ and some that test against tpcds_partitioned_parquet_snap table. critique-gerrit-review.py is updated to run with python3, trigger restore-stats-on-planner-tests.py, and warn if there is any unnecessary table stats change detected. This patch also fixed table size for tests under functional-planner/queries/PlannerTest/tpcds_cpu_cost/ because all tests there runs with synthetic stats declared in stats-3TB.json. Before the patch, the table stats printed in plan is the real stats from HMS. After this patch, the table stats displayed is calculated from the stats-3TB.json. See IMPALA-12726 for more detail on large scale planner test simulation. Testing: - Manually run the script and confirm that stats line are replaced correctly. - Run affected PlannerTest and all passed. Change-Id: I27bab7cee93880cd59f01b9c2d1614dfcabdc682 Reviewed-on: http://gerrit.cloudera.org:8080/22045 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M bin/jenkins/critique-gerrit-review.py M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java A testdata/bin/restore-stats-on-planner-tests.py M testdata/workloads/functional-planner/queries/PlannerTest/agg-node-max-mem-estimate.test M testdata/workloads/functional-planner/queries/PlannerTest/processing-cost-plan-admission-slots.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-processing-cost.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q04.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q05.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q09.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q11.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q54.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q64.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q71.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q74.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q84.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-ddl-iceberg.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-ddl-parquet.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q01.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q02.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q03.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q04.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q05.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q06.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q07.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q08.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q09.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q10a.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q11.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q12.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q13.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q14a.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q14b.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q15.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q16.test M testdata/workloads/functional-planner/
[Impala-ASF-CR] IMPALA-13535: Add script to restore stats on PlannerTest
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/22045 ) Change subject: IMPALA-13535: Add script to restore stats on PlannerTest .. Patch Set 10: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/17613/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/22045 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I27bab7cee93880cd59f01b9c2d1614dfcabdc682 Gerrit-Change-Number: 22045 Gerrit-PatchSet: 10 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Mon, 02 Dec 2024 21:44:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13535: Add script to restore stats on PlannerTest
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/22045 ) Change subject: IMPALA-13535: Add script to restore stats on PlannerTest .. Patch Set 11: Thank you Michael and Quanlong! -- To view, visit http://gerrit.cloudera.org:8080/22045 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I27bab7cee93880cd59f01b9c2d1614dfcabdc682 Gerrit-Change-Number: 22045 Gerrit-PatchSet: 11 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Mon, 02 Dec 2024 21:37:40 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13535: Add script to restore stats on PlannerTest
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/22045 ) Change subject: IMPALA-13535: Add script to restore stats on PlannerTest .. Patch Set 10: Code-Review+2 LGTM. Carry Michael's +1. Thanks for this work to save our future efforts in changing the golden files! -- To view, visit http://gerrit.cloudera.org:8080/22045 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I27bab7cee93880cd59f01b9c2d1614dfcabdc682 Gerrit-Change-Number: 22045 Gerrit-PatchSet: 10 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Mon, 02 Dec 2024 21:36:03 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13535: Add script to restore stats on PlannerTest
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/22045 ) Change subject: IMPALA-13535: Add script to restore stats on PlannerTest .. Patch Set 11: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/22045 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I27bab7cee93880cd59f01b9c2d1614dfcabdc682 Gerrit-Change-Number: 22045 Gerrit-PatchSet: 11 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Mon, 02 Dec 2024 21:37:17 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13535: Add script to restore stats on PlannerTest
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/22045 ) Change subject: IMPALA-13535: Add script to restore stats on PlannerTest .. Patch Set 11: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/11186/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/22045 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I27bab7cee93880cd59f01b9c2d1614dfcabdc682 Gerrit-Change-Number: 22045 Gerrit-PatchSet: 11 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Mon, 02 Dec 2024 21:37:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13535: Add script to restore stats on PlannerTest
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/22045 ) Change subject: IMPALA-13535: Add script to restore stats on PlannerTest .. Patch Set 10: (1 comment) http://gerrit.cloudera.org:8080/#/c/22045/9/bin/jenkins/critique-gerrit-review.py File bin/jenkins/critique-gerrit-review.py: http://gerrit.cloudera.org:8080/#/c/22045/9/bin/jenkins/critique-gerrit-review.py@381 PS9, Line 381: comments[path].append(get_comment_input(UNCOMMITTED_CHANGE)) > At L397 we will discard all changes under testdata/ dir. I think we should Good point. Done. -- To view, visit http://gerrit.cloudera.org:8080/22045 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I27bab7cee93880cd59f01b9c2d1614dfcabdc682 Gerrit-Change-Number: 22045 Gerrit-PatchSet: 10 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Mon, 02 Dec 2024 21:26:03 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13535: Add script to restore stats on PlannerTest
Hello Quanlong Huang, Michael Smith, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/22045 to look at the new patch set (#10). Change subject: IMPALA-13535: Add script to restore stats on PlannerTest .. IMPALA-13535: Add script to restore stats on PlannerTest Impala has several PlannerTest that validate over EXTENDED profile and validate cardinality. In EXTENDED level, profile display stored table stats from HMS like 'numRows' and 'totalSize', which can vary between data loads. They are not validated by PlannerTest. But frequent change of these lines can disturb code review process because they are mostly noise. This patch provides a python script restore-stats-on-planner-tests.py to fix the table stats information in selected .test files. The test files to check and fixed table stats is declared inside the script. It is currently focus on tests under functional-planner/queries/PlannerTest/tpcds/ and some that test against tpcds_partitioned_parquet_snap table. critique-gerrit-review.py is updated to run with python3, trigger restore-stats-on-planner-tests.py, and warn if there is any unnecessary table stats change detected. This patch also fixed table size for tests under functional-planner/queries/PlannerTest/tpcds_cpu_cost/ because all tests there runs with synthetic stats declared in stats-3TB.json. Before the patch, the table stats printed in plan is the real stats from HMS. After this patch, the table stats displayed is calculated from the stats-3TB.json. See IMPALA-12726 for more detail on large scale planner test simulation. Testing: - Manually run the script and confirm that stats line are replaced correctly. - Run affected PlannerTest and all passed. Change-Id: I27bab7cee93880cd59f01b9c2d1614dfcabdc682 --- M bin/jenkins/critique-gerrit-review.py M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java A testdata/bin/restore-stats-on-planner-tests.py M testdata/workloads/functional-planner/queries/PlannerTest/agg-node-max-mem-estimate.test M testdata/workloads/functional-planner/queries/PlannerTest/processing-cost-plan-admission-slots.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-processing-cost.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q04.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q05.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q09.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q11.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q54.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q64.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q71.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q74.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q84.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-ddl-iceberg.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-ddl-parquet.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q01.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q02.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q03.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q04.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q05.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q06.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q07.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q08.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q09.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q10a.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q11.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q12.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q13.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q14a.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q14b.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q15.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q16.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-
[Impala-ASF-CR] IMPALA-13535: Add script to restore stats on PlannerTest
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/22045 ) Change subject: IMPALA-13535: Add script to restore stats on PlannerTest .. Patch Set 9: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/17612/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/22045 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I27bab7cee93880cd59f01b9c2d1614dfcabdc682 Gerrit-Change-Number: 22045 Gerrit-PatchSet: 9 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Mon, 02 Dec 2024 21:25:15 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13535: Add script to restore stats on PlannerTest
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/22045 ) Change subject: IMPALA-13535: Add script to restore stats on PlannerTest .. Patch Set 9: (3 comments) http://gerrit.cloudera.org:8080/#/c/22045/9/bin/jenkins/critique-gerrit-review.py File bin/jenkins/critique-gerrit-review.py: http://gerrit.cloudera.org:8080/#/c/22045/9/bin/jenkins/critique-gerrit-review.py@381 PS9, Line 381: if os.path.splitext(path)[1] == ".test": At L397 we will discard all changes under testdata/ dir. I think we should remove this condition to protect changes on other files like txt files under testdata/data/ http://gerrit.cloudera.org:8080/#/c/22045/8/testdata/bin/restore-stats-on-planner-tests.py File testdata/bin/restore-stats-on-planner-tests.py: http://gerrit.cloudera.org:8080/#/c/22045/8/testdata/bin/restore-stats-on-planner-tests.py@37 PS8, Line 37: \d.* > Added requirement such that all tables subject to this script must have sta Ack http://gerrit.cloudera.org:8080/#/c/22045/8/testdata/bin/restore-stats-on-planner-tests.py@128 PS8, Line 128: > Limit to do replacement only if all partitions are selected. Ack -- To view, visit http://gerrit.cloudera.org:8080/22045 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I27bab7cee93880cd59f01b9c2d1614dfcabdc682 Gerrit-Change-Number: 22045 Gerrit-PatchSet: 9 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Mon, 02 Dec 2024 21:10:22 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13535: Add script to restore stats on PlannerTest
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/22045 ) Change subject: IMPALA-13535: Add script to restore stats on PlannerTest .. Patch Set 9: (4 comments) http://gerrit.cloudera.org:8080/#/c/22045/8/bin/jenkins/critique-gerrit-review.py File bin/jenkins/critique-gerrit-review.py: http://gerrit.cloudera.org:8080/#/c/22045/8/bin/jenkins/critique-gerrit-review.py@382 PS8, Line 382: comments[path].append(get_comment_input(UNCOMMITTED_CHANGE)) > This discards all changes. To be safe, can we run "git diff" before this me Done http://gerrit.cloudera.org:8080/#/c/22045/8/testdata/bin/restore-stats-on-planner-tests.py File testdata/bin/restore-stats-on-planner-tests.py: http://gerrit.cloudera.org:8080/#/c/22045/8/testdata/bin/restore-stats-on-planner-tests.py@33 PS8, Line 33: \w+).( > Should we accept numbers in the names as well? Maybe use "\w"? Done http://gerrit.cloudera.org:8080/#/c/22045/8/testdata/bin/restore-stats-on-planner-tests.py@37 PS8, Line 37: \d.* > Do we have cases where this is not positive number but something like "unav Added requirement such that all tables subject to this script must have stats collected by default. Thus, there should be no "unavailable" or "-1" case. http://gerrit.cloudera.org:8080/#/c/22045/8/testdata/bin/restore-stats-on-planner-tests.py@128 PS8, Line 128: > In cases where not all the partitions are selected, is it correct to use th Limit to do replacement only if all partitions are selected. -- To view, visit http://gerrit.cloudera.org:8080/22045 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I27bab7cee93880cd59f01b9c2d1614dfcabdc682 Gerrit-Change-Number: 22045 Gerrit-PatchSet: 9 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Mon, 02 Dec 2024 21:00:22 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13535: Add script to restore stats on PlannerTest
Hello Quanlong Huang, Michael Smith, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/22045 to look at the new patch set (#9). Change subject: IMPALA-13535: Add script to restore stats on PlannerTest .. IMPALA-13535: Add script to restore stats on PlannerTest Impala has several PlannerTest that validate over EXTENDED profile and validate cardinality. In EXTENDED level, profile display stored table stats from HMS like 'numRows' and 'totalSize', which can vary between data loads. They are not validated by PlannerTest. But frequent change of these lines can disturb code review process because they are mostly noise. This patch provides a python script restore-stats-on-planner-tests.py to fix the table stats information in selected .test files. The test files to check and fixed table stats is declared inside the script. It is currently focus on tests under functional-planner/queries/PlannerTest/tpcds/ and some that test against tpcds_partitioned_parquet_snap table. critique-gerrit-review.py is updated to run with python3, trigger restore-stats-on-planner-tests.py, and warn if there is any unnecessary table stats change detected. This patch also fixed table size for tests under functional-planner/queries/PlannerTest/tpcds_cpu_cost/ because all tests there runs with synthetic stats declared in stats-3TB.json. Before the patch, the table stats printed in plan is the real stats from HMS. After this patch, the table stats displayed is calculated from the stats-3TB.json. See IMPALA-12726 for more detail on large scale planner test simulation. Testing: - Manually run the script and confirm that stats line are replaced correctly. - Run affected PlannerTest and all passed. Change-Id: I27bab7cee93880cd59f01b9c2d1614dfcabdc682 --- M bin/jenkins/critique-gerrit-review.py M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java A testdata/bin/restore-stats-on-planner-tests.py M testdata/workloads/functional-planner/queries/PlannerTest/agg-node-max-mem-estimate.test M testdata/workloads/functional-planner/queries/PlannerTest/processing-cost-plan-admission-slots.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-processing-cost.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q04.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q05.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q09.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q11.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q54.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q64.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q71.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q74.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q84.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-ddl-iceberg.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-ddl-parquet.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q01.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q02.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q03.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q04.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q05.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q06.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q07.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q08.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q09.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q10a.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q11.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q12.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q13.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q14a.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q14b.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q15.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q16.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q
[Impala-ASF-CR] IMPALA-13535: Add script to restore stats on PlannerTest
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/22045 ) Change subject: IMPALA-13535: Add script to restore stats on PlannerTest .. Patch Set 8: (4 comments) http://gerrit.cloudera.org:8080/#/c/22045/8/bin/jenkins/critique-gerrit-review.py File bin/jenkins/critique-gerrit-review.py: http://gerrit.cloudera.org:8080/#/c/22045/8/bin/jenkins/critique-gerrit-review.py@382 PS8, Line 382: check_call(['git', 'checkout', '.']) This discards all changes. To be safe, can we run "git diff" before this method and abort if there are already some uncommitted changes? http://gerrit.cloudera.org:8080/#/c/22045/8/testdata/bin/restore-stats-on-planner-tests.py File testdata/bin/restore-stats-on-planner-tests.py: http://gerrit.cloudera.org:8080/#/c/22045/8/testdata/bin/restore-stats-on-planner-tests.py@33 PS8, Line 33: [_a-z] Should we accept numbers in the names as well? Maybe use "\w"? http://gerrit.cloudera.org:8080/#/c/22045/8/testdata/bin/restore-stats-on-planner-tests.py@37 PS8, Line 37: \d.* Do we have cases where this is not positive number but something like "unavailable" or "-1"? http://gerrit.cloudera.org:8080/#/c/22045/8/testdata/bin/restore-stats-on-planner-tests.py@128 PS8, Line 128: stats[1] In cases where not all the partitions are selected, is it correct to use the table level size? -- To view, visit http://gerrit.cloudera.org:8080/22045 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I27bab7cee93880cd59f01b9c2d1614dfcabdc682 Gerrit-Change-Number: 22045 Gerrit-PatchSet: 8 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Mon, 02 Dec 2024 19:03:40 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13535: Add script to restore stats on PlannerTest
Michael Smith has posted comments on this change. ( http://gerrit.cloudera.org:8080/22045 ) Change subject: IMPALA-13535: Add script to restore stats on PlannerTest .. Patch Set 8: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/22045 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I27bab7cee93880cd59f01b9c2d1614dfcabdc682 Gerrit-Change-Number: 22045 Gerrit-PatchSet: 8 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Mon, 25 Nov 2024 18:29:17 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13535: Add script to restore stats on PlannerTest
Michael Smith has uploaded a new patch set (#8) to the change originally created by Riza Suminto. ( http://gerrit.cloudera.org:8080/22045 ) Change subject: IMPALA-13535: Add script to restore stats on PlannerTest .. IMPALA-13535: Add script to restore stats on PlannerTest Impala has several PlannerTest that validate over EXTENDED profile and validate cardinality. In EXTENDED level, profile display stored table stats from HMS like 'numRows' and 'totalSize', which can vary between data loads. They are not validated by PlannerTest. But frequent change of these lines can disturb code review process because they are mostly noise. This patch provides a python script restore-stats-on-planner-tests.py to fix the table stats information in selected .test files. The test files to check and fixed table stats is declared inside the script. It is currently focus on tests under functional-planner/queries/PlannerTest/tpcds/ and some that test against tpcds_partitioned_parquet_snap table. critique-gerrit-review.py is updated to run with python3, trigger restore-stats-on-planner-tests.py, and warn if there is any unnecessary table stats change detected. This patch also fixed table size for tests under functional-planner/queries/PlannerTest/tpcds_cpu_cost/ because all tests there runs with synthetic stats declared in stats-3TB.json. Before the patch, the table stats printed in plan is the real stats from HMS. After this patch, the table stats displayed is calculated from the stats-3TB.json. See IMPALA-12726 for more detail on large scale planner test simulation. Testing: - Manually run the script and confirm that stats line are replaced correctly. - Run affected PlannerTest and all passed. Change-Id: I27bab7cee93880cd59f01b9c2d1614dfcabdc682 --- M bin/jenkins/critique-gerrit-review.py M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java A testdata/bin/restore-stats-on-planner-tests.py M testdata/workloads/functional-planner/queries/PlannerTest/agg-node-max-mem-estimate.test M testdata/workloads/functional-planner/queries/PlannerTest/processing-cost-plan-admission-slots.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-processing-cost.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q04.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q05.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q09.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q11.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q54.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q64.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q71.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q74.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q84.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-ddl-iceberg.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-ddl-parquet.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q01.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q02.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q03.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q04.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q05.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q06.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q07.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q08.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q09.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q10a.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q11.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q12.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q13.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q14a.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q14b.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q15.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q16.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q17.test M testdata/workloads/functional-planner
[Impala-ASF-CR] IMPALA-13535: Add script to restore stats on PlannerTest
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/22045 ) Change subject: IMPALA-13535: Add script to restore stats on PlannerTest .. Patch Set 7: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/17572/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/22045 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I27bab7cee93880cd59f01b9c2d1614dfcabdc682 Gerrit-Change-Number: 22045 Gerrit-PatchSet: 7 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Sat, 23 Nov 2024 19:01:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13535: Add script to restore stats on PlannerTest
Hello Quanlong Huang, Michael Smith, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/22045 to look at the new patch set (#7). Change subject: IMPALA-13535: Add script to restore stats on PlannerTest .. IMPALA-13535: Add script to restore stats on PlannerTest Impala has several PlannerTest that validate over EXTENDED profile and validate cardinality. In EXTENDED level, profile display stored table stats from HMS like 'numRows' and 'totalSize', which can vary between data loads. They are not validated by PlannerTest. But frequent change of these lines can disturb code review process because they are mostly noise. This patch provide a python script restore-stats-on-planner-tests.py to fix the table stats information in selected .test files. The test files to check and fixed table stats is declared inside the sript. It is currently focus on tests under functional-planner/queries/PlannerTest/tpcds/ and some that test against tpcds_partitioned_parquet_snap table. critique-gerrit-review.py is updated to run with python3, trigger restore-stats-on-planner-tests.py, and warn is there is any unnecessary table stats change detected. This patch also fixed table size for tests under functional-planner/queries/PlannerTest/tpcds_cpu_cost/ because all tests there runs with synthetic stats declared in stats-3TB.json. Before the patch, the table stats printed in plan is the real stats from HMS. After this patch, the table stats displayed is calculated from the stats-3TB.json. See IMPALA-12726 for more detail on large scale planner test simulation. Testing: - Manually run the script and confirm that stats line are replaced correctly. - Run affected PlannerTest and all passed. Change-Id: I27bab7cee93880cd59f01b9c2d1614dfcabdc682 --- M bin/jenkins/critique-gerrit-review.py M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java A testdata/bin/restore-stats-on-planner-tests.py M testdata/workloads/functional-planner/queries/PlannerTest/agg-node-max-mem-estimate.test M testdata/workloads/functional-planner/queries/PlannerTest/processing-cost-plan-admission-slots.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-processing-cost.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q04.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q05.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q09.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q11.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q54.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q64.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q71.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q74.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q84.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-ddl-iceberg.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-ddl-parquet.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q01.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q02.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q03.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q04.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q05.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q06.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q07.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q08.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q09.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q10a.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q11.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q12.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q13.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q14a.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q14b.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q15.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q16.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q17
[Impala-ASF-CR] IMPALA-13535: Add script to restore stats on PlannerTest
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/22045 ) Change subject: IMPALA-13535: Add script to restore stats on PlannerTest .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/22045/6/testdata/workloads/functional-planner/queries/PlannerTest/agg-node-max-mem-estimate.test File testdata/workloads/functional-planner/queries/PlannerTest/agg-node-max-mem-estimate.test: http://gerrit.cloudera.org:8080/#/c/22045/6/testdata/workloads/functional-planner/queries/PlannerTest/agg-node-max-mem-estimate.test@1217 PS6, Line 1217:HDFS partitions=1824/1824 files=1824 size=200.96MB > gerrit-auto-critic always runs the script from impala#master, not the test I see.. got it. -- To view, visit http://gerrit.cloudera.org:8080/22045 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I27bab7cee93880cd59f01b9c2d1614dfcabdc682 Gerrit-Change-Number: 22045 Gerrit-PatchSet: 6 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Sat, 23 Nov 2024 00:38:02 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13535: Add script to restore stats on PlannerTest
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/22045 ) Change subject: IMPALA-13535: Add script to restore stats on PlannerTest .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/17565/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/22045 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I27bab7cee93880cd59f01b9c2d1614dfcabdc682 Gerrit-Change-Number: 22045 Gerrit-PatchSet: 6 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Sat, 23 Nov 2024 00:24:00 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13535: Add script to restore stats on PlannerTest
Michael Smith has posted comments on this change. ( http://gerrit.cloudera.org:8080/22045 ) Change subject: IMPALA-13535: Add script to restore stats on PlannerTest .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/22045/6/testdata/workloads/functional-planner/queries/PlannerTest/agg-node-max-mem-estimate.test File testdata/workloads/functional-planner/queries/PlannerTest/agg-node-max-mem-estimate.test: http://gerrit.cloudera.org:8080/#/c/22045/6/testdata/workloads/functional-planner/queries/PlannerTest/agg-node-max-mem-estimate.test@1217 PS6, Line 1217:HDFS partitions=1824/1824 files=1824 size=200.96MB > Not sure why gerrit-auto-critic does not complain about this before. Jenkin gerrit-auto-critic always runs the script from impala#master, not the test branch. -- To view, visit http://gerrit.cloudera.org:8080/22045 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I27bab7cee93880cd59f01b9c2d1614dfcabdc682 Gerrit-Change-Number: 22045 Gerrit-PatchSet: 6 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Sat, 23 Nov 2024 00:06:44 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13535: Add script to restore stats on PlannerTest
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/22045 ) Change subject: IMPALA-13535: Add script to restore stats on PlannerTest .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/22045/6/testdata/workloads/functional-planner/queries/PlannerTest/agg-node-max-mem-estimate.test File testdata/workloads/functional-planner/queries/PlannerTest/agg-node-max-mem-estimate.test: http://gerrit.cloudera.org:8080/#/c/22045/6/testdata/workloads/functional-planner/queries/PlannerTest/agg-node-max-mem-estimate.test@1217 PS6, Line 1217:HDFS partitions=1824/1824 files=1824 size=200.96MB Not sure why gerrit-auto-critic does not complain about this before. Jenkins also persistently running it with python2 even though I already change the shebang to python3. https://jenkins.impala.io/job/gerrit-auto-critic/18475/console https://jenkins.impala.io/job/gerrit-auto-critic/18474/console But I run it in my local machine and it produce comments. ./bin/jenkins/critique-gerrit-review.py --dryrun Requirement already satisfied: flake8==3.9.2 in ./gerrit_critic_venv/lib/python3.8/site-packages (3.9.2) Requirement already satisfied: flake8-diff==0.2.2 in ./gerrit_critic_venv/lib/python3.8/site-packages (0.2.2) Requirement already satisfied: mccabe<0.7.0,>=0.6.0 in ./gerrit_critic_venv/lib/python3.8/site-packages (from flake8==3.9.2) (0.6.1) Requirement already satisfied: pycodestyle<2.8.0,>=2.7.0 in ./gerrit_critic_venv/lib/python3.8/site-packages (from flake8==3.9.2) (2.7.0) Requirement already satisfied: pyflakes<2.4.0,>=2.3.0 in ./gerrit_critic_venv/lib/python3.8/site-packages (from flake8==3.9.2) (2.3.1) Requirement already satisfied: blessings in ./gerrit_critic_venv/lib/python3.8/site-packages (from flake8-diff==0.2.2) (1.7) Requirement already satisfied: six in ./gerrit_critic_venv/lib/python3.8/site-packages (from flake8-diff==0.2.2) (1.16.0) Requirement already satisfied: argparse in ./gerrit_critic_venv/lib/python3.8/site-packages (from flake8-diff==0.2.2) (1.4.0) Updated 1 path from the index { "comments": { "testdata/workloads/functional-planner/queries/PlannerTest/agg-node-max-mem-estimate.test": [ { "message": "Table numRows and totalSize is changed in this planner test. It is unnecessary to change them. Consider running testdata/bin/restore-stats-on-planner-tests.py script to restore them to consistent values.", "line": 0, "side": "REVISION" } ] }, "message": "gerrit-auto-critic failed. You can reproduce it locally using command:\n\n python2 bin/jenkins/critique-gerrit-review.py --dryrun\n\nTo run it, you might need a virtual env with virtualenv installed." } -- To view, visit http://gerrit.cloudera.org:8080/22045 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I27bab7cee93880cd59f01b9c2d1614dfcabdc682 Gerrit-Change-Number: 22045 Gerrit-PatchSet: 6 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Sat, 23 Nov 2024 00:01:00 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13535: Add script to restore stats on PlannerTest
Hello Michael Smith, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/22045 to look at the new patch set (#6). Change subject: IMPALA-13535: Add script to restore stats on PlannerTest .. IMPALA-13535: Add script to restore stats on PlannerTest Impala has several PlannerTest that validate over EXTENDED profile and validate cardinality. In EXTENDED level, profile display stored table stats from HMS like 'numRows' and 'totalSize', which can vary between data loads. They are not validated by PlannerTest. But frequent change of these lines can disturb code review process because they are mostly noise. This patch provide a python script restore-stats-on-planner-tests.py to fix the table stats information in selected .test files. The test files to check and fixed table stats is declared inside the sript. It is currently focus on tests under functional-planner/queries/PlannerTest/tpcds/ and some that test against tpcds_partitioned_parquet_snap table. critique-gerrit-review.py is updated to run this script and warn is there is any unnecessary table stats change detected. This patch also fixed table size for tests under functional-planner/queries/PlannerTest/tpcds_cpu_cost/ because all tests there runs with synthetic stats declared in stats-3TB.json. Before the patch, the table stats printed in plan is the real stats from HMS. After this patch, the table stats displayed is calculated from the stats-3TB.json. See IMPALA-12726 for more detail on large scale planner test simulation. Testing: - Manually run the script and confirm that stats line are replaced correctly. - Run affected PlannerTest and all passed. Change-Id: I27bab7cee93880cd59f01b9c2d1614dfcabdc682 --- M bin/jenkins/critique-gerrit-review.py M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java A testdata/bin/restore-stats-on-planner-tests.py M testdata/workloads/functional-planner/queries/PlannerTest/agg-node-max-mem-estimate.test M testdata/workloads/functional-planner/queries/PlannerTest/processing-cost-plan-admission-slots.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-processing-cost.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q04.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q05.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q09.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q11.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q54.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q64.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q71.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q74.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q84.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-ddl-iceberg.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-ddl-parquet.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q01.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q02.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q03.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q04.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q05.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q06.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q07.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q08.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q09.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q10a.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q11.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q12.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q13.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q14a.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q14b.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q15.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q16.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q17.test M testdata/workloads/functional-planner/queries/Planne
[Impala-ASF-CR] IMPALA-13535: Add script to restore stats on PlannerTest
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/22045 ) Change subject: IMPALA-13535: Add script to restore stats on PlannerTest .. Patch Set 5: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/17564/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/22045 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I27bab7cee93880cd59f01b9c2d1614dfcabdc682 Gerrit-Change-Number: 22045 Gerrit-PatchSet: 5 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Fri, 22 Nov 2024 22:59:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13535: Add script to restore stats on PlannerTest
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/22045 ) Change subject: IMPALA-13535: Add script to restore stats on PlannerTest .. Patch Set 4: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/17563/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/22045 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I27bab7cee93880cd59f01b9c2d1614dfcabdc682 Gerrit-Change-Number: 22045 Gerrit-PatchSet: 4 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Fri, 22 Nov 2024 22:39:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13535: Add script to restore stats on PlannerTest
Michael Smith has posted comments on this change. ( http://gerrit.cloudera.org:8080/22045 ) Change subject: IMPALA-13535: Add script to restore stats on PlannerTest .. Patch Set 5: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/22045/5/bin/jenkins/critique-gerrit-review.py File bin/jenkins/critique-gerrit-review.py: http://gerrit.cloudera.org:8080/#/c/22045/5/bin/jenkins/critique-gerrit-review.py@1 PS5, Line 1: #!/usr/bin/python3 I think we have python3 everywhere we'd run this now. -- To view, visit http://gerrit.cloudera.org:8080/22045 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I27bab7cee93880cd59f01b9c2d1614dfcabdc682 Gerrit-Change-Number: 22045 Gerrit-PatchSet: 5 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Fri, 22 Nov 2024 22:34:45 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13535: Add script to restore stats on PlannerTest
Hello Michael Smith, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/22045 to look at the new patch set (#5). Change subject: IMPALA-13535: Add script to restore stats on PlannerTest .. IMPALA-13535: Add script to restore stats on PlannerTest Impala has several PlannerTest that validate over EXTENDED profile and validate cardinality. In EXTENDED level, profile display stored table stats from HMS like 'numRows' and 'totalSize', which can vary between data loads. They are not validated by PlannerTest. But frequent change of these lines can disturb code review process because they are mostly noise. This patch provide a python script restore-stats-on-planner-tests.py to fix the table stats information in selected .test files. The test files to check and fixed table stats is declared inside the sript. It is currently focus on tests under functional-planner/queries/PlannerTest/tpcds/ and some that test against tpcds_partitioned_parquet_snap table. critique-gerrit-review.py is updated to run this script and warn is there is any unnecessary table stats change detected. This patch also fixed table size for tests under functional-planner/queries/PlannerTest/tpcds_cpu_cost/ because all tests there runs with synthetic stats declared in stats-3TB.json. Before the patch, the table stats printed in plan is the real stats from HMS. After this patch, the table stats displayed is calculated from the stats-3TB.json. See IMPALA-12726 for more detail on large scale planner test simulation. Testing: - Manually run the script and confirm that stats line are replaced correctly. - Run affected PlannerTest and all passed. Change-Id: I27bab7cee93880cd59f01b9c2d1614dfcabdc682 --- M bin/jenkins/critique-gerrit-review.py M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java A testdata/bin/restore-stats-on-planner-tests.py M testdata/workloads/functional-planner/queries/PlannerTest/agg-node-max-mem-estimate.test M testdata/workloads/functional-planner/queries/PlannerTest/processing-cost-plan-admission-slots.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-processing-cost.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q04.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q05.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q09.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q11.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q54.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q64.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q71.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q74.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q84.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-ddl-iceberg.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-ddl-parquet.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q01.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q02.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q03.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q04.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q05.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q06.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q07.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q08.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q09.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q10a.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q11.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q12.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q13.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q14a.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q14b.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q15.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q16.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q17.test M testdata/workloads/functional-planner/queries/Planne
[Impala-ASF-CR] IMPALA-13535: Add script to restore stats on PlannerTest
Hello Michael Smith, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/22045 to look at the new patch set (#4). Change subject: IMPALA-13535: Add script to restore stats on PlannerTest .. IMPALA-13535: Add script to restore stats on PlannerTest Impala has several PlannerTest that validate over EXTENDED profile and validate cardinality. In EXTENDED level, profile display stored table stats from HMS like 'numRows' and 'totalSize', which can vary between data loads. They are not validated by PlannerTest. But frequent change of these lines can disturb code review process because they are mostly noise. This patch provide a python script restore-stats-on-planner-tests.py to fix the table stats information in selected .test files. The test files to check and fixed table stats is declared inside the sript. It is currently focus on tests under functional-planner/queries/PlannerTest/tpcds/ and some that test against tpcds_partitioned_parquet_snap table. critique-gerrit-review.py is updated to run this script and warn is there is any unnecessary table stats change detected. This patch also fixed table size for tests under functional-planner/queries/PlannerTest/tpcds_cpu_cost/ because all tests there runs with synthetic stats declared in stats-3TB.json. Before the patch, the table stats printed in plan is the real stats from HMS. After this patch, the table stats displayed is calculated from the stats-3TB.json. See IMPALA-12726 for more detail on large scale planner test simulation. Testing: - Manually run the script and confirm that stats line are replaced correctly. - Run affected PlannerTest and all passed. Change-Id: I27bab7cee93880cd59f01b9c2d1614dfcabdc682 --- M bin/jenkins/critique-gerrit-review.py M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java A testdata/bin/restore-stats-on-planner-tests.py M testdata/workloads/functional-planner/queries/PlannerTest/agg-node-max-mem-estimate.test M testdata/workloads/functional-planner/queries/PlannerTest/processing-cost-plan-admission-slots.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-processing-cost.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q04.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q05.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q09.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q11.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q54.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q64.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q71.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q74.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q84.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-ddl-iceberg.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-ddl-parquet.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q01.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q02.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q03.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q04.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q05.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q06.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q07.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q08.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q09.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q10a.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q11.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q12.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q13.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q14a.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q14b.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q15.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q16.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q17.test M testdata/workloads/functional-planner/queries/Planne
[Impala-ASF-CR] IMPALA-13535: Add script to restore stats on PlannerTest
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/22045 ) Change subject: IMPALA-13535: Add script to restore stats on PlannerTest .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/22045/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/22045/3//COMMIT_MSG@16 PS3, Line 16: This patch provide a python script to fix the table stats information in > Maybe we should have or update a README then. I will try to integrate the script with ./bin/jenkins/critique-gerrit-review.py -- To view, visit http://gerrit.cloudera.org:8080/22045 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I27bab7cee93880cd59f01b9c2d1614dfcabdc682 Gerrit-Change-Number: 22045 Gerrit-PatchSet: 3 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Fri, 22 Nov 2024 20:22:29 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13535: Add script to restore stats on PlannerTest
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/22045 ) Change subject: IMPALA-13535: Add script to restore stats on PlannerTest .. Patch Set 3: (3 comments) http://gerrit.cloudera.org:8080/#/c/22045/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/22045/3//COMMIT_MSG@16 PS3, Line 16: This patch provide a python script to fix the table stats information in > It's not clear to me how this script should be used. Do you run it after an I expect contributor to run ./tesdata/bin/restore-stats-on-planner-test.py if their patch changes any test listed in the script. If nothing change after execution, all good. If the table size line change, that change should be included into patch as well. This is because that table size line is specific only to their test data in their local machine. http://gerrit.cloudera.org:8080/#/c/22045/3//COMMIT_MSG@21 PS3, Line 21: This patch also fixed table size for tests under > Is this the only reason stats changed in the test files? Yes, for all changes under tpcds_cpu_cost/. Changes undet tpcds/ comes from executing restore-stats-on-planner-test.py I plan for more tables and tests to check later: https://gerrit.cloudera.org/c/22047/3/testdata/bin/restore-stats-on-planner-tests.py http://gerrit.cloudera.org:8080/#/c/22045/3/fe/src/main/java/org/apache/impala/catalog/Table.java File fe/src/main/java/org/apache/impala/catalog/Table.java: http://gerrit.cloudera.org:8080/#/c/22045/3/fe/src/main/java/org/apache/impala/catalog/Table.java@223 PS3, Line 223: protected double testMetadataScale_ = -1.0; > I'm not clear why this is changed. Is it called out in the commit message? Not explicitly. I will update the commit message. I change the default to -1 to make distinction whether this is in testing mode (>0) or not (-1). -- To view, visit http://gerrit.cloudera.org:8080/22045 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I27bab7cee93880cd59f01b9c2d1614dfcabdc682 Gerrit-Change-Number: 22045 Gerrit-PatchSet: 3 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Fri, 22 Nov 2024 19:00:29 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13535: Add script to restore stats on PlannerTest
Michael Smith has posted comments on this change. ( http://gerrit.cloudera.org:8080/22045 ) Change subject: IMPALA-13535: Add script to restore stats on PlannerTest .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/22045/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/22045/3//COMMIT_MSG@16 PS3, Line 16: This patch provide a python script to fix the table stats information in > I expect contributor to run ./tesdata/bin/restore-stats-on-planner-test.py Maybe we should have or update a README then. http://gerrit.cloudera.org:8080/#/c/22045/3//COMMIT_MSG@21 PS3, Line 21: This patch also fixed table size for tests under > Yes, for all changes under tpcds_cpu_cost/. I guess I'm still not clear what the script is setting them to. I see in the script it has some hard-coded, so it'd be nice to call that out in the commit message. -- To view, visit http://gerrit.cloudera.org:8080/22045 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I27bab7cee93880cd59f01b9c2d1614dfcabdc682 Gerrit-Change-Number: 22045 Gerrit-PatchSet: 3 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Fri, 22 Nov 2024 19:08:18 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13535: Add script to restore stats on PlannerTest
Michael Smith has posted comments on this change. ( http://gerrit.cloudera.org:8080/22045 ) Change subject: IMPALA-13535: Add script to restore stats on PlannerTest .. Patch Set 3: (3 comments) http://gerrit.cloudera.org:8080/#/c/22045/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/22045/3//COMMIT_MSG@16 PS3, Line 16: This patch provide a python script to fix the table stats information in It's not clear to me how this script should be used. Do you run it after any changes, and it ensures stats don't change unless something about our estimation changes? http://gerrit.cloudera.org:8080/#/c/22045/3//COMMIT_MSG@21 PS3, Line 21: This patch also fixed table size for tests under Is this the only reason stats changed in the test files? http://gerrit.cloudera.org:8080/#/c/22045/3/fe/src/main/java/org/apache/impala/catalog/Table.java File fe/src/main/java/org/apache/impala/catalog/Table.java: http://gerrit.cloudera.org:8080/#/c/22045/3/fe/src/main/java/org/apache/impala/catalog/Table.java@223 PS3, Line 223: protected double testMetadataScale_ = -1.0; I'm not clear why this is changed. Is it called out in the commit message? -- To view, visit http://gerrit.cloudera.org:8080/22045 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I27bab7cee93880cd59f01b9c2d1614dfcabdc682 Gerrit-Change-Number: 22045 Gerrit-PatchSet: 3 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Comment-Date: Fri, 22 Nov 2024 18:51:12 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13535: Add script to restore stats on PlannerTest
Hello Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/22045 to look at the new patch set (#3). Change subject: IMPALA-13535: Add script to restore stats on PlannerTest .. IMPALA-13535: Add script to restore stats on PlannerTest Impala has several PlannerTest that validate over EXTENDED profile and validate cardinality. In EXTENDED level, profile display stored table stats from HMS like 'numRows' and 'totalSize', which can vary between data loads. They are not validated by PlannerTest. But frequent change of these lines can disturb code review process because they are mostly noise. This patch provide a python script to fix the table stats information in selected .test files. It currently focus on tests under functional-planner/queries/PlannerTest/tpcds/ and some that test against tpcds_partitioned_parquet_snap table. This patch also fixed table size for tests under functional-planner/queries/PlannerTest/tpcds_cpu_cost/ because all tests there runs with synthetic stats declared in stats-3TB.json. Testing: - Manually run the script and confirm that stats line are replaced correctly. - Run affected PlannerTest and all passed. Change-Id: I27bab7cee93880cd59f01b9c2d1614dfcabdc682 --- M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java A testdata/bin/restore-stats-on-planner-tests.py M testdata/workloads/functional-planner/queries/PlannerTest/agg-node-max-mem-estimate.test M testdata/workloads/functional-planner/queries/PlannerTest/processing-cost-plan-admission-slots.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-processing-cost.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q04.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q05.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q09.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q11.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q54.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q64.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q71.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q74.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q84.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-ddl-iceberg.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-ddl-parquet.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q01.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q02.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q03.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q04.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q05.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q06.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q07.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q08.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q09.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q10a.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q11.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q12.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q13.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q14a.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q14b.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q15.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q16.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q17.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q18.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q19.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q20.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q21.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q22.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q23a.test M testdata/workloads/functional
[Impala-ASF-CR] IMPALA-13535: Add script to restore stats on PlannerTest
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/22045 ) Change subject: IMPALA-13535: Add script to restore stats on PlannerTest .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/17461/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/22045 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I27bab7cee93880cd59f01b9c2d1614dfcabdc682 Gerrit-Change-Number: 22045 Gerrit-PatchSet: 3 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 13 Nov 2024 22:17:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13535: Add script to restore stats on PlannerTest
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/22045 ) Change subject: IMPALA-13535: Add script to restore stats on PlannerTest .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/17448/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/22045 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I27bab7cee93880cd59f01b9c2d1614dfcabdc682 Gerrit-Change-Number: 22045 Gerrit-PatchSet: 2 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 13 Nov 2024 05:25:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13535: Add script to restore stats on PlannerTest
Hello Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/22045 to look at the new patch set (#2). Change subject: IMPALA-13535: Add script to restore stats on PlannerTest .. IMPALA-13535: Add script to restore stats on PlannerTest Impala has several PlannerTest that validate over EXTENDED profile and validate cardinality. In EXTENDED level, profile display stored table stats from HMS like 'numRows' and 'totalSize', which can vary between data loads. They are not validated by PlannerTest. But frequent change of these lines can disturb code review process because they are mostly noise. This patch provide a python script to fix the table stats information in selected .test files. It currently focus on tests under functional-planner/queries/PlannerTest/tpcds/ and some that test against tpcds_partitioned_parquet_snap table. This patch also fixed table size for tests under functional-planner/queries/PlannerTest/tpcds_cpu_cost/ because all tests there runs with synthetic stats declared in stats-3TB.json. Testing: - Manually run the script and confirm that stats line are replaced correctly. - Run affected PlannerTest and all passed. Change-Id: I27bab7cee93880cd59f01b9c2d1614dfcabdc682 --- M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java A testdata/bin/restore-stats-on-planner-tests.py M testdata/workloads/functional-planner/queries/PlannerTest/agg-node-max-mem-estimate.test M testdata/workloads/functional-planner/queries/PlannerTest/processing-cost-plan-admission-slots.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-processing-cost.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q04.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q05.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q09.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q11.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q54.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q64.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q71.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q74.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q84.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-ddl-iceberg.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-ddl-parquet.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q01.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q02.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q03.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q04.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q05.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q06.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q07.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q08.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q09.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q10a.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q11.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q12.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q13.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q14a.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q14b.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q15.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q16.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q17.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q18.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q19.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q20.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q21.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q22.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q23a.test M testdata/workloads/functional
[Impala-ASF-CR] IMPALA-13535: Add script to restore stats on PlannerTest
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/22045 ) Change subject: IMPALA-13535: Add script to restore stats on PlannerTest .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/17420/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/22045 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I27bab7cee93880cd59f01b9c2d1614dfcabdc682 Gerrit-Change-Number: 22045 Gerrit-PatchSet: 1 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Fri, 08 Nov 2024 20:39:29 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13535: Add script to restore stats on PlannerTest
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/22045 ) Change subject: IMPALA-13535: Add script to restore stats on PlannerTest .. Patch Set 1: (1 comment) gerrit-auto-critic failed. You can reproduce it locally using command: python2 bin/jenkins/critique-gerrit-review.py --dryrun To run it, you might need a virtual env with virtualenv installed. http://gerrit.cloudera.org:8080/#/c/22045/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: http://gerrit.cloudera.org:8080/#/c/22045/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@2062 PS1, Line 2062: bytesToDisplay = (long) ((double) partsPerFs.getValue() / table.getPartitions().size() * testTableSize); line too long (116 > 90) -- To view, visit http://gerrit.cloudera.org:8080/22045 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I27bab7cee93880cd59f01b9c2d1614dfcabdc682 Gerrit-Change-Number: 22045 Gerrit-PatchSet: 1 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Fri, 08 Nov 2024 20:18:44 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13535: Add script to restore stats on PlannerTest
Riza Suminto has uploaded this change for review. ( http://gerrit.cloudera.org:8080/22045 Change subject: IMPALA-13535: Add script to restore stats on PlannerTest .. IMPALA-13535: Add script to restore stats on PlannerTest We have several PlannerTest that validate over EXTENDED profile and validate cardinality. In EXTENDED level, profile display stored table stats from HMS like 'numRows' and 'totalSize', which can vary between data loads. They are not validated by PlannerTest. But frequent change of these lines can disturb code review process because they are mostly noise. This patch provide a python script to fix the table stats information in selected .test files. It currently focus on tests under functional-planner/queries/PlannerTest/tpcds/ and some that test against tpcds_partitioned_parquet_snap table. This patch also fixed table size for tests under functional-planner/queries/PlannerTest/tpcds_cpu_cost/ because all tests there runs with synthetic stats declared in stats-3TB.json. Testing: - Manually run the script and confirm that stats line are replaced correctly. - Run affected PlannerTest and all passed. Change-Id: I27bab7cee93880cd59f01b9c2d1614dfcabdc682 --- M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java A testdata/bin/restore-stats-on-planner-tests.py M testdata/workloads/functional-planner/queries/PlannerTest/agg-node-max-mem-estimate.test M testdata/workloads/functional-planner/queries/PlannerTest/processing-cost-plan-admission-slots.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-processing-cost.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q04.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q05.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q09.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q11.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q54.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q71.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q74.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q84.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-ddl-iceberg.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-ddl-parquet.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q01.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q02.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q03.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q04.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q05.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q06.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q07.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q08.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q09.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q10a.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q11.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q12.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q13.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q14a.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q14b.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q15.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q16.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q17.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q18.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q19.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q20.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q21.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q22.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q23a.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q23b.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q24a.tes