[Impala-ASF-CR] IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB
Hello Yida Wu, Zoltan Borok-Nagy, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21455 to look at the new patch set (#2). Change subject: IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB .. IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB ExprValuesCache uses BATCH_SIZE as a deciding factor to set its capacity. It bounds the capacity such that expr_values_array_ memory usage stays below 256KB. This patch tightens that limit to include all memory usage from ExprValuesCache::MemUsage() instead of expr_values_array_ only. Therefore, setting a very high BATCH_SIZE will not push the total memory usage of ExprValuesCache beyond 256KB. Testing: - Add test_join_queries.py::TestExprValueCache. - Pass core tests. Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0 --- M be/src/exec/hash-table.cc M be/src/exec/hash-table.h A testdata/workloads/tpcds_partitioned/queries M tests/common/test_dimensions.py M tests/query_test/test_join_queries.py 5 files changed, 44 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/21455/2 -- To view, visit http://gerrit.cloudera.org:8080/21455 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0 Gerrit-Change-Number: 21455 Gerrit-PatchSet: 2 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Yida Wu Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB
Riza Suminto has uploaded this change for review. ( http://gerrit.cloudera.org:8080/21455 Change subject: IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB .. IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB ExprValuesCache uses BATCH_SIZE as a deciding factor to set its capacity. It bounds the capacity such that expr_values_array_ memory usage stays below 256KB. This patch tightens that limit to include all memory usage from ExprValuesCache::MemUsage() instead of expr_values_array_ only. Therefore, setting a very high BATCH_SIZE will not push the total memory usage of ExprValuesCache beyond 256KB. Testing: - Add test_join_queries.py::TestExprValueCache. - Pass core tests. Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0 --- M be/src/exec/hash-table.cc M be/src/exec/hash-table.h A testdata/workloads/tpcds_partitioned/queries M tests/common/test_dimensions.py M tests/query_test/test_join_queries.py 5 files changed, 45 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/21455/1 -- To view, visit http://gerrit.cloudera.org:8080/21455 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0 Gerrit-Change-Number: 21455 Gerrit-PatchSet: 1 Gerrit-Owner: Riza Suminto
[Impala-ASF-CR] IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21377 ) Change subject: IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column .. Patch Set 8: Code-Review+2 > Patch Set 7: Verified-1 > > Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/10667/ I forgot to update planner tests after making change in patch set 6. Patch set 8 update those planner tests. No plan shape change, only slight reduction in cardinality and cost from dividing over num rows. Carry +2. -- To view, visit http://gerrit.cloudera.org:8080/21377 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib349d97349d1ee99788645a66be1b81749684d10 Gerrit-Change-Number: 21377 Gerrit-PatchSet: 8 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Fri, 24 May 2024 06:47:29 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column
Hello Aman Sinha, Kurt Deschler, David Rorke, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21377 to look at the new patch set (#8). Change subject: IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column .. IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column Impala frontend can not evaluate BETWEEN/NOT BETWEEN predicate directly. It needs to transform a BetweenPredicate into a CompoundPredicate consisting of upper bound and lower bound BinaryPredicate through BetweenToCompoundRule.java. The BinaryPredicate can then be pushed down or rewritten into other form by another expression rewrite rule. However, the selectivity of BetweenPredicate or its derivatives remains unassigned and often collapses with other unknown selectivity predicates to have collective selectivity equals Expr.DEFAULT_SELECTIVITY (0.1). This patch adds a narrow optimization of BetweenPredicate selectivity when the following criteria are met: 1. The BetweenPredicate is bound to a slot reference of a single column of a table. 2. The column type is discrete, such as INTEGER or DATE. 3. The column stats are available. 4. The column is sufficiently unique based on available stats. 5. The BETWEEN/NOT BETWEEN predicate is in good form (lower bound value <= upper bound value). 6. The final calculated selectivity is less than or equal to Expr.DEFAULT_SELECTIVITY. If these criteria are unmet, the Planner will revert to the old behavior, which is letting the selectivity unassigned. Since this patch only target BetweenPredicate over unique column, the following query will still have the default scan selectivity (0.1): select count(*) from tpch.customer c where c.c_custkey >= 1234 and c.c_custkey <= 2345; While this equivalent query written with BETWEEN predicate will have lower scan selectivity: select count(*) from tpch.customer c where c.c_custkey between 1234 and 2345; This patch calculates the BetweenPredicate selectivity during transformation at BetweenToCompoundRule.java. The selectivity is piggy-backed into the resulting CompoundPredicate and BinaryPredicate as betweenSelectivity_ field, separate from the selectivity_ field. Analyzer.getBoundPredicates() is modified to prioritize the derived BinaryPredicate over ordinary BinaryPredicate in its return value to prevent the derived BinaryPredicate from being eliminated by a matching ordinary BinaryPredicate. Testing: - Add table functional_parquet.unique_with_nulls. - Add FE tests in ExprCardinalityTest#testBetweenSelectivity, ExprCardinalityTest#testNotBetweenSelectivity, and PlannerTest#testScanCardinality. - Pass core tests. Change-Id: Ib349d97349d1ee99788645a66be1b81749684d10 --- M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/catalog/Type.java M fe/src/main/java/org/apache/impala/planner/PlanNode.java M fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java M fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java M testdata/bin/compute-table-stats.sh M testdata/datasets/functional/functional_schema_template.sql M testdata/datasets/functional/schema_constraints.csv M testdata/workloads/functional-planner/queries/PlannerTest/card-scan.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q05.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q12.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q16.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q20.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q21.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q32.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q37.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q40.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q77.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q80.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q82.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q92.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q94.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q95.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q98.test 27 files changed, 3,908 insertions(+), 3,502 deletions(-) git
[Impala-ASF-CR] IMPALA-13105: Fix multiple imported query profiles fail to import/clear at once
Riza Suminto has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/21450 ) Change subject: IMPALA-13105: Fix multiple imported query profiles fail to import/clear at once .. IMPALA-13105: Fix multiple imported query profiles fail to import/clear at once On importing multiple query profiles, insertion of the last query in the queue fails as no delay is provided for the insertion. This has been fixed by providing a delay after inserting the final query. On clearing all the imported queries, in some instances page reloads before clearing the IndexedDB object store. This has been fixed by triggering the page reload after clearing the object store succeeds. Change-Id: I42470fecd0cff6e193f080102575e51d86a2d562 Reviewed-on: http://gerrit.cloudera.org:8080/21450 Reviewed-by: Wenzhe Zhou Reviewed-by: Riza Suminto Tested-by: Impala Public Jenkins --- M www/queries.tmpl 1 file changed, 4 insertions(+), 3 deletions(-) Approvals: Wenzhe Zhou: Looks good to me, but someone else must approve Riza Suminto: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/21450 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I42470fecd0cff6e193f080102575e51d86a2d562 Gerrit-Change-Number: 21450 Gerrit-PatchSet: 2 Gerrit-Owner: Surya Hebbar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou
[Impala-ASF-CR] IMPALA-12939: Bound IMPALA BUILD THREADS for cgroups and memory
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21200 ) Change subject: IMPALA-12939: Bound IMPALA_BUILD_THREADS for cgroups and memory .. Patch Set 8: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/21200/8/bin/impala-config.sh File bin/impala-config.sh: http://gerrit.cloudera.org:8080/#/c/21200/8/bin/impala-config.sh@1067 PS8, Line 1067: AVAILABLE_MEM > Not needed in $(()), as done in the line below. All the others were though, Ack -- To view, visit http://gerrit.cloudera.org:8080/21200 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I87994d0464073fe2d91bc2f7c2592c012e42de71 Gerrit-Change-Number: 21200 Gerrit-PatchSet: 8 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Fri, 24 May 2024 00:30:02 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21377 ) Change subject: IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column .. Patch Set 6: Thank you Aman for your review! I'll run gerrit-verify-dryrun next. -- To view, visit http://gerrit.cloudera.org:8080/21377 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib349d97349d1ee99788645a66be1b81749684d10 Gerrit-Change-Number: 21377 Gerrit-PatchSet: 6 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Fri, 24 May 2024 00:22:28 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13083: Clarify REASON MEM LIMIT TOO LOW FOR RESERVATION
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21436 ) Change subject: IMPALA-13083: Clarify REASON_MEM_LIMIT_TOO_LOW_FOR_RESERVATION .. Patch Set 5: Thank you Andrew for your review! -- To view, visit http://gerrit.cloudera.org:8080/21436 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1ef7fb7e7a194b2036c2948639a06c392590bf66 Gerrit-Change-Number: 21436 Gerrit-PatchSet: 5 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Fri, 24 May 2024 00:19:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12680: Fix NullPointerException during AlterTableAddPartitions
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21430 ) Change subject: IMPALA-12680: Fix NullPointerException during AlterTableAddPartitions .. Patch Set 1: (5 comments) http://gerrit.cloudera.org:8080/#/c/21430/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21430/1//COMMIT_MSG@13 PS1, Line 13: processer nit: processor http://gerrit.cloudera.org:8080/#/c/21430/1/fe/src/main/java/org/apache/impala/service/BackendConfig.java File fe/src/main/java/org/apache/impala/service/BackendConfig.java: http://gerrit.cloudera.org:8080/#/c/21430/1/fe/src/main/java/org/apache/impala/service/BackendConfig.java@463 PS1, Line 463: public void setDebugActions(String debugActions) { : backendCfg_.debug_actions = debugActions; : } Is this needed? AFAIK, all debugAction in CatalogOpExecutor comes from the thrift message(either TDdlExecRequest or TUpdateCatalogRequest). http://gerrit.cloudera.org:8080/#/c/21430/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/21430/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@5303 PS1, Line 5303: @Nullable Map partitionToEventId Can the @Nullable annotation removed here? http://gerrit.cloudera.org:8080/#/c/21430/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@5336 PS1, Line 5336: Preconditions.checkNotNull(partitionToEventId); I think moving this Precondition to the beginning of function is sufficient to test this change. You can stop EP entirely and run testAlterTableWithEpDisabled without specifying new debug action to restart EP. The assertion is that partitionToEventId must never be null. http://gerrit.cloudera.org:8080/#/c/21430/1/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/21430/1/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@4043 PS1, Line 4043: String prevDebugActions = BackendConfig.INSTANCE.debugActions(); Is this variable necessary? BackendConfig.INSTANCE.setDebugActions() never called with other value. -- To view, visit http://gerrit.cloudera.org:8080/21430 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I730fed311ebc09762dccc152d9583d5394b0b9b3 Gerrit-Change-Number: 21430 Gerrit-PatchSet: 1 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Fri, 24 May 2024 00:17:50 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12939: Bound IMPALA BUILD THREADS for cgroups and memory
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21200 ) Change subject: IMPALA-12939: Bound IMPALA_BUILD_THREADS for cgroups and memory .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/21200/8/bin/impala-config.sh File bin/impala-config.sh: http://gerrit.cloudera.org:8080/#/c/21200/8/bin/impala-config.sh@1067 PS8, Line 1067: AVAILABLE_MEM missing $ sign? -- To view, visit http://gerrit.cloudera.org:8080/21200 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I87994d0464073fe2d91bc2f7c2592c012e42de71 Gerrit-Change-Number: 21200 Gerrit-PatchSet: 8 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Fri, 24 May 2024 00:06:17 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21377 ) Change subject: IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column .. Patch Set 6: (2 comments) http://gerrit.cloudera.org:8080/#/c/21377/6/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q05.test File testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q05.test: http://gerrit.cloudera.org:8080/#/c/21377/6/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q05.test@202 PS6, Line 202: | | tuple-ids=19,21 row-size=32B cardinality=9.65M cost=611185704 > The estimate for the rows in the build side of the hashjoin dropped from 7. Bulk of HashJoin memory is for the builder side. And in this case, I think it does not change possibly because perInstanceBuildMinMemReservation > perBuildInstanceMemEstimate before and after the patch. https://github.com/apache/impala/blob/b975165a0acfe37af302dd7c007360633df54917/fe/src/main/java/org/apache/impala/planner/HashJoinNode.java#L322-L334 http://gerrit.cloudera.org:8080/#/c/21377/6/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q12.test File testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q12.test: http://gerrit.cloudera.org:8080/#/c/21377/6/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q12.test@123 PS6, Line 123: 40.26M > This scan's cardinality estimate dropped but there's no BETWEEN predicate o Yes, I think it is lower due to selective partition filter RF002 that comes from 02:SCAN. -- To view, visit http://gerrit.cloudera.org:8080/21377 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib349d97349d1ee99788645a66be1b81749684d10 Gerrit-Change-Number: 21377 Gerrit-PatchSet: 6 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Thu, 23 May 2024 23:41:50 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13105: Fix multiple imported query profiles fail to import/clear at once
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21450 ) Change subject: IMPALA-13105: Fix multiple imported query profiles fail to import/clear at once .. Patch Set 1: Code-Review+2 Looks OK in my machine as well. Tested with Chrome, Firefox, and Safari. -- To view, visit http://gerrit.cloudera.org:8080/21450 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I42470fecd0cff6e193f080102575e51d86a2d562 Gerrit-Change-Number: 21450 Gerrit-PatchSet: 1 Gerrit-Owner: Surya Hebbar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Thu, 23 May 2024 20:52:05 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12939: Bound IMPALA BUILD THREADS for cgroups and memory
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21200 ) Change subject: IMPALA-12939: Bound IMPALA_BUILD_THREADS for cgroups and memory .. Patch Set 7: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/21200 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I87994d0464073fe2d91bc2f7c2592c012e42de71 Gerrit-Change-Number: 21200 Gerrit-PatchSet: 7 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Thu, 23 May 2024 18:32:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12939: Bound IMPALA BUILD THREADS for cgroups and memory
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21200 ) Change subject: IMPALA-12939: Bound IMPALA_BUILD_THREADS for cgroups and memory .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/21200/5/bin/impala-config.sh File bin/impala-config.sh: http://gerrit.cloudera.org:8080/#/c/21200/5/bin/impala-config.sh@1042 PS5, Line 1042: AVAILABLE_MEM=$(awk '/MemTotal/{print int($2/1024/1024)}' /proc/meminfo) Just as CORES, can we set reasonable AVAILABLE_MEM by default if /proc/meminfo does not exist (ie., IS_OSX in L577 is true)? Maybe 8 or 16? I know Impala only build in Linux now, but just in case we can do ARM Mac in the future. -- To view, visit http://gerrit.cloudera.org:8080/21200 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I87994d0464073fe2d91bc2f7c2592c012e42de71 Gerrit-Change-Number: 21200 Gerrit-PatchSet: 5 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Thu, 23 May 2024 16:41:24 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13083: Clarify REASON MEM LIMIT TOO LOW FOR RESERVATION
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21436 ) Change subject: IMPALA-13083: Clarify REASON_MEM_LIMIT_TOO_LOW_FOR_RESERVATION .. Patch Set 3: > Patch Set 3: Verified-1 > > Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/10660/ Hit data loading issue at ubuntu-20.04-from-scratch/2654/ Will try rerun precommit job. -- To view, visit http://gerrit.cloudera.org:8080/21436 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1ef7fb7e7a194b2036c2948639a06c392590bf66 Gerrit-Change-Number: 21436 Gerrit-PatchSet: 3 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Wed, 22 May 2024 22:52:31 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21437 ) Change subject: IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off .. Patch Set 9: (1 comment) http://gerrit.cloudera.org:8080/#/c/21437/9/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/21437/9/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@508 PS9, Line 508: ((MetastoreEventsProcessor) metastoreEventProcessor_).getStatus() > IEventProcessorStatus.DISABLED is configured when NoOpEventProcessor is req Agree. But what I meant is, MetastoreEventProcessor itself can never transition to EventProcessorStatus.DISABLED. If that is the case, a Precondition.checkState(eventProcessorStatus_ != EventProcessorStatus.DISABLED) can be added inside MetastoreEventProcessor.updateStatus() or MetastoreEventProcessor.getStatus(). $ git grep -n -e "updateStatus" -e "eventProcessorStatus_ =" fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:610: private EventProcessorStatus eventProcessorStatus_ = EventProcessorStatus.STOPPED; fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:770: updateStatus(EventProcessorStatus.ACTIVE); fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:837: if (eventProcessorStatus_ == EventProcessorStatus.PAUSED) { fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:840: updateStatus(EventProcessorStatus.PAUSED); fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:929: updateStatus(EventProcessorStatus.ACTIVE); fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:948: updateStatus(EventProcessorStatus.STOPPED); fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:1085: updateStatus(EventProcessorStatus.NEEDS_INVALIDATE); fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:1095: updateStatus(EventProcessorStatus.ERROR); fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:1414: public synchronized void updateStatus(EventProcessorStatus toStatus) { fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:1415: eventProcessorStatus_ = toStatus; fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:3897: eventsProcessor_.updateStatus(EventProcessorStatus.NEEDS_INVALIDATE); fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:3902: eventsProcessor_.updateStatus(EventProcessorStatus.ERROR); -- To view, visit http://gerrit.cloudera.org:8080/21437 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18 Gerrit-Change-Number: 21437 Gerrit-PatchSet: 9 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Wed, 22 May 2024 20:31:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21377 ) Change subject: IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/21377/5/fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java File fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java: http://gerrit.cloudera.org:8080/#/c/21377/5/fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java@127 PS5, Line 127: double sel = Math.max(0.0, (double) diff / table.getNumRows()); > I think test against unique_with_nulls returns wrong cardinality estimate. Changed to divide over table.getNumRows() instead. All checks up to L110 should be sufficient to determine that column is unique. -- To view, visit http://gerrit.cloudera.org:8080/21377 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib349d97349d1ee99788645a66be1b81749684d10 Gerrit-Change-Number: 21377 Gerrit-PatchSet: 6 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Wed, 22 May 2024 01:23:31 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column
Hello Aman Sinha, Kurt Deschler, David Rorke, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21377 to look at the new patch set (#6). Change subject: IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column .. IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column Impala frontend can not evaluate BETWEEN/NOT BETWEEN predicate directly. It needs to transform a BetweenPredicate into a CompoundPredicate consisting of upper bound and lower bound BinaryPredicate through BetweenToCompoundRule.java. The BinaryPredicate can then be pushed down or rewritten into other form by another expression rewrite rule. However, the selectivity of BetweenPredicate or its derivatives remains unassigned and often collapses with other unknown selectivity predicates to have collective selectivity equals Expr.DEFAULT_SELECTIVITY (0.1). This patch adds a narrow optimization of BetweenPredicate selectivity when the following criteria are met: 1. The BetweenPredicate is bound to a slot reference of a single column of a table. 2. The column type is discrete, such as INTEGER or DATE. 3. The column stats are available. 4. The column is sufficiently unique based on available stats. 5. The BETWEEN/NOT BETWEEN predicate is in good form (lower bound value <= upper bound value). 6. The final calculated selectivity is less than or equal to Expr.DEFAULT_SELECTIVITY. If these criteria are unmet, the Planner will revert to the old behavior, which is letting the selectivity unassigned. Since this patch only target BetweenPredicate over unique column, the following query will still have the default scan selectivity (0.1): select count(*) from tpch.customer c where c.c_custkey >= 1234 and c.c_custkey <= 2345; While this equivalent query written with BETWEEN predicate will have lower scan selectivity: select count(*) from tpch.customer c where c.c_custkey between 1234 and 2345; This patch calculates the BetweenPredicate selectivity during transformation at BetweenToCompoundRule.java. The selectivity is piggy-backed into the resulting CompoundPredicate and BinaryPredicate as betweenSelectivity_ field, separate from the selectivity_ field. Analyzer.getBoundPredicates() is modified to prioritize the derived BinaryPredicate over ordinary BinaryPredicate in its return value to prevent the derived BinaryPredicate from being eliminated by a matching ordinary BinaryPredicate. Testing: - Add table functional_parquet.unique_with_nulls. - Add FE tests in ExprCardinalityTest#testBetweenSelectivity, ExprCardinalityTest#testNotBetweenSelectivity, and PlannerTest#testScanCardinality. - Pass core tests. Change-Id: Ib349d97349d1ee99788645a66be1b81749684d10 --- M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/catalog/Type.java M fe/src/main/java/org/apache/impala/planner/PlanNode.java M fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java M fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java M testdata/bin/compute-table-stats.sh M testdata/datasets/functional/functional_schema_template.sql M testdata/datasets/functional/schema_constraints.csv M testdata/workloads/functional-planner/queries/PlannerTest/card-scan.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q05.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q12.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q16.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q20.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q21.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q32.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q37.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q40.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q77.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q80.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q82.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q92.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q94.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q95.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q98.test 27 files changed, 3,908 insertions(+), 3,502 deletions(-) git
[Impala-ASF-CR] IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21437 ) Change subject: IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off .. Patch Set 9: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/21437 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18 Gerrit-Change-Number: 21437 Gerrit-PatchSet: 9 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Wed, 22 May 2024 00:56:29 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21437 ) Change subject: IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off .. Patch Set 9: (3 comments) http://gerrit.cloudera.org:8080/#/c/21437/8/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/21437/8/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2465 PS8, Line 2465: boolean isEpActiveOrDisabled = isEventProcessin > I'm still not sure about this. What if I have a cluster with only Impala as Thanks. I just realize isEventProcessingActive already check for that (metastoreEventProcessor_ instanceof MetastoreEventsProcessor) Done. http://gerrit.cloudera.org:8080/#/c/21437/8/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2469 PS8, Line 2469: metastoreEventProcessor_ instanceo > I think it is better to be verbose here by printing the actual EventProcess Done http://gerrit.cloudera.org:8080/#/c/21437/9/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/21437/9/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@508 PS9, Line 508: ((MetastoreEventsProcessor) metastoreEventProcessor_).getStatus() Just question because I'm a little bit confused. Can this ever return EventProcessorStatus.DISABLED? I think it is impossible, but please correct me if I'm wrong. -- To view, visit http://gerrit.cloudera.org:8080/21437 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18 Gerrit-Change-Number: 21437 Gerrit-PatchSet: 9 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Wed, 22 May 2024 00:56:22 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21377 ) Change subject: IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column .. Patch Set 5: (9 comments) http://gerrit.cloudera.org:8080/#/c/21377/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21377/2//COMMIT_MSG@18 PS2, Line 18: This patch adds a narrow optimization of BetweenPredicate selectivity > If the user's query has a predicate unique_col >=5 AND unique_col <= 10 in Correct. This is mainly because the analysis happen at BetweenToCompoundRule.java. A more general approach will require analyzing all range BinaryPredicate and quickly becomes complicated when they are overlaps with each other. I leave TODO at PlanNode.java to think about a more general approach to this problem. http://gerrit.cloudera.org:8080/#/c/21377/2/fe/src/main/java/org/apache/impala/analysis/Analyzer.java File fe/src/main/java/org/apache/impala/analysis/Analyzer.java: http://gerrit.cloudera.org:8080/#/c/21377/2/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@2626 PS2, Line 2626: > The 'prioritization' part was not clear.. why exactly is it needed ? .. cou Added comment. http://gerrit.cloudera.org:8080/#/c/21377/2/fe/src/main/java/org/apache/impala/analysis/Expr.java File fe/src/main/java/org/apache/impala/analysis/Expr.java: http://gerrit.cloudera.org:8080/#/c/21377/2/fe/src/main/java/org/apache/impala/analysis/Expr.java@1856 PS2, Line 1856:*/ > nit: in the comments above, could you add a note about the acceptDate param Done http://gerrit.cloudera.org:8080/#/c/21377/2/fe/src/main/java/org/apache/impala/planner/PlanNode.java File fe/src/main/java/org/apache/impala/planner/PlanNode.java: http://gerrit.cloudera.org:8080/#/c/21377/2/fe/src/main/java/org/apache/impala/planner/PlanNode.java@759 PS2, Line 759:*lower selectivity if analyzed as a pair. > nit: this comment needs updating to reflect the between selectivity. Added as issue number 3. http://gerrit.cloudera.org:8080/#/c/21377/2/fe/src/main/java/org/apache/impala/planner/PlanNode.java@774 PS2, Line 774: > nit: have 'been' assigned .. Done http://gerrit.cloudera.org:8080/#/c/21377/2/fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java File fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java: http://gerrit.cloudera.org:8080/#/c/21377/2/fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java@104 PS2, Line 104: hasNumDist > Looking at the hasStats() method: It is enough to check for hasNumDistinctValues() only. Updated this code accordingly. http://gerrit.cloudera.org:8080/#/c/21377/2/fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java@107 PS2, Line 107: numNotNulls > Is there any test that covers the case where the null count made a differen Added tests against functional_parquet.unique_with_nulls. http://gerrit.cloudera.org:8080/#/c/21377/5/fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java File fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java: http://gerrit.cloudera.org:8080/#/c/21377/5/fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java@127 PS5, Line 127: double sel = Math.max(0.0, (double) diff / stats.getNumDistinctValues()); I think test against unique_with_nulls returns wrong cardinality estimate. Should be half of what it is now. I'll check this line again. http://gerrit.cloudera.org:8080/#/c/21377/2/fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java File fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java: http://gerrit.cloudera.org:8080/#/c/21377/2/fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java@544 PS2, Line 544:* something like 0.33^2. > nit: this last sentence should be updated for the new behavior. Done -- To view, visit http://gerrit.cloudera.org:8080/21377 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib349d97349d1ee99788645a66be1b81749684d10 Gerrit-Change-Number: 21377 Gerrit-PatchSet: 5 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Wed, 22 May 2024 00:44:15 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column
Hello Aman Sinha, Kurt Deschler, David Rorke, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21377 to look at the new patch set (#5). Change subject: IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column .. IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column Impala frontend can not evaluate BETWEEN/NOT BETWEEN predicate directly. It needs to transform a BetweenPredicate into a CompoundPredicate consisting of upper bound and lower bound BinaryPredicate through BetweenToCompoundRule.java. The BinaryPredicate can then be pushed down or rewritten into other form by another expression rewrite rule. However, the selectivity of BetweenPredicate or its derivatives remains unassigned and often collapses with other unknown selectivity predicates to have collective selectivity equals Expr.DEFAULT_SELECTIVITY (0.1). This patch adds a narrow optimization of BetweenPredicate selectivity when the following criteria are met: 1. The BetweenPredicate is bound to a slot reference of a single column of a table. 2. The column type is discrete, such as INTEGER or DATE. 3. The column stats are available. 4. The column is sufficiently unique based on available stats. 5. The BETWEEN/NOT BETWEEN predicate is in good form (lower bound value <= upper bound value). 6. The final calculated selectivity is less than or equal to Expr.DEFAULT_SELECTIVITY. If these criteria are unmet, the Planner will revert to the old behavior, which is letting the selectivity unassigned. Since this patch only target BetweenPredicate over unique column, the following query will still have the default scan selectivity (0.1): select count(*) from tpch.customer c where c.c_custkey >= 1234 and c.c_custkey <= 2345; While this equivalent query written with BETWEEN predicate will have lower scan selectivity: select count(*) from tpch.customer c where c.c_custkey between 1234 and 2345; This patch calculates the BetweenPredicate selectivity during transformation at BetweenToCompoundRule.java. The selectivity is piggy-backed into the resulting CompoundPredicate and BinaryPredicate as betweenSelectivity_ field, separate from the selectivity_ field. Analyzer.getBoundPredicates() is modified to prioritize the derived BinaryPredicate over ordinary BinaryPredicate in its return value to prevent the derived BinaryPredicate from being eliminated by a matching ordinary BinaryPredicate. Testing: - Add table functional_parquet.unique_with_nulls. - Add FE tests in ExprCardinalityTest#testBetweenSelectivity, ExprCardinalityTest#testNotBetweenSelectivity, and PlannerTest#testScanCardinality. - Pass core tests. Change-Id: Ib349d97349d1ee99788645a66be1b81749684d10 --- M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/catalog/Type.java M fe/src/main/java/org/apache/impala/planner/PlanNode.java M fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java M fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java M testdata/bin/compute-table-stats.sh M testdata/datasets/functional/functional_schema_template.sql M testdata/datasets/functional/schema_constraints.csv M testdata/workloads/functional-planner/queries/PlannerTest/card-scan.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q05.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q12.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q16.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q20.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q21.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q32.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q37.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q40.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q77.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q80.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q82.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q92.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q94.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q95.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q98.test 27 files changed, 3,908 insertions(+), 3,502 deletions(-) git
[Impala-ASF-CR] IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column
Hello Aman Sinha, Kurt Deschler, David Rorke, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21377 to look at the new patch set (#4). Change subject: IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column .. IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column Impala frontend can not evaluate BETWEEN/NOT BETWEEN predicate directly. It needs to transform a BetweenPredicate into a CompoundPredicate consisting of upper bound and lower bound BinaryPredicate through BetweenToCompoundRule.java. The BinaryPredicate can then be pushed down or rewritten into other form by another expression rewrite rule. However, the selectivity of BetweenPredicate or its derivatives remains unassigned and often collapses with other unknown selectivity predicates to have collective selectivity equals Expr.DEFAULT_SELECTIVITY (0.1). This patch adds a narrow optimization of BetweenPredicate selectivity when the following criteria are met: 1. The BetweenPredicate is bound to a slot reference of a single column of a table. 2. The column type is discrete, such as INTEGER or DATE. 3. The column stats are available. 4. The column is sufficiently unique based on available stats. 5. The BETWEEN/NOT BETWEEN predicate is in good form (lower bound value <= upper bound value). 6. The final calculated selectivity is less than or equal to Expr.DEFAULT_SELECTIVITY. If these criteria are unmet, the Planner will revert to the old behavior, which is letting the selectivity unassigned. Since this patch only target BetweenPredicate over unique column, the following query will still have the default scan selectivity (0.1): select count(*) from tpch.customer c where c.c_custkey >= 1234 and c.c_custkey <= 2345; While this equivalent query written with BETWEEN predicate will have lower scan selectivity: select count(*) from tpch.customer c where c.c_custkey between 1234 and 2345; This patch calculates the BetweenPredicate selectivity during transformation at BetweenToCompoundRule.java. The selectivity is piggy-backed into the resulting CompoundPredicate and BinaryPredicate as betweenSelectivity_ field, separate from the selectivity_ field. Analyzer.getBoundPredicates() is modified to prioritize the derived BinaryPredicate over ordinary BinaryPredicate in its return value to prevent the derived BinaryPredicate from being eliminated by a matching ordinary BinaryPredicate. Testing: - Add table functional_parquet.unique_with_nulls. - Add FE tests in ExprCardinalityTest#testBetweenSelectivity, ExprCardinalityTest#testNotBetweenSelectivity, and PlannerTest#testScanCardinality. - Pass core tests. Change-Id: Ib349d97349d1ee99788645a66be1b81749684d10 --- M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/catalog/Type.java M fe/src/main/java/org/apache/impala/planner/PlanNode.java M fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java M fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java M testdata/bin/compute-table-stats.sh M testdata/datasets/functional/functional_schema_template.sql M testdata/datasets/functional/schema_constraints.csv M testdata/workloads/functional-planner/queries/PlannerTest/card-scan.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q05.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q12.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q16.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q20.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q21.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q32.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q37.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q40.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q77.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q80.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q82.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q92.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q94.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q95.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q98.test 27 files changed, 3,908 insertions(+), 3,502 deletions(-) git
[Impala-ASF-CR] IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21437 ) Change subject: IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off .. Patch Set 8: (3 comments) http://gerrit.cloudera.org:8080/#/c/21437/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/21437/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2471 PS5, Line 2471: f no validWriteIdList is > L2471 and L2483 are two if conditions and I don't see event processor switc Done http://gerrit.cloudera.org:8080/#/c/21437/8/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/21437/8/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2465 PS8, Line 2465: boolean isEpActive = isEventProcessingActive(); I'm still not sure about this. What if I have a cluster with only Impala as query engine (no Hive, no Spark), and I turn off Event Processor entirely because I don't need it (metastoreEventProcessor_ is a NoOpEventProcessor)? Will Catalogd forced to reload table all the time? I think it should be: boolean isEpActiveOrDisabled = ... http://gerrit.cloudera.org:8080/#/c/21437/8/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2469 PS8, Line 2469: isEpActive ? "ACTIVE" : "INACTIVE" I think it is better to be verbose here by printing the actual EventProcessorStatus enum or "NONE" if metastoreEventProcessor_ is not a MetastoreEventsProcessor. -- To view, visit http://gerrit.cloudera.org:8080/21437 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18 Gerrit-Change-Number: 21437 Gerrit-PatchSet: 8 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Tue, 21 May 2024 18:10:56 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21437 ) Change subject: IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off .. Patch Set 5: (5 comments) http://gerrit.cloudera.org:8080/#/c/21437/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21437/1//COMMIT_MSG@11 PS1, Line 11: NullPointerException > Ack. The issue happens with transactional tables as well. Done http://gerrit.cloudera.org:8080/#/c/21437/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/21437/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2465 PS5, Line 2465: LOG.trace("table {} exits in cache, last synced id {}", tbl.getFullName(), : tbl.getLastSyncedEventId()); Add trace LOG about EP status here, whether tbl is loaded, and whether tbl is transactional or not. Also wrap it with if (LOG.isTraceEnabled()) { ... } http://gerrit.cloudera.org:8080/#/c/21437/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2471 PS5, Line 2471: isEventProcessingActive() What happen if isEventProcessingActive() changed between L2471 and L2483? Should it be called once and stored to boolean variable instead? http://gerrit.cloudera.org:8080/#/c/21437/4/tests/custom_cluster/test_events_custom_configs.py File tests/custom_cluster/test_events_custom_configs.py: http://gerrit.cloudera.org:8080/#/c/21437/4/tests/custom_cluster/test_events_custom_configs.py@1266 PS4, Line 1266: test_no_ep_metadata_reload_for_insert > Good catch!! The issue happens with transactional tables as well. Done http://gerrit.cloudera.org:8080/#/c/21437/5/tests/custom_cluster/test_events_custom_configs.py File tests/custom_cluster/test_events_custom_configs.py: http://gerrit.cloudera.org:8080/#/c/21437/5/tests/custom_cluster/test_events_custom_configs.py@1267 PS5, Line 1267: test_table Turn this into a parameter of function verify_partition below. -- To view, visit http://gerrit.cloudera.org:8080/21437 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18 Gerrit-Change-Number: 21437 Gerrit-PatchSet: 5 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Tue, 21 May 2024 16:07:02 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13091: query test.test iceberg.TestIcebergV2Table.test metadata tables fails on an expected constant
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21440 ) Change subject: IMPALA-13091: query_test.test_iceberg.TestIcebergV2Table.test_metadata_tables fails on an expected constant .. Patch Set 1: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/21440/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21440/1//COMMIT_MSG@16 PS1, Line 16: problematic values > Specifically mention which values are problematic. Is it just "column_size" It seems to be the case. Marked this as Resolved. -- To view, visit http://gerrit.cloudera.org:8080/21440 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic056079eed87a68afa95cd111ce2037314cd9620 Gerrit-Change-Number: 21440 Gerrit-PatchSet: 1 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Peter Rozsa Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Mon, 20 May 2024 23:59:11 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13083: Clarify REASON MEM LIMIT TOO LOW FOR RESERVATION
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21436 ) Change subject: IMPALA-13083: Clarify REASON_MEM_LIMIT_TOO_LOW_FOR_RESERVATION .. Patch Set 2: (15 comments) http://gerrit.cloudera.org:8080/#/c/21436/1/be/src/scheduling/admission-controller-test.cc File be/src/scheduling/admission-controller-test.cc: http://gerrit.cloudera.org:8080/#/c/21436/1/be/src/scheduling/admission-controller-test.cc@422 PS1, Line 422: an > Nit: and Done http://gerrit.cloudera.org:8080/#/c/21436/1/be/src/scheduling/admission-controller-test.cc@443 PS1, Line 443: no > Nit: nor Done http://gerrit.cloudera.org:8080/#/c/21436/1/be/src/scheduling/admission-controller-test.cc@1281 PS1, Line 1281: > Is this comment right as we don't call __set_clamp_mem_limit_query_option() clamp_mem_limit_query_option actually default to True. I removed all pool_config.__set_clamp_mem_limit_query_option(true); and add test DedicatedCoordAdmissionDisabledPoolMemLimit and DedicatedCoordAdmissionIgnoreMemClamp where pool_config.__set_clamp_mem_limit_query_option(false). http://gerrit.cloudera.org:8080/#/c/21436/1/be/src/scheduling/admission-controller-test.cc@1357 PS1, Line 1357: string not_admitted_reason = "--not set--"; > Nit: add ASSERT_NE(nullptr, schedule_state); Done http://gerrit.cloudera.org:8080/#/c/21436/1/be/src/scheduling/admission-controller-test.cc@1386 PS1, Line 1386: pool_config.__set_min_query_mem_limit(MEGABYTE); > Nit: add ASSERT_NE(nullptr, schedule_state); Done http://gerrit.cloudera.org:8080/#/c/21436/1/be/src/scheduling/admission-controller.cc File be/src/scheduling/admission-controller.cc: http://gerrit.cloudera.org:8080/#/c/21436/1/be/src/scheduling/admission-controller.cc@206 PS1, Line 206: in > "in the" might be clearer Done http://gerrit.cloudera.org:8080/#/c/21436/1/be/src/scheduling/admission-controller.cc@211 PS1, Line 211: in > "in the" might be clearer Done http://gerrit.cloudera.org:8080/#/c/21436/1/be/src/scheduling/admission-controller.cc@220 PS1, Line 220: in > "in the" might be clearer Done http://gerrit.cloudera.org:8080/#/c/21436/1/be/src/scheduling/schedule-state.h File be/src/scheduling/schedule-state.h: http://gerrit.cloudera.org:8080/#/c/21436/1/be/src/scheduling/schedule-state.h@375 PS1, Line 375: /// Helper functions to update either > Maybe have short comments for these functions. Done http://gerrit.cloudera.org:8080/#/c/21436/1/common/protobuf/admission_control_service.proto File common/protobuf/admission_control_service.proto: http://gerrit.cloudera.org:8080/#/c/21436/1/common/protobuf/admission_control_service.proto@207 PS1, Line 207: coord_backend_mem_to_ad > coord_backend_mem_to_admit? Done http://gerrit.cloudera.org:8080/#/c/21436/1/docs/topics/impala_admission.xml File docs/topics/impala_admission.xml: http://gerrit.cloudera.org:8080/#/c/21436/1/docs/topics/impala_admission.xml@286 PS1, Line 286: is se > Nit: "is set to" Done http://gerrit.cloudera.org:8080/#/c/21436/1/docs/topics/impala_admission.xml@309 PS1, Line 309: recommend > Nit: can you fix this to "recommend" while you are here? Done http://gerrit.cloudera.org:8080/#/c/21436/1/docs/topics/impala_admission.xml@336 PS1, Line 336: E > There is some weird character between "MEM_LIMIT" and "query option" Should be fixed. http://gerrit.cloudera.org:8080/#/c/21436/1/docs/topics/impala_admission.xml@353 PS1, Line 353: > Nit: another weird character here Should be fixed. http://gerrit.cloudera.org:8080/#/c/21436/1/docs/topics/impala_admission.xml@357 PS1, Line 357: nodes. > Nit: contains Done. -- To view, visit http://gerrit.cloudera.org:8080/21436 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1ef7fb7e7a194b2036c2948639a06c392590bf66 Gerrit-Change-Number: 21436 Gerrit-PatchSet: 2 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Mon, 20 May 2024 22:12:19 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13083: Clarify REASON MEM LIMIT TOO LOW FOR RESERVATION
Hello Andrew Sherman, Abhishek Rawat, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21436 to look at the new patch set (#2). Change subject: IMPALA-13083: Clarify REASON_MEM_LIMIT_TOO_LOW_FOR_RESERVATION .. IMPALA-13083: Clarify REASON_MEM_LIMIT_TOO_LOW_FOR_RESERVATION This patch improves REASON_MEM_LIMIT_TOO_LOW_FOR_RESERVATION error message by saying the specific configuration that must be adjusted such that the query can pass the Admission Control. New fields 'per_backend_mem_to_admit_source' and 'coord_backend_mem_to_admit_source' of type MemLimitSourcePB are added into QuerySchedulePB. These fields explain what limiting factor drives final numbers at 'per_backend_mem_to_admit' and 'coord_backend_mem_to_admit' respectively. In turn, Admission Control will use this information to compose a more informative error message that the user can act upon. The new error message pattern also explicitly mentions "Per Host Min Memory Reservation" as a place to look at to investigate memory reservations scheduled for each backend node. Updated documentation with examples of query rejection by Admission Control and how to read the error message. Testing: - Add BE tests at admission-controller-test.cc - Adjust and pass affected EE tests Change-Id: I1ef7fb7e7a194b2036c2948639a06c392590bf66 --- M be/src/scheduling/admission-controller-test.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/schedule-state.cc M be/src/scheduling/schedule-state.h M be/src/scheduling/scheduler.cc M be/src/scheduling/scheduler.h M common/protobuf/admission_control_service.proto M docs/topics/impala_admission.xml M docs/topics/impala_mem_limit.xml M testdata/workloads/functional-query/queries/QueryTest/admission-max-min-mem-limits.test M testdata/workloads/functional-query/queries/QueryTest/admission-reject-min-reservation.test M testdata/workloads/functional-query/queries/QueryTest/runtime_row_filter_reservations.test 12 files changed, 689 insertions(+), 127 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/36/21436/2 -- To view, visit http://gerrit.cloudera.org:8080/21436 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1ef7fb7e7a194b2036c2948639a06c392590bf66 Gerrit-Change-Number: 21436 Gerrit-PatchSet: 2 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-12800: Skip O(n^2) ExprSubstitutionMap::verify() for release builds
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21444 ) Change subject: IMPALA-12800: Skip O(n^2) ExprSubstitutionMap::verify() for release builds .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/21444/1/fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java File fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java: http://gerrit.cloudera.org:8080/#/c/21444/1/fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java@178 PS1, Line 178: private void verify() { What if we move the check inside the verify() method? if (!BackendConfig.INSTANCE.isReleaseBuild()) return; -- To view, visit http://gerrit.cloudera.org:8080/21444 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ieeacfec6a5b487076ce5b19747319630616411f0 Gerrit-Change-Number: 21444 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Mon, 20 May 2024 21:53:03 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21437 ) Change subject: IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off .. Patch Set 4: (6 comments) http://gerrit.cloudera.org:8080/#/c/21437/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21437/1//COMMIT_MSG@11 PS1, Line 11: NullPointerException > Jira has more details about this. But to explain this in simple terms, with Thanks for your explanation. Please describe those scenario in this commit message as short bullet points. I think the transactional property should also be highlighted? http://gerrit.cloudera.org:8080/#/c/21437/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/21437/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2472 PS4, Line 2472: !AcidUtils.isTransactionalTable(tbl.getMetaStoreTable().getParameters( && : isEventProcessingActive() nit: I think checking isEventProcessingActive() first is better here. http://gerrit.cloudera.org:8080/#/c/21437/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/21437/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1748 PS1, Line 1748: > Discard this change. This is not completely addressing the concern. Done http://gerrit.cloudera.org:8080/#/c/21437/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1751 PS1, Line 1751:debugAction, partitionToEventId, reason, catalogTimeline); > Same as above Done http://gerrit.cloudera.org:8080/#/c/21437/4/fe/src/main/java/org/apache/impala/util/AcidUtils.java File fe/src/main/java/org/apache/impala/util/AcidUtils.java: http://gerrit.cloudera.org:8080/#/c/21437/4/fe/src/main/java/org/apache/impala/util/AcidUtils.java@679 PS4, Line 679: If the tbl metadata is a :* superset of the metadata view represented by the given validWriteIdList this :* method returns a value greater than 0. If they are an exact match of each other, :* it returns 0 and if the table ValidWriteIdList is behind the provided :* validWriteIdList this return -1. nit: update this comment with the new transactional property check. http://gerrit.cloudera.org:8080/#/c/21437/4/tests/custom_cluster/test_events_custom_configs.py File tests/custom_cluster/test_events_custom_configs.py: http://gerrit.cloudera.org:8080/#/c/21437/4/tests/custom_cluster/test_events_custom_configs.py@1266 PS4, Line 1266: test_no_ep_metadata_reload_for_insert Can you test the same scenario over transactional table? Or is there an existing test that already does that? -- To view, visit http://gerrit.cloudera.org:8080/21437 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18 Gerrit-Change-Number: 21437 Gerrit-PatchSet: 4 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Mon, 20 May 2024 18:38:33 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21437 ) Change subject: IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off .. Patch Set 1: (3 comments) http://gerrit.cloudera.org:8080/#/c/21437/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21437/1//COMMIT_MSG@11 PS1, Line 11: NullPointerException What is the source of NPE? Where does it hit / manifest problem? http://gerrit.cloudera.org:8080/#/c/21437/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/21437/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1748 PS1, Line 1748: not active Do you mean "not active or not healthy." ? // possible status of event processor public enum EventProcessorStatus { PAUSED, // event processor is paused because catalog is being reset concurrently ACTIVE, // event processor is scheduled at a given frequency ERROR, // event processor is in error state and event processing has stopped NEEDS_INVALIDATE, // event processor could not resolve certain events and needs a // manual invalidate command to reset the state (See AlterEvent for a example) STOPPED, // event processor is shutdown. No events will be processed DISABLED // event processor is not configured to run } http://gerrit.cloudera.org:8080/#/c/21437/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1751 PS1, Line 1751: (HdfsTable) tbl).load(reuseMetadata, msClient.getHiveClient(), msTbl, This line remains unchanged for a long time since Fri Dec 18 14:31:24 (IMPALA-1480: Slow DDL statements with large number of partitions), presumably even before Impala have HMS Event Processor. What is recently change that makes this cause a problem? Is metadata reuse also not allowed if EventProcessorStatus.DISABLED? -- To view, visit http://gerrit.cloudera.org:8080/21437 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18 Gerrit-Change-Number: 21437 Gerrit-PatchSet: 1 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Fri, 17 May 2024 19:24:58 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13091: query test.test iceberg.TestIcebergV2Table.test metadata tables fails on an expected constant
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21440 ) Change subject: IMPALA-13091: query_test.test_iceberg.TestIcebergV2Table.test_metadata_tables fails on an expected constant .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/21440/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21440/1//COMMIT_MSG@16 PS1, Line 16: problematic values Specifically mention which values are problematic. Is it just "column_size"? -- To view, visit http://gerrit.cloudera.org:8080/21440 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic056079eed87a68afa95cd111ce2037314cd9620 Gerrit-Change-Number: 21440 Gerrit-PatchSet: 1 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Peter Rozsa Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Fri, 17 May 2024 14:49:00 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13040: (addendum) Inject larger delay for sanitized build
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21439 ) Change subject: IMPALA-13040: (addendum) Inject larger delay for sanitized build .. Patch Set 2: Code-Review+2 Thanks Csaba! Carry +2. Will wait for https://gerrit.cloudera.org/c/21440/ to merge first. -- To view, visit http://gerrit.cloudera.org:8080/21439 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I09e5ae4646f53632e9a9f519d370a33a5534df19 Gerrit-Change-Number: 21439 Gerrit-PatchSet: 2 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Fri, 17 May 2024 14:25:49 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13040: (addendum) Inject larger delay for sanitized build
Hello Laszlo Gaal, Csaba Ringhofer, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21439 to look at the new patch set (#2). Change subject: IMPALA-13040: (addendum) Inject larger delay for sanitized build .. IMPALA-13040: (addendum) Inject larger delay for sanitized build TestLateQueryStateInit has been flaky in sanitized build because the largest delay injection time is fixed at 3 seconds. This patch fixes the issue by setting largest delay injection time equal to RUNTIME_FILTER_WAIT_TIME_MS, which is 3 second for regular build and 10 seconds for sanitized build. Testing: - Loop and pass test_runtime_filter_aggregation.py 10 times in ASAN build and 50 times in UBSAN build. Change-Id: I09e5ae4646f53632e9a9f519d370a33a5534df19 --- M tests/custom_cluster/test_runtime_filter_aggregation.py 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/39/21439/2 -- To view, visit http://gerrit.cloudera.org:8080/21439 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I09e5ae4646f53632e9a9f519d370a33a5534df19 Gerrit-Change-Number: 21439 Gerrit-PatchSet: 2 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-13020 (part 2): Split out external vs internal Thrift max message size
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21420 ) Change subject: IMPALA-13020 (part 2): Split out external vs internal Thrift max message size .. Patch Set 3: Code-Review+1 > Patch Set 2: > > (1 comment) Thanks for checking this. I just remember now that there is a limit in Thrift-Java coming from Integer.MAX_VALUE somewhere. I can +2 this if no one else has comment. -- To view, visit http://gerrit.cloudera.org:8080/21420 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib9a649ef49a8a99c7bd9a1b73c37c4c621661311 Gerrit-Change-Number: 21420 Gerrit-PatchSet: 3 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Fri, 17 May 2024 14:22:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13040: (addendum) Inject larger delay for sanitized build
Riza Suminto has uploaded this change for review. ( http://gerrit.cloudera.org:8080/21439 Change subject: IMPALA-13040: (addendum) Inject larger delay for sanitized build .. IMPALA-13040: (addendum) Inject larger delay for sanitized build TestLateQueryStateInit has been flaky in sanitized build because the largest delay injection time is fixed at 3 seconds. This patch fix the issue by setting largest delay injection time equal to RUNTIME_FILTER_WAIT_TIME_MS, which is 3 second for regular build and 10 seconds for sanitized build. Testing: - Loop and pass test_runtime_filter_aggregation.py 10 times in ASAN build and 50 times in UBSAN build. Change-Id: I09e5ae4646f53632e9a9f519d370a33a5534df19 --- M tests/custom_cluster/test_runtime_filter_aggregation.py 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/39/21439/1 -- To view, visit http://gerrit.cloudera.org:8080/21439 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I09e5ae4646f53632e9a9f519d370a33a5534df19 Gerrit-Change-Number: 21439 Gerrit-PatchSet: 1 Gerrit-Owner: Riza Suminto
[Impala-ASF-CR] IMPALA-13083: Clarify REASON MEM LIMIT TOO LOW FOR RESERVATION
Riza Suminto has uploaded this change for review. ( http://gerrit.cloudera.org:8080/21436 Change subject: IMPALA-13083: Clarify REASON_MEM_LIMIT_TOO_LOW_FOR_RESERVATION .. IMPALA-13083: Clarify REASON_MEM_LIMIT_TOO_LOW_FOR_RESERVATION This patch improves REASON_MEM_LIMIT_TOO_LOW_FOR_RESERVATION error message by saying the specific configuration that must be adjusted such that the query can pass the Admission Control. New fields 'per_backend_mem_to_admit_source' and 'coord_backend_mem_to_admit_source' of type MemLimitSourcePB are added into QuerySchedulePB. These fields explain what limiting factor drives final numbers at 'per_backend_mem_to_admit' and 'coord_backend_mem_to_admit' respectively. In turn, Admission Control will use this information to compose a more informative error message that the user can act upon. The new error message pattern also explicitly mentions "Per Host Min Memory Reservation" as a place to look at to investigate memory reservations scheduled for each backend node. Updated documentation with examples of query rejection by Admission Control and how to read the error message. Testing: - Add BE tests at admission-controller-test.cc - Adjust and pass affected EE tests Change-Id: I1ef7fb7e7a194b2036c2948639a06c392590bf66 --- M be/src/scheduling/admission-controller-test.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/schedule-state.cc M be/src/scheduling/schedule-state.h M be/src/scheduling/scheduler.cc M be/src/scheduling/scheduler.h M common/protobuf/admission_control_service.proto M docs/topics/impala_admission.xml M docs/topics/impala_mem_limit.xml M testdata/workloads/functional-query/queries/QueryTest/admission-max-min-mem-limits.test M testdata/workloads/functional-query/queries/QueryTest/admission-reject-min-reservation.test M testdata/workloads/functional-query/queries/QueryTest/runtime_row_filter_reservations.test 12 files changed, 649 insertions(+), 126 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/36/21436/1 -- To view, visit http://gerrit.cloudera.org:8080/21436 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I1ef7fb7e7a194b2036c2948639a06c392590bf66 Gerrit-Change-Number: 21436 Gerrit-PatchSet: 1 Gerrit-Owner: Riza Suminto
[Impala-ASF-CR] IMPALA-13020 (part 2): Split out external vs internal Thrift max message size
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21420 ) Change subject: IMPALA-13020 (part 2): Split out external vs internal Thrift max message size .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/21420/2/be/src/rpc/thrift-util.cc File be/src/rpc/thrift-util.cc: http://gerrit.cloudera.org:8080/#/c/21420/2/be/src/rpc/thrift-util.cc@94 PS2, Line 94: : // The ThriftSerializer uses the DefaultInternalTConfiguration() with the higher limit, : // because this is used on our internal Thrift structures. : ThriftSerializer::ThriftSerializer(bool compact, int initial_buffer_size) > If I'm understanding SerDeBuffer100MB, we're making sure we can serialize s I thought of increasing it to over than 2GB. But thinking again, it is probably impractical to do so in BE test. So yeah, maybe we should leave it as it is. -- To view, visit http://gerrit.cloudera.org:8080/21420 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib9a649ef49a8a99c7bd9a1b73c37c4c621661311 Gerrit-Change-Number: 21420 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Thu, 16 May 2024 17:32:44 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13020 (part 2): Split out external vs internal Thrift max message size
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21420 ) Change subject: IMPALA-13020 (part 2): Split out external vs internal Thrift max message size .. Patch Set 2: Also, please update testdata/scale_test_metadata/README.md when applicable. -- To view, visit http://gerrit.cloudera.org:8080/21420 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib9a649ef49a8a99c7bd9a1b73c37c4c621661311 Gerrit-Change-Number: 21420 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Thu, 16 May 2024 02:16:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13020 (part 2): Split out external vs internal Thrift max message size
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21420 ) Change subject: IMPALA-13020 (part 2): Split out external vs internal Thrift max message size .. Patch Set 2: (4 comments) Please also double check with testing methods mentioned in http://gerrit.cloudera.org:8080/19179 (IMPALA-11669: (addendum) Set TConfiguration in TMemoryBuffer). http://gerrit.cloudera.org:8080/#/c/21420/1/be/src/common/init.cc File be/src/common/init.cc: http://gerrit.cloudera.org:8080/#/c/21420/1/be/src/common/init.cc@548 PS1, Line 548:ThriftDefaultMaxMessageSize())); > It's defined as an int64 gflag, so gflags is parsing it with the right data Ok, ThriftDefaultMaxMessageSize() is the minimum value. Make sense to me. http://gerrit.cloudera.org:8080/#/c/21420/1/be/src/rpc/authentication.cc File be/src/rpc/authentication.cc: http://gerrit.cloudera.org:8080/#/c/21420/1/be/src/rpc/authentication.cc@1350 PS1, Line 1350: : // This function is called immediately prior to sasl_client_ > Changed this into a function that verifies the max message size has been in Done http://gerrit.cloudera.org:8080/#/c/21420/1/be/src/rpc/thrift-server.cc File be/src/rpc/thrift-server.cc: http://gerrit.cloudera.org:8080/#/c/21420/1/be/src/rpc/thrift-server.cc@288 PS1, Line 288: is_external_facing_(is_external_facing) > I coded this up and ended up not liking it very much. ThriftServer is a lib Ack http://gerrit.cloudera.org:8080/#/c/21420/2/be/src/rpc/thrift-util.cc File be/src/rpc/thrift-util.cc: http://gerrit.cloudera.org:8080/#/c/21420/2/be/src/rpc/thrift-util.cc@94 PS2, Line 94: : // The ThriftSerializer uses the DefaultInternalTConfiguration() with the higher limit, : // because this is used on our internal Thrift structures. : ThriftSerializer::ThriftSerializer(bool compact, int initial_buffer_size) Can you update SerDeBuffer100MB test to cover this change? -- To view, visit http://gerrit.cloudera.org:8080/21420 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib9a649ef49a8a99c7bd9a1b73c37c4c621661311 Gerrit-Change-Number: 21420 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Thu, 16 May 2024 02:14:52 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13051: Speed up, refactor query log tests
Riza Suminto has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/21358 ) Change subject: IMPALA-13051: Speed up, refactor query log tests .. IMPALA-13051: Speed up, refactor query log tests Sets faster default shutdown_grace_period_s and shutdown_deadline_s when impalad_graceful_shutdown=True in tests. Impala waits until grace period has passed and all queries are stopped (or deadline is exceeded) before flushing the query log, so grace period of 0 is sufficient. Adds them in setup_method to reduce duplication in test declarations. Re-uses TQueryTableColumn Thrift definitions for testing. Moves waiting for query log table to exist to setup_method rather than as a side-effect of get_client. Refactors workload management code to reduce if-clause nesting. Adds functional query workload tests for both the sys.impala_query_log and the sys.impala_query_live tables to assert the names and order of the individual columns within each table. Renames the python tests for the sys.impala_query_log table removing the unnecessary "_query_log_table_" string from the name of each test. Change-Id: I1127ef041a3e024bf2b262767d56ec5f29bf3855 Reviewed-on: http://gerrit.cloudera.org:8080/21358 Tested-by: Impala Public Jenkins Reviewed-by: Riza Suminto --- M be/src/service/workload-management.cc A testdata/workloads/functional-query/queries/QueryTest/workload-management-live.test A testdata/workloads/functional-query/queries/QueryTest/workload-management-log.test M tests/common/custom_cluster_test_suite.py M tests/custom_cluster/test_query_live.py M tests/custom_cluster/test_query_log.py M tests/util/workload_management.py 7 files changed, 492 insertions(+), 515 deletions(-) Approvals: Impala Public Jenkins: Verified Riza Suminto: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/21358 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I1127ef041a3e024bf2b262767d56ec5f29bf3855 Gerrit-Change-Number: 21358 Gerrit-PatchSet: 11 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-13051: Speed up, refactor query log tests
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21358 ) Change subject: IMPALA-13051: Speed up, refactor query log tests .. Patch Set 10: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/21358 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1127ef041a3e024bf2b262767d56ec5f29bf3855 Gerrit-Change-Number: 21358 Gerrit-PatchSet: 10 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Mon, 13 May 2024 22:46:35 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13051: Speed up, refactor query log tests
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21358 ) Change subject: IMPALA-13051: Speed up, refactor query log tests .. Patch Set 10: Code-Review+1 Looks good to me. I will give my +2 if no one else has any new comment and DRY_RUN pass. -- To view, visit http://gerrit.cloudera.org:8080/21358 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1127ef041a3e024bf2b262767d56ec5f29bf3855 Gerrit-Change-Number: 21358 Gerrit-PatchSet: 10 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Mon, 13 May 2024 16:19:29 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13020 (part 2): Split out external vs internal Thrift max message size
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21420 ) Change subject: IMPALA-13020 (part 2): Split out external vs internal Thrift max message size .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/21420/1/be/src/common/init.cc File be/src/common/init.cc: http://gerrit.cloudera.org:8080/#/c/21420/1/be/src/common/init.cc@548 PS1, Line 548: FLAGS_thrift_external_rpc_max_message_size < ThriftDefaultMaxMessageSize() > should this be allowed up to max value of int64_t? This comment should go for FLAGS_thrift_rpc_max_message_size instead. -- To view, visit http://gerrit.cloudera.org:8080/21420 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib9a649ef49a8a99c7bd9a1b73c37c4c621661311 Gerrit-Change-Number: 21420 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Sat, 11 May 2024 15:22:03 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13020 (part 2): Split out external vs internal Thrift max message size
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21420 ) Change subject: IMPALA-13020 (part 2): Split out external vs internal Thrift max message size .. Patch Set 1: (3 comments) http://gerrit.cloudera.org:8080/#/c/21420/1/be/src/common/init.cc File be/src/common/init.cc: http://gerrit.cloudera.org:8080/#/c/21420/1/be/src/common/init.cc@548 PS1, Line 548: FLAGS_thrift_external_rpc_max_message_size < ThriftDefaultMaxMessageSize() should this be allowed up to max value of int64_t? http://gerrit.cloudera.org:8080/#/c/21420/1/be/src/rpc/authentication.cc File be/src/rpc/authentication.cc: http://gerrit.cloudera.org:8080/#/c/21420/1/be/src/rpc/authentication.cc@1350 PS1, Line 1350: ThriftExternalRpcMaxMessageSize() == max_message_size || : ThriftInternalRpcMaxMessageSize() == max_message_size This DCHECK is repeated in three separate places. Maybe consider making an inline function for it? http://gerrit.cloudera.org:8080/#/c/21420/1/be/src/rpc/thrift-server.cc File be/src/rpc/thrift-server.cc: http://gerrit.cloudera.org:8080/#/c/21420/1/be/src/rpc/thrift-server.cc@288 PS1, Line 288: is_external_facing_(is_external_facing) If is_external_facing_ == false, can we DCHECK if it is legal given specified name_ or port_ ? -- To view, visit http://gerrit.cloudera.org:8080/21420 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib9a649ef49a8a99c7bd9a1b73c37c4c621661311 Gerrit-Change-Number: 21420 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Sat, 11 May 2024 03:30:39 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13051: Speed up, refactor query log tests
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21358 ) Change subject: IMPALA-13051: Speed up, refactor query log tests .. Patch Set 9: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/21358/9//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21358/9//COMMIT_MSG@19 PS9, Line 19: : Refactors workload management code to reduce if-clause nesting. nit: Mention new tests added since patch set 6. -- To view, visit http://gerrit.cloudera.org:8080/21358 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1127ef041a3e024bf2b262767d56ec5f29bf3855 Gerrit-Change-Number: 21358 Gerrit-PatchSet: 9 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Fri, 10 May 2024 21:43:58 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21383 ) Change subject: IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote .. Patch Set 12: Thanks for all of your reviews! -- To view, visit http://gerrit.cloudera.org:8080/21383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I156d1f0c694b91ba34be70bc53ae9bacf924b3b9 Gerrit-Change-Number: 21383 Gerrit-PatchSet: 12 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Fri, 10 May 2024 02:08:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13018: Block push down of conjuncts with implicit casting on base columns for jdbc tables
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21409 ) Change subject: IMPALA-13018: Block push down of conjuncts with implicit casting on base columns for jdbc tables .. Patch Set 4: Code-Review+2 LGTM! -- To view, visit http://gerrit.cloudera.org:8080/21409 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iabd7e28b8d5f11f25a000dc4c9ab65895056b572 Gerrit-Change-Number: 21409 Gerrit-PatchSet: 4 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Fri, 10 May 2024 02:04:31 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13018: Block push down of conjuncts with implicit casting on base columns for jdbc tables
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21409 ) Change subject: IMPALA-13018: Block push down of conjuncts with implicit casting on base columns for jdbc tables .. Patch Set 3: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/21409/3/testdata/workloads/functional-planner/queries/PlannerTest/data-source-tables.test File testdata/workloads/functional-planner/queries/PlannerTest/data-source-tables.test: http://gerrit.cloudera.org:8080/#/c/21409/3/testdata/workloads/functional-planner/queries/PlannerTest/data-source-tables.test@135 PS3, Line 135: : select s_store_id as store_id, nit: Can you put some comment for the new tests? What is the expectation of each test. -- To view, visit http://gerrit.cloudera.org:8080/21409 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iabd7e28b8d5f11f25a000dc4c9ab65895056b572 Gerrit-Change-Number: 21409 Gerrit-PatchSet: 3 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Fri, 10 May 2024 01:45:48 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21383 ) Change subject: IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote .. Patch Set 10: Code-Review+2 Carry +2 since change in ps10 is small. -- To view, visit http://gerrit.cloudera.org:8080/21383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I156d1f0c694b91ba34be70bc53ae9bacf924b3b9 Gerrit-Change-Number: 21383 Gerrit-PatchSet: 10 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Thu, 09 May 2024 20:58:58 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13018: Block push down of conjuncts with implicit casting on base columns for jdbc tables
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21409 ) Change subject: IMPALA-13018: Block push down of conjuncts with implicit casting on base columns for jdbc tables .. Patch Set 2: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/21409/2/fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java File fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java: http://gerrit.cloudera.org:8080/#/c/21409/2/fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java@301 PS2, Line 301: castExpr.getType().isDateOrTimeType() : || castExpr.getCompatibility().isUnsafe() nit: Any new planner test for this? -- To view, visit http://gerrit.cloudera.org:8080/21409 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iabd7e28b8d5f11f25a000dc4c9ab65895056b572 Gerrit-Change-Number: 21409 Gerrit-PatchSet: 2 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Thu, 09 May 2024 20:38:29 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21383 ) Change subject: IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote .. Patch Set 10: (1 comment) http://gerrit.cloudera.org:8080/#/c/21383/9/be/src/util/network-util.cc File be/src/util/network-util.cc: http://gerrit.cloudera.org:8080/#/c/21383/9/be/src/util/network-util.cc@243 PS9, Line 243: if (address.has_uds_address()) t_address.__set_uds_address(address.uds_address()); > Found one issue after this change: Done -- To view, visit http://gerrit.cloudera.org:8080/21383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I156d1f0c694b91ba34be70bc53ae9bacf924b3b9 Gerrit-Change-Number: 21383 Gerrit-PatchSet: 10 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Thu, 09 May 2024 20:32:14 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote
Hello Kurt Deschler, Jason Fehr, Csaba Ringhofer, Michael Smith, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21383 to look at the new patch set (#10). Change subject: IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote .. IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote It is possible to have UpdateFilterFromRemote RPC arrive to an impalad executor before QueryState of the destination query is created or complete initialization. This patch add wait mechanism in UpdateFilterFromRemote RPC endpoint to wait for few miliseconds until QueryState exist and complete initialization. The wait time is fixed at 500ms, with exponential sleep period in between. If wait time passed and QueryState still not found or initialized, UpdateFilterFromRemote RPC is deemed fail and query execution move on without complete filter. Testing: - Add BE tests in network-util-test.cc - Add test_runtime_filter_aggregation.py::TestLateQueryStateInit - Pass exhastive runs of test_runtime_filter_aggregation.py, test_query_live.py, and test_query_log.py Change-Id: I156d1f0c694b91ba34be70bc53ae9bacf924b3b9 --- M be/src/runtime/query-exec-mgr.cc M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/runtime-filter-bank.cc M be/src/runtime/runtime-filter-bank.h M be/src/scheduling/scheduler.cc M be/src/service/data-stream-service.cc M be/src/service/query-state-record.cc M be/src/util/network-util-test.cc M be/src/util/network-util.cc M be/src/util/network-util.h M common/protobuf/data_stream_service.proto M tests/custom_cluster/test_runtime_filter_aggregation.py M tests/util/workload_management.py 14 files changed, 303 insertions(+), 37 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/21383/10 -- To view, visit http://gerrit.cloudera.org:8080/21383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I156d1f0c694b91ba34be70bc53ae9bacf924b3b9 Gerrit-Change-Number: 21383 Gerrit-PatchSet: 10 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21383 ) Change subject: IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote .. Patch Set 9: (1 comment) http://gerrit.cloudera.org:8080/#/c/21383/9/be/src/util/network-util.cc File be/src/util/network-util.cc: http://gerrit.cloudera.org:8080/#/c/21383/9/be/src/util/network-util.cc@243 PS9, Line 243: if (address.has_uds_address()) t_address.__set_uds_address(address.uds_address()); Found one issue after this change: https://github.com/apache/impala/blob/d1d28c0/be/src/service/query-state-record.cc#L224-L227 For consistency, these lines of code should be replaced with: TNetworkAddress host = FromNetworkAddressPB(b.address()); -- To view, visit http://gerrit.cloudera.org:8080/21383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I156d1f0c694b91ba34be70bc53ae9bacf924b3b9 Gerrit-Change-Number: 21383 Gerrit-PatchSet: 9 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Thu, 09 May 2024 20:14:34 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21383 ) Change subject: IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote .. Patch Set 8: (2 comments) http://gerrit.cloudera.org:8080/#/c/21383/6/be/src/service/data-stream-service.cc File be/src/service/data-stream-service.cc: http://gerrit.cloudera.org:8080/#/c/21383/6/be/src/service/data-stream-service.cc@137 PS6, Line 137: few m > The comment was not update to the new timing Done http://gerrit.cloudera.org:8080/#/c/21383/6/be/src/util/network-util.h File be/src/util/network-util.h: http://gerrit.cloudera.org:8080/#/c/21383/6/be/src/util/network-util.h@119 PS6, Line 119: if (lhs.has_uds_address()) { > This is not consistent now with KrpcAddressEqual, as it doesn't consider ha Updated KrpcAddressEqual to call CompareNetworkAddressPB. -- To view, visit http://gerrit.cloudera.org:8080/21383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I156d1f0c694b91ba34be70bc53ae9bacf924b3b9 Gerrit-Change-Number: 21383 Gerrit-PatchSet: 8 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Thu, 09 May 2024 14:14:40 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote
Hello Kurt Deschler, Jason Fehr, Csaba Ringhofer, Michael Smith, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21383 to look at the new patch set (#8). Change subject: IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote .. IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote It is possible to have UpdateFilterFromRemote RPC arrive to an impalad executor before QueryState of the destination query is created or complete initialization. This patch add wait mechanism in UpdateFilterFromRemote RPC endpoint to wait for few miliseconds until QueryState exist and complete initialization. The wait time is fixed at 500ms, with exponential sleep period in between. If wait time passed and QueryState still not found or initialized, UpdateFilterFromRemote RPC is deemed fail and query execution move on without complete filter. Testing: - Add BE tests in network-util-test.cc - Add test_runtime_filter_aggregation.py::TestLateQueryStateInit - Pass exhastive runs of test_runtime_filter_aggregation.py Change-Id: I156d1f0c694b91ba34be70bc53ae9bacf924b3b9 --- M be/src/runtime/query-exec-mgr.cc M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/runtime-filter-bank.cc M be/src/runtime/runtime-filter-bank.h M be/src/scheduling/scheduler.cc M be/src/service/data-stream-service.cc M be/src/util/network-util-test.cc M be/src/util/network-util.cc M be/src/util/network-util.h M common/protobuf/data_stream_service.proto M tests/custom_cluster/test_runtime_filter_aggregation.py 12 files changed, 297 insertions(+), 30 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/21383/8 -- To view, visit http://gerrit.cloudera.org:8080/21383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I156d1f0c694b91ba34be70bc53ae9bacf924b3b9 Gerrit-Change-Number: 21383 Gerrit-PatchSet: 8 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote
Hello Kurt Deschler, Jason Fehr, Csaba Ringhofer, Michael Smith, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21383 to look at the new patch set (#7). Change subject: IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote .. IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote It is possible to have UpdateFilterFromRemote RPC arrive to an impalad executor before QueryState of the destination query is created or complete initialization. This patch add wait mechanism in UpdateFilterFromRemote RPC endpoint to wait for few miliseconds until QueryState exist and complete initialization. The wait time is fixed at 500ms, with exponential sleep period in between. If wait time passed and QueryState still not found or initialized, UpdateFilterFromRemote RPC is deemed fail and query execution move on without complete filter. Testing: - Add BE tests in network-util-test.cc - Add test_runtime_filter_aggregation.py::TestLateQueryStateInit - Pass exhastive runs of test_runtime_filter_aggregation.py Change-Id: I156d1f0c694b91ba34be70bc53ae9bacf924b3b9 --- M be/src/runtime/query-exec-mgr.cc M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/runtime-filter-bank.cc M be/src/runtime/runtime-filter-bank.h M be/src/scheduling/scheduler.cc M be/src/service/data-stream-service.cc M be/src/util/network-util-test.cc M be/src/util/network-util.cc M be/src/util/network-util.h M common/protobuf/data_stream_service.proto M tests/custom_cluster/test_runtime_filter_aggregation.py 12 files changed, 296 insertions(+), 30 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/21383/7 -- To view, visit http://gerrit.cloudera.org:8080/21383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I156d1f0c694b91ba34be70bc53ae9bacf924b3b9 Gerrit-Change-Number: 21383 Gerrit-PatchSet: 7 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21383 ) Change subject: IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/21383/5/be/src/util/network-util.h File be/src/util/network-util.h: http://gerrit.cloudera.org:8080/#/c/21383/5/be/src/util/network-util.h@118 PS5, Line 118: /// Compare function for two NetworkAddressPB. > Please add unit tests in network-util-test.cc either in this change or a fo Done -- To view, visit http://gerrit.cloudera.org:8080/21383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I156d1f0c694b91ba34be70bc53ae9bacf924b3b9 Gerrit-Change-Number: 21383 Gerrit-PatchSet: 6 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Wed, 08 May 2024 21:47:44 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote
Hello Kurt Deschler, Jason Fehr, Csaba Ringhofer, Michael Smith, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21383 to look at the new patch set (#6). Change subject: IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote .. IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote It is possible to have UpdateFilterFromRemote RPC arrive to an impalad executor before QueryState of the destination query is created or complete initialization. This patch add wait mechanism in UpdateFilterFromRemote RPC endpoint to wait for few miliseconds until QueryState exist and complete initialization. The wait time is fixed at 500ms, with exponential sleep period in between. If wait time passed and QueryState still not found or initialized, UpdateFilterFromRemote RPC is deemed fail and query execution move on without complete filter. Testing: - Add BE tests in network-util-test.cc - Add test_runtime_filter_aggregation.py::TestLateQueryStateInit - Pass exhastive runs of test_runtime_filter_aggregation.py Change-Id: I156d1f0c694b91ba34be70bc53ae9bacf924b3b9 --- M be/src/runtime/query-exec-mgr.cc M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/runtime-filter-bank.cc M be/src/runtime/runtime-filter-bank.h M be/src/scheduling/scheduler.cc M be/src/service/data-stream-service.cc M be/src/util/network-util-test.cc M be/src/util/network-util.cc M be/src/util/network-util.h M common/protobuf/data_stream_service.proto M tests/custom_cluster/test_runtime_filter_aggregation.py 12 files changed, 295 insertions(+), 28 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/21383/6 -- To view, visit http://gerrit.cloudera.org:8080/21383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I156d1f0c694b91ba34be70bc53ae9bacf924b3b9 Gerrit-Change-Number: 21383 Gerrit-PatchSet: 6 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21383 ) Change subject: IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/21383/5/be/src/runtime/query-state.h File be/src/runtime/query-state.h: http://gerrit.cloudera.org:8080/#/c/21383/5/be/src/runtime/query-state.h@328 PS5, Line 328: BackendExecState exec_state = backend_exec_state_; > Shouldn't this be atomic if we need to worry about it's value changing? Or backend_exec_state_ can only transition to terminal state once and wont revert back. This is just to ensure that we are comparing 1 constant value for 3 times below rather than checking against backend_exec_state_ that might be modified in between three checks (say, backend_exec_state_lock_ is changing from EXECUTING to FINISHED when in the middle of backend_exec_state_ == ERROR check). I don't think obtaining backend_exec_state_lock_ is necessary because it will release the lock anyway when applying runtime filter update. Existing codes in query-state.cc also call IsTerminalState() without holding lock. -- To view, visit http://gerrit.cloudera.org:8080/21383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I156d1f0c694b91ba34be70bc53ae9bacf924b3b9 Gerrit-Change-Number: 21383 Gerrit-PatchSet: 5 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Wed, 08 May 2024 20:35:34 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21383 ) Change subject: IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote .. Patch Set 5: (3 comments) http://gerrit.cloudera.org:8080/#/c/21383/4/be/src/scheduling/scheduler.cc File be/src/scheduling/scheduler.cc: http://gerrit.cloudera.org:8080/#/c/21383/4/be/src/scheduling/scheduler.cc@341 PS4, Line 341: if (UNLIKELY(FLAGS_sort_runtime_filter_aggregator_candidates)) { > Would it be worthwhile to wrap this condition with UNLIKELY since the flag Done http://gerrit.cloudera.org:8080/#/c/21383/4/be/src/service/data-stream-service.cc File be/src/service/data-stream-service.cc: http://gerrit.cloudera.org:8080/#/c/21383/4/be/src/service/data-stream-service.cc@159 PS4, Line 159: } else if (qs->is_initialized()) { > Is this condition the more likely to happen? If so, please make it the if It is more likely, but if IsTerminalState() is true, there is no need to process filter update at all. Therefore, it is checked first in the earlier branch. Removed IsCancelled() check since IsTerminalState() already include BackendExecState::CANCELLED checking. http://gerrit.cloudera.org:8080/#/c/21383/4/be/src/util/network-util.h File be/src/util/network-util.h: http://gerrit.cloudera.org:8080/#/c/21383/4/be/src/util/network-util.h@124 PS4, Line 124: if (lhs.has_uds_address()) { > What if only one has_uds_address? Seems like they should be unequal then. Done. -- To view, visit http://gerrit.cloudera.org:8080/21383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I156d1f0c694b91ba34be70bc53ae9bacf924b3b9 Gerrit-Change-Number: 21383 Gerrit-PatchSet: 5 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Wed, 08 May 2024 20:17:36 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote
Hello Kurt Deschler, Jason Fehr, Csaba Ringhofer, Michael Smith, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21383 to look at the new patch set (#5). Change subject: IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote .. IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote It is possible to have UpdateFilterFromRemote RPC arrive to an impalad executor before QueryState of the destination query is created or complete initialization. This patch add wait mechanism in UpdateFilterFromRemote RPC endpoint to wait for few miliseconds until QueryState exist and complete initialization. The wait time is fixed at 500ms, with exponential sleep period in between. If wait time passed and QueryState still not found or initialized, UpdateFilterFromRemote RPC is deemed fail and query execution move on without complete filter. Testing: - Add test_runtime_filter_aggregation.py::TestLateQueryStateInit - Pass exhastive runs of test_runtime_filter_aggregation.py Change-Id: I156d1f0c694b91ba34be70bc53ae9bacf924b3b9 --- M be/src/runtime/query-exec-mgr.cc M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/runtime-filter-bank.cc M be/src/runtime/runtime-filter-bank.h M be/src/scheduling/scheduler.cc M be/src/service/data-stream-service.cc M be/src/util/network-util.h M common/protobuf/data_stream_service.proto M tests/custom_cluster/test_runtime_filter_aggregation.py 10 files changed, 190 insertions(+), 25 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/21383/5 -- To view, visit http://gerrit.cloudera.org:8080/21383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I156d1f0c694b91ba34be70bc53ae9bacf924b3b9 Gerrit-Change-Number: 21383 Gerrit-PatchSet: 5 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21383 ) Change subject: IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote .. Patch Set 4: (6 comments) http://gerrit.cloudera.org:8080/#/c/21383/3/be/src/scheduling/scheduler.cc File be/src/scheduling/scheduler.cc: http://gerrit.cloudera.org:8080/#/c/21383/3/be/src/scheduling/scheduler.cc@59 PS3, Line 59: DEFINE_bool_hidden(sort_runtime_filter_aggregator_candidates, false, > candidates Done http://gerrit.cloudera.org:8080/#/c/21383/3/be/src/scheduling/scheduler.cc@59 PS3, Line 59: sort_runtime_filter_aggregator_candidate > typo: _candidates Done http://gerrit.cloudera.org:8080/#/c/21383/3/be/src/service/data-stream-service.cc File be/src/service/data-stream-service.cc: http://gerrit.cloudera.org:8080/#/c/21383/3/be/src/service/data-stream-service.cc@143 PS3, Line 143: sleep_duration_ms > Maybe even less than 20ms. Especially if these sleeps can stack with comple Changed to begin with 2ms sleep, and then keep doubling for next iteration. http://gerrit.cloudera.org:8080/#/c/21383/3/be/src/util/network-util.h File be/src/util/network-util.h: http://gerrit.cloudera.org:8080/#/c/21383/3/be/src/util/network-util.h@123 PS3, Line 123: if (comp == 0 && lhs.has_uds_address() && rhs.has_uds_address()) { > Check has_uds_address since it is optional field. Done http://gerrit.cloudera.org:8080/#/c/21383/3/tests/custom_cluster/test_runtime_filter_aggregation.py File tests/custom_cluster/test_runtime_filter_aggregation.py: http://gerrit.cloudera.org:8080/#/c/21383/3/tests/custom_cluster/test_runtime_filter_aggregation.py@89 PS3, Line 89: cls._init_delay] > nit: +2 indentations on broken lines Done http://gerrit.cloudera.org:8080/#/c/21383/3/tests/custom_cluster/test_runtime_filter_aggregation.py@121 PS3, Line 121: all_blocked = 'UpdateFilterFromRemote RPC called with remaining wait time' > Could we check in the profile that the runtime filter did not arrive in the Done -- To view, visit http://gerrit.cloudera.org:8080/21383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I156d1f0c694b91ba34be70bc53ae9bacf924b3b9 Gerrit-Change-Number: 21383 Gerrit-PatchSet: 4 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Wed, 08 May 2024 15:28:13 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote
Hello Kurt Deschler, Csaba Ringhofer, Michael Smith, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21383 to look at the new patch set (#4). Change subject: IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote .. IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote It is possible to have UpdateFilterFromRemote RPC arrive to an impalad executor before QueryState of the destination query is created or complete initialization. This patch add wait mechanism in UpdateFilterFromRemote RPC endpoint to wait for few miliseconds until QueryState exist and complete initialization. The wait time is fixed at 500ms, with exponential sleep period in between. If wait time passed and QueryState still not found or initialized, UpdateFilterFromRemote RPC is deemed fail and query execution move on without complete filter. Testing: - Add test_runtime_filter_aggregation.py::TestLateQueryStateInit - Pass exhastive runs of test_runtime_filter_aggregation.py Change-Id: I156d1f0c694b91ba34be70bc53ae9bacf924b3b9 --- M be/src/runtime/query-exec-mgr.cc M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/runtime-filter-bank.cc M be/src/runtime/runtime-filter-bank.h M be/src/scheduling/scheduler.cc M be/src/service/data-stream-service.cc M be/src/util/network-util.h M common/protobuf/data_stream_service.proto M tests/custom_cluster/test_runtime_filter_aggregation.py 10 files changed, 180 insertions(+), 25 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/21383/4 -- To view, visit http://gerrit.cloudera.org:8080/21383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I156d1f0c694b91ba34be70bc53ae9bacf924b3b9 Gerrit-Change-Number: 21383 Gerrit-PatchSet: 4 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-13058: Init first arrival time and completion time with -1
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21405 ) Change subject: IMPALA-13058: Init first_arrival_time_ and completion_time_ with -1 .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/21405/4/be/src/service/client-request-state.cc File be/src/service/client-request-state.cc: http://gerrit.cloudera.org:8080/#/c/21405/4/be/src/service/client-request-state.cc@123 PS4, Line 123: profile_->AddChild(summary_profile_); > I'd like to revert this back to query_events_->Start(), since it already mo Done. This change still pass test_basic_filters in ARM. -- To view, visit http://gerrit.cloudera.org:8080/21405 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1176e2118bb03414ab35049f50009ff0e8c63f58 Gerrit-Change-Number: 21405 Gerrit-PatchSet: 5 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Wed, 08 May 2024 04:07:14 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13058: Init first arrival time and completion time with -1
Hello Wenzhe Zhou, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21405 to look at the new patch set (#5). Change subject: IMPALA-13058: Init first_arrival_time_ and completion_time_ with -1 .. IMPALA-13058: Init first_arrival_time_ and completion_time_ with -1 Impala run over ARM machine shows 'arch_sys_counter' clock source being used rather than more precise 'tsc'. This cause MonotonicStopwatch::Now() to use 'CLOCK_MONOTONIC_COARSE' rather than 'CLOCK_MONOTONIC'. This is what printed near the beginning of impalad log: I0506 13:49:15.429359 355337 init.cc:600] OS distribution: Red Hat Enterprise Linux 8.8 (Ootpa) OS version: Linux version 4.18.0-477.15.1.el8_8.aarch64 ... Clock: clocksource: 'arch_sys_counter', clockid_t: CLOCK_MONOTONIC_COARSE This difference in clock source causes test failure in test_runtime_filters.py::TestRuntimeFilters::test_basic_filters. This patch fixes the issue by initializing first_arrival_time_ and completion_time_ fields of Coordinator::FilterState with -1 and accept 0 as valid value for those fields. query_events_ initialization and start are also moved to the beginning of ClientRequestState's contructor. Testing: - Tweak row_regex pattern in runtime_filters.test. - Loop and pass test_runtime_filters.py in exhaustive mode 3 times in ARM machine. Change-Id: I1176e2118bb03414ab35049f50009ff0e8c63f58 --- M be/src/runtime/coordinator-filter-state.h M be/src/runtime/coordinator.cc M be/src/service/client-request-state.cc M testdata/workloads/functional-query/queries/QueryTest/runtime_filters.test 4 files changed, 21 insertions(+), 18 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/21405/5 -- To view, visit http://gerrit.cloudera.org:8080/21405 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1176e2118bb03414ab35049f50009ff0e8c63f58 Gerrit-Change-Number: 21405 Gerrit-PatchSet: 5 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou
[Impala-ASF-CR] IMPALA-13058: Init first arrival time and completion time with -1
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21405 ) Change subject: IMPALA-13058: Init first_arrival_time_ and completion_time_ with -1 .. Patch Set 4: (1 comment) > Patch Set 4: Verified-1 > > Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/10615/ ubuntu-20.04-from-scratch job fail doing apt-get install 15:54:13 + sudo apt-get --yes install git 15:54:13 E: Could not get lock /var/lib/dpkg/lock-frontend. It is held by process 2804 (apt-get) 15:54:13 E: Unable to acquire the dpkg frontend lock (/var/lib/dpkg/lock-frontend), is another process using it? http://gerrit.cloudera.org:8080/#/c/21405/4/be/src/service/client-request-state.cc File be/src/service/client-request-state.cc: http://gerrit.cloudera.org:8080/#/c/21405/4/be/src/service/client-request-state.cc@123 PS4, Line 123: query_events_->Start(query_events_start_time); I'd like to revert this back to query_events_->Start(), since it already moved to beginning of constructor and calling MonotonicStopWatch::Now() twice feels overkill. -- To view, visit http://gerrit.cloudera.org:8080/21405 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1176e2118bb03414ab35049f50009ff0e8c63f58 Gerrit-Change-Number: 21405 Gerrit-PatchSet: 4 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Wed, 08 May 2024 04:04:37 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13058: Init first arrival time and completion time with -1
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21405 ) Change subject: IMPALA-13058: Init first_arrival_time_ and completion_time_ with -1 .. Patch Set 3: (4 comments) http://gerrit.cloudera.org:8080/#/c/21405/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21405/2//COMMIT_MSG@19 PS2, Line 19: cause > nit: causes Done http://gerrit.cloudera.org:8080/#/c/21405/2//COMMIT_MSG@21 PS2, Line 21: fix > nit: fixes Done http://gerrit.cloudera.org:8080/#/c/21405/2//COMMIT_MSG@22 PS2, Line 22: wi > nit: are Removed. http://gerrit.cloudera.org:8080/#/c/21405/2//COMMIT_MSG@25 PS2, Line 25: ar > nit: are Done -- To view, visit http://gerrit.cloudera.org:8080/21405 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1176e2118bb03414ab35049f50009ff0e8c63f58 Gerrit-Change-Number: 21405 Gerrit-PatchSet: 3 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Tue, 07 May 2024 22:42:20 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13058: Init first arrival time and completion time with -1
Hello Wenzhe Zhou, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21405 to look at the new patch set (#3). Change subject: IMPALA-13058: Init first_arrival_time_ and completion_time_ with -1 .. IMPALA-13058: Init first_arrival_time_ and completion_time_ with -1 Impala run over ARM machine shows 'arch_sys_counter' clock source being used rather than more precise 'tsc'. This cause MonotonicStopwatch::Now() to use 'CLOCK_MONOTONIC_COARSE' rather than 'CLOCK_MONOTONIC'. This is what printed near the beginning of impalad log: I0506 13:49:15.429359 355337 init.cc:600] OS distribution: Red Hat Enterprise Linux 8.8 (Ootpa) OS version: Linux version 4.18.0-477.15.1.el8_8.aarch64 ... Clock: clocksource: 'arch_sys_counter', clockid_t: CLOCK_MONOTONIC_COARSE This difference in clock source causes test failure in test_runtime_filters.py::TestRuntimeFilters::test_basic_filters. This patch fixes the issue by initializing first_arrival_time_ and completion_time_ fields of Coordinator::FilterState with -1 and accept 0 as valid value for those fields. query_events_ initialization and start are also moved to the beginning of ClientRequestState's contructor. Testing: - Tweak row_regex pattern in runtime_filters.test. - Loop and pass test_runtime_filters.py in exhaustive mode 3 times in ARM machine. Change-Id: I1176e2118bb03414ab35049f50009ff0e8c63f58 --- M be/src/runtime/coordinator-filter-state.h M be/src/runtime/coordinator.cc M be/src/service/client-request-state.cc M testdata/workloads/functional-query/queries/QueryTest/runtime_filters.test 4 files changed, 22 insertions(+), 18 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/21405/3 -- To view, visit http://gerrit.cloudera.org:8080/21405 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1176e2118bb03414ab35049f50009ff0e8c63f58 Gerrit-Change-Number: 21405 Gerrit-PatchSet: 3 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou
[Impala-ASF-CR] IMPALA-13058: Init first arrival time and completion time with -1
Hello Wenzhe Zhou, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21405 to look at the new patch set (#2). Change subject: IMPALA-13058: Init first_arrival_time_ and completion_time_ with -1 .. IMPALA-13058: Init first_arrival_time_ and completion_time_ with -1 Impala run over ARM machine shows 'arch_sys_counter' clock source being used rather than more precise 'tsc'. This cause MonotonicStopwatch::Now() to use 'CLOCK_MONOTONIC_COARSE' rather than 'CLOCK_MONOTONIC'. This is what printed near the beginning of impalad log: I0506 13:49:15.429359 355337 init.cc:600] OS distribution: Red Hat Enterprise Linux 8.8 (Ootpa) OS version: Linux version 4.18.0-477.15.1.el8_8.aarch64 ... Clock: clocksource: 'arch_sys_counter', clockid_t: CLOCK_MONOTONIC_COARSE This difference in clock source cause test failure in test_runtime_filters.py::TestRuntimeFilters::test_basic_filters. This patch fix the issue by initializing first_arrival_time_ and completion_time_ fields of Coordinator::FilterState is with -1 and accept 0 as valid value for those fields. query_events_ initialization and start is also moved to the beginning of ClientRequestState's contructor. Testing: - Tweak row_regex pattern in runtime_filters.test. - Loop and pass test_runtime_filters.py in exhaustive mode 3 times in ARM machine. Change-Id: I1176e2118bb03414ab35049f50009ff0e8c63f58 --- M be/src/runtime/coordinator-filter-state.h M be/src/runtime/coordinator.cc M be/src/service/client-request-state.cc M testdata/workloads/functional-query/queries/QueryTest/runtime_filters.test 4 files changed, 22 insertions(+), 18 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/21405/2 -- To view, visit http://gerrit.cloudera.org:8080/21405 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1176e2118bb03414ab35049f50009ff0e8c63f58 Gerrit-Change-Number: 21405 Gerrit-PatchSet: 2 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou
[Impala-ASF-CR] IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21383 ) Change subject: IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/21383/2/be/src/service/data-stream-service.cc File be/src/service/data-stream-service.cc: http://gerrit.cloudera.org:8080/#/c/21383/2/be/src/service/data-stream-service.cc@141 PS2, Line 141: bool query_found = false; > nit: could be useful as a hidden config option. Done http://gerrit.cloudera.org:8080/#/c/21383/2/tests/custom_cluster/test_runtime_filter_aggregation.py File tests/custom_cluster/test_runtime_filter_aggregation.py: http://gerrit.cloudera.org:8080/#/c/21383/2/tests/custom_cluster/test_runtime_filter_aggregation.py@115 PS2, Line 115: #The JOIN BUILD fragment in first and third impalads immediately receive EOS > typo: anc -> and Done -- To view, visit http://gerrit.cloudera.org:8080/21383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I156d1f0c694b91ba34be70bc53ae9bacf924b3b9 Gerrit-Change-Number: 21383 Gerrit-PatchSet: 3 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Tue, 07 May 2024 21:53:08 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote
Hello Csaba Ringhofer, Michael Smith, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21383 to look at the new patch set (#3). Change subject: IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote .. IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote It is possible to have UpdateFilterFromRemote RPC arrive to an impalad executor before QueryState of the destination query is created or complete initialization. This patch add wait mechanism in UpdateFilterFromRemote RPC endpoint to wait for few miliseconds until QueryState exist and complete initialization. The wait time is fixed at 500ms. If wait time passed and QueryState still not found or initialized, UpdateFilterFromRemote RPC is deemed fail and query execution move on without complete filter. Testing: - Add test_runtime_filter_aggregation.py::TestLateQueryStateInit - Pass exhastive runs of test_runtime_filter_aggregation.py Change-Id: I156d1f0c694b91ba34be70bc53ae9bacf924b3b9 --- M be/src/runtime/query-exec-mgr.cc M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/runtime-filter-bank.cc M be/src/runtime/runtime-filter-bank.h M be/src/scheduling/scheduler.cc M be/src/service/data-stream-service.cc M be/src/util/network-util.h M common/protobuf/data_stream_service.proto M tests/custom_cluster/test_runtime_filter_aggregation.py 10 files changed, 171 insertions(+), 25 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/21383/3 -- To view, visit http://gerrit.cloudera.org:8080/21383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I156d1f0c694b91ba34be70bc53ae9bacf924b3b9 Gerrit-Change-Number: 21383 Gerrit-PatchSet: 3 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-13018: Fix test tpcds q80a for JDBC table
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21409 ) Change subject: IMPALA-13018: Fix test_tpcds_q80a for JDBC table .. Patch Set 1: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/21409/1/fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java File fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java: http://gerrit.cloudera.org:8080/#/c/21409/1/fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java@300 PS1, Line 300: castExpr.getType().isDateOrTimeType() nit: Just want to double check, is CastExpr OK for types other than Date/Timestamp? Or should it be avoided entirely regardless of expression type? -- To view, visit http://gerrit.cloudera.org:8080/21409 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iabd7e28b8d5f11f25a000dc4c9ab65895056b572 Gerrit-Change-Number: 21409 Gerrit-PatchSet: 1 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Tue, 07 May 2024 19:12:11 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13061: Create query live as external table
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21401 ) Change subject: IMPALA-13061: Create query live as external table .. Patch Set 4: Code-Review+2 (2 comments) http://gerrit.cloudera.org:8080/#/c/21401/3/tests/custom_cluster/test_query_live.py File tests/custom_cluster/test_query_live.py: http://gerrit.cloudera.org:8080/#/c/21401/3/tests/custom_cluster/test_query_live.py@124 PS3, Line 124: 'select tables_queried from sys.impala_query_live where query_id = "' : + result.query_id + '"') > Can you add assertion for other table properties of impala_query_live here? Done http://gerrit.cloudera.org:8080/#/c/21401/3/tests/custom_cluster/test_query_live.py@175 PS3, Line 175: "--use_local_catalog=true", : catalogd_args="--enable_workload_mgmt " : > Switched to CREATE EXTERNAL TABLE, I think that makes sense for System Tabl Done -- To view, visit http://gerrit.cloudera.org:8080/21401 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie60a2bd03fabc63c85bcd9fa2489e9d47cd2aa65 Gerrit-Change-Number: 21401 Gerrit-PatchSet: 4 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Tue, 07 May 2024 19:03:24 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13051: Speed up, refactor query log tests
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21358 ) Change subject: IMPALA-13051: Speed up, refactor query log tests .. Patch Set 6: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/21358 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1127ef041a3e024bf2b262767d56ec5f29bf3855 Gerrit-Change-Number: 21358 Gerrit-PatchSet: 6 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Tue, 07 May 2024 18:24:07 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13058: Init first arrival time and completion time with -1
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21405 ) Change subject: IMPALA-13058: Init first_arrival_time_ and completion_time_ with -1 .. Patch Set 1: > Patch Set 1: > > This patch only help bypassing the flaky test, but does not address the > underlying issue with RuntimeProfile::EventSequence::Start() from: > > be/src/service/client-request-state.cc:140: query_events_->Start(); > Maybe a better solution is to Start query_events_ at start_time_us() query_events_->Start(start_time_us() * 1000); -- To view, visit http://gerrit.cloudera.org:8080/21405 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1176e2118bb03414ab35049f50009ff0e8c63f58 Gerrit-Change-Number: 21405 Gerrit-PatchSet: 1 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Tue, 07 May 2024 17:25:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13061: Set transactional�lse on query live table
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21401 ) Change subject: IMPALA-13061: Set transactional=false on query live table .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/21401/3/tests/custom_cluster/test_query_live.py File tests/custom_cluster/test_query_live.py: http://gerrit.cloudera.org:8080/#/c/21401/3/tests/custom_cluster/test_query_live.py@124 PS3, Line 124: describe_ext_result = self.execute_query('describe extended sys.impala_query_live') : assert len(describe_ext_result.data) == 82 Can you add assertion for other table properties of impala_query_live here? http://gerrit.cloudera.org:8080/#/c/21401/3/tests/custom_cluster/test_query_live.py@175 PS3, Line 175: esult = self.client.execute("select * from functional.alltypes", : fetch_profile_after_close=True) : Can you also assert that 'transactional'='false' in table properties via "DESCRIBE EXTENDED" query? -- To view, visit http://gerrit.cloudera.org:8080/21401 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie60a2bd03fabc63c85bcd9fa2489e9d47cd2aa65 Gerrit-Change-Number: 21401 Gerrit-PatchSet: 3 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Tue, 07 May 2024 16:51:38 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13058: Init first arrival time and completion time with -1
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21405 ) Change subject: IMPALA-13058: Init first_arrival_time_ and completion_time_ with -1 .. Patch Set 1: This patch only help bypassing the flaky test, but does not address the underlying issue with RuntimeProfile::EventSequence::Start() from: be/src/service/client-request-state.cc:140: query_events_->Start(); https://issues.apache.org/jira/browse/IMPALA-13058?focusedCommentId=17844060=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17844060 -- To view, visit http://gerrit.cloudera.org:8080/21405 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1176e2118bb03414ab35049f50009ff0e8c63f58 Gerrit-Change-Number: 21405 Gerrit-PatchSet: 1 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Tue, 07 May 2024 16:05:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13058: Init first arrival time and completion time with -1
Riza Suminto has uploaded this change for review. ( http://gerrit.cloudera.org:8080/21405 Change subject: IMPALA-13058: Init first_arrival_time_ and completion_time_ with -1 .. IMPALA-13058: Init first_arrival_time_ and completion_time_ with -1 Before this patch, first_arrival_time_ and completion_time_ of Coordinator::FilterState is initialized with 0. This patch change the initialization value to -1 and accept 0 as valid value for those fields. Testing: - Tweak row_regex pattern in runtime_filters.test. - Pass test_runtime_filters.py in exhaustive mode. Change-Id: I1176e2118bb03414ab35049f50009ff0e8c63f58 --- M be/src/runtime/coordinator-filter-state.h M be/src/runtime/coordinator.cc M testdata/workloads/functional-query/queries/QueryTest/runtime_filters.test 3 files changed, 11 insertions(+), 9 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/21405/1 -- To view, visit http://gerrit.cloudera.org:8080/21405 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I1176e2118bb03414ab35049f50009ff0e8c63f58 Gerrit-Change-Number: 21405 Gerrit-PatchSet: 1 Gerrit-Owner: Riza Suminto
[Impala-ASF-CR] IMPALA-13054: Avoid revisiting children in QueryStateExpanded
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21392 ) Change subject: IMPALA-13054: Avoid revisiting children in QueryStateExpanded .. Patch Set 2: Code-Review+2 LGTM -- To view, visit http://gerrit.cloudera.org:8080/21392 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia36d53e52095d487ffb3a162fc723f1e815bec11 Gerrit-Change-Number: 21392 Gerrit-PatchSet: 2 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Mon, 06 May 2024 22:54:58 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13054: Avoid revisiting children in QueryStateExpanded
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21392 ) Change subject: IMPALA-13054: Avoid revisiting children in QueryStateExpanded .. Patch Set 1: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/21392/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21392/1//COMMIT_MSG@12 PS1, Line 12: GitAllChildren nit: GetAllChildren -- To view, visit http://gerrit.cloudera.org:8080/21392 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia36d53e52095d487ffb3a162fc723f1e815bec11 Gerrit-Change-Number: 21392 Gerrit-PatchSet: 1 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Fri, 03 May 2024 01:30:45 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13051: Speed up, refactor query log tests
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21358 ) Change subject: IMPALA-13051: Speed up, refactor query log tests .. Patch Set 5: (2 comments) http://gerrit.cloudera.org:8080/#/c/21358/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21358/1//COMMIT_MSG@6 PS1, Line 6: : IMPALA-13051: Speed up, refa > Will do, this refactor isn't urgent but I wanted to toss up some ideas. Done http://gerrit.cloudera.org:8080/#/c/21358/5/tests/common/custom_cluster_test_suite.py File tests/common/custom_cluster_test_suite.py: http://gerrit.cloudera.org:8080/#/c/21358/5/tests/common/custom_cluster_test_suite.py@236 PS5, Line 236: super(CustomClusterTestSuite, self).teardown_class() Should it be called the very last? I see couple custom cluster tests override teardown_class() -- To view, visit http://gerrit.cloudera.org:8080/21358 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1127ef041a3e024bf2b262767d56ec5f29bf3855 Gerrit-Change-Number: 21358 Gerrit-PatchSet: 5 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Fri, 03 May 2024 01:20:56 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21383 ) Change subject: IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote .. Patch Set 2: (4 comments) http://gerrit.cloudera.org:8080/#/c/21383/1/be/src/runtime/query-exec-mgr.cc File be/src/runtime/query-exec-mgr.cc: http://gerrit.cloudera.org:8080/#/c/21383/1/be/src/runtime/query-exec-mgr.cc@85 PS1, Line 85: RETURN_IF_ERROR(DebugAction(query_ctx.client_request.query_options.debug_action, > Maybe we need to add a query option to fix those for testing so we can have Changed the test and debug action in ps2. http://gerrit.cloudera.org:8080/#/c/21383/1/be/src/service/data-stream-service.cc File be/src/service/data-stream-service.cc: http://gerrit.cloudera.org:8080/#/c/21383/1/be/src/service/data-stream-service.cc@126 PS1, Line 126: const UpdateFilterParamsPB* req, UpdateFilterResultPB* resp, RpcContext* context) { > It share the same RPC queue with TransmitData (KRPC exchanges). There is a I figured the downside of this approach is that RPC thread will blocked until filter update finally applied or wait period pass. I fix wait period to 500ms to not hold RPC thread for too long. http://gerrit.cloudera.org:8080/#/c/21383/1/be/src/service/data-stream-service.cc@166 PS1, Line 166: } while (total_wait_time < min_wait_time_ms); > Ack. "no longer running" sound better. Done http://gerrit.cloudera.org:8080/#/c/21383/1/tests/query_test/test_runtime_filters.py File tests/query_test/test_runtime_filters.py: http://gerrit.cloudera.org:8080/#/c/21383/1/tests/query_test/test_runtime_filters.py@555 PS1, Line 555: > The longer SLEEP debug action should log error. I'll see if I can add that. Done -- To view, visit http://gerrit.cloudera.org:8080/21383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I156d1f0c694b91ba34be70bc53ae9bacf924b3b9 Gerrit-Change-Number: 21383 Gerrit-PatchSet: 2 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Fri, 03 May 2024 00:49:16 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote
Hello Csaba Ringhofer, Michael Smith, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21383 to look at the new patch set (#2). Change subject: IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote .. IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote It is possible to have UpdateFilterFromRemote RPC arrive to an impalad executor before QueryState of the destination query is created or complete initialization. This patch add wait mechanism in UpdateFilterFromRemote RPC endpoint to wait for few miliseconds until QueryState exist and complete initialization. The wait time is fixed at 500ms. If wait time passed and QueryState still not found or initialized, UpdateFilterFromRemote RPC is deemed fail and query execution move on without complete filter. Testing: - Add test_runtime_filter_aggregation.py::TestLateQueryStateInit - Pass exhastive runs of test_runtime_filter_aggregation.py Change-Id: I156d1f0c694b91ba34be70bc53ae9bacf924b3b9 --- M be/src/runtime/query-exec-mgr.cc M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/runtime-filter-bank.cc M be/src/runtime/runtime-filter-bank.h M be/src/scheduling/scheduler.cc M be/src/service/data-stream-service.cc M be/src/util/network-util.h M common/protobuf/data_stream_service.proto M tests/custom_cluster/test_runtime_filter_aggregation.py 10 files changed, 169 insertions(+), 25 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/21383/2 -- To view, visit http://gerrit.cloudera.org:8080/21383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I156d1f0c694b91ba34be70bc53ae9bacf924b3b9 Gerrit-Change-Number: 21383 Gerrit-PatchSet: 2 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21383 ) Change subject: IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote .. Patch Set 1: (4 comments) http://gerrit.cloudera.org:8080/#/c/21383/1/be/src/runtime/query-exec-mgr.cc File be/src/runtime/query-exec-mgr.cc: http://gerrit.cloudera.org:8080/#/c/21383/1/be/src/runtime/query-exec-mgr.cc@85 PS1, Line 85: {FLAGS_krpc_port == 27000 ? "coordinator" : "executor"})); > This seems like an odd way to check when is_coordinator and is_executor exi This test is tricky because: - No fragment instances nor RuntimeFilterBank active until QueryState pass initialization. - In 3 nodes impala minicluster, all of them are is_coordinator and is_executor. Only our py.test code enforce that query execution of EE test go to the first impalad, that is the one with krpc_port=27000 - The test scenario ideally just want to delay the randomly-selected intermediate aggregator node, but we won't be able to determine if this impalad is it or not until QueryState is initialized. That is why SLEEP is injected for all but first impalad. - The test rely on JOIN BUILD assigned in first impalad (27000) to send UpdateFilterFromRemote to either the second or the third that is selected as filter aggregator. This is only possible if this DebugAction does not trigger in the first impalad. But even so, I still find undeterminism happen due to data partitioning schedule. That test scenario that I mention about only work if the build scanner fragment and join build fragment both colocated at the first impalad and build scanner does not exchange to the other two impalads. So far, I have not found any way to make scheduling and exchange direction more deterministic. http://gerrit.cloudera.org:8080/#/c/21383/1/be/src/service/data-stream-service.cc File be/src/service/data-stream-service.cc: http://gerrit.cloudera.org:8080/#/c/21383/1/be/src/service/data-stream-service.cc@126 PS1, Line 126: const UpdateFilterParamsPB* req, UpdateFilterResultPB* resp, RpcContext* context) { > What thread does this run in? Do we need to worry about it blocking a queue It share the same RPC queue with TransmitData (KRPC exchanges). There is a possibility of blocked queue, just as it does with TransmitData. http://gerrit.cloudera.org:8080/#/c/21383/1/be/src/service/data-stream-service.cc@166 PS1, Line 166: PrintId(ProtoToQueryId(req->query_id())), query_found ? "has done" : "not found", > I'm not clear what "Query State for query_id=... has done after N ms" would Ack. "no longer running" sound better. http://gerrit.cloudera.org:8080/#/c/21383/1/tests/query_test/test_runtime_filters.py File tests/query_test/test_runtime_filters.py: http://gerrit.cloudera.org:8080/#/c/21383/1/tests/query_test/test_runtime_filters.py@555 PS1, Line 555: """Test that distributed runtime filter aggregation still works > Can we also test for the log message? assert_impalad_log_contains would wor The longer SLEEP debug action should log error. I'll see if I can add that. -- To view, visit http://gerrit.cloudera.org:8080/21383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I156d1f0c694b91ba34be70bc53ae9bacf924b3b9 Gerrit-Change-Number: 21383 Gerrit-PatchSet: 1 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Wed, 01 May 2024 23:30:51 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote
Riza Suminto has uploaded this change for review. ( http://gerrit.cloudera.org:8080/21383 Change subject: IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote .. IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote It is possible to have UpdateFilterFromRemote RPC arrive to an impalad executor before QueryState of the destination query is created or complete initialization. This patch add wait mechanism in UpdateFilterFromRemote RPC endpoint to wait for few miliseconds until QueryState exist and complete initialization. The maximum wait time is max between 200ms and the remaining runtime filter wait time from caller's perspective. If maximum wait time passed and QueryState still not found or initialized, UpdateFilterFromRemote RPC is deemed fail and query execution move on without complete filter. Testing: - Add test_runtime_filters.py::TestLateQueryStateInit - Pass exhastive runs of test_runtime_filters.py and test_runtime_filter_aggregation.py Change-Id: I156d1f0c694b91ba34be70bc53ae9bacf924b3b9 --- M be/src/runtime/query-exec-mgr.cc M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/runtime-filter-bank.cc M be/src/runtime/runtime-filter-bank.h M be/src/service/data-stream-service.cc M common/protobuf/data_stream_service.proto M tests/query_test/test_runtime_filters.py 8 files changed, 120 insertions(+), 24 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/21383/1 -- To view, visit http://gerrit.cloudera.org:8080/21383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I156d1f0c694b91ba34be70bc53ae9bacf924b3b9 Gerrit-Change-Number: 21383 Gerrit-PatchSet: 1 Gerrit-Owner: Riza Suminto
[Impala-ASF-CR] IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column
Hello Aman Sinha, Kurt Deschler, David Rorke, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21377 to look at the new patch set (#2). Change subject: IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column .. IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column Impala frontend can not evaluate BETWEEN/NOT BETWEEN predicate directly. It needs to transform a BetweenPredicate into a CompoundPredicate consisting of upper bound and lower bound BinaryPredicate through BetweenToCompoundRule.java. The BinaryPredicate can then be pushed down or rewritten into other form by another expression rewrite rule. However, the selectivity of BetweenPredicate or its derivatives remains unassigned and often collapses with other unknown selectivity predicates to have collective selectivity equals Expr.DEFAULT_SELECTIVITY (0.1). This patch adds a narrow optimization of BetweenPredicate selectivity when the following criteria are met: 1. The BetweenPredicate is bound to a slot reference of a single column of a table. 2. The column type is discrete, such as INTEGER or DATE. 3. The column stats are available. 4. The column is sufficiently unique based on available stats. 5. The BETWEEN/NOT BETWEEN predicate is in good form (lower bound value <= upper bound value). 6. The final calculated selectivity is less than or equal to Expr.DEFAULT_SELECTIVITY. If these criteria are unmet, the Planner will revert to the old behavior, which is letting the selectivity unassigned. This patch calculates the BetweenPredicate selectivity during transformation at BetweenToCompoundRule.java. The selectivity is piggy-backed into the resulting CompoundPredicate and BinaryPredicate as betweenSelectivity_ field, separate from the selectivity_ field. Analyzer.getBoundPredicates() is modified to prioritize the derived BinaryPredicate over ordinary BinaryPredicate in its return value to prevent the derived BinaryPredicate from being eliminated by a matching ordinary BinaryPredicate. Testing: - Add FE tests in ExprCardinalityTest#testBetweenSelectivity, ExprCardinalityTest#testNotBetweenSelectivity, and PlannerTest#testScanCardinality. - Pass core tests. Change-Id: Ib349d97349d1ee99788645a66be1b81749684d10 --- M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/catalog/Type.java M fe/src/main/java/org/apache/impala/planner/PlanNode.java M fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java M fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java M testdata/workloads/functional-planner/queries/PlannerTest/card-scan.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q05.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q12.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q16.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q20.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q21.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q32.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q37.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q40.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q77.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q80.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q82.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q92.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q94.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q95.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q98.test 24 files changed, 3,846 insertions(+), 3,500 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/77/21377/2 -- To view, visit http://gerrit.cloudera.org:8080/21377 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib349d97349d1ee99788645a66be1b81749684d10 Gerrit-Change-Number: 21377 Gerrit-PatchSet: 2 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21377 ) Change subject: IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/21377/1/fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java File fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java: http://gerrit.cloudera.org:8080/#/c/21377/1/fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java@a165 PS1, Line 165: This is mistakenly removed and should be restored. -- To view, visit http://gerrit.cloudera.org:8080/21377 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib349d97349d1ee99788645a66be1b81749684d10 Gerrit-Change-Number: 21377 Gerrit-PatchSet: 1 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Wed, 01 May 2024 16:06:46 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column
Riza Suminto has uploaded this change for review. ( http://gerrit.cloudera.org:8080/21377 Change subject: IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column .. IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column Impala frontend can not evaluate BETWEEN/NOT BETWEEN predicate directly. It needs to transform a BetweenPredicate into a CompoundPredicate consisting of upper bound and lower bound BinaryPredicate through BetweenToCompoundRule.java. The BinaryPredicate can then be pushed down or rewritten into other form by another expression rewrite rule. However, the selectivity of BetweenPredicate or its derivatives remains unassigned and often collapses with other unknown selectivity predicates to have collective selectivity equals Expr.DEFAULT_SELECTIVITY (0.1). This patch adds a narrow optimization of BetweenPredicate selectivity when the following criteria are met: 1. The BetweenPredicate is bound to a slot reference of a single column of a table. 2. The column type is discrete, such as INTEGER or DATE. 3. The column stats are available. 4. The column is sufficiently unique based on available stats. 5. The BETWEEN/NOT BETWEEN predicate is in good form (lower bound value <= upper bound value). 6. The final calculated selectivity is less than or equal to Expr.DEFAULT_SELECTIVITY. If these criteria are unmet, the Planner will revert to the old behavior, which is letting the selectivity unassigned. This patch calculates the BetweenPredicate selectivity during transformation at BetweenToCompoundRule.java. The selectivity is piggy-backed into the resulting CompoundPredicate and BinaryPredicate as betweenSelectivity_ field, separate from the selectivity_ field. Analyzer.getBoundPredicates() is modified to prioritize the derived BinaryPredicate over ordinary BinaryPredicate in its return value to prevent the derived BinaryPredicate from being eliminated by a matching ordinary BinaryPredicate. Testing: - Add FE tests in ExprCardinalityTest#testBetweenSelectivity, ExprCardinalityTest#testNotBetweenSelectivity, and PlannerTest#testScanCardinality. - Pass core tests. Change-Id: Ib349d97349d1ee99788645a66be1b81749684d10 --- M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/catalog/Type.java M fe/src/main/java/org/apache/impala/planner/PlanNode.java M fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java M fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java M testdata/workloads/functional-planner/queries/PlannerTest/card-scan.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q05.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q12.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q16.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q20.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q21.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q32.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q37.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q40.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q77.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q80.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q82.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q92.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q94.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q95.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q98.test 24 files changed, 3,846 insertions(+), 3,504 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/77/21377/1 -- To view, visit http://gerrit.cloudera.org:8080/21377 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ib349d97349d1ee99788645a66be1b81749684d10 Gerrit-Change-Number: 21377 Gerrit-PatchSet: 1 Gerrit-Owner: Riza Suminto
[Impala-ASF-CR] IMPALA-13045: Wait for impala query live to exist
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21372 ) Change subject: IMPALA-13045: Wait for impala_query_live to exist .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/21372 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5cc3fa3c43be7af9a5f097359a0d4f20d057a207 Gerrit-Change-Number: 21372 Gerrit-PatchSet: 2 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Mon, 29 Apr 2024 21:08:51 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13045: Wait for impala query live to exist
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21372 ) Change subject: IMPALA-13045: Wait for impala_query_live to exist .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/21372/1/tests/custom_cluster/test_query_live.py File tests/custom_cluster/test_query_live.py: http://gerrit.cloudera.org:8080/#/c/21372/1/tests/custom_cluster/test_query_live.py@34 PS1, Line 34: def wait_for_create_table(self, table_name): Should this be an override of setup_method() instead, with table_name fixed to 'sys.impala_query_live'? I'm guessing that this is a common requirement for all tests in TestQueryLive. -- To view, visit http://gerrit.cloudera.org:8080/21372 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5cc3fa3c43be7af9a5f097359a0d4f20d057a207 Gerrit-Change-Number: 21372 Gerrit-PatchSet: 1 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Mon, 29 Apr 2024 20:05:12 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13012: Lower default query log max queued
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21351 ) Change subject: IMPALA-13012: Lower default query_log_max_queued .. Patch Set 8: Code-Review+2 Did another pass. Looks OK to go. Changed my vote to +2. -- To view, visit http://gerrit.cloudera.org:8080/21351 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6535675307d88cb65ba7d908f3c692e0cf3259d7 Gerrit-Change-Number: 21351 Gerrit-PatchSet: 8 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Fri, 26 Apr 2024 23:45:13 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12997: Use graceful shutdown for query log tests
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21345 ) Change subject: IMPALA-12997: Use graceful shutdown for query log tests .. Patch Set 12: Code-Review+2 Looks good. Thanks for digging into it! -- To view, visit http://gerrit.cloudera.org:8080/21345 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia123c53a952a77ff4a9c02736b5717ccaa3566dc Gerrit-Change-Number: 21345 Gerrit-PatchSet: 12 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 26 Apr 2024 22:12:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12997: Use graceful shutdown for query log tests
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21345 ) Change subject: IMPALA-12997: Use graceful shutdown for query log tests .. Patch Set 10: (1 comment) http://gerrit.cloudera.org:8080/#/c/21345/10/be/src/scheduling/cluster-membership-mgr.cc File be/src/scheduling/cluster-membership-mgr.cc: http://gerrit.cloudera.org:8080/#/c/21345/10/be/src/scheduling/cluster-membership-mgr.cc@399 PS10, Line 399: // Local backend may not have been added in a dedicated coordinator. What is "Local backend" here? Different impalad in the same host, or This impalad when it is not a dedicated coordinator? Can we add DCHECK to confirm that this is indeed an impalad configured as dedicated coordinator, or set expect_exists param in RemoveExecutorAndGroup to false if and only if this is a dedicated coordinator? -- To view, visit http://gerrit.cloudera.org:8080/21345 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia123c53a952a77ff4a9c02736b5717ccaa3566dc Gerrit-Change-Number: 21345 Gerrit-PatchSet: 10 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 26 Apr 2024 21:29:14 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13012: Lower default query log max queued
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21351 ) Change subject: IMPALA-13012: Lower default query_log_max_queued .. Patch Set 7: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/21351 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6535675307d88cb65ba7d908f3c692e0cf3259d7 Gerrit-Change-Number: 21351 Gerrit-PatchSet: 7 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Fri, 26 Apr 2024 19:09:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13005: Create Query Live table in HMS
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21302 ) Change subject: IMPALA-13005: Create Query Live table in HMS .. Patch Set 15: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/21302/15/tests/custom_cluster/test_query_live.py File tests/custom_cluster/test_query_live.py: http://gerrit.cloudera.org:8080/#/c/21302/15/tests/custom_cluster/test_query_live.py@154 PS15, Line 154: # Drop table at the end, it's only recreated on impalad startup. nit: I wonder what will happen in multi-coordinator cluster if this table is dropped from one coordinator and accessed from the other. If that is something that we need to address in the future (ie., prevent it from being dropped), please leave TODO comment. Otherwise, LGTM. -- To view, visit http://gerrit.cloudera.org:8080/21302 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idf302ee54a819fdee2db0ae582a5eeddffe4a5b4 Gerrit-Change-Number: 21302 Gerrit-PatchSet: 15 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Fri, 26 Apr 2024 19:05:31 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13024: Ignore slots if using default pool and empty group
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21340 ) Change subject: IMPALA-13024: Ignore slots if using default pool and empty group .. Patch Set 8: Code-Review+2 Carry +2. -- To view, visit http://gerrit.cloudera.org:8080/21340 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0b08dea7ba0c78ac6b98c7a0b148df8fb036c4d0 Gerrit-Change-Number: 21340 Gerrit-PatchSet: 8 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 26 Apr 2024 16:12:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13024: Ignore slots if using default pool and empty group
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21340 ) Change subject: IMPALA-13024: Ignore slots if using default pool and empty group .. Patch Set 8: (2 comments) http://gerrit.cloudera.org:8080/#/c/21340/7//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21340/7//COMMIT_MSG@13 PS7, Line 13: only)"). This patch adds check to recognize coordinator-only query in > nits: Done http://gerrit.cloudera.org:8080/#/c/21340/7/be/src/scheduling/admission-controller.cc File be/src/scheduling/admission-controller.cc: http://gerrit.cloudera.org:8080/#/c/21340/7/be/src/scheduling/admission-controller.cc@998 PS7, Line 998: // Can't admit if: : // (a) There are already queued requests (and this is not admitting from the queue). : // (b) The resource pool is already at the maximum number of requests. : // (c) One of the executors or coordinator in 'schedule' is already at its maximum : // admission slots (when not using the default executor group). : // (d) There are not enough memory resources available for the query. > The comment could probably also be updated: Done -- To view, visit http://gerrit.cloudera.org:8080/21340 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0b08dea7ba0c78ac6b98c7a0b148df8fb036c4d0 Gerrit-Change-Number: 21340 Gerrit-PatchSet: 8 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 26 Apr 2024 16:04:13 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13024: Ignore slots if using default pool and empty group
Hello Abhishek Rawat, Zoltan Borok-Nagy, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21340 to look at the new patch set (#8). Change subject: IMPALA-13024: Ignore slots if using default pool and empty group .. IMPALA-13024: Ignore slots if using default pool and empty group Slot based admission should not be enabled when using default pool. There is a bug where coordinator-only query still does slot based admission because executor group name set to ClusterMembershipMgr::EMPTY_GROUP_NAME ("empty group (using coordinator only)"). This patch adds check to recognize coordinator-only query in default pool and skip slot based admission for it. Testing: - Add BE test AdmissionControllerTest.CanAdmitRequestSlotsDefault. - In test_executor_groups.py, split test_coordinator_concurrency to test_coordinator_concurrency_default and test_coordinator_concurrency_two_exec_group_cluster to show the behavior change. - Pass core tests in ASAN build. Change-Id: I0b08dea7ba0c78ac6b98c7a0b148df8fb036c4d0 --- M be/src/scheduling/admission-controller-test.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/scheduling/cluster-membership-mgr.cc M be/src/scheduling/cluster-membership-mgr.h M be/src/scheduling/request-pool-service.cc M be/src/scheduling/request-pool-service.h M tests/common/impala_connection.py M tests/custom_cluster/test_executor_groups.py 9 files changed, 128 insertions(+), 18 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/21340/8 -- To view, visit http://gerrit.cloudera.org:8080/21340 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0b08dea7ba0c78ac6b98c7a0b148df8fb036c4d0 Gerrit-Change-Number: 21340 Gerrit-PatchSet: 8 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-13012: Lower default query log max queued
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21351 ) Change subject: IMPALA-13012: Lower default query_log_max_queued .. Patch Set 6: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/21351 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6535675307d88cb65ba7d908f3c692e0cf3259d7 Gerrit-Change-Number: 21351 Gerrit-PatchSet: 6 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Fri, 26 Apr 2024 00:45:23 + Gerrit-HasComments: No