[Impala-ASF-CR] IMPALA-10860: Allow setting mem limit for coordinators
Wenzhe Zhou has posted comments on this change. ( http://gerrit.cloudera.org:8080/20378 ) Change subject: IMPALA-10860: Allow setting mem_limit for coordinators .. Patch Set 2: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/20378/2/common/thrift/ImpalaService.thrift File common/thrift/ImpalaService.thrift: http://gerrit.cloudera.org:8080/#/c/20378/2/common/thrift/ImpalaService.thrift@839 PS2, Line 839: unspecified or a limit of 0 nit: Unspecified or a limit of 0 or negative value -- To view, visit http://gerrit.cloudera.org:8080/20378 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2dfc9a735e82dce2fd903bdaf6bc2e46e982ef8c Gerrit-Change-Number: 20378 Gerrit-PatchSet: 2 Gerrit-Owner: Abhishek Rawat Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Fri, 18 Aug 2023 23:09:10 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12385: Enable Periodic metrics by default
Kurt Deschler has posted comments on this change. ( http://gerrit.cloudera.org:8080/20377 ) Change subject: IMPALA-12385: Enable Periodic metrics by default .. Patch Set 1: (5 comments) http://gerrit.cloudera.org:8080/#/c/20377/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/20377/1//COMMIT_MSG@12 PS1, Line 12: resource_trace_ratio to 1 > AFAIK, there is a pretty significant overhead on always sampling this metri I didn't see any significant overhead, even with sampling at 10ms. Can you please provide an examples of a query that is slower? http://gerrit.cloudera.org:8080/#/c/20377/1/be/src/runtime/query-state.cc File be/src/runtime/query-state.cc: http://gerrit.cloudera.org:8080/#/c/20377/1/be/src/runtime/query-state.cc@221 PS1, Line 221: AddSamplingTimeSeriesCounter > Will this cause interpretation problem if different host happen to resize i The code appears to handle this already. Note that SamplingTimeSeriesCounter is already being used for Fragment metrics. http://gerrit.cloudera.org:8080/#/c/20377/1/be/src/util/periodic-counter-updater.cc File be/src/util/periodic-counter-updater.cc: http://gerrit.cloudera.org:8080/#/c/20377/1/be/src/util/periodic-counter-updater.cc@30 PS1, Line 30: periodic_counter_update_period_ms, 50 > I'm a bit concern about lowering this to 10x. Can the code in PeriodicCount 50ms doesn't appear to create performance issues with single-user queries. I will test with concurrent queries. Even at 100ms, values are too far apart for detailed analysis of short queries. http://gerrit.cloudera.org:8080/#/c/20377/1/be/src/util/runtime-profile-counters.h File be/src/util/runtime-profile-counters.h: http://gerrit.cloudera.org:8080/#/c/20377/1/be/src/util/runtime-profile-counters.h@807 PS1, Line 807: typedef StreamingSampler StreamingCounterSampler; > If initial_period = 50ms, and MAX_SAMPLES = 64, that means it will take 320 Queries on the order of 1sec were not affected. I will test more with shorter queries. http://gerrit.cloudera.org:8080/#/c/20377/1/be/src/util/streaming-sampler.h File be/src/util/streaming-sampler.h: http://gerrit.cloudera.org:8080/#/c/20377/1/be/src/util/streaming-sampler.h@40 PS1, Line 40: int initial_period > I'd rather keep this default to 500, but then add new parameter in AddSampl Memory and thread usage need to use the lower interval to short-running queries. I can understand adding a different switch to preserve the 500ms default for KRPC. -- To view, visit http://gerrit.cloudera.org:8080/20377 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic8e5cbfd4b324081158574ceb8f4b3a062a69fd1 Gerrit-Change-Number: 20377 Gerrit-PatchSet: 1 Gerrit-Owner: Kurt Deschler Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Surya Hebbar Gerrit-Comment-Date: Fri, 18 Aug 2023 22:11:07 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10860: Allow setting mem limit for coordinators
Abhishek Rawat has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/20378 ) Change subject: IMPALA-10860: Allow setting mem_limit for coordinators .. IMPALA-10860: Allow setting mem_limit for coordinators Added support for MEM_LIMIT_COORDINATORS query option. This is similar to exisiting MEM_LIMIT_EXECUTORS, but applies to coordinators. There are cases where Planner generates inaccurate estimates for coordinator fragments and would be good to be able to set mem limit just for the coordinator, since a query's memory requirement on coordinator tends to be much lower compared to that on executors. If MEM_LIMIT is set, then MEM_LIMIT_COORDINATORS is ignored. Also updated the documentation for the new query option. Testing: - Added new custom cluster tests which validates MEM_LIMIT_COORDINATORS applies only on coordinator. The test also validates that both MEM_LIMIT_EXECUTORS and MEM_LIMIT_COORDINATORS can be set together. - Built docs and made sure that the new changes have proper formatting. Change-Id: I2dfc9a735e82dce2fd903bdaf6bc2e46e982ef8c --- M be/src/scheduling/schedule-state.cc M be/src/service/query-options-test.cc M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaService.thrift M common/thrift/Query.thrift M docs/topics/impala_mem_limit.xml M tests/custom_cluster/test_admission_controller.py 8 files changed, 90 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/20378/2 -- To view, visit http://gerrit.cloudera.org:8080/20378 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2dfc9a735e82dce2fd903bdaf6bc2e46e982ef8c Gerrit-Change-Number: 20378 Gerrit-PatchSet: 2 Gerrit-Owner: Abhishek Rawat Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] [tools] Simplify local toolchain development
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20267 ) Change subject: [tools] Simplify local toolchain development .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/13783/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20267 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3a9e51b7f54c738d8cc01b32428ac88a344de376 Gerrit-Change-Number: 20267 Gerrit-PatchSet: 4 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Smith Gerrit-Comment-Date: Fri, 18 Aug 2023 21:37:30 + Gerrit-HasComments: No
[Impala-ASF-CR] Specify the native toolchain revision exactly
Michael Smith has posted comments on this change. ( http://gerrit.cloudera.org:8080/19885 ) Change subject: Specify the native toolchain revision exactly .. Patch Set 2: https://gerrit.cloudera.org/c/20267/ now incorporates this change. -- To view, visit http://gerrit.cloudera.org:8080/19885 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib1cb3476fb0c0fdc020782fd543abc3aaa2873a9 Gerrit-Change-Number: 19885 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Comment-Date: Fri, 18 Aug 2023 21:14:15 + Gerrit-HasComments: No
[Impala-ASF-CR] [tools] Simplify local toolchain development
Michael Smith has posted comments on this change. ( http://gerrit.cloudera.org:8080/20267 ) Change subject: [tools] Simplify local toolchain development .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/20267/3/bin/impala-config.sh File bin/impala-config.sh: http://gerrit.cloudera.org:8080/#/c/20267/3/bin/impala-config.sh@319 PS3, Line 319: fi > I was thinking it might be nice to combine this with some of the logic for Done. I also rolled in https://gerrit.cloudera.org/c/19885/ as part of setting up NATIVE_TOOLCHAIN_HOME with the correct ref. -- To view, visit http://gerrit.cloudera.org:8080/20267 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3a9e51b7f54c738d8cc01b32428ac88a344de376 Gerrit-Change-Number: 20267 Gerrit-PatchSet: 4 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Smith Gerrit-Comment-Date: Fri, 18 Aug 2023 21:12:54 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [tools] Simplify local toolchain development
Hello Laszlo Gaal, Joe McDonnell, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20267 to look at the new patch set (#4). Change subject: [tools] Simplify local toolchain development .. [tools] Simplify local toolchain development If NATIVE_TOOLCHAIN_HOME is set, that will be used to provide the native toolchain instead of the default in IMPALA_TOOLCHAIN. Overrides IMPALA_TOOLCHAIN_PACKAGES_HOME and sets SKIP_TOOLCHAIN_BOOTSTRAP=true. Adds IMPALA_TOOLCHAIN_REPO, IMPALA_TOOLCHAIN_BRANCH, and IMPALA_TOOLCHAIN_COMMIT_HASH so everything is clear about what toolchain is used for this Impala commit. Also skips downloading Kudu if SKIP_TOOLCHAIN_BOOTSTRAP is true as Kudu is built from native-toolchain. Normalizes aarch64 logic, which skipped Kudu because it would always build native-toolchain locally. Change-Id: I3a9e51b7f54c738d8cc01b32428ac88a344de376 --- M bin/bootstrap_system.sh M bin/bootstrap_toolchain.py M bin/impala-config.sh M buildall.sh 4 files changed, 40 insertions(+), 28 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/20267/4 -- To view, visit http://gerrit.cloudera.org:8080/20267 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I3a9e51b7f54c738d8cc01b32428ac88a344de376 Gerrit-Change-Number: 20267 Gerrit-PatchSet: 4 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Smith
[Impala-ASF-CR] [tools] Add Dev Container support for Impala development.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20380 ) Change subject: [tools] Add Dev Container support for Impala development. .. Patch Set 1: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/13782/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/20380 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I50508a09710641ec2a299b001fef3e7fefb0b7d5 Gerrit-Change-Number: 20380 Gerrit-PatchSet: 1 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Comment-Date: Fri, 18 Aug 2023 20:58:43 + Gerrit-HasComments: No
[Impala-ASF-CR] [tools] Add Dev Container support for Impala development.
Fredy Wijaya has uploaded this change for review. ( http://gerrit.cloudera.org:8080/20380 Change subject: [tools] Add Dev Container support for Impala development. .. [tools] Add Dev Container support for Impala development. Currently only VS Code is supported since IntelliJ/CLion support for Dev Container is still beta at the time of this writing. To use it, simply open Impala source code. $ git clone https://github.com/apache/impala.git $ cd impala $ code . The bootstrap_development.sh will be automatically executed post Docker container creation and all necesary extensions for IDE-like experience will be automatically installed. For C++, it uses clangd that uses compilation database instead of the Microsoft C++ extension since it works better with Clang related tools. Change-Id: I50508a09710641ec2a299b001fef3e7fefb0b7d5 --- A .devcontainer/Dockerfile A .devcontainer/devcontainer.json 2 files changed, 32 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/20380/1 -- To view, visit http://gerrit.cloudera.org:8080/20380 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I50508a09710641ec2a299b001fef3e7fefb0b7d5 Gerrit-Change-Number: 20380 Gerrit-PatchSet: 1 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal
[Impala-ASF-CR] IMPALA-11957: Implement Regression functions: regr slope(), regr intercept() and regr r2()
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19569 ) Change subject: IMPALA-11957: Implement Regression functions: regr_slope(), regr_intercept() and regr_r2() .. Patch Set 19: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/13781/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/19569 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iab6bd84ae3e0c02ec924c30183308123b951caa3 Gerrit-Change-Number: 19569 Gerrit-PatchSet: 19 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Fri, 18 Aug 2023 20:35:46 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10860: Allow setting mem limit for coordinators
Abhishek Rawat has posted comments on this change. ( http://gerrit.cloudera.org:8080/20378 ) Change subject: IMPALA-10860: Allow setting mem_limit for coordinators .. Patch Set 1: (1 comment) Make lint happy. http://gerrit.cloudera.org:8080/#/c/20378/1/be/src/scheduling/schedule-state.cc File be/src/scheduling/schedule-state.cc: http://gerrit.cloudera.org:8080/#/c/20378/1/be/src/scheduling/schedule-state.cc@319 PS1, Line 319: const bool is_mem_limit_coordinators_set = query_options().__isset.mem_limit_coordinators > line too long (91 > 90) Done -- To view, visit http://gerrit.cloudera.org:8080/20378 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2dfc9a735e82dce2fd903bdaf6bc2e46e982ef8c Gerrit-Change-Number: 20378 Gerrit-PatchSet: 1 Gerrit-Owner: Abhishek Rawat Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Fri, 18 Aug 2023 20:14:07 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11957: Implement Regression functions: regr slope(), regr intercept() and regr r2()
pranav.lo...@cloudera.com has posted comments on this change. ( http://gerrit.cloudera.org:8080/19569 ) Change subject: IMPALA-11957: Implement Regression functions: regr_slope(), regr_intercept() and regr_r2() .. Patch Set 19: (5 comments) > Patch Set 18: > > (5 comments) http://gerrit.cloudera.org:8080/#/c/19569/16//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/19569/16//COMMIT_MSG@20 PS16, Line 20: f > Why is it needed? Why should the line be indented one space? Sorry, I misunderstood, I though a space will also be needed. Done! http://gerrit.cloudera.org:8080/#/c/19569/18/be/src/exprs/aggregate-functions-ir.cc File be/src/exprs/aggregate-functions-ir.cc: http://gerrit.cloudera.org:8080/#/c/19569/18/be/src/exprs/aggregate-functions-ir.cc@316 PS18, Line 316: AllocBuffer(ctx, dst, dst->len); > You should initialise 'dst' with AllocBuffer(), otherwise 'dst' will not be Done http://gerrit.cloudera.org:8080/#/c/19569/18/be/src/exprs/aggregate-functions-ir.cc@510 PS18, Line 510: AllocBuffer(ctx, dst, dst->len); > See L316. Done http://gerrit.cloudera.org:8080/#/c/19569/18/be/src/exprs/aggregate-functions-ir.cc@516 PS18, Line 516: } > Nit: still empty line. Done http://gerrit.cloudera.org:8080/#/c/19569/18/be/src/exprs/aggregate-functions-ir.cc@748 PS18, Line 748: if (UNLIKELY(dst->is_null)) { > See L316. Done -- To view, visit http://gerrit.cloudera.org:8080/19569 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iab6bd84ae3e0c02ec924c30183308123b951caa3 Gerrit-Change-Number: 19569 Gerrit-PatchSet: 19 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Fri, 18 Aug 2023 20:10:18 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11957: Implement Regression functions: regr slope(), regr intercept() and regr r2()
pranav.lo...@cloudera.com has uploaded a new patch set (#19). ( http://gerrit.cloudera.org:8080/19569 ) Change subject: IMPALA-11957: Implement Regression functions: regr_slope(), regr_intercept() and regr_r2() .. IMPALA-11957: Implement Regression functions: regr_slope(), regr_intercept() and regr_r2() The linear regression functions fit an ordinary-least-squares regression line to a set of number pairs. They can be used both as aggregate and analytic functions. regr_slope() takes two arguments of numeric type and returns the slope of the line. regr_intercept() takes two arguments of numeric type and returns the y-intercept of the regression line. regr_r2() takes two arguments of numeric type and returns the coefficient of determination (also called R-squared or goodness of fit) for the regression. Testing: The functions are extensively tested and cross-checked with Hive. The tests can be found in aggregation.test. Change-Id: Iab6bd84ae3e0c02ec924c30183308123b951caa3 --- M be/src/exprs/aggregate-functions-ir.cc M be/src/exprs/aggregate-functions.h M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java M testdata/workloads/functional-query/queries/QueryTest/aggregation.test 4 files changed, 988 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/69/19569/19 -- To view, visit http://gerrit.cloudera.org:8080/19569 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iab6bd84ae3e0c02ec924c30183308123b951caa3 Gerrit-Change-Number: 19569 Gerrit-PatchSet: 19 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Quanlong Huang
[Impala-ASF-CR] IMPALA-11957: Implement Regression functions: regr slope(), regr intercept() and regr r2()
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19569 ) Change subject: IMPALA-11957: Implement Regression functions: regr_slope(), regr_intercept() and regr_r2() .. Patch Set 19: (1 comment) http://gerrit.cloudera.org:8080/#/c/19569/19/be/src/exprs/aggregate-functions-ir.cc File be/src/exprs/aggregate-functions-ir.cc: http://gerrit.cloudera.org:8080/#/c/19569/19/be/src/exprs/aggregate-functions-ir.cc@298 PS19, Line 298: // https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/REGR_-Linear-Regression-Functions.html#GUID-A675B68F-2A88-4843-BE2C-FCDE9C65F9A9 line too long (151 > 90) -- To view, visit http://gerrit.cloudera.org:8080/19569 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iab6bd84ae3e0c02ec924c30183308123b951caa3 Gerrit-Change-Number: 19569 Gerrit-PatchSet: 19 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Fri, 18 Aug 2023 20:11:07 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12253: Improve the integration of codegen cache and async codegen
Michael Smith has removed a vote on this change. Change subject: IMPALA-12253: Improve the integration of codegen cache and async codegen .. Removed Code-Review+1 by Michael Smith -- To view, visit http://gerrit.cloudera.org:8080/20211 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: deleteVote Gerrit-Change-Id: Ic5ae4b342ff8ef1c3b7ce35c927baa8b59d72908 Gerrit-Change-Number: 20211 Gerrit-PatchSet: 2 Gerrit-Owner: Yida Wu Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Yida Wu
[Impala-ASF-CR] IMPALA-12253: Improve the integration of codegen cache and async codegen
Michael Smith has posted comments on this change. ( http://gerrit.cloudera.org:8080/20211 ) Change subject: IMPALA-12253: Improve the integration of codegen cache and async codegen .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/20211/1/be/src/codegen/llvm-codegen.cc File be/src/codegen/llvm-codegen.cc: http://gerrit.cloudera.org:8080/#/c/20211/1/be/src/codegen/llvm-codegen.cc@1384 PS1, Line 1384: DCHECK(cache_key_); > '!cache_key_->empty()' is not checked in the new version, is it? Good catch, I was unclear what part I was referring to and missed the removal of ->empty on review. -- To view, visit http://gerrit.cloudera.org:8080/20211 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic5ae4b342ff8ef1c3b7ce35c927baa8b59d72908 Gerrit-Change-Number: 20211 Gerrit-PatchSet: 2 Gerrit-Owner: Yida Wu Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Yida Wu Gerrit-Comment-Date: Fri, 18 Aug 2023 19:56:43 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12383: Fix SingleNodePlanner aggregation limits
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20379 ) Change subject: IMPALA-12383: Fix SingleNodePlanner aggregation limits .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/13780/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20379 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic5eec1190e8e182152aa954897b79cc3f219c816 Gerrit-Change-Number: 20379 Gerrit-PatchSet: 1 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Fri, 18 Aug 2023 19:10:13 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12383: Fix SingleNodePlanner aggregation limits
Michael Smith has uploaded this change for review. ( http://gerrit.cloudera.org:8080/20379 Change subject: IMPALA-12383: Fix SingleNodePlanner aggregation limits .. IMPALA-12383: Fix SingleNodePlanner aggregation limits When IMPALA-2581 was implemented, it assumed all aggregation nodes would have a pre-aggregation step that limits could be pushed to. That's not the case when using SingleNodePlanner, such as when num_nodes=1. For example, the following query would return 16 rows, not 10: set num_nodes=1; select distinct l_orderkey from tpch.lineitem limit 10; Identifies all aggregation nodes that use pre-aggregation so we use fast_limit_check in only those cases. Testing: - added a test case where we assert number of rows returned by an aggregation node (rather than an exchange or top-n). - restores running with num_nodes=0 and num_nodes=1 for default test dimensions; IMPALA-561 was fixed ages ago. Change-Id: Ic5eec1190e8e182152aa954897b79cc3f219c816 --- M fe/src/main/java/org/apache/impala/planner/AggregationNode.java M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java M tests/common/test_dimensions.py M tests/query_test/test_aggregation.py 4 files changed, 27 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/20379/1 -- To view, visit http://gerrit.cloudera.org:8080/20379 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ic5eec1190e8e182152aa954897b79cc3f219c816 Gerrit-Change-Number: 20379 Gerrit-PatchSet: 1 Gerrit-Owner: Michael Smith
[Impala-ASF-CR] IMPALA-10860: Allow setting mem limit for coordinators
Abhishek Rawat has uploaded this change for review. ( http://gerrit.cloudera.org:8080/20378 Change subject: IMPALA-10860: Allow setting mem_limit for coordinators .. IMPALA-10860: Allow setting mem_limit for coordinators Added support for MEM_LIMIT_COORDINATORS query option. This is similar to exisiting MEM_LIMIT_EXECUTORS, but applies to coordinators. There are cases where Planner generates inaccurate estimates for coordinator fragments and would be good to be able to set mem limit just for the coordinator, since a query's memory requirement on coordinator tends to be much lower compared to that on executors. If MEM_LIMIT is set, then MEM_LIMIT_COORDINATORS is ignored. Also updated the documentation for the new query option. Testing: - Added new custom cluster tests which validates MEM_LIMIT_COORDINATORS applies only on coordinator. The test also validates that both MEM_LIMIT_EXECUTORS and MEM_LIMIT_COORDINATORS can be set together. - Built docs and made sure that the new changes have proper formatting. Change-Id: I2dfc9a735e82dce2fd903bdaf6bc2e46e982ef8c --- M be/src/scheduling/schedule-state.cc M be/src/service/query-options-test.cc M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaService.thrift M common/thrift/Query.thrift M docs/topics/impala_mem_limit.xml M tests/custom_cluster/test_admission_controller.py 8 files changed, 89 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/20378/1 -- To view, visit http://gerrit.cloudera.org:8080/20378 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I2dfc9a735e82dce2fd903bdaf6bc2e46e982ef8c Gerrit-Change-Number: 20378 Gerrit-PatchSet: 1 Gerrit-Owner: Abhishek Rawat Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-10860: Allow setting mem limit for coordinators
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20378 ) Change subject: IMPALA-10860: Allow setting mem_limit for coordinators .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/20378/1/be/src/scheduling/schedule-state.cc File be/src/scheduling/schedule-state.cc: http://gerrit.cloudera.org:8080/#/c/20378/1/be/src/scheduling/schedule-state.cc@319 PS1, Line 319: const bool is_mem_limit_coordinators_set = query_options().__isset.mem_limit_coordinators line too long (91 > 90) -- To view, visit http://gerrit.cloudera.org:8080/20378 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2dfc9a735e82dce2fd903bdaf6bc2e46e982ef8c Gerrit-Change-Number: 20378 Gerrit-PatchSet: 1 Gerrit-Owner: Abhishek Rawat Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Fri, 18 Aug 2023 17:54:45 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12384: Restore NullLiteral's uncheckedCastTo function signature
Michael Smith has posted comments on this change. ( http://gerrit.cloudera.org:8080/20376 ) Change subject: IMPALA-12384: Restore NullLiteral's uncheckedCastTo function signature .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/20376/1/fe/src/main/java/org/apache/impala/analysis/NullLiteral.java File fe/src/main/java/org/apache/impala/analysis/NullLiteral.java: http://gerrit.cloudera.org:8080/#/c/20376/1/fe/src/main/java/org/apache/impala/analysis/NullLiteral.java@68 PS1, Line 68: protected Expr uncheckedCastTo(Type targetType) { Do we also need to implement uncheckedCastTo with the compatibility argument? If that function's called elsewhere it seems like it could return the wrong result. I was thinking the change would be to override both. -- To view, visit http://gerrit.cloudera.org:8080/20376 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id9c01129d3cdcaeb222ea910521704ce2305fd2e Gerrit-Change-Number: 20376 Gerrit-PatchSet: 1 Gerrit-Owner: Peter Rozsa Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Peter Rozsa Gerrit-Comment-Date: Fri, 18 Aug 2023 17:19:39 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12228: Simulate the failure of an iceberg transaction.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20306 ) Change subject: IMPALA-12228: Simulate the failure of an iceberg transaction. .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/20306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iafdacc3377a0a946ead9331ba6a63a1fbb11f0eb Gerrit-Change-Number: 20306 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 18 Aug 2023 16:42:40 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12228: Simulate the failure of an iceberg transaction.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/20306 ) Change subject: IMPALA-12228: Simulate the failure of an iceberg transaction. .. IMPALA-12228: Simulate the failure of an iceberg transaction. The commit of an Iceberg transaction is done by the Iceberg catalog. In the common case for Impala the Iceberg catalog is HiveCatalog, and the actual commit is performed by HMS. This means the commit could fail because of some activity outside of Impala. It is useful therefore to be able to simulate what happens when an Iceberg commit fails. Extend Java DebugAction to allow it to throw an exception. For now this is limited to throwing unchecked exceptions, which is all that is needed for this patch. Add two DebugActions that can be used to throw Iceberg CommitFailedExceptions at the point where the Iceberg transaction is about to commit. Add a new test that uses the new DebugActions to abort an insert and the addition of a column. Change-Id: Iafdacc3377a0a946ead9331ba6a63a1fbb11f0eb Reviewed-on: http://gerrit.cloudera.org:8080/20306 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/DebugUtils.java M fe/src/test/java/org/apache/impala/util/DebugUtilsTest.java M tests/query_test/test_iceberg.py 4 files changed, 123 insertions(+), 6 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/20306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Iafdacc3377a0a946ead9331ba6a63a1fbb11f0eb Gerrit-Change-Number: 20306 Gerrit-PatchSet: 3 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-12385: Enable Periodic metrics by default
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/20377 ) Change subject: IMPALA-12385: Enable Periodic metrics by default .. Patch Set 2: (5 comments) http://gerrit.cloudera.org:8080/#/c/20377/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/20377/1//COMMIT_MSG@12 PS1, Line 12: resource_trace_ratio to 1 AFAIK, there is a pretty significant overhead on always sampling this metrics. Seems like parsing /proc/stat, /proc/net/dev, /proc/diskstats does not come cheap. I'm not sure if this should be enabled by default. http://gerrit.cloudera.org:8080/#/c/20377/1/be/src/runtime/query-state.cc File be/src/runtime/query-state.cc: http://gerrit.cloudera.org:8080/#/c/20377/1/be/src/runtime/query-state.cc@221 PS1, Line 221: AddSamplingTimeSeriesCounter Will this cause interpretation problem if different host happen to resize its sampling period differently? In contrast, ChunkedTimeSeriesCounter does not resize it sampling period, right? http://gerrit.cloudera.org:8080/#/c/20377/1/be/src/util/periodic-counter-updater.cc File be/src/util/periodic-counter-updater.cc: http://gerrit.cloudera.org:8080/#/c/20377/1/be/src/util/periodic-counter-updater.cc@30 PS1, Line 30: periodic_counter_update_period_ms, 50 I'm a bit concern about lowering this to 10x. Can the code in PeriodicCounterUpdater::UpdateLoop() keep up in such short sampling period under heavy-concurrent queries? It looks like PeriodicCounterUpdater is a singleton per impalad. http://gerrit.cloudera.org:8080/#/c/20377/1/be/src/util/runtime-profile-counters.h File be/src/util/runtime-profile-counters.h: http://gerrit.cloudera.org:8080/#/c/20377/1/be/src/util/runtime-profile-counters.h@807 PS1, Line 807: typedef StreamingSampler StreamingCounterSampler; If initial_period = 50ms, and MAX_SAMPLES = 64, that means it will take 3200ms before the sampling period doubled to 100ms. Will this hurt performance of short latency queries? http://gerrit.cloudera.org:8080/#/c/20377/1/be/src/util/streaming-sampler.h File be/src/util/streaming-sampler.h: http://gerrit.cloudera.org:8080/#/c/20377/1/be/src/util/streaming-sampler.h@40 PS1, Line 40: int initial_period I'd rather keep this default to 500, but then add new parameter in AddSamplingTimeSeriesCounter for customized initial_period. I see this kind of counter is being used in other places like following: be/src/runtime/fragment-instance-state.cc: mem_usage_sampled_counter_ = profile()->AddSamplingTimeSeriesCounter("MemoryUsage", be/src/runtime/fragment-instance-state.cc: thread_usage_sampled_counter_ = profile()->AddSamplingTimeSeriesCounter("ThreadUsage", be/src/runtime/krpc-data-stream-recvr.cc: enqueue_profile_->AddSamplingTimeSeriesCounter("DeferredQueueSize", TUnit::UNIT, Their sampling period should probably stay at 500, while sampling counters from host_profile_ starts at lower initial_period. -- To view, visit http://gerrit.cloudera.org:8080/20377 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic8e5cbfd4b324081158574ceb8f4b3a062a69fd1 Gerrit-Change-Number: 20377 Gerrit-PatchSet: 2 Gerrit-Owner: Kurt Deschler Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Surya Hebbar Gerrit-Comment-Date: Fri, 18 Aug 2023 16:23:59 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12385: Enable Periodic metrics by default
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20377 ) Change subject: IMPALA-12385: Enable Periodic metrics by default .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/13779/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20377 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic8e5cbfd4b324081158574ceb8f4b3a062a69fd1 Gerrit-Change-Number: 20377 Gerrit-PatchSet: 2 Gerrit-Owner: Kurt Deschler Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Surya Hebbar Gerrit-Comment-Date: Fri, 18 Aug 2023 15:50:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12385: Enable Periodic metrics by default
Hello Riza Suminto, David Rorke, Surya Hebbar, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20377 to look at the new patch set (#2). Change subject: IMPALA-12385: Enable Periodic metrics by default .. IMPALA-12385: Enable Periodic metrics by default This patch enables periodic metrics in query profiles by default and changes the metric collectors to be more suitable for mixed workloads. -Change default of resource_trace_ratio to 1 -Use samplng counters which can automatically resize for long queries -Reduce metric interval to 50ms to support short-running queries -Fragment metrics use the same sample interval as periodic metrics Testing: Updated runtime-profile-test and test_observability.py for new defaults Manual inspection of query profile metrics for long-running queries Change-Id: Ic8e5cbfd4b324081158574ceb8f4b3a062a69fd1 --- M be/src/runtime/query-state.cc M be/src/util/periodic-counter-updater.cc M be/src/util/runtime-profile-counters.h M be/src/util/runtime-profile-test.cc M be/src/util/runtime-profile.cc M be/src/util/streaming-sampler.h M common/thrift/Query.thrift M tests/query_test/test_observability.py 8 files changed, 35 insertions(+), 31 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/77/20377/2 -- To view, visit http://gerrit.cloudera.org:8080/20377 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic8e5cbfd4b324081158574ceb8f4b3a062a69fd1 Gerrit-Change-Number: 20377 Gerrit-PatchSet: 2 Gerrit-Owner: Kurt Deschler Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Surya Hebbar
[Impala-ASF-CR] IMPALA-12385: Enable Periodic metrics by default
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20377 ) Change subject: IMPALA-12385: Enable Periodic metrics by default .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/13778/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20377 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic8e5cbfd4b324081158574ceb8f4b3a062a69fd1 Gerrit-Change-Number: 20377 Gerrit-PatchSet: 1 Gerrit-Owner: Kurt Deschler Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Surya Hebbar Gerrit-Comment-Date: Fri, 18 Aug 2023 15:02:26 + Gerrit-HasComments: No
[Impala-ASF-CR] CDPD-50675: Include crypto functions supported by Hive
pranav.lo...@cloudera.com has abandoned this change. ( http://gerrit.cloudera.org:8080/20364 ) Change subject: CDPD-50675: Include crypto functions supported by Hive .. Abandoned -- To view, visit http://gerrit.cloudera.org:8080/20364 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: abandon Gerrit-Change-Id: I78389ff527ddb2c3fc01d4c6c56eca1c07753e5d Gerrit-Change-Number: 20364 Gerrit-PatchSet: 1 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell
[Impala-ASF-CR] IMPALA-12385: Enable Periodic metrics by default
Kurt Deschler has uploaded this change for review. ( http://gerrit.cloudera.org:8080/20377 Change subject: IMPALA-12385: Enable Periodic metrics by default .. IMPALA-12385: Enable Periodic metrics by default This patch enables periodic metrics in query profiles by default and changes the metric collectors to be more suitable for mixed workloads. -Change default of resource_trace_ratio to 1 -Use samplng counters which can automatically resize for long queries -Reduce metric interval to 50ms to support short-running queries -Fragment metrics use the same sample interval as periodic metrics Testing: Updated runtime-profile-test and test_observability.py for new defaults Manual inspection of query profile metrics for long-running queries Change-Id: Ic8e5cbfd4b324081158574ceb8f4b3a062a69fd1 --- M be/src/runtime/query-state.cc M be/src/util/periodic-counter-updater.cc M be/src/util/runtime-profile-counters.h M be/src/util/runtime-profile-test.cc M be/src/util/runtime-profile.cc M be/src/util/streaming-sampler.h M common/thrift/Query.thrift M tests/query_test/test_observability.py 8 files changed, 34 insertions(+), 31 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/77/20377/1 -- To view, visit http://gerrit.cloudera.org:8080/20377 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ic8e5cbfd4b324081158574ceb8f4b3a062a69fd1 Gerrit-Change-Number: 20377 Gerrit-PatchSet: 1 Gerrit-Owner: Kurt Deschler
[Impala-ASF-CR] IMPALA-12385: Enable Periodic metrics by default
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20377 ) Change subject: IMPALA-12385: Enable Periodic metrics by default .. Patch Set 1: (3 comments) http://gerrit.cloudera.org:8080/#/c/20377/1/be/src/util/runtime-profile.cc File be/src/util/runtime-profile.cc: http://gerrit.cloudera.org:8080/#/c/20377/1/be/src/util/runtime-profile.cc@2023 PS1, Line 2023: TimeSeriesCounter* counter = pool_->Add(new SamplingTimeSeriesCounter(name, unit, fn, FLAGS_periodic_counter_update_period_ms)); line too long (130 > 90) http://gerrit.cloudera.org:8080/#/c/20377/1/tests/query_test/test_observability.py File tests/query_test/test_observability.py: http://gerrit.cloudera.org:8080/#/c/20377/1/tests/query_test/test_observability.py@532 PS1, Line 532: r flake8: W291 trailing whitespace http://gerrit.cloudera.org:8080/#/c/20377/1/tests/query_test/test_observability.py@532 PS1, Line 532: """Tests that the query profile does not contain resource usage metrics line has trailing whitespace -- To view, visit http://gerrit.cloudera.org:8080/20377 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic8e5cbfd4b324081158574ceb8f4b3a062a69fd1 Gerrit-Change-Number: 20377 Gerrit-PatchSet: 1 Gerrit-Owner: Kurt Deschler Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Surya Hebbar Gerrit-Comment-Date: Fri, 18 Aug 2023 14:35:07 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12384: Restore NullLiteral's uncheckedCastTo function signature
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20376 ) Change subject: IMPALA-12384: Restore NullLiteral's uncheckedCastTo function signature .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/13777/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20376 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id9c01129d3cdcaeb222ea910521704ce2305fd2e Gerrit-Change-Number: 20376 Gerrit-PatchSet: 1 Gerrit-Owner: Peter Rozsa Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Peter Rozsa Gerrit-Comment-Date: Fri, 18 Aug 2023 13:38:31 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: IMPALA-12159: Support ORDER BY for collections of variable length types in select list
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20108 ) Change subject: WIP: IMPALA-12159: Support ORDER BY for collections of variable length types in select list .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/13776/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20108 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic15b29393f260b572e11a8dbb9deeb8c02981852 Gerrit-Change-Number: 20108 Gerrit-PatchSet: 3 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Fri, 18 Aug 2023 13:33:50 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: IMPALA-12159: Support ORDER BY for collections of variable length types in select list
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20108 ) Change subject: WIP: IMPALA-12159: Support ORDER BY for collections of variable length types in select list .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/13775/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20108 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic15b29393f260b572e11a8dbb9deeb8c02981852 Gerrit-Change-Number: 20108 Gerrit-PatchSet: 2 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Fri, 18 Aug 2023 13:21:55 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12384: Restore NullLiteral's uncheckedCastTo function signature
Peter Rozsa has uploaded this change for review. ( http://gerrit.cloudera.org:8080/20376 Change subject: IMPALA-12384: Restore NullLiteral's uncheckedCastTo function signature .. IMPALA-12384: Restore NullLiteral's uncheckedCastTo function signature This change restores NullLiteral's uncheckedCastTo function's signature to preserve the external compatibility of the method and make it conform with changes regarding IMPALA-10173. Change-Id: Id9c01129d3cdcaeb222ea910521704ce2305fd2e --- M fe/src/main/java/org/apache/impala/analysis/NullLiteral.java 1 file changed, 2 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/20376/1 -- To view, visit http://gerrit.cloudera.org:8080/20376 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Id9c01129d3cdcaeb222ea910521704ce2305fd2e Gerrit-Change-Number: 20376 Gerrit-PatchSet: 1 Gerrit-Owner: Peter Rozsa
[Impala-ASF-CR] WIP: IMPALA-12159: Support ORDER BY for collections of variable length types in select list
Daniel Becker has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/20108 ) Change subject: WIP: IMPALA-12159: Support ORDER BY for collections of variable length types in select list .. WIP: IMPALA-12159: Support ORDER BY for collections of variable length types in select list IMPALA-12019 implemented support for collections of fixed length types in the sorting tuple. This change implements it for collections of variable length types. Note that the limitation that structs that contain any type of collection are not allowed in the sorting tuple is still in place (see IMPALA-12160). Testing: - Renamed the 'simple_arrays_big' table to 'arrays_big' and extended it with collections containing variable length types. This table is mainly used to test that spilling works during sorting. - Renamed test_sort.py::TestArraySort::{test_simple_arrays,test_simple_arrays_with_limit} to {test_array_sort,test_array_sort_with_limit} - Extended the tests run in test_queries.py::TestQueries::{test_sort, test_top_n,test_partitioned_top_n} with collections containing var-len types. Change-Id: Ic15b29393f260b572e11a8dbb9deeb8c02981852 --- M be/src/codegen/codegen-anyval-read-write-info.cc M be/src/codegen/codegen-anyval-read-write-info.h M be/src/runtime/collection-value.cc M be/src/runtime/collection-value.h M be/src/runtime/descriptors.cc M be/src/runtime/descriptors.h M be/src/runtime/raw-value.cc M be/src/runtime/raw-value.h M be/src/runtime/sorter-internal.h M be/src/runtime/sorter.cc M be/src/runtime/tuple-ir.cc M be/src/runtime/tuple.cc M be/src/runtime/tuple.h M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java M fe/src/main/java/org/apache/impala/analysis/SortInfo.java M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java M fe/src/main/java/org/apache/impala/planner/AnalyticPlanner.java M fe/src/main/java/org/apache/impala/planner/UnionNode.java A testdata/ComplexTypesTbl/arrays_big.parq D testdata/ComplexTypesTbl/simple_arrays_big.parq M testdata/data/README M testdata/datasets/functional/functional_schema_template.sql M testdata/datasets/functional/schema_constraints.csv M testdata/workloads/functional-query/queries/QueryTest/nested-array-in-select-list.test M testdata/workloads/functional-query/queries/QueryTest/nested-map-in-select-list.test M testdata/workloads/functional-query/queries/QueryTest/partitioned-top-n-complex.test M testdata/workloads/functional-query/queries/QueryTest/sort-complex.test M testdata/workloads/functional-query/queries/QueryTest/top-n-complex.test M tests/query_test/test_queries.py M tests/query_test/test_sort.py 31 files changed, 1,313 insertions(+), 578 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/08/20108/3 -- To view, visit http://gerrit.cloudera.org:8080/20108 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic15b29393f260b572e11a8dbb9deeb8c02981852 Gerrit-Change-Number: 20108 Gerrit-PatchSet: 3 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] WIP: IMPALA-12159: Support ORDER BY for collections of variable length types in select list
Daniel Becker has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/20108 ) Change subject: WIP: IMPALA-12159: Support ORDER BY for collections of variable length types in select list .. WIP: IMPALA-12159: Support ORDER BY for collections of variable length types in select list IMPALA-12019 implemented support for collections of fixed length types in the sorting tuple. This change implements it for collections of variable length types. Note that the limitation that structs that contain any type of collection are not allowed in the sorting tuple is still in place (see IMPALA-12160). Testing: - Renamed the 'simple_arrays_big' table to 'arrays_big' and extended it with collections containing variable length types. This table is mainly used to test that spilling works during sorting. - Extended the tests run in test_queries.py::TestQueries::{test_sort, test_top_n,test_partitioned_top_n} with collections containing var-len types. Change-Id: Ic15b29393f260b572e11a8dbb9deeb8c02981852 --- M be/src/codegen/codegen-anyval-read-write-info.cc M be/src/codegen/codegen-anyval-read-write-info.h M be/src/runtime/collection-value.cc M be/src/runtime/collection-value.h M be/src/runtime/descriptors.cc M be/src/runtime/descriptors.h M be/src/runtime/raw-value.cc M be/src/runtime/raw-value.h M be/src/runtime/sorter-internal.h M be/src/runtime/sorter.cc M be/src/runtime/tuple-ir.cc M be/src/runtime/tuple.cc M be/src/runtime/tuple.h M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java M fe/src/main/java/org/apache/impala/analysis/SortInfo.java M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java M fe/src/main/java/org/apache/impala/planner/AnalyticPlanner.java M fe/src/main/java/org/apache/impala/planner/UnionNode.java A testdata/ComplexTypesTbl/arrays_big.parq D testdata/ComplexTypesTbl/simple_arrays_big.parq M testdata/data/README M testdata/datasets/functional/functional_schema_template.sql M testdata/datasets/functional/schema_constraints.csv M testdata/workloads/functional-query/queries/QueryTest/nested-array-in-select-list.test M testdata/workloads/functional-query/queries/QueryTest/nested-map-in-select-list.test M testdata/workloads/functional-query/queries/QueryTest/partitioned-top-n-complex.test M testdata/workloads/functional-query/queries/QueryTest/sort-complex.test M testdata/workloads/functional-query/queries/QueryTest/top-n-complex.test M tests/query_test/test_queries.py M tests/query_test/test_sort.py 31 files changed, 1,312 insertions(+), 577 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/08/20108/2 -- To view, visit http://gerrit.cloudera.org:8080/20108 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic15b29393f260b572e11a8dbb9deeb8c02981852 Gerrit-Change-Number: 20108 Gerrit-PatchSet: 2 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-12228: Simulate the failure of an iceberg transaction.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20306 ) Change subject: IMPALA-12228: Simulate the failure of an iceberg transaction. .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/9606/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/20306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iafdacc3377a0a946ead9331ba6a63a1fbb11f0eb Gerrit-Change-Number: 20306 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 18 Aug 2023 12:20:01 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12228: Simulate the failure of an iceberg transaction.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20306 ) Change subject: IMPALA-12228: Simulate the failure of an iceberg transaction. .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/20306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iafdacc3377a0a946ead9331ba6a63a1fbb11f0eb Gerrit-Change-Number: 20306 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 18 Aug 2023 12:20:00 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12228: Simulate the failure of an iceberg transaction.
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/20306 ) Change subject: IMPALA-12228: Simulate the failure of an iceberg transaction. .. Patch Set 1: Code-Review+2 (1 comment) LGTM! http://gerrit.cloudera.org:8080/#/c/20306/1/fe/src/main/java/org/apache/impala/util/DebugUtils.java File fe/src/main/java/org/apache/impala/util/DebugUtils.java: http://gerrit.cloudera.org:8080/#/c/20306/1/fe/src/main/java/org/apache/impala/util/DebugUtils.java@166 PS1, Line 166: : > That was my first idea for implementation. Sure, I'm completely fine with this. Future implementers will extend this solution if they need to. -- To view, visit http://gerrit.cloudera.org:8080/20306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iafdacc3377a0a946ead9331ba6a63a1fbb11f0eb Gerrit-Change-Number: 20306 Gerrit-PatchSet: 1 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 18 Aug 2023 12:19:32 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12342: Erasure coding build fails on loading iceberg lineitem multiblock
Noemi Pap-Takacs has posted comments on this change. ( http://gerrit.cloudera.org:8080/20359 ) Change subject: IMPALA-12342: Erasure coding build fails on loading iceberg_lineitem_multiblock .. Patch Set 1: Code-Review+1 LGTM -- To view, visit http://gerrit.cloudera.org:8080/20359 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iad15a335407c12578eb822bb1cb4450647502e50 Gerrit-Change-Number: 20359 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Comment-Date: Fri, 18 Aug 2023 10:14:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11898: Add query options in the profile even if the query failed in planning
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/19517 ) Change subject: IMPALA-11898: Add query options in the profile even if the query failed in planning .. Patch Set 6: In the failed tests you probably have to modify the profiles to include the added fields. -- To view, visit http://gerrit.cloudera.org:8080/19517 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0e9ce62008dd5b1671b09eda5365cbb0940ebe64 Gerrit-Change-Number: 19517 Gerrit-PatchSet: 6 Gerrit-Owner: Baike Xia Gerrit-Reviewer: Baike Xia Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Fri, 18 Aug 2023 10:10:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11898: Add query options in the profile even if the query failed in planning
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/19517 ) Change subject: IMPALA-11898: Add query options in the profile even if the query failed in planning .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/19517/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/19517/6//COMMIT_MSG@20 PS6, Line 20: Add a "Testing" section and describe briefly how this change is tested. -- To view, visit http://gerrit.cloudera.org:8080/19517 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0e9ce62008dd5b1671b09eda5365cbb0940ebe64 Gerrit-Change-Number: 19517 Gerrit-PatchSet: 6 Gerrit-Owner: Baike Xia Gerrit-Reviewer: Baike Xia Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Fri, 18 Aug 2023 10:05:21 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10798: Prototype a simple JSON File reader
Zihao Ye has posted comments on this change. ( http://gerrit.cloudera.org:8080/19699 ) Change subject: IMPALA-10798: Prototype a simple JSON File reader .. Patch Set 22: (1 comment) Thank you once again for the code review. I have been a little busy lately, but I managed to find some time to complete the move of json-parser.h and separate the implementation into json-parser.cc. As for the remaining task of adding new test cases, I will try to find another time to finish it. http://gerrit.cloudera.org:8080/#/c/19699/22/be/src/exec/json-parser.h File be/src/exec/json-parser.h: http://gerrit.cloudera.org:8080/#/c/19699/22/be/src/exec/json-parser.h@37 PS22, Line 37: > It seems to be tightly coupled with hdfs-json-scanner so I think we > can put it in /json unless it can be reused in other places. > > Moving the implementation codes to json-parser.cc helps to speedup > recompilation when you have code changes. Also helps to make this > header file shorter and easier for going through. You can keep some > short methods in the header file and just move large methods like > Parse(). Done, It could compiles successfully in my own environment, but I'm not sure why it keeps failing to build here. -- To view, visit http://gerrit.cloudera.org:8080/19699 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I31309cb8f2d04722a0508b3f9b8f1532ad49a569 Gerrit-Change-Number: 19699 Gerrit-PatchSet: 22 Gerrit-Owner: Zihao Ye Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Zihao Ye Gerrit-Comment-Date: Fri, 18 Aug 2023 09:51:08 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10798: Prototype a simple JSON File reader
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19699 ) Change subject: IMPALA-10798: Prototype a simple JSON File reader .. Patch Set 26: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/13774/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/19699 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I31309cb8f2d04722a0508b3f9b8f1532ad49a569 Gerrit-Change-Number: 19699 Gerrit-PatchSet: 26 Gerrit-Owner: Zihao Ye Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Zihao Ye Gerrit-Comment-Date: Fri, 18 Aug 2023 09:40:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12228: Simulate the failure of an iceberg transaction.
Tamas Mate has posted comments on this change. ( http://gerrit.cloudera.org:8080/20306 ) Change subject: IMPALA-12228: Simulate the failure of an iceberg transaction. .. Patch Set 1: Code-Review+1 Thanks Andrew! LGTM! I will give a +1, so Zoltan can give the +2. -- To view, visit http://gerrit.cloudera.org:8080/20306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iafdacc3377a0a946ead9331ba6a63a1fbb11f0eb Gerrit-Change-Number: 20306 Gerrit-PatchSet: 1 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 18 Aug 2023 09:24:29 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10798: Prototype a simple JSON File reader
Hello Quanlong Huang, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/19699 to look at the new patch set (#26). Change subject: IMPALA-10798: Prototype a simple JSON File reader .. IMPALA-10798: Prototype a simple JSON File reader Prototype of HdfsJsonScanner implemented based on rapidjson, which supports scanning data from splitting json files. The scanning of JSON data is mainly completed by two parts working together. The first part is the JsonParser responsible for parsing the JSON object, which is implemented based on the SAX-style API of rapidjson. It reads data from the char stream, parses it, and calls the corresponding callback function when encountering the corresponding JSON element. See the comments of the JsonParser class for more details. The other part is the HdfsJsonScanner, which inherits from HdfsScanner and provides callback functions for the JsonParser. The callback functions are responsible for providing data buffers to the Parser and converting and materializing the Parser's parsing results into RowBatch. It should be noted that the parser returns numeric values as strings to the scanner. The scanner uses the TextConverter class to convert the strings to the desired types, similar to how the HdfsTextScanner works. This is an advantage compared to using number value provided by rapidjson directly, as it eliminates concerns about inconsistencies in converting decimals (e.g. losing precision). Limitations - Multiline json objects are not fully supported yet. It is ok when each file has only one scan range. However, when a file has multiple scan ranges, there is a small probability of incomplete scanning of multiline JSON objects that span ScanRange boundaries (in such cases, parsing errors may be reported). For more details, please refer to the comments in the 'multiline_json.test'. - Compressed JSON files are not supported yet. - Complex types are not supported yet. Tests - Most of the existing end-to-end tests can run on JSON format. - Add TestQueriesJsonTables in test_queries.py for testing multiline, malformed, and overflow in JSON. Change-Id: I31309cb8f2d04722a0508b3f9b8f1532ad49a569 --- M be/CMakeLists.txt M be/src/exec/CMakeLists.txt M be/src/exec/hdfs-scan-node-base.cc A be/src/exec/json/CMakeLists.txt A be/src/exec/json/hdfs-json-scanner.cc A be/src/exec/json/hdfs-json-scanner.h A be/src/exec/json/json-parser-test.cc A be/src/exec/json/json-parser.cc A be/src/exec/json/json-parser.h M be/src/exec/text-converter.inline.h M bin/rat_exclude_files.txt M fe/src/main/java/org/apache/impala/catalog/HdfsFileFormat.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M testdata/bin/generate-schema-statements.py A testdata/data/json_test/complex.json A testdata/data/json_test/malformed.json A testdata/data/json_test/multiline.json A testdata/data/json_test/overflow.json M testdata/datasets/functional/functional_schema_template.sql M testdata/datasets/functional/schema_constraints.csv M testdata/workloads/functional-query/functional-query_core.csv M testdata/workloads/functional-query/functional-query_dimensions.csv M testdata/workloads/functional-query/functional-query_exhaustive.csv M testdata/workloads/functional-query/functional-query_pairwise.csv A testdata/workloads/functional-query/queries/QueryTest/complex_json.test A testdata/workloads/functional-query/queries/QueryTest/malformed_json.test A testdata/workloads/functional-query/queries/QueryTest/multiline_json.test A testdata/workloads/functional-query/queries/QueryTest/overflow_json.test M testdata/workloads/tpcds/tpcds_core.csv M testdata/workloads/tpcds/tpcds_exhaustive.csv M testdata/workloads/tpcds/tpcds_pairwise.csv M testdata/workloads/tpch/tpch_core.csv M testdata/workloads/tpch/tpch_dimensions.csv M testdata/workloads/tpch/tpch_exhaustive.csv M testdata/workloads/tpch/tpch_pairwise.csv M tests/common/test_dimensions.py M tests/metadata/test_hms_integration.py M tests/query_test/test_decimal_queries.py M tests/query_test/test_queries.py M tests/query_test/test_scanners.py M tests/query_test/test_scanners_fuzz.py M tests/query_test/test_tpch_queries.py 42 files changed, 1,498 insertions(+), 44 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/99/19699/26 -- To view, visit http://gerrit.cloudera.org:8080/19699 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I31309cb8f2d04722a0508b3f9b8f1532ad49a569 Gerrit-Change-Number: 19699 Gerrit-PatchSet: 26 Gerrit-Owner: Zihao Ye Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Zihao Ye
[Impala-ASF-CR] IMPALA-10798: Prototype a simple JSON File reader
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19699 ) Change subject: IMPALA-10798: Prototype a simple JSON File reader .. Patch Set 24: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/13773/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/19699 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I31309cb8f2d04722a0508b3f9b8f1532ad49a569 Gerrit-Change-Number: 19699 Gerrit-PatchSet: 24 Gerrit-Owner: Zihao Ye Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Zihao Ye Gerrit-Comment-Date: Fri, 18 Aug 2023 08:39:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10798: Prototype a simple JSON File reader
Hello Quanlong Huang, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/19699 to look at the new patch set (#24). Change subject: IMPALA-10798: Prototype a simple JSON File reader .. IMPALA-10798: Prototype a simple JSON File reader Prototype of HdfsJsonScanner implemented based on rapidjson, which supports scanning data from splitting json files. The scanning of JSON data is mainly completed by two parts working together. The first part is the JsonParser responsible for parsing the JSON object, which is implemented based on the SAX-style API of rapidjson. It reads data from the char stream, parses it, and calls the corresponding callback function when encountering the corresponding JSON element. See the comments of the JsonParser class for more details. The other part is the HdfsJsonScanner, which inherits from HdfsScanner and provides callback functions for the JsonParser. The callback functions are responsible for providing data buffers to the Parser and converting and materializing the Parser's parsing results into RowBatch. It should be noted that the parser returns numeric values as strings to the scanner. The scanner uses the TextConverter class to convert the strings to the desired types, similar to how the HdfsTextScanner works. This is an advantage compared to using number value provided by rapidjson directly, as it eliminates concerns about inconsistencies in converting decimals (e.g. losing precision). Limitations - Multiline json objects are not fully supported yet. It is ok when each file has only one scan range. However, when a file has multiple scan ranges, there is a small probability of incomplete scanning of multiline JSON objects that span ScanRange boundaries (in such cases, parsing errors may be reported). For more details, please refer to the comments in the 'multiline_json.test'. - Compressed JSON files are not supported yet. - Complex types are not supported yet. Tests - Most of the existing end-to-end tests can run on JSON format. - Add TestQueriesJsonTables in test_queries.py for testing multiline, malformed, and overflow in JSON. Change-Id: I31309cb8f2d04722a0508b3f9b8f1532ad49a569 --- M be/CMakeLists.txt M be/src/exec/CMakeLists.txt M be/src/exec/hdfs-scan-node-base.cc A be/src/exec/json/CMakeLists.txt A be/src/exec/json/hdfs-json-scanner.cc A be/src/exec/json/hdfs-json-scanner.h A be/src/exec/json/json-parser-test.cc A be/src/exec/json/json-parser.cc A be/src/exec/json/json-parser.h M be/src/exec/text-converter.inline.h M bin/rat_exclude_files.txt M fe/src/main/java/org/apache/impala/catalog/HdfsFileFormat.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M testdata/bin/generate-schema-statements.py A testdata/data/json_test/complex.json A testdata/data/json_test/malformed.json A testdata/data/json_test/multiline.json A testdata/data/json_test/overflow.json M testdata/datasets/functional/functional_schema_template.sql M testdata/datasets/functional/schema_constraints.csv M testdata/workloads/functional-query/functional-query_core.csv M testdata/workloads/functional-query/functional-query_dimensions.csv M testdata/workloads/functional-query/functional-query_exhaustive.csv M testdata/workloads/functional-query/functional-query_pairwise.csv A testdata/workloads/functional-query/queries/QueryTest/complex_json.test A testdata/workloads/functional-query/queries/QueryTest/malformed_json.test A testdata/workloads/functional-query/queries/QueryTest/multiline_json.test A testdata/workloads/functional-query/queries/QueryTest/overflow_json.test M testdata/workloads/tpcds/tpcds_core.csv M testdata/workloads/tpcds/tpcds_exhaustive.csv M testdata/workloads/tpcds/tpcds_pairwise.csv M testdata/workloads/tpch/tpch_core.csv M testdata/workloads/tpch/tpch_dimensions.csv M testdata/workloads/tpch/tpch_exhaustive.csv M testdata/workloads/tpch/tpch_pairwise.csv M tests/common/test_dimensions.py M tests/metadata/test_hms_integration.py M tests/query_test/test_decimal_queries.py M tests/query_test/test_queries.py M tests/query_test/test_scanners.py M tests/query_test/test_scanners_fuzz.py M tests/query_test/test_tpch_queries.py 42 files changed, 1,498 insertions(+), 44 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/99/19699/24 -- To view, visit http://gerrit.cloudera.org:8080/19699 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I31309cb8f2d04722a0508b3f9b8f1532ad49a569 Gerrit-Change-Number: 19699 Gerrit-PatchSet: 24 Gerrit-Owner: Zihao Ye Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Zihao Ye
[Impala-ASF-CR] IMPALA-11957: Implement Regression functions: regr slope(), regr intercept() and regr r2()
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/19569 ) Change subject: IMPALA-11957: Implement Regression functions: regr_slope(), regr_intercept() and regr_r2() .. Patch Set 18: (5 comments) http://gerrit.cloudera.org:8080/#/c/19569/16//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/19569/16//COMMIT_MSG@20 PS16, Line 20: > I think its needed Why is it needed? Why should the line be indented one space? http://gerrit.cloudera.org:8080/#/c/19569/18/be/src/exprs/aggregate-functions-ir.cc File be/src/exprs/aggregate-functions-ir.cc: http://gerrit.cloudera.org:8080/#/c/19569/18/be/src/exprs/aggregate-functions-ir.cc@316 PS18, Line 316: dst->ptr = ctx->Allocate(dst->len); You should initialise 'dst' with AllocBuffer(), otherwise 'dst' will not be set to null on failure. http://gerrit.cloudera.org:8080/#/c/19569/18/be/src/exprs/aggregate-functions-ir.cc@510 PS18, Line 510: dst->ptr = ctx->Allocate(dst->len); See L316. http://gerrit.cloudera.org:8080/#/c/19569/18/be/src/exprs/aggregate-functions-ir.cc@516 PS18, Line 516: Nit: still empty line. http://gerrit.cloudera.org:8080/#/c/19569/18/be/src/exprs/aggregate-functions-ir.cc@748 PS18, Line 748: dst->ptr = ctx->Allocate(dst->len); See L316. -- To view, visit http://gerrit.cloudera.org:8080/19569 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iab6bd84ae3e0c02ec924c30183308123b951caa3 Gerrit-Change-Number: 19569 Gerrit-PatchSet: 18 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Fri, 18 Aug 2023 08:11:45 + Gerrit-HasComments: Yes