[Impala-ASF-CR] IMPALA-10662: Change EE tests to return the same results for HS2 as Beeswax
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17325 ) Change subject: IMPALA-10662: Change EE tests to return the same results for HS2 as Beeswax .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/17325 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If69ae90c6333ff245c2b951af5689e3071f85cb2 Gerrit-Change-Number: 17325 Gerrit-PatchSet: 3 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 20 Apr 2021 03:06:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10647 Improve always-true min/max filter handling in coordinator
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17252 ) Change subject: IMPALA-10647 Improve always-true min/max filter handling in coordinator .. Patch Set 23: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/17252 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I326317833979efcbe02ce6c95ad80133dd5c7964 Gerrit-Change-Number: 17252 Gerrit-PatchSet: 23 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Tue, 20 Apr 2021 00:44:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10647 Improve always-true min/max filter handling in coordinator
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/17252 ) Change subject: IMPALA-10647 Improve always-true min/max filter handling in coordinator .. IMPALA-10647 Improve always-true min/max filter handling in coordinator The change improves how a coordinator behaves when a just arriving min/max filter is always true. A new member 'always_true_filter_received_' is introduced to record such a fact. Similarily, the new member always_false_flipped_to_false_ is added to indicate that the always false flag is flipped from 'true' to 'false'. These two members only influence how the min and max columns in "Filter routing table" and "Final filter table" in profile are displayed as follows. 1. 'PartialUpdates' - The min and the max are partially updated; 2. 'AlwaysTrue' - One received filter is AlwaysTrue; 3. 'AlwaysFalse'- No filter is received or all received filters are empty; 4. 'Real values'- The final accumulated min/max from all received filters. A second change introduced is to record, in scan node, the arrival time of min/max filters (as a timestamp since the system is rebooted, obtained by calling MonotonicMillis()). A timestamp of similar nature is recorded for hdfs parquet scanners when a row group is processed. By comparing these two timestamps, one can easily diagnose issues related to late arrival of min/max filters. This change also addresses a flaw with rows unexpectedly filtered out, due to the reason that the always_true_ flag in a min/max filter, when set, is ignored in the eval code path in RuntimeFilter::Eval(). Testing: 1. Added three new tests in overlap_min_max_filters.test to verify that the min/max are displayed correctly when the min/max filter in hash join builder is set to always true, always false, or a pair of meaningful min and max values. 2. Ran unit tests; 3. Ran runtime-filter-test; 4. Ran core tests successfully. Change-Id: I326317833979efcbe02ce6c95ad80133dd5c7964 Reviewed-on: http://gerrit.cloudera.org:8080/17252 Reviewed-by: Joe McDonnell Tested-by: Impala Public Jenkins --- M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/scan-node.cc M be/src/runtime/coordinator-filter-state.h M be/src/runtime/coordinator.cc M be/src/runtime/runtime-filter-ir.cc M be/src/util/min-max-filter.cc M be/src/util/min-max-filter.h M testdata/workloads/functional-query/queries/QueryTest/overlap_min_max_filters.test 9 files changed, 224 insertions(+), 30 deletions(-) Approvals: Joe McDonnell: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/17252 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I326317833979efcbe02ce6c95ad80133dd5c7964 Gerrit-Change-Number: 17252 Gerrit-PatchSet: 24 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou
[Impala-ASF-CR] [WIP] IMPALA-10650: Bailout min/max filters in hash join builder early
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17295 ) Change subject: [WIP] IMPALA-10650: Bailout min/max filters in hash join builder early .. Patch Set 7: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8602/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17295 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I193646e7acfdd3023f7c947d8107da58a1f41183 Gerrit-Change-Number: 17295 Gerrit-PatchSet: 7 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Mon, 19 Apr 2021 23:35:23 + Gerrit-HasComments: No
[Impala-ASF-CR] [WIP] IMPALA-10650: Bailout min/max filters in hash join builder early
Qifan Chen has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/17295 ) Change subject: [WIP] IMPALA-10650: Bailout min/max filters in hash join builder early .. [WIP] IMPALA-10650: Bailout min/max filters in hash join builder early This change set addresses the weakness in population min/max filters in the hash join builder by periodically measuring the usefulness of each such filter and set the 'always_true_' flag to true. For each insert into a filter with always_true_ flag being true, the steps from the evaluation of the value from the row to the verification the value in the the min/max range are completely skipped. The above optimization is also LLVM-codeded. Change-Id: I193646e7acfdd3023f7c947d8107da58a1f41183 --- M be/src/codegen/gen_ir_descriptions.py M be/src/exec/filter-context.cc M be/src/exec/filter-context.h M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/runtime/runtime-filter-ir.cc M be/src/util/min-max-filter-ir.cc M be/src/util/min-max-filter.cc M be/src/util/min-max-filter.h 9 files changed, 178 insertions(+), 38 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/95/17295/7 -- To view, visit http://gerrit.cloudera.org:8080/17295 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I193646e7acfdd3023f7c947d8107da58a1f41183 Gerrit-Change-Number: 17295 Gerrit-PatchSet: 7 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Wenzhe Zhou
[Impala-ASF-CR] [WIP] IMPALA-10650: Bailout min/max filters in hash join builder early
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17295 ) Change subject: [WIP] IMPALA-10650: Bailout min/max filters in hash join builder early .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8601/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17295 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I193646e7acfdd3023f7c947d8107da58a1f41183 Gerrit-Change-Number: 17295 Gerrit-PatchSet: 6 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Mon, 19 Apr 2021 23:05:34 + Gerrit-HasComments: No
[Impala-ASF-CR] [WIP] IMPALA-10650: Bailout min/max filters in hash join builder early
Qifan Chen has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/17295 ) Change subject: [WIP] IMPALA-10650: Bailout min/max filters in hash join builder early .. [WIP] IMPALA-10650: Bailout min/max filters in hash join builder early This change set addresses the weakness in population min/max filters in the hash join builder by periodically measuring the usefulness of each such filter and set the AlwaysTrue flag to true. For each insert into a not useful filter, this reduces the amount of work from at least two comparisons and two conditional assignments to one Boolean test. Change-Id: I193646e7acfdd3023f7c947d8107da58a1f41183 --- M be/src/exec/filter-context.cc M be/src/exec/filter-context.h M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/runtime/runtime-filter-ir.cc M be/src/util/min-max-filter-ir.cc M be/src/util/min-max-filter.cc M be/src/util/min-max-filter.h 8 files changed, 63 insertions(+), 13 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/95/17295/6 -- To view, visit http://gerrit.cloudera.org:8080/17295 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I193646e7acfdd3023f7c947d8107da58a1f41183 Gerrit-Change-Number: 17295 Gerrit-PatchSet: 6 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Wenzhe Zhou
[Impala-ASF-CR] IMPALA-10645: Log catalogd HMS API metrics
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17284 ) Change subject: IMPALA-10645: Log catalogd HMS API metrics .. Patch Set 8: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8600/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17284 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id41afe89bbe3395c158919bddd09f302c6752287 Gerrit-Change-Number: 17284 Gerrit-PatchSet: 8 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Mon, 19 Apr 2021 21:49:08 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10662: Change EE tests to return the same results for HS2 as Beeswax
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17325 ) Change subject: IMPALA-10662: Change EE tests to return the same results for HS2 as Beeswax .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8599/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17325 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If69ae90c6333ff245c2b951af5689e3071f85cb2 Gerrit-Change-Number: 17325 Gerrit-PatchSet: 3 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 19 Apr 2021 21:42:31 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10662: Change EE tests to return the same results for HS2 as Beeswax
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17325 ) Change subject: IMPALA-10662: Change EE tests to return the same results for HS2 as Beeswax .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8598/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17325 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If69ae90c6333ff245c2b951af5689e3071f85cb2 Gerrit-Change-Number: 17325 Gerrit-PatchSet: 2 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 19 Apr 2021 21:41:28 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10645: Log catalogd HMS API metrics
Vihang Karajgaonkar has uploaded a new patch set (#8). ( http://gerrit.cloudera.org:8080/17284 ) Change subject: IMPALA-10645: Log catalogd HMS API metrics .. IMPALA-10645: Log catalogd HMS API metrics Expose rpc duration, cache hit ratio, etc for Catalogd HMS APIs. The metrics currently are only logged at debug level when the catalogd starts a HMS endpoint. A followup will be done separately to expose them to the debug UI. This patch was originally contributed by Kishen Das. Testing: 1. Deployed the catalogd's metastore server and made sure that the metrics are logged in the catalogd.INFO logs. Change-Id: Id41afe89bbe3395c158919bddd09f302c6752287 --- M common/thrift/JniCatalog.thrift M common/thrift/metrics.json M fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java A fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java M fe/src/main/java/org/apache/impala/catalog/metastore/ICatalogMetastoreServer.java M fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java M fe/src/main/java/org/apache/impala/catalog/metastore/NoOpCatalogMetastoreServer.java M fe/src/main/java/org/apache/impala/catalog/monitor/CatalogMonitor.java 11 files changed, 433 insertions(+), 38 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/84/17284/8 -- To view, visit http://gerrit.cloudera.org:8080/17284 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id41afe89bbe3395c158919bddd09f302c6752287 Gerrit-Change-Number: 17284 Gerrit-PatchSet: 8 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-10645: Log catalogd HMS API metrics
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/17284 ) Change subject: IMPALA-10645: Log catalogd HMS API metrics .. Patch Set 7: (7 comments) > (7 comments) > > Thanks for adjusting the code style! It seems there are some test > failures that we should address. Left some minor comments as well. The test failures look similar to the ones we saw in https://gerrit.cloudera.org/#/c/17244/ and were fixed. So I am hopeful that when that patch is unreverted back and this patch is rebased on top it it should go through fine. http://gerrit.cloudera.org:8080/#/c/17284/7/common/thrift/metrics.json File common/thrift/metrics.json: http://gerrit.cloudera.org:8080/#/c/17284/7/common/thrift/metrics.json@2918 PS7, Line 2918: { : "description": "Catalogd HMS cache file metadata cache hit ratio.", : "contexts": [ : "CATALOGSERVER" : ], : "label": "Catalogd HMS cache file metadata cache hit ratio", : "units": "NONE", : "kind" : "GAUGE", : "key" : "catalogd.hms.cache.status.file.metadata.cache.hit.ratio" : } > I think this is stale now and should be removed. Yeah, Thanks for pointing it out. Done http://gerrit.cloudera.org:8080/#/c/17284/7/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java File fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java: http://gerrit.cloudera.org:8080/#/c/17284/7/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java@400 PS7, Line 400: > nit: redundant blank line Done http://gerrit.cloudera.org:8080/#/c/17284/7/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java@400 PS7, Line 400: > nit: redundant blank line Done http://gerrit.cloudera.org:8080/#/c/17284/7/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java@421 PS7, Line 421: HmsApiNameEnum.contains(apiName) > It seems to be always true. Why do we only check this for cache hit ratio b RPC duration metrics are collected for all the APIs. But the cache hit ratio is only collected for the ones HmsApiNameEnum. Added a comment to make it more readable. http://gerrit.cloudera.org:8080/#/c/17284/7/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java@422 PS7, Line 422: double specificApiCacheHitRatio = : ((double) CatalogMonitor.INSTANCE.getCatalogdHmsCacheMetrics() : .getCounter(String.format(CATALOGD_CACHE_API_HIT_METRIC, : apiName)).getCount()) / : (double) (CatalogMonitor.INSTANCE : .getCatalogdHmsCacheMetrics() : .getCounter(String : .format(CATALOGD_CACHE_API_HIT_METRIC, : apiName)).getCount() + : CatalogMonitor.INSTANCE. : getCatalogdHmsCacheMetrics() : .getCounter(String.format( : CATALOGD_CACHE_API_MISS_METRIC, : apiName)) : .getCount()); > nit: Could you help to refactor these codes and those at line 362? E.g. ext Good point. Done http://gerrit.cloudera.org:8080/#/c/17284/7/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java@438 PS7, Line 438: > nit: redundant blank line Done http://gerrit.cloudera.org:8080/#/c/17284/7/fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java File fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java: http://gerrit.cloudera.org:8080/#/c/17284/7/fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java@29 PS7, Line 29: private String apiName; > nit: this can be 'final' Done -- To view, visit http://gerrit.cloudera.org:8080/17284 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id41afe89bbe3395c158919bddd09f302c6752287 Gerrit-Change-Number: 17284 Gerrit-PatchSet: 7 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Mon, 19 Apr 2021 21:28:09 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10662: Change EE tests to return the same results for HS2 as Beeswax
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17325 ) Change subject: IMPALA-10662: Change EE tests to return the same results for HS2 as Beeswax .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7083/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/17325 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If69ae90c6333ff245c2b951af5689e3071f85cb2 Gerrit-Change-Number: 17325 Gerrit-PatchSet: 3 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 19 Apr 2021 21:23:50 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10662: Change EE tests to return the same results for HS2 as Beeswax
Hello Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17325 to look at the new patch set (#3). Change subject: IMPALA-10662: Change EE tests to return the same results for HS2 as Beeswax .. IMPALA-10662: Change EE tests to return the same results for HS2 as Beeswax In EE tests HS2 returned results with smaller precision than Beeswax for FLOAT/DOUBLE/TIMESTAMP types. These differences are not inherent to the HS2 protocol - the results are returned with full precision in Thrift and lose precision during conversion in client code. This patch changes to conversion in HS2 to match Beeswax and removes test section DBAPI_RESULTS that was used to handle the differences. Note that FLOAT/DOUBLE are still different in impala-shell, this change only deals with EE tests. Testing: - ran the changed tests Change-Id: If69ae90c6333ff245c2b951af5689e3071f85cb2 --- M testdata/workloads/functional-query/queries/QueryTest/analytic-fns.test M testdata/workloads/functional-query/queries/QueryTest/data-source-tables.test M testdata/workloads/functional-query/queries/QueryTest/inline-view-limit.test M testdata/workloads/functional-query/queries/QueryTest/inline-view.test M testdata/workloads/functional-query/queries/QueryTest/limit.test M testdata/workloads/functional-query/queries/QueryTest/subquery.test M testdata/workloads/functional-query/queries/QueryTest/top-n.test M tests/common/impala_connection.py M tests/common/impala_test_suite.py M tests/util/test_file_parser.py 10 files changed, 8 insertions(+), 249 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/25/17325/3 -- To view, visit http://gerrit.cloudera.org:8080/17325 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If69ae90c6333ff245c2b951af5689e3071f85cb2 Gerrit-Change-Number: 17325 Gerrit-PatchSet: 3 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-10662: Change EE tests to return the same results for HS2 as Beeswax
Csaba Ringhofer has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/17325 ) Change subject: IMPALA-10662: Change EE tests to return the same results for HS2 as Beeswax .. IMPALA-10662: Change EE tests to return the same results for HS2 as Beeswax In EE tests HS2 returned results with smaller precision than Beeswax for FLOAT/DOUBLE/TIMESTAMP types. These differences are not inherent to the HS2 protocol - the results are returned with full precision in Thrift and lose precision during conversion in client code. This patch changes to conversion in HS2 to match Beeswax and removes test section DBAPI_RESULTS that was used to handle the differences. Note that FLOAT/DOUBLE are still different in impala-shell, this change only deals with EE tests. Testing: - ran the changed tests Change-Id: If69ae90c6333ff245c2b951af5689e3071f85cb2 --- M testdata/workloads/functional-query/queries/QueryTest/analytic-fns.test M testdata/workloads/functional-query/queries/QueryTest/data-source-tables.test M testdata/workloads/functional-query/queries/QueryTest/inline-view-limit.test M testdata/workloads/functional-query/queries/QueryTest/inline-view.test M testdata/workloads/functional-query/queries/QueryTest/limit.test M testdata/workloads/functional-query/queries/QueryTest/subquery.test M testdata/workloads/functional-query/queries/QueryTest/top-n.test M tests/common/impala_connection.py M tests/common/impala_test_suite.py M tests/util/test_file_parser.py 10 files changed, 9 insertions(+), 246 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/25/17325/2 -- To view, visit http://gerrit.cloudera.org:8080/17325 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If69ae90c6333ff245c2b951af5689e3071f85cb2 Gerrit-Change-Number: 17325 Gerrit-PatchSet: 2 Gerrit-Owner: Csaba Ringhofer
[Impala-ASF-CR] IMPALA-7825: Upgrade Thrift version to 0.11.0
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17170 ) Change subject: IMPALA-7825: Upgrade Thrift version to 0.11.0 .. Patch Set 9: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/17170 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idd13f177b4f7acc07872ea6399035aa180ef6ab6 Gerrit-Change-Number: 17170 Gerrit-PatchSet: 9 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 19 Apr 2021 21:16:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10647 Improve always-true min/max filter handling in coordinator
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/17252 ) Change subject: IMPALA-10647 Improve always-true min/max filter handling in coordinator .. Patch Set 23: Code-Review+2 Bumping to +2 -- To view, visit http://gerrit.cloudera.org:8080/17252 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I326317833979efcbe02ce6c95ad80133dd5c7964 Gerrit-Change-Number: 17252 Gerrit-PatchSet: 23 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Mon, 19 Apr 2021 19:02:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10647 Improve always-true min/max filter handling in coordinator
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17252 ) Change subject: IMPALA-10647 Improve always-true min/max filter handling in coordinator .. Patch Set 23: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7082/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/17252 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I326317833979efcbe02ce6c95ad80133dd5c7964 Gerrit-Change-Number: 17252 Gerrit-PatchSet: 23 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Mon, 19 Apr 2021 18:57:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10647 Improve always-true min/max filter handling in coordinator
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/17252 ) Change subject: IMPALA-10647 Improve always-true min/max filter handling in coordinator .. Patch Set 22: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/17252 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I326317833979efcbe02ce6c95ad80133dd5c7964 Gerrit-Change-Number: 17252 Gerrit-PatchSet: 22 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Mon, 19 Apr 2021 18:04:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10640: Support reading Parquet Bloom filters - most common types
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17026 ) Change subject: IMPALA-10640: Support reading Parquet Bloom filters - most common types .. Patch Set 27: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7080/ -- To view, visit http://gerrit.cloudera.org:8080/17026 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7119c7161fa3658e561fc1265430cb90079d8287 Gerrit-Change-Number: 17026 Gerrit-PatchSet: 27 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 19 Apr 2021 17:22:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10647 Improve always-true min/max filter handling in coordinator
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17252 ) Change subject: IMPALA-10647 Improve always-true min/max filter handling in coordinator .. Patch Set 22: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8597/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17252 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I326317833979efcbe02ce6c95ad80133dd5c7964 Gerrit-Change-Number: 17252 Gerrit-PatchSet: 22 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Mon, 19 Apr 2021 17:15:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10647 Improve always-true min/max filter handling in coordinator
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/17252 ) Change subject: IMPALA-10647 Improve always-true min/max filter handling in coordinator .. Patch Set 22: > Address the addresses a flaw with rows unexpectedly > filtered out, due to the reason that the always_true_ flag in > a min/max filter, when set, is ignored in the eval code path > in RuntimeFilter::Eval(). The change is one line code change in runtime-filter-ir.cc, and an extra comment in min-max-filter.h for EvalOverlap(). -- To view, visit http://gerrit.cloudera.org:8080/17252 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I326317833979efcbe02ce6c95ad80133dd5c7964 Gerrit-Change-Number: 17252 Gerrit-PatchSet: 22 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Mon, 19 Apr 2021 17:04:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10647 Improve always-true min/max filter handling in coordinator
Qifan Chen has uploaded a new patch set (#22). ( http://gerrit.cloudera.org:8080/17252 ) Change subject: IMPALA-10647 Improve always-true min/max filter handling in coordinator .. IMPALA-10647 Improve always-true min/max filter handling in coordinator The change improves how a coordinator behaves when a just arriving min/max filter is always true. A new member 'always_true_filter_received_' is introduced to record such a fact. Similarily, the new member always_false_flipped_to_false_ is added to indicate that the always false flag is flipped from 'true' to 'false'. These two members only influence how the min and max columns in "Filter routing table" and "Final filter table" in profile are displayed as follows. 1. 'PartialUpdates' - The min and the max are partially updated; 2. 'AlwaysTrue' - One received filter is AlwaysTrue; 3. 'AlwaysFalse'- No filter is received or all received filters are empty; 4. 'Real values'- The final accumulated min/max from all received filters. A second change introduced is to record, in scan node, the arrival time of min/max filters (as a timestamp since the system is rebooted, obtained by calling MonotonicMillis()). A timestamp of similar nature is recorded for hdfs parquet scanners when a row group is processed. By comparing these two timestamps, one can easily diagnose issues related to late arrival of min/max filters. This change also addresses a flaw with rows unexpectedly filtered out, due to the reason that the always_true_ flag in a min/max filter, when set, is ignored in the eval code path in RuntimeFilter::Eval(). Testing: 1. Added three new tests in overlap_min_max_filters.test to verify that the min/max are displayed correctly when the min/max filter in hash join builder is set to always true, always false, or a pair of meaningful min and max values. 2. Ran unit tests; 3. Ran runtime-filter-test; 4. Ran core tests successfully. Change-Id: I326317833979efcbe02ce6c95ad80133dd5c7964 --- M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/scan-node.cc M be/src/runtime/coordinator-filter-state.h M be/src/runtime/coordinator.cc M be/src/runtime/runtime-filter-ir.cc M be/src/util/min-max-filter.cc M be/src/util/min-max-filter.h M testdata/workloads/functional-query/queries/QueryTest/overlap_min_max_filters.test 9 files changed, 224 insertions(+), 30 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/17252/22 -- To view, visit http://gerrit.cloudera.org:8080/17252 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I326317833979efcbe02ce6c95ad80133dd5c7964 Gerrit-Change-Number: 17252 Gerrit-PatchSet: 22 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou
[Impala-ASF-CR] IMPALA-10611: Fix flakiness in test wide row
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17324 ) Change subject: IMPALA-10611: Fix flakiness in test_wide_row .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8596/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17324 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie1f0b7d4d6b3a875d9b408f057d46fdbdbdf2a34 Gerrit-Change-Number: 17324 Gerrit-PatchSet: 1 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 19 Apr 2021 16:51:27 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10611: Fix flakiness in test wide row
Riza Suminto has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17324 Change subject: IMPALA-10611: Fix flakiness in test_wide_row .. IMPALA-10611: Fix flakiness in test_wide_row test_wide_row has been intermittently failed with "Failed to allocate row batch" error message. This is due to recent change in IMPALA-9856 that add query option max_row_size=10MB without raising the mem_limit. This patch fix the flakiness by increasing the mem_limit from 100 MB to 132 MB to account for 32 MB reservation needed by BufferedPlanRootSink. Testing: - Loop the test in local dev machine. Change-Id: Ie1f0b7d4d6b3a875d9b408f057d46fdbdbdf2a34 --- M tests/query_test/test_scanners.py 1 file changed, 4 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/17324/1 -- To view, visit http://gerrit.cloudera.org:8080/17324 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ie1f0b7d4d6b3a875d9b408f057d46fdbdbdf2a34 Gerrit-Change-Number: 17324 Gerrit-PatchSet: 1 Gerrit-Owner: Riza Suminto
[Impala-ASF-CR] IMPALA-7825: Upgrade Thrift version to 0.11.0
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17170 ) Change subject: IMPALA-7825: Upgrade Thrift version to 0.11.0 .. Patch Set 10: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8595/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17170 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idd13f177b4f7acc07872ea6399035aa180ef6ab6 Gerrit-Change-Number: 17170 Gerrit-PatchSet: 10 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 19 Apr 2021 15:55:07 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7825: Upgrade Thrift version to 0.11.0
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17170 ) Change subject: IMPALA-7825: Upgrade Thrift version to 0.11.0 .. Patch Set 8: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8594/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17170 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idd13f177b4f7acc07872ea6399035aa180ef6ab6 Gerrit-Change-Number: 17170 Gerrit-PatchSet: 8 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 19 Apr 2021 15:49:13 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7825: Upgrade Thrift version to 0.11.0
Hello Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17170 to look at the new patch set (#10). Change subject: IMPALA-7825: Upgrade Thrift version to 0.11.0 .. IMPALA-7825: Upgrade Thrift version to 0.11.0 Before this patch Impala mainly used Thrift 0.9.3, but it was possible to compile Impala shell with Thrift 0.11.0, so the 0.11.0 Thrift lib was already included in the toolchain. Most of the changes are related to replacing boost:: with std:: shared_ptr-s in cpp code (this is a continuation of patch by Sahil). The Thrift upgrade also needs an Impyla relase with Thrift 0.11.0, as Impala's test framework relies on Impyla. A thrift_sasl release is also needed, because it currently pins Thrift version to 0.9.3 for Python 2. The current patch uses alpha releases from Impyla and thrift_sasl that use thrift 0.11.0. Notable side effects: - THRIFT-3921 changed the stream operators to print an enum's name instead of its number, leading to slightly different messages in some cases. - "templates" was added to the thift generator's parameters to avoid a compilation issue (related to IMPALA-10600). I didn't notice any change in compilation time. This option generated .tcc files with templetized readers/writers for Thrift types. Currently we don't use these, but they could potentially speed up (de)serialization. Testing: - ran Impyla's test suite with Python 2 and 3 - ran core tests TODOs: - remove preexisting extra logic needed to use 0.11.0 for python Change-Id: Idd13f177b4f7acc07872ea6399035aa180ef6ab6 --- M be/src/benchmarks/network-perf-benchmark.cc M be/src/catalog/catalog-server.h M be/src/catalog/catalog-service-client-wrapper.h M be/src/catalog/catalog-util.cc M be/src/catalog/catalogd-main.cc M be/src/rpc/TAcceptQueueServer.cpp M be/src/rpc/TAcceptQueueServer.h M be/src/rpc/auth-provider.h M be/src/rpc/authentication.cc M be/src/rpc/hs2-http-test.cc M be/src/rpc/thrift-client.h M be/src/rpc/thrift-server-test.cc M be/src/rpc/thrift-server.cc M be/src/rpc/thrift-server.h M be/src/rpc/thrift-thread.cc M be/src/rpc/thrift-thread.h M be/src/rpc/thrift-util.cc M be/src/rpc/thrift-util.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/impalad-main.cc M be/src/statestore/statestore-service-client-wrapper.h M be/src/statestore/statestore-subscriber-client-wrapper.h M be/src/statestore/statestore-subscriber.cc M be/src/statestore/statestore-subscriber.h M be/src/statestore/statestore.cc M be/src/statestore/statestore.h M be/src/testutil/in-process-servers.h M be/src/transport/THttpServer.cpp M be/src/transport/THttpServer.h M be/src/transport/THttpTransport.cpp M be/src/transport/THttpTransport.h M be/src/transport/TSaslClientTransport.cpp M be/src/transport/TSaslClientTransport.h M be/src/transport/TSaslServerTransport.cpp M be/src/transport/TSaslServerTransport.h M be/src/transport/TSaslTransport.cpp M be/src/transport/TSaslTransport.h M be/src/util/parquet-reader.cc M bin/impala-config.sh M common/thrift/CMakeLists.txt M infra/python/deps/requirements.txt M java/pom.xml M shell/ext-py/thrift_sasl-0.4.2/setup.py M tests/beeswax/impala_beeswax.py M tests/query_test/test_observability.py 46 files changed, 197 insertions(+), 193 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/17170/10 -- To view, visit http://gerrit.cloudera.org:8080/17170 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Idd13f177b4f7acc07872ea6399035aa180ef6ab6 Gerrit-Change-Number: 17170 Gerrit-PatchSet: 10 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-7825: Upgrade Thrift version to 0.11.0
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17170 ) Change subject: IMPALA-7825: Upgrade Thrift version to 0.11.0 .. Patch Set 9: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7081/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/17170 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idd13f177b4f7acc07872ea6399035aa180ef6ab6 Gerrit-Change-Number: 17170 Gerrit-PatchSet: 9 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 19 Apr 2021 15:28:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7825: Upgrade Thrift version to 0.11.0
Csaba Ringhofer has uploaded a new patch set (#8). ( http://gerrit.cloudera.org:8080/17170 ) Change subject: IMPALA-7825: Upgrade Thrift version to 0.11.0 .. IMPALA-7825: Upgrade Thrift version to 0.11.0 Before this patch Impala mainly used Thrift 0.9.3, but it was possible to compile Impala shell with Thrift 0.11.0, so the 0.11.0 Thrift lib was already included in the toolchain. Most of the changes are related to replacing boost:: with std:: shared_ptr-s in cpp code (this is a continuation of patch by Vihang). The Thrift upgrade also needs an Impyla relase with Thrift 0.11.0, as Impala's test framework relies on Impyla. A thrift_sasl release is also needed, because it currently pins Thrift version to 0.9.3 for Python 2. The current patch uses alpha releases from Impyla and thrift_sasl that use thrift 0.11.0. Notable side effects: - THRIFT-3921 changed the stream operators to print an enum's name instead of its number, leading to slightly different messages in some cases. - "templates" was added to the thift generator's parameters to avoid a compilation issue (related to IMPALA-10600). I didn't notice any change in compilation time. This option generated .tcc files with templetized readers/writers for Thrift types. Currently we don't use these, but they could potentially speed up (de)serialization. Testing: - ran Impyla's test suite with Python 2 and 3 - ran core tests TODOs: - remove preexisting extra logic needed to use 0.11.0 for python Change-Id: Idd13f177b4f7acc07872ea6399035aa180ef6ab6 --- M be/src/benchmarks/network-perf-benchmark.cc M be/src/catalog/catalog-server.h M be/src/catalog/catalog-service-client-wrapper.h M be/src/catalog/catalog-util.cc M be/src/catalog/catalogd-main.cc M be/src/rpc/TAcceptQueueServer.cpp M be/src/rpc/TAcceptQueueServer.h M be/src/rpc/auth-provider.h M be/src/rpc/authentication.cc M be/src/rpc/hs2-http-test.cc M be/src/rpc/thrift-client.h M be/src/rpc/thrift-server-test.cc M be/src/rpc/thrift-server.cc M be/src/rpc/thrift-server.h M be/src/rpc/thrift-thread.cc M be/src/rpc/thrift-thread.h M be/src/rpc/thrift-util.cc M be/src/rpc/thrift-util.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/impalad-main.cc M be/src/statestore/statestore-service-client-wrapper.h M be/src/statestore/statestore-subscriber-client-wrapper.h M be/src/statestore/statestore-subscriber.cc M be/src/statestore/statestore-subscriber.h M be/src/statestore/statestore.cc M be/src/statestore/statestore.h M be/src/testutil/in-process-servers.h M be/src/transport/THttpServer.cpp M be/src/transport/THttpServer.h M be/src/transport/THttpTransport.cpp M be/src/transport/THttpTransport.h M be/src/transport/TSaslClientTransport.cpp M be/src/transport/TSaslClientTransport.h M be/src/transport/TSaslServerTransport.cpp M be/src/transport/TSaslServerTransport.h M be/src/transport/TSaslTransport.cpp M be/src/transport/TSaslTransport.h M be/src/util/parquet-reader.cc M bin/impala-config.sh M common/thrift/CMakeLists.txt M infra/python/deps/requirements.txt M java/pom.xml M shell/ext-py/thrift_sasl-0.4.2/setup.py M tests/beeswax/impala_beeswax.py M tests/query_test/test_observability.py 46 files changed, 197 insertions(+), 193 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/17170/8 -- To view, visit http://gerrit.cloudera.org:8080/17170 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Idd13f177b4f7acc07872ea6399035aa180ef6ab6 Gerrit-Change-Number: 17170 Gerrit-PatchSet: 8 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-10640: Support reading Parquet Bloom filters - most common types
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17026 ) Change subject: IMPALA-10640: Support reading Parquet Bloom filters - most common types .. Patch Set 27: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17026 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7119c7161fa3658e561fc1265430cb90079d8287 Gerrit-Change-Number: 17026 Gerrit-PatchSet: 27 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 19 Apr 2021 13:41:25 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10640: Support reading Parquet Bloom filters - most common types
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17026 ) Change subject: IMPALA-10640: Support reading Parquet Bloom filters - most common types .. Patch Set 27: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7080/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/17026 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7119c7161fa3658e561fc1265430cb90079d8287 Gerrit-Change-Number: 17026 Gerrit-PatchSet: 27 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 19 Apr 2021 13:41:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10640: Support reading Parquet Bloom filters - most common types
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/17026 ) Change subject: IMPALA-10640: Support reading Parquet Bloom filters - most common types .. Patch Set 26: Code-Review+2 (2 comments) Great work, Daniel! http://gerrit.cloudera.org:8080/#/c/17026/24//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17026/24//COMMIT_MSG@7 PS24, Line 7: IMPALA-10640: Support reading Parquet Bloom filters - most common types > No, I don't think we are planning to add filtering for complex types. Yeah, I was thinking about leaf values inside complex types. I'm OK with deferring it. Probably it worth to open a Jira for it. http://gerrit.cloudera.org:8080/#/c/17026/23/be/src/util/parquet-bloom-filter.cc File be/src/util/parquet-bloom-filter.cc: http://gerrit.cloudera.org:8080/#/c/17026/23/be/src/util/parquet-bloom-filter.cc@18 PS23, Line 18: #include "parquet-bloom-filter.h" : : #include : #include : : #include "kudu/util/slice.h" : #include "kudu/util/status.h" : #include "util/kudu-status-util.h" : : #include "thirdparty/xxhash/xxhash.h" > Isn't the first include in a .cc file usually the corresponding .h file? Se Yeah, I think the include order is fine. -- To view, visit http://gerrit.cloudera.org:8080/17026 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7119c7161fa3658e561fc1265430cb90079d8287 Gerrit-Change-Number: 17026 Gerrit-PatchSet: 26 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 19 Apr 2021 13:40:20 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP: upgrading some python requirements.
Jim Apple has posted comments on this change. ( http://gerrit.cloudera.org:8080/17323 ) Change subject: WIP: upgrading some python requirements. .. Patch Set 1: > Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7079/ Appears just to be a flaky test. Here's an exhaustive run that passed: https://jenkins.impala.io/view/Utility/job/ubuntu-16.04-from-scratch/13700/ -- To view, visit http://gerrit.cloudera.org:8080/17323 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1d35b73dc245781aa36282c6a268390152b63f05 Gerrit-Change-Number: 17323 Gerrit-PatchSet: 1 Gerrit-Owner: Jim Apple Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Comment-Date: Mon, 19 Apr 2021 13:38:59 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10656: Fire insert events before commit
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/17313 ) Change subject: IMPALA-10656: Fire insert events before commit .. Patch Set 8: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/17313 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2ed812dbcb5f55efff3a910a3daeeb76cd3295b9 Gerrit-Change-Number: 17313 Gerrit-PatchSet: 8 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 19 Apr 2021 13:08:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10658: LOAD DATA INPATH silently fails between HDFS and Azure ABFS
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/17316 ) Change subject: IMPALA-10658: LOAD DATA INPATH silently fails between HDFS and Azure ABFS .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/17316/1/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java File fe/src/main/java/org/apache/impala/common/FileSystemUtil.java: http://gerrit.cloudera.org:8080/#/c/17316/1/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@588 PS1, Line 588: return fs.exists(path); > If the qualified paths aren't the same and fs.exists(path) returns true, do My idea was that 'path.equals(qp)' is the fast-path to check if the path is on the given filesystem. If that returns false because 'path' is not a qualified path, then the ultimate check is 'fs.exists(path)'. Currently we always pass a qualified path AFAICT, so we could only use the 'fast-path'. fs.exists(path) can be useful if this method was invoked with an unqualified path, e.g. '/tmp/data/file.csv'. The other option could be to throw an exception when 'path' is not qualified, but I'm not sure what is the best way to check that. -- To view, visit http://gerrit.cloudera.org:8080/17316 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id807e8a200b83283a09d3a917185cabab930017d Gerrit-Change-Number: 17316 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 19 Apr 2021 13:00:23 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10648: Invalidate catalogd table cache for hms ddl apis which modify tables and partitions.
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/17298 ) Change subject: IMPALA-10648: Invalidate catalogd table cache for hms ddl apis which modify tables and partitions. .. Patch Set 5: (2 comments) http://gerrit.cloudera.org:8080/#/c/17298/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17298/3//COMMIT_MSG@9 PS3, Line 9: For non transactional tables > Could you explain why we don't do this for transactional tables in the comm Could you address on this? Maybe you just miss it due to too many code style comments :) http://gerrit.cloudera.org:8080/#/c/17298/3//COMMIT_MSG@13 PS3, Line 13: (since table loading in cache takes time) but ensures consistency. This change is behind catalogd > nit: please adjust the commit message body to fit into at-most 72 chars per Could you address on this? Maybe you just miss it due to too many code style comments :) -- To view, visit http://gerrit.cloudera.org:8080/17298 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idb9cc22ebfb51948433e4d57f4705ce201acaf98 Gerrit-Change-Number: 17298 Gerrit-PatchSet: 5 Gerrit-Owner: Sourabh Goyal Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Mon, 19 Apr 2021 08:20:18 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10645: Log catalogd HMS API metrics
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/17284 ) Change subject: IMPALA-10645: Log catalogd HMS API metrics .. Patch Set 7: (7 comments) Thanks for adjusting the code style! It seems there are some test failures that we should address. Left some minor comments as well. http://gerrit.cloudera.org:8080/#/c/17284/7/common/thrift/metrics.json File common/thrift/metrics.json: http://gerrit.cloudera.org:8080/#/c/17284/7/common/thrift/metrics.json@2918 PS7, Line 2918: { : "description": "Catalogd HMS cache file metadata cache hit ratio.", : "contexts": [ : "CATALOGSERVER" : ], : "label": "Catalogd HMS cache file metadata cache hit ratio", : "units": "NONE", : "kind" : "GAUGE", : "key" : "catalogd.hms.cache.status.file.metadata.cache.hit.ratio" : } I think this is stale now and should be removed. http://gerrit.cloudera.org:8080/#/c/17284/7/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java File fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java: http://gerrit.cloudera.org:8080/#/c/17284/7/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java@354 PS7, Line 354:* :* @return nit: remove these http://gerrit.cloudera.org:8080/#/c/17284/7/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java@400 PS7, Line 400: nit: redundant blank line http://gerrit.cloudera.org:8080/#/c/17284/7/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java@421 PS7, Line 421: HmsApiNameEnum.contains(apiName) It seems to be always true. Why do we only check this for cache hit ratio but not all metrics? Could we check this at the beginning of the loop and log an error if it's false? http://gerrit.cloudera.org:8080/#/c/17284/7/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java@422 PS7, Line 422: double specificApiCacheHitRatio = : ((double) CatalogMonitor.INSTANCE.getCatalogdHmsCacheMetrics() : .getCounter(String.format(CATALOGD_CACHE_API_HIT_METRIC, : apiName)).getCount()) / : (double) (CatalogMonitor.INSTANCE : .getCatalogdHmsCacheMetrics() : .getCounter(String : .format(CATALOGD_CACHE_API_HIT_METRIC, : apiName)).getCount() + : CatalogMonitor.INSTANCE. : getCatalogdHmsCacheMetrics() : .getCounter(String.format( : CATALOGD_CACHE_API_MISS_METRIC, : apiName)) : .getCount()); nit: Could you help to refactor these codes and those at line 362? E.g. extracting a method like this private static double getHitRatio(String hitMetricName, String missMetricName) { long hits = CatalogMonitor.INSTANCE.getCatalogdHmsCacheMetrics() .getCounter(hitMetricName) .getCount(); long misses = CatalogMonitor.INSTANCE.getCatalogdHmsCacheMetrics() .getCounter(missMetricName) .getCount(); return (double)hits / (hits + misses); } Then line 362 becomes: double cacheHitRatio = getHitRatio(CATALOGD_CACHE_HIT_METRIC, CATALOGD_CACHE_MISS_METRIC); codes here become: double specificApiCacheHitRatio = getHitRatio( String.format(CATALOGD_CACHE_API_HIT_METRIC, apiName), String.format(CATALOGD_CACHE_API_MISS_METRIC, apiName)); http://gerrit.cloudera.org:8080/#/c/17284/7/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java@438 PS7, Line 438: nit: redundant blank line http://gerrit.cloudera.org:8080/#/c/17284/7/fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java File fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java: http://gerrit.cloudera.org:8080/#/c/17284/7/fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java@29 PS7, Line 29: private String apiName; nit: this can be 'final' -- To view, visit http://gerrit.cloudera.org:8080/17284 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id41afe89bbe3395c158919bddd09f302c6752287 Gerrit-Change-Number: 17284 Gerrit-PatchSet: 7 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang