[Impala-ASF-CR] IMPALA-9741: Supported query Icebreg table by impala
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16143 ) Change subject: IMPALA-9741: Supported query Icebreg table by impala .. Patch Set 7: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6612/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16143 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006 Gerrit-Change-Number: 16143 Gerrit-PatchSet: 7 Gerrit-Owner: wangsheng Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Thu, 16 Jul 2020 05:13:55 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9741: Supported query Icebreg table by impala
wangsheng has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/16143 ) Change subject: IMPALA-9741: Supported query Icebreg table by impala .. IMPALA-9741: Supported query Icebreg table by impala This patch mainly realizes the query of iceberg table through impala, we can use the following sql to create an external iceberg table: CREATE EXTERNAL TABLE default.iceberg_test ( level string, event_time timestamp, message string, ) STORED AS ICEBERG LOCATION 'hdfs://xxx' TBLPROPERTIES ('iceberg_file_format'='parquet'); Or just including table name and location like this: CREATE EXTERNAL TABLE default.iceberg_test STORED AS ICEBERG LOCATION 'hdfs://xxx' TBLPROPERTIES ('iceberg_file_format'='parquet'); 'iceberg_file_format' is the file format in iceberg, currently only support PARQUET, other format would be supported in the future. And if you don't identity this property in your SQL, default file format is PARQUET. We achieved this function by treating the iceberg table as normal unpartitioned hdfs table. When query iceberg table, we pushdown partition column predicates to iceberg to decided which data files need to be scanned, and then transformed these information to BE to do the real scan operation. Testing: - Unit test for Iceberg in FileMetadataLoaderTest - Create table tests in functional_schema_template.sql - Iceberg table query test in custom cluster test test_iceberg.py Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006 --- M be/src/runtime/descriptors.cc M bin/rat_exclude_files.txt M common/thrift/CatalogObjects.thrift M fe/pom.xml M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionSpec.java M fe/src/main/java/org/apache/impala/analysis/ShowFilesStmt.java M fe/src/main/java/org/apache/impala/analysis/ShowStatsStmt.java M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java M fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java A fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java M fe/src/test/java/org/apache/impala/catalog/FileMetadataLoaderTest.java M testdata/data/README A testdata/data/iceberg_test/iceberg_non_partitioned/data/1-100-e1a80ed6-1064-494d-9cdd-c4a30c1ab8dc-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/3-102-511427f2-85f0-43ae-9b39-a456f8dc57b6-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/4-103-00fc55e1-6ef7-4241-ace2-6d075b9737fc-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/6-105-ef9e76d5-c060-4040-8aa1-b7c275610daa-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/7-106-c09c9c8d-9478-44f9-8501-f85f53112bc3-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/9-108-3b4f06ac-dca3-4f4e-be60-bf42d9927b5b-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/00011-110-1e653ccf-0963-4fb0-941c-32c9de13268b-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/00012-111-dfa70658-eb4b-4fa0-9ffa-b892cf90d6ac-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/00014-113-2d16e751-e2a4-4856-ab89-145996e3815e-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/00015-114-0f710621-cbbf-4509-a93d-b58808978e2e-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/00017-116-0b666c79-53df-4507-906c-542e65a83443-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/00019-118-1bc6bc6e-e061-4da3-9d1e-a427a306c471-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/00020-119-ae7b2c67-1538-4429-8246-4998960e3817-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/00022-121-8db0f1e1-d88c-4aad-a8b3-24fd07329cdb-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/00023-122-de57b6b0-f54b-40ac-85cd-e783505094b6-0.parquet A
[Impala-ASF-CR] IMPALA-3127: Support incremental metadata updates in partition level
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16159 ) Change subject: IMPALA-3127: Support incremental metadata updates in partition level .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6611/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16159 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia0abfb346903d6e7cdc603af91c2b8937d24d870 Gerrit-Change-Number: 16159 Gerrit-PatchSet: 2 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Thu, 16 Jul 2020 04:01:15 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3127: Support incremental metadata updates in partition level
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16159 ) Change subject: IMPALA-3127: Support incremental metadata updates in partition level .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6139/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/16159 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia0abfb346903d6e7cdc603af91c2b8937d24d870 Gerrit-Change-Number: 16159 Gerrit-PatchSet: 2 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Thu, 16 Jul 2020 03:35:01 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3127: Support incremental metadata updates in partition level
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16159 ) Change subject: IMPALA-3127: Support incremental metadata updates in partition level .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/16159/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/16159/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1574 PS2, Line 1574: for (Map.Entry part : hdfsTable.getPartitions().entrySet()) { line too long (91 > 90) http://gerrit.cloudera.org:8080/#/c/16159/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/16159/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@4241 PS2, Line 4241: // TODO(IMPALA-9937): if client is a 'v1' impalad, only send back incremental updates line too long (93 > 90) -- To view, visit http://gerrit.cloudera.org:8080/16159 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia0abfb346903d6e7cdc603af91c2b8937d24d870 Gerrit-Change-Number: 16159 Gerrit-PatchSet: 2 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Thu, 16 Jul 2020 03:34:44 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3127: Support incremental metadata updates in partition level
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/16159 ) Change subject: IMPALA-3127: Support incremental metadata updates in partition level .. Patch Set 2: (6 comments) Thanks for the review! Addressed the comments. http://gerrit.cloudera.org:8080/#/c/16159/1/common/thrift/CatalogObjects.thrift File common/thrift/CatalogObjects.thrift: http://gerrit.cloudera.org:8080/#/c/16159/1/common/thrift/CatalogObjects.thrift@424 PS1, Line 424: > nit, partition Removed this field http://gerrit.cloudera.org:8080/#/c/16159/1/common/thrift/CatalogObjects.thrift@425 PS1, Line 425: // Each TNetworkAddress is a datanode which contains blocks of a file in the table. : // Used so that each THdfsFileBlock can just reference an index in this list rather : // than duplicate the list of network address, w > Is there any value of having a new field? Seems like this list is always de Done. Merged the list into the partition map and introduce some flags to reveal the state. http://gerrit.cloudera.org:8080/#/c/16159/1/fe/src/main/java/org/apache/impala/catalog/Catalog.java File fe/src/main/java/org/apache/impala/catalog/Catalog.java: http://gerrit.cloudera.org:8080/#/c/16159/1/fe/src/main/java/org/apache/impala/catalog/Catalog.java@632 PS1, Line 632: ":" > I feel having space in the catalogObjectKey is bit unconventional and can c Sure. Will change to ":" http://gerrit.cloudera.org:8080/#/c/16159/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/16159/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@710 PS1, Line 710: byte[] data = serializer.serialize(minimalObject); : String v2Key = CatalogServiceConstants.CATALOG_TOPIC_V2_PREFIX + key; : > You may want to consider sending the updates for partitions as well since t Yeah, good point! We can send invalidation on the old (replaced) partition id. http://gerrit.cloudera.org:8080/#/c/16159/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/16159/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1614 PS1, Line 1614: ptor tableDesc = new TTab > perhaps a better name could be toThriftWithMinimalPartitions since we are a Done http://gerrit.cloudera.org:8080/#/c/16159/1/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java File fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java: http://gerrit.cloudera.org:8080/#/c/16159/1/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java@469 PS1, Line 469: pdates. : if (newTable instanceof HdfsTable > I think it would be more readable if we move this to a method called isFrom Done. Merge the list into the map. -- To view, visit http://gerrit.cloudera.org:8080/16159 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia0abfb346903d6e7cdc603af91c2b8937d24d870 Gerrit-Change-Number: 16159 Gerrit-PatchSet: 2 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Thu, 16 Jul 2020 03:34:12 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3127: Support incremental metadata updates in partition level
Hello Anurag Mantripragada, Vihang Karajgaonkar, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16159 to look at the new patch set (#2). Change subject: IMPALA-3127: Support incremental metadata updates in partition level .. IMPALA-3127: Support incremental metadata updates in partition level Currently, partitions are tightly integrated into the HdfsTable objects. Catalogd has to transmit the entire table metadata even when few partitions change. This is a waste of resources and can lead to OOM in transmitting large tables due to the 2GB JVM array limit. This patch makes HdfsPartition extend CatalogObject so the catalogd can send partitions as individual catalog objects. Consequently, table objects in the catalog topic update can have minimal partition maps that only contain the partition ids, which reduces the thrift object size for large tables. The catalog object key of HdfsPartition consists of db name, table name and partition name. In "full" topic mode (catalog_topic_mode=full), catalogd only sends changed partitions with their latest table states. The latest table states are table objects with the minimal partition map. Legacy coordinators use the partition list to pick up existing (unchanged) partitions from the existing table object and new partitions in the catalog update. Currently, partition instances are immutable - all partition modifications are implemented by deleting the old instance and adding a new one with a new partition id. Since partition ids are generated by a global counter. Newer partition instances will have larger partition ids. So catalogd maintains a watermark for each table as the max sent partition id. Partition instances with ids larger than this are new partitions that should be sent in the next catalog update. For the deleted partition instances, they are kept in a set for each table until the next catalog update. If there are no updates on the same partition name, catalogd will send deletion on the partition. For dropped or invalidated tables, catalogd will still send deletions on their partitions. Although they are not used in coordinators (coordinators delete the partitions when they delete the table instances), they help in avoiding topic entry leak in the statestore catalog topic. In "minimal" topic mode (catalog_topic_mode=minimal), catalogd only sends invalidations on tables and stale partition instances. Each partition instance is identified by its partition id. LocalCatalog coordinators use the partition invalidations to evict stale partitions in time. For instance, let's say partition(year=2010) is updated in catalogd. This is done by deleting the old partition instance partition(id=0, year=2010) and adding a new partition instance partition(id=1, year=2010). Catalogd will send invalidations on the table and partition instance with id=0, but not the one with id=1. A LocalCatalog coordinator will invalidate the partition instance(id=0) if it's in the cache. If the partition instance(id=1) is cached, it's already the latest version since partition instances are immutable. So we don't need to invalidate it. Tests - Run exhaustive tests. - Run exhaustive test_ddl.py in LocalCatalog mode. - (TODO) Add tests on long statestore update frequency that several table changes are sent in the same topic update. - (TODO) Add tests on straggler coordinators that need to process several incremental updates at once. - (TODO) Add tests on no statestore topic entry leak. Change-Id: Ia0abfb346903d6e7cdc603af91c2b8937d24d870 --- M be/src/catalog/catalog-util.cc M common/thrift/CatalogObjects.thrift M fe/src/main/java/org/apache/impala/catalog/Catalog.java M fe/src/main/java/org/apache/impala/catalog/CatalogObject.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java 12 files changed, 501 insertions(+), 62 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/16159/2 -- To view, visit http://gerrit.cloudera.org:8080/16159 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia0abfb346903d6e7cdc603af91c2b8937d24d870 Gerrit-Change-Number: 16159 Gerrit-PatchSet: 2 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang
[Impala-ASF-CR] IMPALA-6788: Abort ExecFInstance() RPC loop early after query failure
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16192 ) Change subject: IMPALA-6788: Abort ExecFInstance() RPC loop early after query failure .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6610/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16192 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I034788f7720fc97c25c54f006ff72dce6cb199c3 Gerrit-Change-Number: 16192 Gerrit-PatchSet: 3 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Thu, 16 Jul 2020 02:34:15 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6788: Abort ExecFInstance() RPC loop early after query failure
Wenzhe Zhou has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/16192 ) Change subject: IMPALA-6788: Abort ExecFInstance() RPC loop early after query failure .. IMPALA-6788: Abort ExecFInstance() RPC loop early after query failure Stops issuing ExecQueryFInstance rpcs and cancels any inflight when backend reports failure. Adds new debug action CONSTRUCT_QUERY_STATE_REPORT that runs when constructing a query state report. Adds a new test case for handling errors reported from query state. Testing: - Ran following command for new test case and verified that the code working as expected: ./bin/impala-py.test tests/custom_cluster/test_rpc_exception.py\ ::TestRPCException::test_state_report_error \ --workload_exploration_strategy=functional-query:exhaustive - Passed exhaustive tests. Change-Id: I034788f7720fc97c25c54f006ff72dce6cb199c3 --- M be/src/runtime/coordinator.cc M be/src/runtime/query-state.cc M tests/custom_cluster/test_rpc_exception.py 3 files changed, 39 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/92/16192/3 -- To view, visit http://gerrit.cloudera.org:8080/16192 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I034788f7720fc97c25c54f006ff72dce6cb199c3 Gerrit-Change-Number: 16192 Gerrit-PatchSet: 3 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-9956: inline hot functions in Sorter
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16202 ) Change subject: IMPALA-9956: inline hot functions in Sorter .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16202 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7a8034ab6d2e3c71a2d2f2fcc3d6b788e9398194 Gerrit-Change-Number: 16202 Gerrit-PatchSet: 2 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 16 Jul 2020 00:01:12 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9956: inline hot functions in Sorter
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16202 ) Change subject: IMPALA-9956: inline hot functions in Sorter .. IMPALA-9956: inline hot functions in Sorter Add some compiler hints to force inlining of small functions into the hot Partition() loop. Performance: A single node perf run on TPC-H showed no perf change. A single node performance run with the queries that target sort performance showed up to a 19% reduction in time spent in the sort. +---+---+-++++ | Workload | File Format | Avg (s) | Delta(Avg) | GeoMean(s) | Delta(GeoMean) | +---+---+-++++ | TARGETED-PERF(30) | parquet / none / none | 5.52| -5.82% | 4.00 | -9.74% | +---+---+-++++ +---+-+---++-++---++---++-++ | Workload | Query | File Format | Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%) | Base StdDev(%) | Iters | Median Diff(%) | MW Zval | Tval | +---+-+---++-++---++---++-++ | TARGETED-PERF(30) | primitive_orderby_all | parquet / none / none | 11.89 | 12.22 | -2.73% | 1.07% | 1.20%| 10| -2.88% | -3.13 | -5.42 | | TARGETED-PERF(30) | primitive_orderby_bigint_expression | parquet / none / none | 2.61 | 2.94| I -11.27% | 0.83% | 1.14%| 10| I -12.56% | -3.58 | -26.25 | | TARGETED-PERF(30) | primitive_orderby_bigint| parquet / none / none | 2.06 | 2.42| I -14.80% | 0.94% | 0.68%| 10| I -17.43% | -3.58 | -44.37 | +---+-+---++-++---++---++-++ (I) Improvement: TARGETED-PERF(30) primitive_orderby_bigint_expression [parquet / none / none] (2.94s -> 2.61s [-11.27%]) +-++--+--++---+--+--+++---+---+---+ | Operator| % of Query | Avg | Base Avg | Delta(Avg) | StdDev(%) | Max | Base Max | Delta(Max) | #Hosts | #Inst | #Rows | Est #Rows | +-++--+--++---+--+--+++---+---+---+ | 02:ANALYTIC | 11.84% | 332.95ms | 337.56ms | -1.37% | 4.86% | 360.86ms | 379.52ms | -4.92% | 1 | 1 | 5.09M | 18.00M| | F00:EXCHANGE SENDER | 15.61% | 439.03ms | 454.63ms | -3.43% | 4.86% | 478.29ms | 485.79ms | -1.55% | 1 | 1 | -1| -1| | 01:SORT | 67.05% | 1.89s| 2.21s| -14.88%| 0.98% | 1.92s| 2.26s| -15.07%| 1 | 1 | 5.09M | 18.00M| +-++--+--++---+--+--+++---+---+---+ (I) Improvement: TARGETED-PERF(30) primitive_orderby_bigint [parquet / none / none] (2.42s -> 2.06s [-14.80%]) +-++--+--++---+--+--+++---+---+---+ | Operator| % of Query | Avg | Base Avg | Delta(Avg) | StdDev(%) | Max | Base Max | Delta(Max) | #Hosts | #Inst | #Rows | Est #Rows | +-++--+--++---+--+--+++---+---+---+ | 02:ANALYTIC | 15.39% | 367.90ms | 373.26ms | -1.44% | 3.48% | 390.03ms | 393.01ms | -0.76% | 1 | 1 | 5.09M | 18.00M| | F00:EXCHANGE SENDER | 15.64% | 373.88ms | 374.12ms | -0.07% | 2.83% | 389.96ms | 386.36ms | +0.93% | 1 | 1 | -1| -1| | 01:SORT | 56.28% | 1.35s| 1.68s| -20.10%| 1.14% | 1.38s| 1.70s| -18.92%| 1 | 1 | 5.09M | 18.00M| | 00:SCAN HDFS| 9.67% | 231.18ms | 231.77ms | -0.25% | 7.06% | 247.79ms | 250.70ms | -1.16% | 1 | 1 | 5.09M | 18.00M|
[Impala-ASF-CR] IMPALA-8125: Add query option to limit number of hdfs writer instances
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16204 ) Change subject: IMPALA-8125: Add query option to limit number of hdfs writer instances .. Patch Set 1: (6 comments) I had some initial things that I noticed while doing a pass over it. http://gerrit.cloudera.org:8080/#/c/16204/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16204/1//COMMIT_MSG@9 PS1, Line 9: This patch adds a new query option MAX_HDFS_WRITERS that limits the Maybe we should call it FS instead of HDFS? Just cause the name is a bit anachronistic at this point. http://gerrit.cloudera.org:8080/#/c/16204/1//COMMIT_MSG@26 PS1, Line 26: - Added e2e tests to confirm that the scheduler is enforcing the limit It seemed based on first glance that we should probably have end-to-end tests for more of the different plan shapes that could be generated. Maybe I am missing something though. http://gerrit.cloudera.org:8080/#/c/16204/1/be/src/scheduling/scheduler.cc File be/src/scheduling/scheduler.cc: http://gerrit.cloudera.org:8080/#/c/16204/1/be/src/scheduling/scheduler.cc@496 PS1, Line 496: // This implementation ensures that instances on the same host get consecutive Can you also comment what it's trying to achieve (i.e. Create the desired number of instances while balancing them across hosts). http://gerrit.cloudera.org:8080/#/c/16204/1/be/src/service/query-options.cc File be/src/service/query-options.cc: http://gerrit.cloudera.org:8080/#/c/16204/1/be/src/service/query-options.cc@685 PS1, Line 685: break; Uhh... oops. It would be good to file a separate JIRA for the missing breaks. Just cause it could be something people actually run into. http://gerrit.cloudera.org:8080/#/c/16204/1/testdata/workloads/functional-planner/queries/PlannerTest/insert-hdfs-writer-limit.test File testdata/workloads/functional-planner/queries/PlannerTest/insert-hdfs-writer-limit.test: http://gerrit.cloudera.org:8080/#/c/16204/1/testdata/workloads/functional-planner/queries/PlannerTest/insert-hdfs-writer-limit.test@6 PS1, Line 6: PLAN Maybe we should only include DISTRIBUTEDPLAN in these tests? I feel like the single node plans are mostly adding noise. http://gerrit.cloudera.org:8080/#/c/16204/1/tests/custom_cluster/test_mt_dop.py File tests/custom_cluster/test_mt_dop.py: http://gerrit.cloudera.org:8080/#/c/16204/1/tests/custom_cluster/test_mt_dop.py@117 PS1, Line 117: @CustomClusterTestSuite.with_args(impalad_args="--unlock_mt_dop=true", cluster_size=3) We actually set unlock_mt_dop=true for the e2e tests, so I think this could be an end-to-end test. -- To view, visit http://gerrit.cloudera.org:8080/16204 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I17c8e61b9a32d908eec82c83618ff9caa41078a5 Gerrit-Change-Number: 16204 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 15 Jul 2020 23:47:08 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7001: Fix Privilege inconsistency between SHOW TABLES and SHOW FUNCTIONS
Fang-Yu Rao has posted comments on this change. ( http://gerrit.cloudera.org:8080/16199 ) Change subject: IMPALA-7001: Fix Privilege inconsistency between SHOW TABLES and SHOW FUNCTIONS .. Patch Set 2: Code-Review+1 Hi Adam, thanks for working on this patch! The patch looks good to me since you have implemented what Fredy had suggested at https://issues.apache.org/jira/browse/IMPALA-7001. I only have two minor comments regarding the test and the commit message. Specifically, after your patch, a user granted only the privilege of CREATE on a specified database, e.g., functional, would be able to execute a statement like "SHOW FUNCTIONS IN functional", since according to https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/authorization/Privilege.java and https://github.com/apache/impala/blob/3a6022ce80ca1cedb629400b18caaf0d1f54137c/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java#L431-L453, such a statement would succeed as long as the user is granted any privilege in the set {ALL, OWNER, ALTER, DROP, CREATE, INSERT, SELECT, REFRESH}. Before your patch, in order for the statement above to succeed, a user has to be granted any privilege in the set {INSERT, SELECT, REFRESH}. Thus I think it would be good to add one more test case in https://github.com/apache/impala/blob/master/tests/authorization/test_ranger.py, where we 1) grant the privilege of CREATE to a user (as admin_client), and 2) execute a statement like "SHOW FUNCTIONS IN unique_database" to verify there is no exception thrown. On the other hand, I think it may also be good to provide more detail of the difference before and after the patch. For instance, we could mention that a user granted only the privilege of CREATE is now able to execute that SQL statement above after this patch, making it easier for the user to manage the functions it creates. -- To view, visit http://gerrit.cloudera.org:8080/16199 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9ae7546c206daaf98ecc3de449069027c43c6e1a Gerrit-Change-Number: 16199 Gerrit-PatchSet: 2 Gerrit-Owner: Adam Tamas Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 15 Jul 2020 23:27:59 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8125: Add query option to limit number of hdfs writer instances
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16204 ) Change subject: IMPALA-8125: Add query option to limit number of hdfs writer instances .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6609/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16204 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I17c8e61b9a32d908eec82c83618ff9caa41078a5 Gerrit-Change-Number: 16204 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 15 Jul 2020 21:48:25 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8125: Add query option to limit number of hdfs writer instances
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16204 ) Change subject: IMPALA-8125: Add query option to limit number of hdfs writer instances .. Patch Set 1: (5 comments) http://gerrit.cloudera.org:8080/#/c/16204/1/common/thrift/ImpalaService.thrift File common/thrift/ImpalaService.thrift: http://gerrit.cloudera.org:8080/#/c/16204/1/common/thrift/ImpalaService.thrift@547 PS1, Line 547: // Sets an upper limit on the number of hdfs writer instances used scheduled during insert. line too long (93 > 90) http://gerrit.cloudera.org:8080/#/c/16204/1/fe/src/test/java/org/apache/impala/planner/PlannerTest.java File fe/src/test/java/org/apache/impala/planner/PlannerTest.java: http://gerrit.cloudera.org:8080/#/c/16204/1/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@325 PS1, Line 325: "create table test_hdfs_insert_writer_limit.unpartitioned_table (id int) location '/'"); line too long (96 > 90) http://gerrit.cloudera.org:8080/#/c/16204/1/tests/custom_cluster/test_mt_dop.py File tests/custom_cluster/test_mt_dop.py: http://gerrit.cloudera.org:8080/#/c/16204/1/tests/custom_cluster/test_mt_dop.py@105 PS1, Line 105: class TestMtDopHdfsWriterLimit(CustomClusterTestSuite): flake8: E302 expected 2 blank lines, found 1 http://gerrit.cloudera.org:8080/#/c/16204/1/tests/query_test/test_insert.py File tests/query_test/test_insert.py: http://gerrit.cloudera.org:8080/#/c/16204/1/tests/query_test/test_insert.py@354 PS1, Line 354: class TestInsertHdfsWriterLimit(ImpalaTestSuite): flake8: E302 expected 2 blank lines, found 1 http://gerrit.cloudera.org:8080/#/c/16204/1/tests/query_test/test_insert.py@382 PS1, Line 382: , flake8: E231 missing whitespace after ',' -- To view, visit http://gerrit.cloudera.org:8080/16204 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I17c8e61b9a32d908eec82c83618ff9caa41078a5 Gerrit-Change-Number: 16204 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 15 Jul 2020 21:21:03 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8125: Add query option to limit number of hdfs writer instances
Bikramjeet Vig has uploaded this change for review. ( http://gerrit.cloudera.org:8080/16204 Change subject: IMPALA-8125: Add query option to limit number of hdfs writer instances .. IMPALA-8125: Add query option to limit number of hdfs writer instances This patch adds a new query option MAX_HDFS_WRITERS that limits the number of HDFS writer instances. Highlights: - Depending on the plan, it either restricts the num of instances of the root fragment or adds an exchange and then limits the num of instances of that. - Assigns instances evenly across available backends. - "no-shuffle" query hint is ignored when using query option. - Change in behavior of plans is only when this query option is used. - The only exception to the previous point is that the optimization logic that decides to add an exchange now looks at the num of instances instead of the number of nodes. Testing: - Adding planner tests to cover all cases where this enforcement kicks in and to highlight the behavior. - Added e2e tests to confirm that the scheduler is enforcing the limit and distributing the instance evenly across backends. Change-Id: I17c8e61b9a32d908eec82c83618ff9caa41078a5 --- M be/src/scheduling/scheduler.cc M be/src/scheduling/scheduler.h M be/src/service/query-options-test.cc M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java M fe/src/main/java/org/apache/impala/planner/HdfsTableSink.java M fe/src/main/java/org/apache/impala/planner/PlanFragment.java M fe/src/main/java/org/apache/impala/planner/TableSink.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java A testdata/workloads/functional-planner/queries/PlannerTest/insert-hdfs-writer-limit.test M tests/custom_cluster/test_mt_dop.py M tests/query_test/test_insert.py 17 files changed, 1,271 insertions(+), 34 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/16204/1 -- To view, visit http://gerrit.cloudera.org:8080/16204 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I17c8e61b9a32d908eec82c83618ff9caa41078a5 Gerrit-Change-Number: 16204 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet Vig
[Impala-ASF-CR] Bump up CDP BUILD NUMBER to 4493826
Tim Armstrong has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16195 ) Change subject: Bump up CDP_BUILD_NUMBER to 4493826 .. Bump up CDP_BUILD_NUMBER to 4493826 This change bumps up the CDP_BUILD_NUMBER to 4493826. This is needed to fix a failing test. Hive started to assign bucket ids to files differently. Because of that I had to modify the test_full_acid_rowid test that had an assumption about how bucket ids are assigned to files. If you have problems restarting the Hive Metastore, try the following: buildall.sh -upgrade_metastore_db If you have problems restarting Kudu, try the following: Unset LD_LIBRARY_PATH in your shell, and stop setting it in impala-config-local.sh Change-Id: Ia4635feef146c945624135e0715495bb01ea4699 Reviewed-on: http://gerrit.cloudera.org:8080/16195 Tested-by: Impala Public Jenkins Reviewed-by: Tim Armstrong --- M bin/impala-config.sh M fe/pom.xml M testdata/workloads/functional-query/queries/QueryTest/full-acid-rowid.test 3 files changed, 30 insertions(+), 17 deletions(-) Approvals: Impala Public Jenkins: Verified Tim Armstrong: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/16195 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ia4635feef146c945624135e0715495bb01ea4699 Gerrit-Change-Number: 16195 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] Bump up CDP BUILD NUMBER to 4493826
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16195 ) Change subject: Bump up CDP_BUILD_NUMBER to 4493826 .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16195 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia4635feef146c945624135e0715495bb01ea4699 Gerrit-Change-Number: 16195 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 15 Jul 2020 19:14:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9956: inline hot functions in Sorter
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16202 ) Change subject: IMPALA-9956: inline hot functions in Sorter .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6137/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16202 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7a8034ab6d2e3c71a2d2f2fcc3d6b788e9398194 Gerrit-Change-Number: 16202 Gerrit-PatchSet: 2 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 15 Jul 2020 18:54:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9956: inline hot functions in Sorter
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16202 ) Change subject: IMPALA-9956: inline hot functions in Sorter .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16202 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7a8034ab6d2e3c71a2d2f2fcc3d6b788e9398194 Gerrit-Change-Number: 16202 Gerrit-PatchSet: 2 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 15 Jul 2020 18:54:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9956: inline hot functions in Sorter
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16202 ) Change subject: IMPALA-9956: inline hot functions in Sorter .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6608/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16202 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7a8034ab6d2e3c71a2d2f2fcc3d6b788e9398194 Gerrit-Change-Number: 16202 Gerrit-PatchSet: 1 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 15 Jul 2020 18:48:28 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9956: inline hot functions in Sorter
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/16202 ) Change subject: IMPALA-9956: inline hot functions in Sorter .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16202 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7a8034ab6d2e3c71a2d2f2fcc3d6b788e9398194 Gerrit-Change-Number: 16202 Gerrit-PatchSet: 1 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 15 Jul 2020 18:32:13 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9956: inline hot functions in Sorter
Tim Armstrong has uploaded this change for review. ( http://gerrit.cloudera.org:8080/16202 Change subject: IMPALA-9956: inline hot functions in Sorter .. IMPALA-9956: inline hot functions in Sorter Add some compiler hints to force inlining of small functions into the hot Partition() loop. Performance: A single node perf run on TPC-H showed no perf change. A single node performance run with the queries that target sort performance showed up to a 19% reduction in time spent in the sort. +---+---+-++++ | Workload | File Format | Avg (s) | Delta(Avg) | GeoMean(s) | Delta(GeoMean) | +---+---+-++++ | TARGETED-PERF(30) | parquet / none / none | 5.52| -5.82% | 4.00 | -9.74% | +---+---+-++++ +---+-+---++-++---++---++-++ | Workload | Query | File Format | Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%) | Base StdDev(%) | Iters | Median Diff(%) | MW Zval | Tval | +---+-+---++-++---++---++-++ | TARGETED-PERF(30) | primitive_orderby_all | parquet / none / none | 11.89 | 12.22 | -2.73% | 1.07% | 1.20%| 10| -2.88% | -3.13 | -5.42 | | TARGETED-PERF(30) | primitive_orderby_bigint_expression | parquet / none / none | 2.61 | 2.94| I -11.27% | 0.83% | 1.14%| 10| I -12.56% | -3.58 | -26.25 | | TARGETED-PERF(30) | primitive_orderby_bigint| parquet / none / none | 2.06 | 2.42| I -14.80% | 0.94% | 0.68%| 10| I -17.43% | -3.58 | -44.37 | +---+-+---++-++---++---++-++ (I) Improvement: TARGETED-PERF(30) primitive_orderby_bigint_expression [parquet / none / none] (2.94s -> 2.61s [-11.27%]) +-++--+--++---+--+--+++---+---+---+ | Operator| % of Query | Avg | Base Avg | Delta(Avg) | StdDev(%) | Max | Base Max | Delta(Max) | #Hosts | #Inst | #Rows | Est #Rows | +-++--+--++---+--+--+++---+---+---+ | 02:ANALYTIC | 11.84% | 332.95ms | 337.56ms | -1.37% | 4.86% | 360.86ms | 379.52ms | -4.92% | 1 | 1 | 5.09M | 18.00M| | F00:EXCHANGE SENDER | 15.61% | 439.03ms | 454.63ms | -3.43% | 4.86% | 478.29ms | 485.79ms | -1.55% | 1 | 1 | -1| -1| | 01:SORT | 67.05% | 1.89s| 2.21s| -14.88%| 0.98% | 1.92s| 2.26s| -15.07%| 1 | 1 | 5.09M | 18.00M| +-++--+--++---+--+--+++---+---+---+ (I) Improvement: TARGETED-PERF(30) primitive_orderby_bigint [parquet / none / none] (2.42s -> 2.06s [-14.80%]) +-++--+--++---+--+--+++---+---+---+ | Operator| % of Query | Avg | Base Avg | Delta(Avg) | StdDev(%) | Max | Base Max | Delta(Max) | #Hosts | #Inst | #Rows | Est #Rows | +-++--+--++---+--+--+++---+---+---+ | 02:ANALYTIC | 15.39% | 367.90ms | 373.26ms | -1.44% | 3.48% | 390.03ms | 393.01ms | -0.76% | 1 | 1 | 5.09M | 18.00M| | F00:EXCHANGE SENDER | 15.64% | 373.88ms | 374.12ms | -0.07% | 2.83% | 389.96ms | 386.36ms | +0.93% | 1 | 1 | -1| -1| | 01:SORT | 56.28% | 1.35s| 1.68s| -20.10%| 1.14% | 1.38s| 1.70s| -18.92%| 1 | 1 | 5.09M | 18.00M| | 00:SCAN HDFS| 9.67% | 231.18ms | 231.77ms | -0.25% | 7.06% | 247.79ms | 250.70ms | -1.16% | 1 | 1 | 5.09M | 18.00M|
[Impala-ASF-CR] IMPALA-1270: add distinct aggregation to semi joins
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16180 ) Change subject: IMPALA-1270: add distinct aggregation to semi joins .. IMPALA-1270: add distinct aggregation to semi joins When generating plans with left semi/anti joins (typically resulting from subquery rewrites), the planner now considers inserting a distinct aggregation on the inner side of the join. The decision is based on whether that aggregation would reduce the number of rows by more than 75%. This is fairly conservative and the optimization might be beneficial for smaller reductions, but the conservative threshold is chosen to reduce the number of potential plan regressions. The aggregation can both reduce the # of rows and the width of the rows, by projecting out unneeded slots. ENABLE_DISTINCT_SEMI_JOIN_OPTIMIZATION query option is added to allow toggling the optimization. Tests: * Add positive and negative planner tests for various cases - including semi/anti joins, missing stats, broadcast/shuffle, different numbers of join predicates. * Add some end-to-end tests to verify plans execute correctly. Change-Id: Icbb955e805d9e764edf11c57b98f341b88a37fcc Reviewed-on: http://gerrit.cloudera.org:8080/16180 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/planner/AggregationNode.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M testdata/workloads/functional-planner/queries/PlannerTest/join-order.test M testdata/workloads/functional-planner/queries/PlannerTest/joins.test M testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test M testdata/workloads/functional-planner/queries/PlannerTest/nested-loop-join.test M testdata/workloads/functional-planner/queries/PlannerTest/outer-joins.test A testdata/workloads/functional-planner/queries/PlannerTest/semi-join-distinct.test M testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite-hdfs-num-rows-est-enabled.test M testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test M testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test M testdata/workloads/functional-planner/queries/PlannerTest/tpch-kudu.test M testdata/workloads/functional-planner/queries/PlannerTest/tpch-nested.test M testdata/workloads/functional-planner/queries/PlannerTest/tpch-views.test M testdata/workloads/functional-planner/queries/PlannerTest/union.test M testdata/workloads/functional-query/queries/QueryTest/nested-types-runtime.test M testdata/workloads/functional-query/queries/QueryTest/subquery.test 25 files changed, 3,746 insertions(+), 467 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/16180 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Icbb955e805d9e764edf11c57b98f341b88a37fcc Gerrit-Change-Number: 16180 Gerrit-PatchSet: 13 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-1270: add distinct aggregation to semi joins
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16180 ) Change subject: IMPALA-1270: add distinct aggregation to semi joins .. Patch Set 12: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16180 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icbb955e805d9e764edf11c57b98f341b88a37fcc Gerrit-Change-Number: 16180 Gerrit-PatchSet: 12 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 15 Jul 2020 17:10:49 + Gerrit-HasComments: No
[Impala-ASF-CR] Bump up CDP BUILD NUMBER to 4493826
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16195 ) Change subject: Bump up CDP_BUILD_NUMBER to 4493826 .. Patch Set 4: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16195 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia4635feef146c945624135e0715495bb01ea4699 Gerrit-Change-Number: 16195 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 15 Jul 2020 15:16:51 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9741: Supported query Icebreg table by impala
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/16143 ) Change subject: IMPALA-9741: Supported query Icebreg table by impala .. Patch Set 6: (15 comments) http://gerrit.cloudera.org:8080/#/c/16143/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16143/6//COMMIT_MSG@7 PS6, Line 7: Icebreg It's still misspelled http://gerrit.cloudera.org:8080/#/c/16143/6//COMMIT_MSG@7 PS6, Line 7: Supported query nit: Support querying http://gerrit.cloudera.org:8080/#/c/16143/6//COMMIT_MSG@26 PS6, Line 26: identity specify http://gerrit.cloudera.org:8080/#/c/16143/6/common/thrift/CatalogObjects.thrift File common/thrift/CatalogObjects.thrift: http://gerrit.cloudera.org:8080/#/c/16143/6/common/thrift/CatalogObjects.thrift@512 PS6, Line 512: source_cols_map nit: column_to_source_id ? http://gerrit.cloudera.org:8080/#/c/16143/6/common/thrift/CatalogObjects.thrift@515 PS6, Line 515: partition_col_to_source_id_map The mapping is reversed. Name it "source_id_to_partition" ? http://gerrit.cloudera.org:8080/#/c/16143/6/common/thrift/CatalogObjects.thrift@516 PS6, Line 516: map file_descriptors Please follow the above conventions for naming maps. http://gerrit.cloudera.org:8080/#/c/16143/6/fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java File fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java: http://gerrit.cloudera.org:8080/#/c/16143/6/fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java@28 PS6, Line 28: // The id of the source field in iceberg table Schema, you can get these source : // fields by Schema.columns(), the return type is List. Might worth rewording it a bit: "The id of the source column in the Iceberg table schema. The source column is used as the input for this partition field." http://gerrit.cloudera.org:8080/#/c/16143/6/fe/src/main/java/org/apache/impala/analysis/ShowStatsStmt.java File fe/src/main/java/org/apache/impala/analysis/ShowStatsStmt.java: http://gerrit.cloudera.org:8080/#/c/16143/6/fe/src/main/java/org/apache/impala/analysis/ShowStatsStmt.java@88 PS6, Line 88: if (table_ instanceof FeIcebergTable) { : if (((FeIcebergTable) table_).getPartitionColToSourceIdMap().isEmpty()) { : notPartitioned = true; : } Probably we should treat all Iceberg tables as not partitioned, since it's partitioning is different than other file system tables' partitioning. http://gerrit.cloudera.org:8080/#/c/16143/6/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java File fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java: http://gerrit.cloudera.org:8080/#/c/16143/6/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java@66 PS6, Line 66: getFileDescMap nit: getPartitionToFileDescMap http://gerrit.cloudera.org:8080/#/c/16143/6/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java@219 PS6, Line 219: isPartitionTable nit: isPartitioned? http://gerrit.cloudera.org:8080/#/c/16143/6/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java@258 PS6, Line 258: PartitionColToSourceId It returns a mapping from source ids to partition columns, therefore please name it "sourceIdToPartitionCol". http://gerrit.cloudera.org:8080/#/c/16143/6/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java@271 PS6, Line 271: getSourceColsMap nit: getColumnToSourceIdMap? http://gerrit.cloudera.org:8080/#/c/16143/6/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java@305 PS6, Line 305: nit: wrong indentation http://gerrit.cloudera.org:8080/#/c/16143/6/fe/src/main/java/org/apache/impala/util/IcebergUtil.java File fe/src/main/java/org/apache/impala/util/IcebergUtil.java: http://gerrit.cloudera.org:8080/#/c/16143/6/fe/src/main/java/org/apache/impala/util/IcebergUtil.java@114 PS6, Line 114: if (format == null) return null; : format = format.toUpperCase(); : if (format.equals("PARQUET")) { : return TIcebergFileFormat.PARQUET; : } : return null; How about: if ("PARQUET".equalsIgnoreCase(format)) return TIcebergFileFormat.PARQUET; return null; http://gerrit.cloudera.org:8080/#/c/16143/6/testdata/bin/generate-schema-statements.py File testdata/bin/generate-schema-statements.py: http://gerrit.cloudera.org:8080/#/c/16143/6/testdata/bin/generate-schema-statements.py@193 PS6, Line 193: 'iceberg': 'ICEBERG' You probably don't need to modify this file. I think adding HUDIPARQUET to this file was also unnecessary. Probably we can do the same thing that we did for Hudi, i.e. add the Iceberg tables under the functional_parquet database. https://gerrit.cloudera.org/c/14711/25/testdata/datasets/functional/schema_constraints.csv
[Impala-ASF-CR] IMPALA-1270: add distinct aggregation to semi joins
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16180 ) Change subject: IMPALA-1270: add distinct aggregation to semi joins .. Patch Set 11: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16180 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icbb955e805d9e764edf11c57b98f341b88a37fcc Gerrit-Change-Number: 16180 Gerrit-PatchSet: 11 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 15 Jul 2020 12:16:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-1270: add distinct aggregation to semi joins
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16180 ) Change subject: IMPALA-1270: add distinct aggregation to semi joins .. Patch Set 12: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16180 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icbb955e805d9e764edf11c57b98f341b88a37fcc Gerrit-Change-Number: 16180 Gerrit-PatchSet: 12 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 15 Jul 2020 11:59:59 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-1270: add distinct aggregation to semi joins
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16180 ) Change subject: IMPALA-1270: add distinct aggregation to semi joins .. Patch Set 12: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6136/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16180 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icbb955e805d9e764edf11c57b98f341b88a37fcc Gerrit-Change-Number: 16180 Gerrit-PatchSet: 12 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 15 Jul 2020 12:00:00 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-1270: add distinct aggregation to semi joins
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/16180 ) Change subject: IMPALA-1270: add distinct aggregation to semi joins .. Patch Set 11: Code-Review+2 Great work, LGTM! -- To view, visit http://gerrit.cloudera.org:8080/16180 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icbb955e805d9e764edf11c57b98f341b88a37fcc Gerrit-Change-Number: 16180 Gerrit-PatchSet: 11 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 15 Jul 2020 11:59:20 + Gerrit-HasComments: No
[Impala-ASF-CR] Bump up CDP BUILD NUMBER to 4493826
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16195 ) Change subject: Bump up CDP_BUILD_NUMBER to 4493826 .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6607/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16195 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia4635feef146c945624135e0715495bb01ea4699 Gerrit-Change-Number: 16195 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 15 Jul 2020 10:29:14 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7001: Fix Privilege inconsistency between SHOW TABLES and SHOW FUNCTIONS
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16199 ) Change subject: IMPALA-7001: Fix Privilege inconsistency between SHOW TABLES and SHOW FUNCTIONS .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6606/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16199 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9ae7546c206daaf98ecc3de449069027c43c6e1a Gerrit-Change-Number: 16199 Gerrit-PatchSet: 2 Gerrit-Owner: Adam Tamas Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 15 Jul 2020 10:15:56 + Gerrit-HasComments: No
[Impala-ASF-CR] Bump up CDP BUILD NUMBER to 4493826
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16195 ) Change subject: Bump up CDP_BUILD_NUMBER to 4493826 .. Patch Set 4: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6135/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16195 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia4635feef146c945624135e0715495bb01ea4699 Gerrit-Change-Number: 16195 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 15 Jul 2020 10:06:12 + Gerrit-HasComments: No
[Impala-ASF-CR] Bump up CDP BUILD NUMBER to 4493826
Hello Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16195 to look at the new patch set (#3). Change subject: Bump up CDP_BUILD_NUMBER to 4493826 .. Bump up CDP_BUILD_NUMBER to 4493826 This change bumps up the CDP_BUILD_NUMBER to 4493826. This is needed to fix a failing test. Hive started to assign bucket ids to files differently. Because of that I had to modify the test_full_acid_rowid test that had an assumption about how bucket ids are assigned to files. If you have problems restarting the Hive Metastore, try the following: buildall.sh -upgrade_metastore_db If you have problems restarting Kudu, try the following: Unset LD_LIBRARY_PATH in your shell, and stop setting it in impala-config-local.sh Change-Id: Ia4635feef146c945624135e0715495bb01ea4699 --- M bin/impala-config.sh M fe/pom.xml M testdata/workloads/functional-query/queries/QueryTest/full-acid-rowid.test 3 files changed, 30 insertions(+), 17 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/95/16195/3 -- To view, visit http://gerrit.cloudera.org:8080/16195 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia4635feef146c945624135e0715495bb01ea4699 Gerrit-Change-Number: 16195 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-7001: Fix Privilege inconsistency between SHOW TABLES and SHOW FUNCTIONS
Adam Tamas has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/16199 ) Change subject: IMPALA-7001: Fix Privilege inconsistency between SHOW TABLES and SHOW FUNCTIONS .. IMPALA-7001: Fix Privilege inconsistency between SHOW TABLES and SHOW FUNCTIONS In "show tables" ANY privilege was used, whereas in "show functions" the required privilege was VIEW_METADATA. To solve the inconsistency "show functions" will use ANY instead of VIEW_METADATA similar to "show tables". Testing: -Ran CORE tests. -Added new test to check the privilege. Change-Id: I9ae7546c206daaf98ecc3de449069027c43c6e1a --- M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java M fe/src/main/java/org/apache/impala/analysis/ShowFunctionsStmt.java M fe/src/test/java/org/apache/impala/analysis/AuditingTest.java 3 files changed, 15 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/99/16199/2 -- To view, visit http://gerrit.cloudera.org:8080/16199 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9ae7546c206daaf98ecc3de449069027c43c6e1a Gerrit-Change-Number: 16199 Gerrit-PatchSet: 2 Gerrit-Owner: Adam Tamas Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-7001: Fix Privilege inconsistency between SHOW TABLES and SHOW FUNCTIONS
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16199 ) Change subject: IMPALA-7001: Fix Privilege inconsistency between SHOW TABLES and SHOW FUNCTIONS .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6605/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16199 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9ae7546c206daaf98ecc3de449069027c43c6e1a Gerrit-Change-Number: 16199 Gerrit-PatchSet: 1 Gerrit-Owner: Adam Tamas Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 15 Jul 2020 09:15:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7001: Fix Privilege inconsistency between SHOW TABLES and SHOW FUNCTIONS
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16199 ) Change subject: IMPALA-7001: Fix Privilege inconsistency between SHOW TABLES and SHOW FUNCTIONS .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/16199/1/fe/src/test/java/org/apache/impala/analysis/AuditingTest.java File fe/src/test/java/org/apache/impala/analysis/AuditingTest.java: http://gerrit.cloudera.org:8080/#/c/16199/1/fe/src/test/java/org/apache/impala/analysis/AuditingTest.java@372 PS1, Line 372: Set accessEvents = AnalyzeAccessEvents(String.format("show %s in functional", qual)); line too long (109 > 90) http://gerrit.cloudera.org:8080/#/c/16199/1/fe/src/test/java/org/apache/impala/analysis/AuditingTest.java@374 PS1, Line 374: Sets.newHashSet(new TAccessEvent("functional", TCatalogObjectType.DATABASE, "ANY"))); line too long (103 > 90) -- To view, visit http://gerrit.cloudera.org:8080/16199 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9ae7546c206daaf98ecc3de449069027c43c6e1a Gerrit-Change-Number: 16199 Gerrit-PatchSet: 1 Gerrit-Owner: Adam Tamas Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 15 Jul 2020 08:47:16 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7001: Fix Privilege inconsistency between SHOW TABLES and SHOW FUNCTIONS
Adam Tamas has uploaded this change for review. ( http://gerrit.cloudera.org:8080/16199 Change subject: IMPALA-7001: Fix Privilege inconsistency between SHOW TABLES and SHOW FUNCTIONS .. IMPALA-7001: Fix Privilege inconsistency between SHOW TABLES and SHOW FUNCTIONS In "show tables" ANY privilege was used, whereas in "show functions" the required privilege was VIEW_METADATA. To solve the inconsistency "show functions" will use ANY instead of VIEW_METADATA similar to "show tables". Testing: -Ran CORE tests. -Added new test to check the privilege. Change-Id: I9ae7546c206daaf98ecc3de449069027c43c6e1a --- M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java M fe/src/main/java/org/apache/impala/analysis/ShowFunctionsStmt.java M fe/src/test/java/org/apache/impala/analysis/AuditingTest.java 3 files changed, 13 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/99/16199/1 -- To view, visit http://gerrit.cloudera.org:8080/16199 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I9ae7546c206daaf98ecc3de449069027c43c6e1a Gerrit-Change-Number: 16199 Gerrit-PatchSet: 1 Gerrit-Owner: Adam Tamas
[Impala-ASF-CR] IMPALA-9741: Supported query Icebreg table by impala
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16143 ) Change subject: IMPALA-9741: Supported query Icebreg table by impala .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6604/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16143 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006 Gerrit-Change-Number: 16143 Gerrit-PatchSet: 6 Gerrit-Owner: wangsheng Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Wed, 15 Jul 2020 08:32:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9741: Supported query Icebreg table by impala
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16143 ) Change subject: IMPALA-9741: Supported query Icebreg table by impala .. Patch Set 6: (2 comments) http://gerrit.cloudera.org:8080/#/c/16143/6/testdata/bin/generate-schema-statements.py File testdata/bin/generate-schema-statements.py: http://gerrit.cloudera.org:8080/#/c/16143/6/testdata/bin/generate-schema-statements.py@766 PS6, Line 766: n flake8: E501 line too long (94 > 90 characters) http://gerrit.cloudera.org:8080/#/c/16143/6/tests/common/test_dimensions.py File tests/common/test_dimensions.py: http://gerrit.cloudera.org:8080/#/c/16143/6/tests/common/test_dimensions.py@32 PS6, Line 32: c flake8: E501 line too long (98 > 90 characters) -- To view, visit http://gerrit.cloudera.org:8080/16143 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006 Gerrit-Change-Number: 16143 Gerrit-PatchSet: 6 Gerrit-Owner: wangsheng Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Wed, 15 Jul 2020 08:05:11 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9741: Supported query Icebreg table by impala
wangsheng has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/16143 ) Change subject: IMPALA-9741: Supported query Icebreg table by impala .. IMPALA-9741: Supported query Icebreg table by impala This patch mainly realizes the query of iceberg table through impala, we can use the following sql to create an external iceberg table: CREATE EXTERNAL TABLE default.iceberg_test ( level string, event_time timestamp, message string, ) STORED AS ICEBERG LOCATION 'hdfs://xxx' TBLPROPERTIES ('iceberg_file_format'='parquet'); Or just including table name and location like this: CREATE EXTERNAL TABLE default.iceberg_test STORED AS ICEBERG LOCATION 'hdfs://xxx' TBLPROPERTIES ('iceberg_file_format'='parquet'); 'iceberg_file_format' is the file format in iceberg, currently only support PARQUET, other format would be supported in the future. And if you don't identity this property in your SQL, default file format is PARQUET. We achieved this function by treating the iceberg table as normal unpartitioned hdfs table. When query iceberg table, we pushdown partition column predicates to iceberg to decided which data files need to be scanned, and then transformed these information to BE to do the real scan operation. Testing: - Unit test for Iceberg in FileMetadataLoaderTest - Create table tests in functional_schema_template.sql - Iceberg table query test in custom cluster test test_iceberg.py Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006 --- M be/src/runtime/descriptors.cc M bin/rat_exclude_files.txt M common/thrift/CatalogObjects.thrift M fe/pom.xml M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionSpec.java M fe/src/main/java/org/apache/impala/analysis/ShowFilesStmt.java M fe/src/main/java/org/apache/impala/analysis/ShowStatsStmt.java M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java M fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java A fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java M fe/src/test/java/org/apache/impala/catalog/FileMetadataLoaderTest.java M testdata/bin/generate-schema-statements.py M testdata/data/README A testdata/data/iceberg_test/iceberg_non_partitioned/data/1-100-e1a80ed6-1064-494d-9cdd-c4a30c1ab8dc-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/3-102-511427f2-85f0-43ae-9b39-a456f8dc57b6-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/4-103-00fc55e1-6ef7-4241-ace2-6d075b9737fc-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/6-105-ef9e76d5-c060-4040-8aa1-b7c275610daa-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/7-106-c09c9c8d-9478-44f9-8501-f85f53112bc3-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/9-108-3b4f06ac-dca3-4f4e-be60-bf42d9927b5b-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/00011-110-1e653ccf-0963-4fb0-941c-32c9de13268b-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/00012-111-dfa70658-eb4b-4fa0-9ffa-b892cf90d6ac-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/00014-113-2d16e751-e2a4-4856-ab89-145996e3815e-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/00015-114-0f710621-cbbf-4509-a93d-b58808978e2e-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/00017-116-0b666c79-53df-4507-906c-542e65a83443-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/00019-118-1bc6bc6e-e061-4da3-9d1e-a427a306c471-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/00020-119-ae7b2c67-1538-4429-8246-4998960e3817-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/00022-121-8db0f1e1-d88c-4aad-a8b3-24fd07329cdb-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/00023-122-de57b6b0-f54b-40ac-85cd-e783505094b6-0.parquet A
[Impala-ASF-CR] IMPALA-1270: add distinct aggregation to semi joins
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16180 ) Change subject: IMPALA-1270: add distinct aggregation to semi joins .. Patch Set 11: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6134/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/16180 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icbb955e805d9e764edf11c57b98f341b88a37fcc Gerrit-Change-Number: 16180 Gerrit-PatchSet: 11 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 15 Jul 2020 07:07:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-1270: add distinct aggregation to semi joins
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16180 ) Change subject: IMPALA-1270: add distinct aggregation to semi joins .. Patch Set 11: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6603/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16180 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icbb955e805d9e764edf11c57b98f341b88a37fcc Gerrit-Change-Number: 16180 Gerrit-PatchSet: 11 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 15 Jul 2020 06:59:09 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: IMPALA-9889: Fixed flaky test runtime filters on Kudu table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16191 ) Change subject: WIP: IMPALA-9889: Fixed flaky test_runtime_filters on Kudu table .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6602/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16191 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I94a08e272f0870c04c96563fa614e3416fb5379b Gerrit-Change-Number: 16191 Gerrit-PatchSet: 2 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Wed, 15 Jul 2020 06:47:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6788: Abort ExecFInstance() RPC loop early after query failure
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16192 ) Change subject: IMPALA-6788: Abort ExecFInstance() RPC loop early after query failure .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6601/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16192 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I034788f7720fc97c25c54f006ff72dce6cb199c3 Gerrit-Change-Number: 16192 Gerrit-PatchSet: 2 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Wed, 15 Jul 2020 06:47:25 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-1270: add distinct aggregation to semi joins
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16180 ) Change subject: IMPALA-1270: add distinct aggregation to semi joins .. Patch Set 10: (2 comments) http://gerrit.cloudera.org:8080/#/c/16180/10/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java: http://gerrit.cloudera.org:8080/#/c/16180/10/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@1877 PS10, Line 1877: List distinctExprs = new ArrayList<>(); > Should this be Set instead of List ? For example, if there's a correlatio Yeah, Expr.getIds() deduplicates the slot ids (I checked that when i was writing the code, but didn't add any breadcrumbs). It isn't actually documented on the method, so added a comment. http://gerrit.cloudera.org:8080/#/c/16180/10/testdata/workloads/functional-planner/queries/PlannerTest/semi-join-distinct.test File testdata/workloads/functional-planner/queries/PlannerTest/semi-join-distinct.test: http://gerrit.cloudera.org:8080/#/c/16180/10/testdata/workloads/functional-planner/queries/PlannerTest/semi-join-distinct.test@826 PS10, Line 826: | | group by: count(*) > It is strange to see an aggregate expr in the group-by since it is not vali There is some weirdness with how expressions are shown in the explain after substitution. It also happens with the transpose agg with the CASE statements appearing in places in the plan where they're not actually evaluated. -- To view, visit http://gerrit.cloudera.org:8080/16180 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icbb955e805d9e764edf11c57b98f341b88a37fcc Gerrit-Change-Number: 16180 Gerrit-PatchSet: 10 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 15 Jul 2020 06:36:45 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1270: add distinct aggregation to semi joins
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16180 ) Change subject: IMPALA-1270: add distinct aggregation to semi joins .. Patch Set 11: Code-Review+1 carry +1 -- To view, visit http://gerrit.cloudera.org:8080/16180 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icbb955e805d9e764edf11c57b98f341b88a37fcc Gerrit-Change-Number: 16180 Gerrit-PatchSet: 11 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 15 Jul 2020 06:36:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-1270: add distinct aggregation to semi joins
Hello Aman Sinha, Shant Hovsepian, David Rorke, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16180 to look at the new patch set (#11). Change subject: IMPALA-1270: add distinct aggregation to semi joins .. IMPALA-1270: add distinct aggregation to semi joins When generating plans with left semi/anti joins (typically resulting from subquery rewrites), the planner now considers inserting a distinct aggregation on the inner side of the join. The decision is based on whether that aggregation would reduce the number of rows by more than 75%. This is fairly conservative and the optimization might be beneficial for smaller reductions, but the conservative threshold is chosen to reduce the number of potential plan regressions. The aggregation can both reduce the # of rows and the width of the rows, by projecting out unneeded slots. ENABLE_DISTINCT_SEMI_JOIN_OPTIMIZATION query option is added to allow toggling the optimization. Tests: * Add positive and negative planner tests for various cases - including semi/anti joins, missing stats, broadcast/shuffle, different numbers of join predicates. * Add some end-to-end tests to verify plans execute correctly. Change-Id: Icbb955e805d9e764edf11c57b98f341b88a37fcc --- M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/planner/AggregationNode.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M testdata/workloads/functional-planner/queries/PlannerTest/join-order.test M testdata/workloads/functional-planner/queries/PlannerTest/joins.test M testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test M testdata/workloads/functional-planner/queries/PlannerTest/nested-loop-join.test M testdata/workloads/functional-planner/queries/PlannerTest/outer-joins.test A testdata/workloads/functional-planner/queries/PlannerTest/semi-join-distinct.test M testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite-hdfs-num-rows-est-enabled.test M testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test M testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test M testdata/workloads/functional-planner/queries/PlannerTest/tpch-kudu.test M testdata/workloads/functional-planner/queries/PlannerTest/tpch-nested.test M testdata/workloads/functional-planner/queries/PlannerTest/tpch-views.test M testdata/workloads/functional-planner/queries/PlannerTest/union.test M testdata/workloads/functional-query/queries/QueryTest/nested-types-runtime.test M testdata/workloads/functional-query/queries/QueryTest/subquery.test 25 files changed, 3,746 insertions(+), 467 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/16180/11 -- To view, visit http://gerrit.cloudera.org:8080/16180 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Icbb955e805d9e764edf11c57b98f341b88a37fcc Gerrit-Change-Number: 16180 Gerrit-PatchSet: 11 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] WIP: IMPALA-9889: Fixed flaky test runtime filters on Kudu table
Wenzhe Zhou has uploaded this change for review. ( http://gerrit.cloudera.org:8080/16191 Change subject: WIP: IMPALA-9889: Fixed flaky test_runtime_filters on Kudu table .. WIP: IMPALA-9889: Fixed flaky test_runtime_filters on Kudu table Test cases in test_runtime_filters failed occasionally in ASAN builds due to runtime filters not arriving scan nodes in time. Query profiles showed that codegen took 2 to 4 minutes for one fragment when this issue happened. This caused hash join nodes waiting long time to generate and publish runtime filters, hence arrival delay on scan nodes. To avoid the delay, turn on ASYNC_CODEGEN for test_runtime_filters agaiest Kudu table when test runs for ASAN build. Testing: - Passed core test for regular debug. TODO: pass ASAN build with core test. There are some unrelated issues which cause lots of failures for the ASAN build on Jenkins. The daily ASAN builds have same issue. Change-Id: I94a08e272f0870c04c96563fa614e3416fb5379b --- M tests/query_test/test_runtime_filters.py 1 file changed, 19 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/91/16191/2 -- To view, visit http://gerrit.cloudera.org:8080/16191 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I94a08e272f0870c04c96563fa614e3416fb5379b Gerrit-Change-Number: 16191 Gerrit-PatchSet: 2 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-6788: Abort ExecFInstance() RPC loop early after query failure
Wenzhe Zhou has uploaded this change for review. ( http://gerrit.cloudera.org:8080/16192 Change subject: IMPALA-6788: Abort ExecFInstance() RPC loop early after query failure .. IMPALA-6788: Abort ExecFInstance() RPC loop early after query failure Stops issuing ExecQueryFInstance rpcs and cancels any inflight when backend reports failure. Adds new debug action CONSTRUCT_QUERY_STATE_REPORT that runs when constructing a query state report. Adds a new test case for handling errors reported from query state. Testing: - Ran following command for new test case and verified that the code working as expected: ./bin/impala-py.test tests/custom_cluster/test_rpc_exception.py\ ::TestRPCException::test_state_report_error \ --workload_exploration_strategy=functional-query:exhaustive - Passed core tests. Change-Id: I034788f7720fc97c25c54f006ff72dce6cb199c3 --- M be/src/runtime/coordinator.cc M be/src/runtime/query-state.cc M tests/custom_cluster/test_rpc_exception.py 3 files changed, 38 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/92/16192/2 -- To view, visit http://gerrit.cloudera.org:8080/16192 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I034788f7720fc97c25c54f006ff72dce6cb199c3 Gerrit-Change-Number: 16192 Gerrit-PatchSet: 2 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Thomas Tauber-Marshall