[Impala-ASF-CR] IMPALA-10940: Pick parts of recent gutil changes from Kudu repo
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17897 ) Change subject: IMPALA-10940: Pick parts of recent gutil changes from Kudu repo .. Patch Set 6: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17897 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9230923111af3c1028f0e71cd37cb5ee6fbec654 Gerrit-Change-Number: 17897 Gerrit-PatchSet: 6 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Wed, 10 Nov 2021 07:49:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10940: Pick parts of recent gutil changes from Kudu repo
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17897 ) Change subject: IMPALA-10940: Pick parts of recent gutil changes from Kudu repo .. Patch Set 6: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7618/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/17897 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9230923111af3c1028f0e71cd37cb5ee6fbec654 Gerrit-Change-Number: 17897 Gerrit-PatchSet: 6 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Wed, 10 Nov 2021 07:49:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17858 ) Change subject: IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables .. Patch Set 15: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7616/ -- To view, visit http://gerrit.cloudera.org:8080/17858 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6ba07c9a338a25614690e314335ee4b801486da9 Gerrit-Change-Number: 17858 Gerrit-PatchSet: 15 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Fucun Chu Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Wed, 10 Nov 2021 06:25:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5256: Force log rotation when max log size exceeded
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17997 ) Change subject: IMPALA-5256: Force log rotation when max_log_size exceeded .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17997 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b94727180354fe69989ebf3cd1a8f8cda1cf0c3 Gerrit-Change-Number: 17997 Gerrit-PatchSet: 4 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Wed, 10 Nov 2021 04:36:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5256: Force log rotation when max log size exceeded
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17997 ) Change subject: IMPALA-5256: Force log rotation when max_log_size exceeded .. Patch Set 4: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7617/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/17997 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b94727180354fe69989ebf3cd1a8f8cda1cf0c3 Gerrit-Change-Number: 17997 Gerrit-PatchSet: 4 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Wed, 10 Nov 2021 04:36:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11006: Impalad crashes during query cancel tests
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17999 ) Change subject: IMPALA-11006: Impalad crashes during query cancel tests .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/9749/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17999 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia49411f8525734b8d463d9ffbfbce705b90a8d73 Gerrit-Change-Number: 17999 Gerrit-PatchSet: 3 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Qifan Chen Gerrit-Comment-Date: Wed, 10 Nov 2021 03:54:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11006: Impalad crashes during query cancel tests
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/17999 ) Change subject: IMPALA-11006: Impalad crashes during query cancel tests .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/17999/1/be/src/service/client-request-state.cc File be/src/service/client-request-state.cc: http://gerrit.cloudera.org:8080/#/c/17999/1/be/src/service/client-request-state.cc@1070 PS1, Line 1070: re > My concern is about queries that do not have a coordinator at all, as Finis CTAS is the only DDL that we call ExecQueryOrDmlRequest() and thus FinishExecQueryOrDmlRequest(). See https://github.com/apache/impala/blob/master/be/src/service/client-request-state.cc#L721. All other DDLs, if run async, will execute the rest of the code in ExecDdlRequestImpl(), or ExecDdlRequestImplSync() if run sync. Added a IF test for CTAS that requires a coordinator. If true, we further test is_cancelled_. http://gerrit.cloudera.org:8080/#/c/17999/1/be/src/service/client-request-state.cc@1076 PS1, Line 1076: be no query > MarkActive/Inactive affects query expiration only, which is not relevant in MarkInactive() call is removed. -- To view, visit http://gerrit.cloudera.org:8080/17999 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia49411f8525734b8d463d9ffbfbce705b90a8d73 Gerrit-Change-Number: 17999 Gerrit-PatchSet: 3 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Qifan Chen Gerrit-Comment-Date: Wed, 10 Nov 2021 03:32:44 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11006: Impalad crashes during query cancel tests
Qifan Chen has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/17999 ) Change subject: IMPALA-11006: Impalad crashes during query cancel tests .. IMPALA-11006: Impalad crashes during query cancel tests This patch addresses impalad crash during query cancel tests for CTAS. The root cause is that the wait thread in impalad assumes that the coordinator or computation results are always available once the worker thread async_exec_thread_ is joined. In reality, this never happens as the query is cancelled. The patch specifically checks the cancel status when it is found that the coordinator is not available for CTAS and bails out immediately. This prevents crash since the subsequent code that assumes the coordinator or computation results are available is never reached. Testing: 1. Ran core tests successfully. Change-Id: Ia49411f8525734b8d463d9ffbfbce705b90a8d73 --- M be/src/service/client-request-state.cc 1 file changed, 12 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/99/17999/3 -- To view, visit http://gerrit.cloudera.org:8080/17999 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia49411f8525734b8d463d9ffbfbce705b90a8d73 Gerrit-Change-Number: 17999 Gerrit-PatchSet: 3 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Qifan Chen
[Impala-ASF-CR] IMPALA-10943: Add test to verify support for multiple resource and executor pools
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/17891 ) Change subject: IMPALA-10943: Add test to verify support for multiple resource and executor pools .. IMPALA-10943: Add test to verify support for multiple resource and executor pools This patch adds a test to verify that admission control accounting works when using multiple coordinators and multiple executor groups mapped to different resource pools and having different sizes. Change-Id: If76d386d8de5730da937674ddd9a69aa1aa1355e Reviewed-on: http://gerrit.cloudera.org:8080/17891 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M tests/custom_cluster/test_executor_groups.py 1 file changed, 86 insertions(+), 7 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/17891 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: If76d386d8de5730da937674ddd9a69aa1aa1355e Gerrit-Change-Number: 17891 Gerrit-PatchSet: 6 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell
[Impala-ASF-CR] IMPALA-10943: Add test to verify support for multiple resource and executor pools
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17891 ) Change subject: IMPALA-10943: Add test to verify support for multiple resource and executor pools .. Patch Set 5: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/17891 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If76d386d8de5730da937674ddd9a69aa1aa1355e Gerrit-Change-Number: 17891 Gerrit-PatchSet: 5 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Wed, 10 Nov 2021 02:53:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5256: Force log rotation when max log size exceeded
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/17997 ) Change subject: IMPALA-5256: Force log rotation when max_log_size exceeded .. Patch Set 3: > Patch Set 3: Verified-1 > > Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7613/ This GVO failed on TestReusePartitionMetadata.test_reuse_partition_meta, flaky test reported by IMPALA-10886. -- To view, visit http://gerrit.cloudera.org:8080/17997 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b94727180354fe69989ebf3cd1a8f8cda1cf0c3 Gerrit-Change-Number: 17997 Gerrit-PatchSet: 3 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Wed, 10 Nov 2021 02:26:16 + Gerrit-HasComments: No
[Impala-ASF-CR] [WIP] IMPALA-10992 Planner changes for estimate peak memory
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17994 ) Change subject: [WIP] IMPALA-10992 Planner changes for estimate peak memory .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/9748/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17994 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1dca6933f7db3d9e00b20c93b38310b0e77a09eb Gerrit-Change-Number: 17994 Gerrit-PatchSet: 4 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Comment-Date: Wed, 10 Nov 2021 02:20:53 + Gerrit-HasComments: No
[Impala-ASF-CR] [WIP] IMPALA-10992 Planner changes for estimate peak memory
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/17994 ) Change subject: [WIP] IMPALA-10992 Planner changes for estimate peak memory .. Patch Set 4: Changed all references to the default executor membership snapshot to refer to a particular executor group via a type available in Analyzer. Also added the new query option max_per_host_memory_threshold_for_regular_queries. -- To view, visit http://gerrit.cloudera.org:8080/17994 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1dca6933f7db3d9e00b20c93b38310b0e77a09eb Gerrit-Change-Number: 17994 Gerrit-PatchSet: 4 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Comment-Date: Wed, 10 Nov 2021 02:00:33 + Gerrit-HasComments: No
[Impala-ASF-CR] [WIP] IMPALA-10992 Planner changes for estimate peak memory
Qifan Chen has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/17994 ) Change subject: [WIP] IMPALA-10992 Planner changes for estimate peak memory .. [WIP] IMPALA-10992 Planner changes for estimate peak memory This patch provides planner support to deal with regular and large query executor groups. Per a preset threshold, the planner delivers a suitable plan that runs on the regular query executor group if the total estimated memory is no more than the threshold. Otherwise, it delieves a different kind of plan suitable to run on the large query executor group. A new query option max_per_host_memory_threshold_for_regular_queries is added to define a threshold T. A query with the estimated per host memory <= T runs in the regular query executor group. Otherwise, the query runs in the large query executor group. Change-Id: I1dca6933f7db3d9e00b20c93b38310b0e77a09eb --- M be/src/scheduling/cluster-membership-mgr.cc M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/Frontend.thrift M common/thrift/ImpalaService.thrift M common/thrift/Query.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/util/ExecutorMembershipSnapshot.java M fe/src/test/java/org/apache/impala/planner/ClusterSizeTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java 13 files changed, 151 insertions(+), 61 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/94/17994/4 -- To view, visit http://gerrit.cloudera.org:8080/17994 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1dca6933f7db3d9e00b20c93b38310b0e77a09eb Gerrit-Change-Number: 17994 Gerrit-PatchSet: 4 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen
[Impala-ASF-CR] IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17858 ) Change subject: IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables .. Patch Set 15: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/9747/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17858 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6ba07c9a338a25614690e314335ee4b801486da9 Gerrit-Change-Number: 17858 Gerrit-PatchSet: 15 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Fucun Chu Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Wed, 10 Nov 2021 00:36:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17858 ) Change subject: IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables .. Patch Set 15: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7616/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/17858 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6ba07c9a338a25614690e314335ee4b801486da9 Gerrit-Change-Number: 17858 Gerrit-PatchSet: 15 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Fucun Chu Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Wed, 10 Nov 2021 00:15:35 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables
Yu-Wen Lai has posted comments on this change. ( http://gerrit.cloudera.org:8080/17858 ) Change subject: IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables .. Patch Set 15: (6 comments) http://gerrit.cloudera.org:8080/#/c/17858/13/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/17858/13/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3560 PS13, Line 3560: LOG.debug("Not adding write ids to table {}.{} for event {} " + > nit: add more details in the log message like table name, event id being pr Done http://gerrit.cloudera.org:8080/#/c/17858/11/fe/src/main/java/org/apache/impala/hive/common/MutableValidReaderWriteIdList.java File fe/src/main/java/org/apache/impala/hive/common/MutableValidReaderWriteIdList.java: http://gerrit.cloudera.org:8080/#/c/17858/11/fe/src/main/java/org/apache/impala/hive/common/MutableValidReaderWriteIdList.java@252 PS11, Line 252: exceptions.add(currentId); > Looked at the implementation of BitSet.get() and I think the following sequ Yes, that already works because BitSet by default returns false if it is not set. http://gerrit.cloudera.org:8080/#/c/17858/13/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/17858/13/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@4368 PS13, Line 4368: throw new CatalogException( > nit: Would be good to add a log message with details about the rollback. Done http://gerrit.cloudera.org:8080/#/c/17858/12/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/17858/12/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2487 PS12, Line 2487: } finally { > nit: Original config should be restored in finally block Done http://gerrit.cloudera.org:8080/#/c/17858/12/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2499 PS12, Line 2499: stubCfg.setHms_event_incremental_refresh_transactional_table(true); > nit: can include test name in the table name for example: test_abort_transa Done http://gerrit.cloudera.org:8080/#/c/17858/12/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2818 PS12, Line 2818: } > Instead of creating a new method createTransactionalTable, we can enhance g I tried and verified that we need to set table params for creating transactional tables. Please see https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetastoreDefaultTransformer.java#L174. -- To view, visit http://gerrit.cloudera.org:8080/17858 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6ba07c9a338a25614690e314335ee4b801486da9 Gerrit-Change-Number: 17858 Gerrit-PatchSet: 15 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Fucun Chu Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Wed, 10 Nov 2021 00:14:45 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables
Yu-Wen Lai has uploaded a new patch set (#15). ( http://gerrit.cloudera.org:8080/17858 ) Change subject: IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables .. IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables To enable fine-grained table refreshing, there are three main changes in this commit. 1. Maintain validWriteIdList in Catalogd for transactional tables. We will keep track of write id changes for partitioned tables by AllocWriteIdEvents, CommitTxnEvents, and AbortTxnEvents. 2. Conduct partition level refreshing for transactional tables' addPartitionEvents, dropPartitionEvents, and AlterPartitionEvents. 3. Introduce a config hms_event_incremental_refresh_transactional_table, which can switch on/off the fine-grained table refreshing. Performance Tests: A simple test was performed by running insert into one partition for a partitioned ACID table(50,000 partitions). Below are the time taken to refresh this table by the event. StorageBefore After = S3 50 secs 50 msecs local 3 secs 3 msecs Change-Id: I6ba07c9a338a25614690e314335ee4b801486da9 --- M be/src/catalog/catalog-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/catalog/Catalog.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java A fe/src/main/java/org/apache/impala/catalog/TableWriteId.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/hive/common/MutableValidReaderWriteIdList.java M fe/src/main/java/org/apache/impala/hive/common/MutableValidWriteIdList.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java A fe/src/test/java/org/apache/impala/catalog/CatalogTableWriteIdTest.java M fe/src/test/java/org/apache/impala/catalog/CatalogTest.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java M fe/src/test/java/org/apache/impala/hive/common/MutableValidReaderWriteIdListTest.java 17 files changed, 1,130 insertions(+), 82 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/17858/15 -- To view, visit http://gerrit.cloudera.org:8080/17858 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6ba07c9a338a25614690e314335ee4b801486da9 Gerrit-Change-Number: 17858 Gerrit-PatchSet: 15 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Fucun Chu Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai
[Impala-ASF-CR] IMPALA-10940: Pick parts of recent gutil changes from Kudu repo
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17897 ) Change subject: IMPALA-10940: Pick parts of recent gutil changes from Kudu repo .. Patch Set 5: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7614/ -- To view, visit http://gerrit.cloudera.org:8080/17897 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9230923111af3c1028f0e71cd37cb5ee6fbec654 Gerrit-Change-Number: 17897 Gerrit-PatchSet: 5 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Wed, 10 Nov 2021 00:14:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5256: Force log rotation when max log size exceeded
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17997 ) Change subject: IMPALA-5256: Force log rotation when max_log_size exceeded .. Patch Set 3: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7613/ -- To view, visit http://gerrit.cloudera.org:8080/17997 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b94727180354fe69989ebf3cd1a8f8cda1cf0c3 Gerrit-Change-Number: 17997 Gerrit-PatchSet: 3 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Tue, 09 Nov 2021 23:07:49 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10943: Add test to verify support for multiple resource and executor pools
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17891 ) Change subject: IMPALA-10943: Add test to verify support for multiple resource and executor pools .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17891 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If76d386d8de5730da937674ddd9a69aa1aa1355e Gerrit-Change-Number: 17891 Gerrit-PatchSet: 5 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Tue, 09 Nov 2021 20:43:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10943: Add test to verify support for multiple resource and executor pools
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17891 ) Change subject: IMPALA-10943: Add test to verify support for multiple resource and executor pools .. Patch Set 5: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7615/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/17891 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If76d386d8de5730da937674ddd9a69aa1aa1355e Gerrit-Change-Number: 17891 Gerrit-PatchSet: 5 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Tue, 09 Nov 2021 20:43:31 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10943: Add test to verify support for multiple resource and executor pools
Bikramjeet Vig has removed a vote on this change. Change subject: IMPALA-10943: Add test to verify support for multiple resource and executor pools .. Removed Verified-1 by Impala Public Jenkins -- To view, visit http://gerrit.cloudera.org:8080/17891 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: deleteVote Gerrit-Change-Id: If76d386d8de5730da937674ddd9a69aa1aa1355e Gerrit-Change-Number: 17891 Gerrit-PatchSet: 4 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell
[Impala-ASF-CR] IMPALA-10943: Add test to verify support for multiple resource and executor pools
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/17891 ) Change subject: IMPALA-10943: Add test to verify support for multiple resource and executor pools .. Patch Set 4: Hit another flaky test IMPALA-11012 -- To view, visit http://gerrit.cloudera.org:8080/17891 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If76d386d8de5730da937674ddd9a69aa1aa1355e Gerrit-Change-Number: 17891 Gerrit-PatchSet: 4 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Tue, 09 Nov 2021 20:42:31 + Gerrit-HasComments: No
[Impala-ASF-CR] [DO NOT MERGE]: IMPALA-10926: Test patchset: 25 - set last synced event id for incremental reload
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18006 ) Change subject: [DO NOT MERGE]: IMPALA-10926: Test patchset: 25 - set last synced event id for incremental reload .. Patch Set 1: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7612/ -- To view, visit http://gerrit.cloudera.org:8080/18006 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0f52186a14fb1097f26b88674c7e7cede4f68b4 Gerrit-Change-Number: 18006 Gerrit-PatchSet: 1 Gerrit-Owner: Sourabh Goyal Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 09 Nov 2021 20:37:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables
Sourabh Goyal has posted comments on this change. ( http://gerrit.cloudera.org:8080/17858 ) Change subject: IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables .. Patch Set 14: (8 comments) http://gerrit.cloudera.org:8080/#/c/17858/12/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/17858/12/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3558 PS12, Line 3558: tbl = getTable(dbName, tblName); > In this case, that would make sense to load the table before processing All Works for me. We can take it up in a follow up patch. However, I would like Vihang (since he has already reviewed the PR) to take a look at this to make sure we are not missing any issue here. http://gerrit.cloudera.org:8080/#/c/17858/13/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/17858/13/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3560 PS13, Line 3560: LOG.debug("Not adding write ids since database {} was not found", dbName); nit: add more details in the log message like table name, event id being processed. http://gerrit.cloudera.org:8080/#/c/17858/12/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/17858/12/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2737 PS12, Line 2737: Preconditions.checkArgument(partsFromEvent != null > I added the check in reloadPartitions because there's where we modify the t reloadPartitionsFromEvent is a public api of HdfsTable and anyone can call this api as long as the caller has access to table object. Preconditions check will ensure that any other user of this api in future would have to acquire write lock on the table first http://gerrit.cloudera.org:8080/#/c/17858/11/fe/src/main/java/org/apache/impala/hive/common/MutableValidReaderWriteIdList.java File fe/src/main/java/org/apache/impala/hive/common/MutableValidReaderWriteIdList.java: http://gerrit.cloudera.org:8080/#/c/17858/11/fe/src/main/java/org/apache/impala/hive/common/MutableValidReaderWriteIdList.java@252 PS11, Line 252: exceptions.add(currentId); > I don't think we need to rely on the caller. AbortedBits need to be set onl Looked at the implementation of BitSet.get() and I think the following sequence already works but can you just reconfirm? If it does work, I am fine with current implementation as well 1. addOpenWriteId(writeId) // add a new writeId which is greater than current highWatermark 2. isWriteIdOpen(writeId) -> should return true http://gerrit.cloudera.org:8080/#/c/17858/12/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/17858/12/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@4342 PS12, Line 4342: part.setWriteId(writeId); > I assume this method is called by a commit event and it is impossible to ha No HMS won't be wrong. But adding a check enforce the expected behavior of addCommittedWriteIdsAndReloadPartitionsIfExist and in my opinion it is always good to enforce stricter checks to avoid any abuse (knowingly or unknowingly).. http://gerrit.cloudera.org:8080/#/c/17858/13/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/17858/13/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@4368 PS13, Line 4368: hdfsTable.setValidWriteIds(previousWriteIdList); nit: Would be good to add a log message with details about the rollback. http://gerrit.cloudera.org:8080/#/c/17858/12/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/17858/12/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2487 PS12, Line 2487: BackendConfig.create(origCfg); nit: Original config should be restored in finally block http://gerrit.cloudera.org:8080/#/c/17858/12/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2818 PS12, Line 2818: params.put("transactional", "true"); > Could you elaborate this? A transactional table must be a managed table. Wh Instead of creating a new method createTransactionalTable, we can enhance getTestTable() by passing an additional parameter which accepts a table type. Check this out for example: https://gerrit.clou
[Impala-ASF-CR] IMPALA-10940: Pick parts of recent gutil changes from Kudu repo
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17897 ) Change subject: IMPALA-10940: Pick parts of recent gutil changes from Kudu repo .. Patch Set 5: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7614/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/17897 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9230923111af3c1028f0e71cd37cb5ee6fbec654 Gerrit-Change-Number: 17897 Gerrit-PatchSet: 5 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Tue, 09 Nov 2021 17:45:51 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10940: Pick parts of recent gutil changes from Kudu repo
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17897 ) Change subject: IMPALA-10940: Pick parts of recent gutil changes from Kudu repo .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17897 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9230923111af3c1028f0e71cd37cb5ee6fbec654 Gerrit-Change-Number: 17897 Gerrit-PatchSet: 5 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Tue, 09 Nov 2021 17:45:50 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10940: Pick parts of recent gutil changes from Kudu repo
Wenzhe Zhou has posted comments on this change. ( http://gerrit.cloudera.org:8080/17897 ) Change subject: IMPALA-10940: Pick parts of recent gutil changes from Kudu repo .. Patch Set 4: Hit IMPALA-10983, which is irrelevant to this patch. -- To view, visit http://gerrit.cloudera.org:8080/17897 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9230923111af3c1028f0e71cd37cb5ee6fbec654 Gerrit-Change-Number: 17897 Gerrit-PatchSet: 4 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Tue, 09 Nov 2021 17:44:10 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10791 Add batching reading for remote temporary files
Yida Wu has posted comments on this change. ( http://gerrit.cloudera.org:8080/17979 ) Change subject: IMPALA-10791 Add batching reading for remote temporary files .. Patch Set 6: (18 comments) Thanks Qifan for the detailed review. Changed the name to "read buffer block", previously was "read buffer" or "mem block", and MemBlock is the implementation of the "read buffer block". http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-file.h File be/src/runtime/io/disk-file.h: http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-file.h@52 PS5, Line 52: > Need a description for MemBlock class. Done http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-file.h@62 PS5, Line 62: > nit. Return whether the memory is reserved or allocated before deletion to Done http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-file.h@68 PS5, Line 68: > nit. may switch the order of the two arguments as the lock appears first in Done http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-file.h@168 PS5, Line 168: /// The status of the memory block. > nit. May describe the content of a memory block a little bit. For example, Done http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-file.h@200 PS5, Line 200: DiskFile(const std::string& path, DiskIoMgr* io_mgr, int64_t file_size, > nit missing the comment. Done http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-file.h@206 PS5, Line 206: > Since this class is nested within DiskFile class, suggest to call it ReadBu Done http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-file.h@206 PS5, Line 206: : virtual ~DiskFile() {} > A little bit confusing on how these two names relate to mem blocks. See my Previously, a read buffer and a memory block are the same thing in our patch. Changed to name it as "read buffer block", because compared to "memory block", the TmpFileMgr may be easier to know what the block is used for by its name. And the block may not be necessarily in memory, at first we think to have it in disk, the name could be easier to extend if later we have some different implementation. Added some comments, hope to make it clearer. http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-file.h@210 PS5, Line 210: page, each time > Suggest to use a name relate to memory block, say mem_block_size_? Changed to read_buffer_block_size_. http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-file.h@213 PS5, Line 213: ilable or not. > suggest to rename to num_mem_blocks_. Changed to num_of_read_buffer_blocks_. http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-file.h@326 PS5, Line 326: id UpdateReadBufferMetaData(int64_t offset) { > I saw the IF check below which should take care of index being 12. Since th It depends on the file size. For example, if the file size is 128MB, block size is 8MB. Then there will be 128/8 = 16 mem blocks per file. Therefore, there is no limitation that to be only 4 blocks per file, the number of blocks is decided by default file size and default block size. In this case, the last index is 15 for offset over 120MB. There could be a case that the actual file size is over the default one a little bit, but it is assumed no more than a page size. For example, it could be 129MB for the file, therefore, the actual size of the last block can be 9MB, but it is not a huge difference compared to other blocks. The number of read blocks is calculated here: https://gerrit.cloudera.org/#/c/17979/6/be/src/runtime/tmp-file-mgr.cc@411 http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-file.cc File be/src/runtime/io/disk-file.cc: http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-file.cc@116 PS5, Line 116: > Since the memory is actually deleted in this method, maybe we should call t Think for a while, maybe DISABLED is better, because it means the block doesn't allow to use anymore. DELETED may mean sth has been deleted, but in our case, it doesn't necessarily delete anything for being the state. http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-io-mgr.cc File be/src/runtime/io/disk-io-mgr.cc: http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-io-mgr.cc@396 PS5, Line 396: // Fetch the data from the source file (remote) to the destination file (local). : DCHECK(disk_file_dst_ != nullptr); > May need to add a comment to describe these two files. An example would be Done http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-io-mgr.cc@424 PS5, Line 424: read_buffer_bloc, MemBlockStatus::DISABLED, dstl, &read_buffer_lock)) { > Why can't we use this DISABLED memory block here? Looks like we should not Because the memory of the block is released, if this case happens, we can't use the memory block u
[Impala-ASF-CR] IMPALA-10791 Add batching reading for remote temporary files
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17979 ) Change subject: IMPALA-10791 Add batching reading for remote temporary files .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/9746/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17979 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1dcc5d0881ffaeff09c5c514306cd668373ad31b Gerrit-Change-Number: 17979 Gerrit-PatchSet: 6 Gerrit-Owner: Yida Wu Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Yida Wu Gerrit-Comment-Date: Tue, 09 Nov 2021 17:37:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10791 Add batching reading for remote temporary files
Yida Wu has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/17979 ) Change subject: IMPALA-10791 Add batching reading for remote temporary files .. IMPALA-10791 Add batching reading for remote temporary files The patch adds a feature to batching read from a remote temporary file in order to improve the reading performance for the spilled remote data. Originally, the design is to use the local disk file as the buffer for batching reading from the remote file. But in practice, it doesn't help to improve the performance. Therefore, the design is changed to use the memory as the read buffer. Currently, each TmpFileRemote has two DiskFile, one is for the remote, and one is for the local buffer. The patch adds MemBlocks to the local buffer file. Each local buffer file is divided into several MemBlocks evenly. Moreover, in order to guarantee a single page not being cut into two parts in different blocks, the block size could be a little different to each other in practice. The default block size is the minimum value between the default file size and MAX_REMOTE_READ_MEM_BLOCK_THRESHOLD_MB, which is 16MB. When pinning a page, the system will detect if there is enough memory for the block that holds the page, if not, we will go reading the page directly and disable this block, because it may be good to avoid duplicated reads from the remote fs for the same content. If the system decides to fetch a block, the block will be stored in the memory until all of the pages in the block are read or the query ends. One challenge of using the memory for the buffer is that, when the system is lacking of memory when it needs to spill the data. So we make a restriction to limit the percentage of the memory for the read buffer to 10% of the total, because right now the impala process will reserve 20% memory as unused memory by default, using 10% for the emergency case like spilling is reasonable. Two start options have been added for the new feature. 1. remote_batching_read. Default is false. If set true, the batching read is enabled. 2. remote_read_memory_buffer_size. Default is 1G. The maximum memory that can be used by the read buffer. The number also restricted by the total system memory, which can not exceed 10% of the total memory. The patch also increases the MAX_REMOTE_TMPFILE_SIZE_THRESHOLD_MB from 256 to 512. Tests: Ran core and exhaustive tests. Added and ran TmpFileMgrTest::TestBatchingReadFromRemote. Added e2e test test_scratch_dirs_batch_reading. Change-Id: I1dcc5d0881ffaeff09c5c514306cd668373ad31b --- M be/src/runtime/io/disk-file.cc M be/src/runtime/io/disk-file.h M be/src/runtime/io/disk-io-mgr.cc M be/src/runtime/io/request-context.cc M be/src/runtime/io/request-ranges.h M be/src/runtime/io/scan-range.cc M be/src/runtime/tmp-file-mgr-internal.h M be/src/runtime/tmp-file-mgr-test.cc M be/src/runtime/tmp-file-mgr.cc M be/src/runtime/tmp-file-mgr.h M common/thrift/metrics.json M tests/custom_cluster/test_scratch_disk.py 12 files changed, 1,225 insertions(+), 150 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/17979/6 -- To view, visit http://gerrit.cloudera.org:8080/17979 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1dcc5d0881ffaeff09c5c514306cd668373ad31b Gerrit-Change-Number: 17979 Gerrit-PatchSet: 6 Gerrit-Owner: Yida Wu Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Yida Wu
[Impala-ASF-CR] IMPALA-5256: Force log rotation when max log size exceeded
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17997 ) Change subject: IMPALA-5256: Force log rotation when max_log_size exceeded .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7613/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/17997 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b94727180354fe69989ebf3cd1a8f8cda1cf0c3 Gerrit-Change-Number: 17997 Gerrit-PatchSet: 3 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Tue, 09 Nov 2021 16:56:25 + Gerrit-HasComments: No
[Impala-ASF-CR] [DO NOT MERGE]: IMPALA-10926: Test patchset:24 - set last synced event id when processing self event
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18005 ) Change subject: [DO NOT MERGE]: IMPALA-10926: Test patchset:24 - set last synced event id when processing self event .. Patch Set 1: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/18005 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I439f59b25c6b01f9f14a0cb58d48a700d627cd91 Gerrit-Change-Number: 18005 Gerrit-PatchSet: 1 Gerrit-Owner: Sourabh Goyal Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 09 Nov 2021 16:56:22 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5256: Force log rotation when max log size exceeded
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/17997 ) Change subject: IMPALA-5256: Force log rotation when max_log_size exceeded .. Patch Set 3: Code-Review+2 Carrying +2 -- To view, visit http://gerrit.cloudera.org:8080/17997 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b94727180354fe69989ebf3cd1a8f8cda1cf0c3 Gerrit-Change-Number: 17997 Gerrit-PatchSet: 3 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Tue, 09 Nov 2021 16:46:08 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP IMPALA-9498: Allow returning arrays in select list
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/17811 ) Change subject: WIP IMPALA-9498: Allow returning arrays in select list .. Patch Set 17: (8 comments) Sorry for the late review. I managed to go through the tests and the commit msg. Will continue checking the rest tomorrow. http://gerrit.cloudera.org:8080/#/c/17811/17//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17811/17//COMMIT_MSG@7 PS17, Line 7: WIP Is this still a work in progress? What would be needed to make this "live"? http://gerrit.cloudera.org:8080/#/c/17811/17//COMMIT_MSG@15 PS17, Line 15: Things intentionally postponed for later commits: By later commits do you mean later version of this patch or covered by separate Jiras? If the latest then could you open Jiras to not to forget them and could you include the Jira IDs here? http://gerrit.cloudera.org:8080/#/c/17811/17//COMMIT_MSG@26 PS17, Line 26: Finalize behavior with column masking/row filtering Same question: is this expected in this patch or a separate one? http://gerrit.cloudera.org:8080/#/c/17811/17//COMMIT_MSG@29 PS17, Line 29: Was there anything specific in terms of testing that is worth mentioning here? http://gerrit.cloudera.org:8080/#/c/17811/17/testdata/workloads/functional-query/queries/QueryTest/nested-array-in-select-list.test File testdata/workloads/functional-query/queries/QueryTest/nested-array-in-select-list.test: http://gerrit.cloudera.org:8080/#/c/17811/17/testdata/workloads/functional-query/queries/QueryTest/nested-array-in-select-list.test@34 PS17, Line 34: IllegalStateException: Sorting is not supported for collection columns. The error msg is misleading here: Sorting is not done by collection columns in the query. The issue is that you do a sort in a query where you actually query arrays as well, but you don't sort by collection columns. http://gerrit.cloudera.org:8080/#/c/17811/17/testdata/workloads/functional-query/queries/QueryTest/nested-array-in-select-list.test@37 PS17, Line 37: in nit: typo http://gerrit.cloudera.org:8080/#/c/17811/17/testdata/workloads/functional-query/queries/QueryTest/nested-array-in-select-list.test@147 PS17, Line 147: Could you write a test that has a WHERE filter (on a non-array column)? Could you create a view that contains arrays and query these arrays from the views? I see that there is coverage for inline views but not for regular ones. These tests are run on a small dataset. Have you performed some tests on a way bigger dataset? E.g. multiple files, multiple partitions, where the data doesn't fit in a single row batch. http://gerrit.cloudera.org:8080/#/c/17811/17/tests/query_test/test_nested_types.py File tests/query_test/test_nested_types.py: http://gerrit.cloudera.org:8080/#/c/17811/17/tests/query_test/test_nested_types.py@155 PS17, Line 155: nested-array-in-select-list I find the naming a bit misleading: this is not just for nested arrays, but for regular arrays as well, right? -- To view, visit http://gerrit.cloudera.org:8080/17811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibb1e42ffb21c7ddc033aba0f754b0108e46f34d0 Gerrit-Change-Number: 17811 Gerrit-PatchSet: 17 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 09 Nov 2021 16:20:25 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10984: Improve TimestampValue to String casting
Csaba Ringhofer has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/17980 ) Change subject: IMPALA-10984: Improve TimestampValue to String casting .. IMPALA-10984: Improve TimestampValue to String casting TimestampValue::ToString was implemented by concatenating boost::gregorian::to_iso_extended_string and boost::posix_time::to_simple_string using stringstream. This involves multiple string allocations, copying, and might hit lock within tcmalloc::CentralFreeList. FROM_UNIXTIME and CAST expression that touches this function can be inefficient if the expression is being evaluated for millions of rows. This patch adds method TimestampValue::ToStringVal and reimplements TimestampValue::ToString by supplying default DateTimeFormatContext if no pattern was specified. "-MM-dd HH:mm:ss" will be picked as the default format if the time_ component does not have fractional seconds. Otherwise, "-MM-dd HH:mm:ss.S" will be picked as the default format. The chosen DateTimeFormatContext then is passed to TimestampParser::Format along with date_ and time_ to be formatted into the string representation. Int to string parsing method is replaced with FastInt32ToBufferLeft in TimestampParser::Format. We ran a set of expression benchmarks in a machine with Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz. This patch gives > 10X performance improvement for CAST timestamp to string and FROM_UNIXTIME without a date-time pattern. Following are the detailed results before and after the patch. Before the patch: FromUnixCodegen: Function 10%ile 50%ile 90%ile 10%ile 50%ile 90%ile (relative) (relative) (relative) --- literal 36.7 37 37.3 1X 1X 1X cast(now() as string) 2.31 2.31 2.330.0628X 0.0623X0.0626X cast(now() as string format 'Y .S') 16.9 17.5 17.5 0.459X 0.472X 0.471X from_unixtime(0,'-MM-dd HH:mm:ss') 6.3 6.3 6.37 0.171X 0.17X 0.171X from_unixtime(0,'-MM-dd') 11.8 11.8 12 0.32X 0.32X 0.322X from_unixtime(0) 2.36 2.4 2.40.0644X 0.0648X0.0644X After the patch: FromUnixCodegen: Function 10%ile 50%ile 90%ile 10%ile 50%ile 90%ile (relative) (relative) (relative) --- literal 37.7 38.1 38.4 1X 1X 1X cast(now() as string) 29.9 30.1 30.2 0.794X 0.79X 0.787X cast(now() as string format 'Y .S') 61.1 61.3 61.6 1.62X 1.61X 1.61X from_unixtime(0,'-MM-dd HH:mm:ss') 33.6 33.8 34.2 0.892X 0.887X 0.892X from_unixtime(0,'-MM-dd') 50.5 50.6 50.9 1.34X 1.33X 1.33X from_unixtime(0) 34 34.2 34.5 0.902X 0.896X 0.898X The literal expression used as the baseline in this benchmark is "cast('2012-01-01 09:10:11.123456789' as timestamp)". This patch also updates numbers in expr-benchmark for BenchmarkTimestampFunctions and tidy up expr-benchmark a bit to clear its MemPool in between benchmark iteration so that it does not run out of memory. Testing: - Pass core tests. Change-Id: I4fcb4545d9c9a3fdb38c4db58bb4b1321a429d61 Reviewed-on: http://gerrit.cloudera.org:8080/17980 Reviewed-by: Impala Public Jenkins Tested-by: Csaba Ringhofer Reviewed-by: Csaba Ringhofer --- M be/src/benchmarks/expr-benchmark.cc M be/src/exprs/aggregate-functions-ir.cc M be/src/exprs/cast-functions-ir.cc M be/src/exprs/literal.cc M be/src/exprs/timestamp-functions-ir.cc M be/src/exprs/timestamp-functions.cc M be/src/runtime/date-parse-util.cc M be/src/runtime/datetime-iso-sql-format-tokenizer.cc M be/src/runtime/datetime-iso-sql-format-tokenizer.h M be/src/runtime/datetime-parser-common.cc M be/src/runtime/datetime-parser-common.h M be/src/runtime/datetime-simple-date-format-parser.cc M be/src/runtime/datetime-simple-date-format-parser.h M be/src/runtime/timestamp-parse-util.cc M be/src/runtime/timestamp-parse-util.h M be/src/runtime/timestamp-test.cc M be/src/runtime/timestamp-value.cc M be/src/runtime/timestamp-value.h M be/src/runtime/timestamp-value.inline.h M be/src/util/min-max-filter.cc 20 files changed, 261 insertions(+), 159 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved Csaba Ringhofer: Looks good to me, approved; Veri
[Impala-ASF-CR] IMPALA-10984: Improve TimestampValue to String casting
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/17980 ) Change subject: IMPALA-10984: Improve TimestampValue to String casting .. Patch Set 10: Code-Review+2 Submitting as the GVO failed twice due to flaky tests. -- To view, visit http://gerrit.cloudera.org:8080/17980 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4fcb4545d9c9a3fdb38c4db58bb4b1321a429d61 Gerrit-Change-Number: 17980 Gerrit-PatchSet: 10 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Tue, 09 Nov 2021 16:13:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10984: Improve TimestampValue to String casting
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/17980 ) Change subject: IMPALA-10984: Improve TimestampValue to String casting .. Patch Set 10: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/17980 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4fcb4545d9c9a3fdb38c4db58bb4b1321a429d61 Gerrit-Change-Number: 17980 Gerrit-PatchSet: 10 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Tue, 09 Nov 2021 16:13:12 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10984: Improve TimestampValue to String casting
Csaba Ringhofer has removed a vote on this change. Change subject: IMPALA-10984: Improve TimestampValue to String casting .. Removed Verified-1 by Impala Public Jenkins -- To view, visit http://gerrit.cloudera.org:8080/17980 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: deleteVote Gerrit-Change-Id: I4fcb4545d9c9a3fdb38c4db58bb4b1321a429d61 Gerrit-Change-Number: 17980 Gerrit-PatchSet: 10 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-10984: Improve TimestampValue to String casting
Csaba Ringhofer has removed a vote on this change. Change subject: IMPALA-10984: Improve TimestampValue to String casting .. Removed -Verified by Impala Public Jenkins -- To view, visit http://gerrit.cloudera.org:8080/17980 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: deleteVote Gerrit-Change-Id: I4fcb4545d9c9a3fdb38c4db58bb4b1321a429d61 Gerrit-Change-Number: 17980 Gerrit-PatchSet: 10 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-11006: Impalad crashes during query cancel tests
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/17999 ) Change subject: IMPALA-11006: Impalad crashes during query cancel tests .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/17999/1/be/src/service/client-request-state.cc File be/src/service/client-request-state.cc: http://gerrit.cloudera.org:8080/#/c/17999/1/be/src/service/client-request-state.cc@1070 PS1, Line 1070: else > CTAS is processed async with a difference before the RPC patch. My concern is about queries that do not have a coordinator at all, as FinishExecQueryOrDmlRequest is not called. My understanding is that there is no coordinator in case of DDLs, with the exception of CTAS. In this case it is normal to go in this branch - we will work correctly, but I still think that we should distinguish between the two cases. http://gerrit.cloudera.org:8080/#/c/17999/1/be/src/service/client-request-state.cc@1076 PS1, Line 1076: MarkInactive > I instrumented the code to observe when MarkActive() and MarkInactive() for MarkActive/Inactive affects query expiration only, which is not relevant in case the query was cancelled. ref_count_ should reach 0 only when Impala is really not doing any work on the query and we are waiting for the client to do another RPC. -- To view, visit http://gerrit.cloudera.org:8080/17999 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia49411f8525734b8d463d9ffbfbce705b90a8d73 Gerrit-Change-Number: 17999 Gerrit-PatchSet: 2 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Qifan Chen Gerrit-Comment-Date: Tue, 09 Nov 2021 16:08:12 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10920: Zipping unnest for arrays
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17983 ) Change subject: IMPALA-10920: Zipping unnest for arrays .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/9745/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17983 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic58ff6579ecff03962e7a8698edfbe0684ce6cf7 Gerrit-Change-Number: 17983 Gerrit-PatchSet: 5 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 09 Nov 2021 15:09:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10920: Zipping unnest for arrays
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/17983 ) Change subject: IMPALA-10920: Zipping unnest for arrays .. Patch Set 5: (43 comments) http://gerrit.cloudera.org:8080/#/c/17983/2/be/src/exec/unnest-node.h File be/src/exec/unnest-node.h: http://gerrit.cloudera.org:8080/#/c/17983/2/be/src/exec/unnest-node.h@40 PS2, Line 40: SlotRefs > If they are always SlotRefs, can't the type of the vector be std::vectorhttp://gerrit.cloudera.org:8080/#/c/17983/2/be/src/exec/unnest-node.h@42 PS2, Line 42: 'UnnestNode::coll_values_', but instead manually retrieve the slot values to support : /// projection (see cla > I think it should be mentioned that 'coll_values_' and the class comment ar Done http://gerrit.cloudera.org:8080/#/c/17983/2/be/src/exec/unnest-node.h@46 PS2, Line 46: UnnestPlanNode. > 'coll_expr_evals_' has been removed. Done http://gerrit.cloudera.org:8080/#/c/17983/2/be/src/exec/unnest-node.h@46 PS2, Line 46: t in : /// InitCollExpr(). > It seems no longer to be true that they are set in Prepare() and set to NUL Right, these are set to null in a different class not here. Done http://gerrit.cloudera.org:8080/#/c/17983/2/be/src/exec/unnest-node.h@50 PS2, Line 50: Set in InitCollE > It seems no longer to be the case. Done http://gerrit.cloudera.org:8080/#/c/17983/2/be/src/exec/unnest-node.h@109 PS2, Line 109: . > Nit: belong. Done http://gerrit.cloudera.org:8080/#/c/17983/2/be/src/exec/unnest-node.h@116 PS2, Line 116: doesn > Nit: doesn't. Done http://gerrit.cloudera.org:8080/#/c/17983/2/be/src/exec/unnest-node.h@126 PS2, Line 126: These slots are always set to NULL in : /// Open() as a simple projection. > Does not seem to be true anymore. In fact it's partially true. These still are projected to NULL but not set in Prepare. Done http://gerrit.cloudera.org:8080/#/c/17983/2/be/src/exec/unnest-node.h@131 PS2, Line 131: > Does not seem to be true. Done http://gerrit.cloudera.org:8080/#/c/17983/2/be/src/exec/unnest-node.h@144 PS2, Line 144: length > typo Done http://gerrit.cloudera.org:8080/#/c/17983/2/be/src/exec/unnest-node.cc File be/src/exec/unnest-node.cc: http://gerrit.cloudera.org:8080/#/c/17983/2/be/src/exec/unnest-node.cc@60 PS2, Line 60: DCHECK(coll_expr->IsSlotRef( > This DCHECK is superfluous because slot_ref==null iff coll_expr==nullptr, a Done http://gerrit.cloudera.org:8080/#/c/17983/2/be/src/exec/unnest-node.cc@90 PS2, Line 90: num_collections_counter_(nullptr) { > nit: usually the constant is the second argument. Done http://gerrit.cloudera.org:8080/#/c/17983/2/be/src/exec/unnest-node.cc@143 PS2, Line 143: if (num_tuples > 0) { > nit: here and at line 146: using std::max would be easier to read IMO Done http://gerrit.cloudera.org:8080/#/c/17983/2/be/src/exec/unnest-node.cc@174 PS2, Line 174: > Nit: should be collections. Done http://gerrit.cloudera.org:8080/#/c/17983/2/be/src/exec/unnest-node.cc@218 PS2, Line 218: (*coll_slot_descs_)[coll_idx]->children_tuple_de > Can't we use nullptr everywhere? My understanding is that nullptr as tuple As I remember I had to create the tuple in the row batch and set it to null explicitly otherwise there is a point somewhere later where Impala crashes. http://gerrit.cloudera.org:8080/#/c/17983/2/fe/src/main/cup/sql-parser.cup File fe/src/main/cup/sql-parser.cup: http://gerrit.cloudera.org:8080/#/c/17983/2/fe/src/main/cup/sql-parser.cup@315 PS2, Line 315: KW_UNNEST > We should add to the comment that we have added a new keyword, making this Mentioned it in the commit message. http://gerrit.cloudera.org:8080/#/c/17983/2/fe/src/main/cup/sql-parser.cup@3030 PS2, Line 3030: Providing multiple UNNEST() in the FROM clause is not supported. > It this something that we want to support later? It depends on what semantics you want to achieve with multiple unnests. In case you want to do zipping unnest between the arrays in the same unnest but do joining unnest with the ones on different unnest then this might be feasible, but a bit hard to understand. And anyway zipping and joining unnest is not allowed together atm. I'm not really convinced we want to support this later as I see no added value. http://gerrit.cloudera.org:8080/#/c/17983/2/fe/src/main/cup/sql-parser.cup@3091 PS2, Line 3091: > What if the user provides more aliases than paths? Indeed, thx :) Added some tests as well to cover this. Done http://gerrit.cloudera.org:8080/#/c/17983/2/fe/src/main/java/org/apache/impala/analysis/Analyzer.java File fe/src/main/java/org/apache/impala/analysis/Analyzer.java: http://gerrit.cloudera.org:8080/#/c/17983/2/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@574 PS2, Line 574: tableRefsFromUnnestExpr > Nit: Should end in an underscore (_). Done http://gerrit.cloudera.org:8080/#/c/17983/2/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@706 P
[Impala-ASF-CR] IMPALA-10920: Zipping unnest for arrays
Hello Daniel Becker, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17983 to look at the new patch set (#5). Change subject: IMPALA-10920: Zipping unnest for arrays .. IMPALA-10920: Zipping unnest for arrays This patch provides an unnest implementation for arrays where unnesting multiple arrays in one query results the items of the arrays being zipped together instead of joining. There are two different syntaxes introduced for this purpose: 1: ISO:SQL 2016 compliant syntax: SELECT a1.item, a2.item FROM complextypes_arrays t, UNNEST(t.arr1, t.arr2) AS (a1, a2); 2: Postgres compatible syntax: SELECT UNNEST(arr1), UNNEST(arr2) FROM complextypes_arrays; Let me show the expected behaviour through the following example: Inputs: arr1: {1,2,3}, arr2: {11, 12} After running any of the above queries we expect the following output: === | arr1 | arr2 | === | 1| 11 | | 2| 12 | | 3| NULL | === Expected behaviour: - When unnesting multiple arrays with zipping unnest then the 'i'th item of one array will be put next to the 'i'th item of the other arrays in the results. - In case the size of the arrays is not the same then the shorter arrays will be filled with NULL values up to the size of the longest array. On a sidenote, UNNEST is added to Impala's SQL language as a new keyword. This might interfere with use cases where a resource (db, table, column, etc.) is named "UNNEST". Restrictions: - It is not allowed to have WHERE filters on an unnested item of an array in the same SELECT query. E.g. this is not allowed: SELECT arr1.item FROM complextypes_arrays t, UNNEST(t.arr1) WHERE arr1.item < 5; Note, that it is allowed to have an outer SELECT around the one doing unnests and have a filter there on the unnested items. - If there is an outer SELECT filtering on the unnested array's items from the inner SELECT then these predicates won't be pushed down to the SCAN node. They are rather evaluated in the UNNEST node to guarantee result correctness after unnesting. Note, this restriction is only active when there are multiple arrays being unnested, or in other words when zipping unnest logic is required to produce results. - It's not allowed to do a zipping and a (traditional) joining unnest together in one SELECT query. - It's not allowed to perform zipping unnests on arrays from different tables. Testing: - Added a bunch of E2E tests to the test suite to cover both syntaxes. - Did a manual test run on a table with 1000 rows, 3 array columns with size of around 5000 items in each array. I did an unnest on all three arrays in one query to see if there are any crashes or suspicious slowness when running on this scale. Change-Id: Ic58ff6579ecff03962e7a8698edfbe0684ce6cf7 --- M be/src/exec/unnest-node.cc M be/src/exec/unnest-node.h M common/thrift/PlanNodes.thrift M fe/src/main/cup/sql-parser.cup M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/FromClause.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/analysis/SlotRef.java M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java M fe/src/main/java/org/apache/impala/analysis/TableRef.java M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java A fe/src/main/java/org/apache/impala/analysis/UnnestExpr.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/PlanNode.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M fe/src/main/java/org/apache/impala/planner/UnnestNode.java M fe/src/main/jflex/sql-scanner.flex M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java A testdata/ComplexTypesTbl/arrays.orc A testdata/ComplexTypesTbl/arrays.parq M testdata/data/README M testdata/datasets/functional/functional_schema_template.sql M testdata/datasets/functional/schema_constraints.csv A testdata/workloads/functional-query/queries/QueryTest/zipping-unnest-in-from-clause.test A testdata/workloads/functional-query/queries/QueryTest/zipping-unnest-in-select-list.test M tests/query_test/test_nested_types.py 29 files changed, 1,366 insertions(+), 136 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/17983/5 -- To view, visit http://gerrit.cloudera.org:8080/17983 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic58ff6579ecff03962e7a8698edfbe0684ce6cf7 Gerrit-Change-Num
[Impala-ASF-CR] IMPALA-11006: Impalad crashes during query cancel tests
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17999 ) Change subject: IMPALA-11006: Impalad crashes during query cancel tests .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/9744/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17999 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia49411f8525734b8d463d9ffbfbce705b90a8d73 Gerrit-Change-Number: 17999 Gerrit-PatchSet: 2 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Qifan Chen Gerrit-Comment-Date: Tue, 09 Nov 2021 14:11:24 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP IMPALA-9498: Allow returning arrays in select list
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/17811 ) Change subject: WIP IMPALA-9498: Allow returning arrays in select list .. Patch Set 17: (3 comments) Thanks. http://gerrit.cloudera.org:8080/#/c/17811/15/fe/src/main/java/org/apache/impala/analysis/SetOperationStmt.java File fe/src/main/java/org/apache/impala/analysis/SetOperationStmt.java: http://gerrit.cloudera.org:8080/#/c/17811/15/fe/src/main/java/org/apache/impala/analysis/SetOperationStmt.java@344 PS15, Line 344: DO: Currently only > Yes, we want to exclude all other operators than UNION ALL here. Added a co In this case, shouldn't the error message also say that "EXCEPT" is not supported? Like "UNION and EXCEPT are not supported for collection types". http://gerrit.cloudera.org:8080/#/c/17811/15/fe/src/main/java/org/apache/impala/analysis/SlotRef.java File fe/src/main/java/org/apache/impala/analysis/SlotRef.java: http://gerrit.cloudera.org:8080/#/c/17811/15/fe/src/main/java/org/apache/impala/analysis/SlotRef.java@99 PS15, Line 99: > Removed it here, but not in L82, because the default constructor of the par You're right, thanks. http://gerrit.cloudera.org:8080/#/c/17811/15/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java File fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java: http://gerrit.cloudera.org:8080/#/c/17811/15/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@2296 PS15, Line 2296: Map type is > I kept it as "Map type", as we also use "Struct type" in case of structs. Okay. -- To view, visit http://gerrit.cloudera.org:8080/17811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibb1e42ffb21c7ddc033aba0f754b0108e46f34d0 Gerrit-Change-Number: 17811 Gerrit-PatchSet: 17 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 09 Nov 2021 13:57:09 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11006: Impalad crashes during query cancel tests
Qifan Chen has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/17999 ) Change subject: IMPALA-11006: Impalad crashes during query cancel tests .. IMPALA-11006: Impalad crashes during query cancel tests This patch addresses impalad crash during query cancel tests. The root cause is that the wait thread in impalad assumes that the coordinator or computation results are always available once the worker thread async_exec_thread_ is joined. In reality, this may not be true at all once the query is cancelled. The patch specifically checks the cancel status when it is found that the coordinator is not available and bails out immediately. This prevents crash since the subsequent code that assumes the coordinator or computation results are available is never reached. Testing: 1. Ran core tests successfully. Change-Id: Ia49411f8525734b8d463d9ffbfbce705b90a8d73 --- M be/src/service/client-request-state.cc 1 file changed, 9 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/99/17999/2 -- To view, visit http://gerrit.cloudera.org:8080/17999 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia49411f8525734b8d463d9ffbfbce705b90a8d73 Gerrit-Change-Number: 17999 Gerrit-PatchSet: 2 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Qifan Chen
[Impala-ASF-CR] IMPALA-10984: Improve TimestampValue to String casting
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17980 ) Change subject: IMPALA-10984: Improve TimestampValue to String casting .. Patch Set 10: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7610/ -- To view, visit http://gerrit.cloudera.org:8080/17980 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4fcb4545d9c9a3fdb38c4db58bb4b1321a429d61 Gerrit-Change-Number: 17980 Gerrit-PatchSet: 10 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Tue, 09 Nov 2021 13:40:13 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP IMPALA-9498: Allow returning arrays in select list
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17811 ) Change subject: WIP IMPALA-9498: Allow returning arrays in select list .. Patch Set 17: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/9743/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibb1e42ffb21c7ddc033aba0f754b0108e46f34d0 Gerrit-Change-Number: 17811 Gerrit-PatchSet: 17 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 09 Nov 2021 13:03:39 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP IMPALA-9498: Allow returning arrays in select list
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/17811 ) Change subject: WIP IMPALA-9498: Allow returning arrays in select list .. Patch Set 17: (19 comments) Thanks for the comments! http://gerrit.cloudera.org:8080/#/c/17811/15/fe/src/main/java/org/apache/impala/analysis/Analyzer.java File fe/src/main/java/org/apache/impala/analysis/Analyzer.java: http://gerrit.cloudera.org:8080/#/c/17811/15/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@1296 PS15, Line 1296: : /** :* Returns an existing or new SlotDescriptor for the given path. :* > I think this comment refers to the 1-argument overload on L1293. Done http://gerrit.cloudera.org:8080/#/c/17811/15/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@1315 PS15, Line 1315: SlotDescriptor existingSlotDesc = slotPathMap_.get(key); : if (existingSlotDesc != null) return existingSlotDesc; : SlotDescriptor result = null; : Preconditions.checkState(!slotPath.destType().isMapType()); : if (slotPath.destType().isArrayType()) { : result = registerCollectionSlotRef(slotPath); : } else { : result = addSlotDescriptor(slotPath.getRootDesc()); : } : result.setPath(slotPath); : slotPathMap_.put(key, result); : registerColumnPrivReq(result); : return result; : } : : /** :* Registers a collection and its descendendants. :* Creates a CollectionTableRef for all collections. :*/ : private SlotDescriptor registerCollectionSlotRef(Path slotPath) : throws AnalysisException { : Preconditions.checkState(slotPath.destType().isArrayType()); : SlotDescriptor result = null; : Path currentPath = slotPath; : List rawPath = currentPath.getRawPath(); : boolean first_level = true; : while (currentPath.destType().isArrayType()) { : TableRef tblRef = new TableRef(rawPath, null); : : CollectionTableRef collTblRef = new CollectionTableRef(tblRef, currentPath, true); : collTblRef.analyze(this); : : Preconditions.checkState(collTblRef.getCollectionExpr() != null); : Preconditions.checkState(collTblRef.getCollectionExpr() instanceof SlotRef); : SlotDescriptor desc = ((SlotRef) collTblRef.getCollectionExpr()).getDesc(); : desc.setIsMaterializedRecursively(true); : if (first_level) { : result = desc; : firs > Optional: the handling of arrays could be extracted to a function. Done http://gerrit.cloudera.org:8080/#/c/17811/15/fe/src/main/java/org/apache/impala/analysis/CollectionTableRef.java File fe/src/main/java/org/apache/impala/analysis/CollectionTableRef.java: http://gerrit.cloudera.org:8080/#/c/17811/15/fe/src/main/java/org/apache/impala/analysis/CollectionTableRef.java@25 PS15, Line 25: an be either being : * flattened during execution > Is it still true? The TODO on L104 indicates it is not. You are right, this is no longer true. http://gerrit.cloudera.org:8080/#/c/17811/15/fe/src/main/java/org/apache/impala/analysis/CollectionTableRef.java@52 PS15, Line 52: : /** :* Create a CollectionTableRef from the original unresolved table ref as well as :* > This comment should be for the 2 argument version on L48. The boolean argum added some explanation about inSelectList http://gerrit.cloudera.org:8080/#/c/17811/15/fe/src/main/java/org/apache/impala/analysis/CollectionTableRef.java@91 PS15, Line 91: d > Typo: y Done http://gerrit.cloudera.org:8080/#/c/17811/15/fe/src/main/java/org/apache/impala/analysis/CollectionTableRef.java@92 PS15, Line 92: n > Typo: y Done http://gerrit.cloudera.org:8080/#/c/17811/15/fe/src/main/java/org/apache/impala/analysis/SetOperationStmt.java File fe/src/main/java/org/apache/impala/analysis/SetOperationStmt.java: http://gerrit.cloudera.org:8080/#/c/17811/15/fe/src/main/java/org/apache/impala/analysis/SetOperationStmt.java@344 PS15, Line 344: DO: Currently only > Do we only have to do this check if there are only UNION ALL operations? Wh Yes, we want to exclude all other operators than UNION ALL here. Added a comment about this. http://gerrit.cloudera.org:8080/#/c/17811/15/fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java File fe/src/main/java/org/apache/impala/analysis
[Impala-ASF-CR] WIP IMPALA-9498: Allow returning arrays in select list
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17811 ) Change subject: WIP IMPALA-9498: Allow returning arrays in select list .. Patch Set 17: (3 comments) http://gerrit.cloudera.org:8080/#/c/17811/17/fe/src/main/java/org/apache/impala/analysis/Analyzer.java File fe/src/main/java/org/apache/impala/analysis/Analyzer.java: http://gerrit.cloudera.org:8080/#/c/17811/17/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@1410 PS17, Line 1410: Column column = new Column(String.join(".", slotDesc.getPath().getRawPath()), slotDesc.getType(), line too long (101 > 90) http://gerrit.cloudera.org:8080/#/c/17811/17/fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java File fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java: http://gerrit.cloudera.org:8080/#/c/17811/17/fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java@729 PS17, Line 729: // Select with struct in select list. Uses functional_orc_def.complextypes_nested_structs line too long (93 > 90) http://gerrit.cloudera.org:8080/#/c/17811/17/fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java@757 PS17, Line 757: allExcept(TPrivilegeLevel.ALL, TPrivilegeLevel.OWNER, TPrivilegeLevel.SELECT))); line too long (96 > 90) -- To view, visit http://gerrit.cloudera.org:8080/17811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibb1e42ffb21c7ddc033aba0f754b0108e46f34d0 Gerrit-Change-Number: 17811 Gerrit-PatchSet: 17 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 09 Nov 2021 12:41:53 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP IMPALA-9498: Allow returning arrays in select list
Hello Daniel Becker, Gabor Kaszab, Attila Jeges, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17811 to look at the new patch set (#17). Change subject: WIP IMPALA-9498: Allow returning arrays in select list .. WIP IMPALA-9498: Allow returning arrays in select list Until now ARRAYs had to be unnested in queries. This patch adds support to return ARRAYs as STRINGs (JSON arrays) in select list. Though STRUCTs are already supported, ARRAYs and STRUCTs nested in each other are not supported yet. Things intentionally postponed for later commits: - Add MAP suppport too - this shouldn't be too tricky after ARRAY support, but I don't want to make this patch even more complex. - Unify HS2 / Beeswax logic with the way STRUCTs are handled. This could be done in a "final" logic that can handle STRUCTS/ARRAYS nested to each other - Implement "deep copy" and "deep serialize" for ARRAYs in BE. This would enable all operators, e.g. ORDER BY and UNION. TODOs: - Finalize behavior with column masking/row filtering and modify tests accordingly. Currently some related tests are broken. Change-Id: Ibb1e42ffb21c7ddc033aba0f754b0108e46f34d0 --- M be/src/codegen/codegen-anyval.cc M be/src/exec/blocking-plan-root-sink.cc M be/src/exec/buffered-plan-root-sink.cc M be/src/exec/parquet/parquet-collection-column-reader.cc M be/src/exec/plan-root-sink.cc M be/src/exec/plan-root-sink.h M be/src/exprs/expr.h M be/src/exprs/slot-ref.cc M be/src/exprs/slot-ref.h M be/src/runtime/collection-value.h M be/src/runtime/raw-value.cc M be/src/runtime/raw-value.h M be/src/runtime/types.cc M be/src/runtime/types.h M be/src/service/hs2-util.cc M be/src/service/impala-beeswax-server.cc M be/src/service/impala-server.h M be/src/service/query-result-set.cc M be/src/service/query-result-set.h M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/CollectionTableRef.java M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/analysis/SetOperationStmt.java M fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java M fe/src/main/java/org/apache/impala/analysis/SlotRef.java M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java M fe/src/main/java/org/apache/impala/analysis/TableRef.java M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java M fe/src/main/java/org/apache/impala/authorization/TableMask.java M fe/src/main/java/org/apache/impala/catalog/Type.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M fe/src/main/java/org/apache/impala/planner/UnionNode.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeUpsertStmtTest.java M fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java A testdata/workloads/functional-query/queries/QueryTest/nested-array-in-select-list.test M testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_struct_in_select_list.test M testdata/workloads/functional-query/queries/QueryTest/struct-in-select-list.test M tests/authorization/test_ranger.py M tests/query_test/test_nested_types.py 42 files changed, 734 insertions(+), 178 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/17811/17 -- To view, visit http://gerrit.cloudera.org:8080/17811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibb1e42ffb21c7ddc033aba0f754b0108e46f34d0 Gerrit-Change-Number: 17811 Gerrit-PatchSet: 17 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] [DO NOT MERGE]: IMPALA-10926: Test patchset: 25 - set last synced event id for incremental reload
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18006 ) Change subject: [DO NOT MERGE]: IMPALA-10926: Test patchset: 25 - set last synced event id for incremental reload .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/9742/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/18006 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0f52186a14fb1097f26b88674c7e7cede4f68b4 Gerrit-Change-Number: 18006 Gerrit-PatchSet: 1 Gerrit-Owner: Sourabh Goyal Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 09 Nov 2021 10:57:38 + Gerrit-HasComments: No
[Impala-ASF-CR] [DO NOT MERGE]: IMPALA-10926: Test patchset:24 - set last synced event id when processing self event
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18005 ) Change subject: [DO NOT MERGE]: IMPALA-10926: Test patchset:24 - set last synced event id when processing self event .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/9741/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/18005 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I439f59b25c6b01f9f14a0cb58d48a700d627cd91 Gerrit-Change-Number: 18005 Gerrit-PatchSet: 1 Gerrit-Owner: Sourabh Goyal Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 09 Nov 2021 10:55:08 + Gerrit-HasComments: No
[Impala-ASF-CR] [DO NOT MERGE]: IMPALA-10926: Test patchset: 25 - set last synced event id for incremental reload
Sourabh Goyal has uploaded this change for review. ( http://gerrit.cloudera.org:8080/18006 Change subject: [DO NOT MERGE]: IMPALA-10926: Test patchset: 25 - set last synced event id for incremental reload .. [DO NOT MERGE]: IMPALA-10926: Test patchset: 25 - set last synced event id for incremental reload Change-Id: If0f52186a14fb1097f26b88674c7e7cede4f68b4 --- M be/src/catalog/catalog-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/Db.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/TableLoader.java M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java M fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/JniCatalog.java M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java M fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java M fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java M fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java M tests/custom_cluster/test_metastore_service.py 26 files changed, 3,561 insertions(+), 292 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/06/18006/1 -- To view, visit http://gerrit.cloudera.org:8080/18006 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: If0f52186a14fb1097f26b88674c7e7cede4f68b4 Gerrit-Change-Number: 18006 Gerrit-PatchSet: 1 Gerrit-Owner: Sourabh Goyal
[Impala-ASF-CR] [DO NOT MERGE]: IMPALA-10926: Test patchset:24 - set last synced event id when processing self event
Sourabh Goyal has uploaded this change for review. ( http://gerrit.cloudera.org:8080/18005 Change subject: [DO NOT MERGE]: IMPALA-10926: Test patchset:24 - set last synced event id when processing self event .. [DO NOT MERGE]: IMPALA-10926: Test patchset:24 - set last synced event id when processing self event Change-Id: I439f59b25c6b01f9f14a0cb58d48a700d627cd91 --- M be/src/catalog/catalog-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/Db.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/TableLoader.java M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java M fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/JniCatalog.java M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java M fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java M fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java M fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java M tests/custom_cluster/test_metastore_service.py 26 files changed, 3,495 insertions(+), 283 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/18005/1 -- To view, visit http://gerrit.cloudera.org:8080/18005 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I439f59b25c6b01f9f14a0cb58d48a700d627cd91 Gerrit-Change-Number: 18005 Gerrit-PatchSet: 1 Gerrit-Owner: Sourabh Goyal
[Impala-ASF-CR] [DO NOT MERGE]: IMPALA-10926: Test patchset: 25 - set last synced event id for incremental reload
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18006 ) Change subject: [DO NOT MERGE]: IMPALA-10926: Test patchset: 25 - set last synced event id for incremental reload .. Patch Set 1: (3 comments) http://gerrit.cloudera.org:8080/#/c/18006/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/18006/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2423 PS1, Line 2423:* String, long)} which passes false for {@code refreshUpdatedPartitions} argument and ignore line too long (95 > 90) http://gerrit.cloudera.org:8080/#/c/18006/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/18006/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@5877 PS1, Line 5877: updatedThriftTable = catalog_.reloadTable(tbl, req, resultType, cmdString, -1); line too long (93 > 90) http://gerrit.cloudera.org:8080/#/c/18006/1/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/18006/1/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2471 PS1, Line 2471: batchEvents = eventFactory.createBatchEvents(mockEvents, eventsProcessor_.getMetrics()); line too long (94 > 90) -- To view, visit http://gerrit.cloudera.org:8080/18006 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0f52186a14fb1097f26b88674c7e7cede4f68b4 Gerrit-Change-Number: 18006 Gerrit-PatchSet: 1 Gerrit-Owner: Sourabh Goyal Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 09 Nov 2021 10:37:16 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [DO NOT MERGE]: IMPALA-10926: Test patchset: 25 - set last synced event id for incremental reload
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18006 ) Change subject: [DO NOT MERGE]: IMPALA-10926: Test patchset: 25 - set last synced event id for incremental reload .. Patch Set 1: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7612/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/18006 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0f52186a14fb1097f26b88674c7e7cede4f68b4 Gerrit-Change-Number: 18006 Gerrit-PatchSet: 1 Gerrit-Owner: Sourabh Goyal Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 09 Nov 2021 10:37:08 + Gerrit-HasComments: No
[Impala-ASF-CR] [DO NOT MERGE]: IMPALA-10926: Test patchset:24 - set last synced event id when processing self event
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18005 ) Change subject: [DO NOT MERGE]: IMPALA-10926: Test patchset:24 - set last synced event id when processing self event .. Patch Set 1: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7611/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/18005 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I439f59b25c6b01f9f14a0cb58d48a700d627cd91 Gerrit-Change-Number: 18005 Gerrit-PatchSet: 1 Gerrit-Owner: Sourabh Goyal Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 09 Nov 2021 10:34:21 + Gerrit-HasComments: No
[Impala-ASF-CR] [DO NOT MERGE]: IMPALA-10926: Test patchset:24 - set last synced event id when processing self event
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18005 ) Change subject: [DO NOT MERGE]: IMPALA-10926: Test patchset:24 - set last synced event id when processing self event .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/18005/1/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/18005/1/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2452 PS1, Line 2452: batchEvents = eventFactory.createBatchEvents(mockEvents, eventsProcessor_.getMetrics()); line too long (94 > 90) http://gerrit.cloudera.org:8080/#/c/18005/1/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java File fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java: http://gerrit.cloudera.org:8080/#/c/18005/1/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java@85 PS1, Line 85: private static boolean flagEnableCatalogCache ,flagInvalidateCache, flagSyncToLatestEventId; line too long (96 > 90) -- To view, visit http://gerrit.cloudera.org:8080/18005 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I439f59b25c6b01f9f14a0cb58d48a700d627cd91 Gerrit-Change-Number: 18005 Gerrit-PatchSet: 1 Gerrit-Owner: Sourabh Goyal Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 09 Nov 2021 10:34:15 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
Sourabh Goyal has abandoned this change. ( http://gerrit.cloudera.org:8080/18004 ) Change subject: [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints .. Abandoned -- To view, visit http://gerrit.cloudera.org:8080/18004 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: abandon Gerrit-Change-Id: I8cae9cccdcd94ef2c0f5cfaf5ddb15eaa8c46f5c Gerrit-Change-Number: 18004 Gerrit-PatchSet: 1 Gerrit-Owner: Sourabh Goyal Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-10920: Zipping unnest for arrays
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17983 ) Change subject: IMPALA-10920: Zipping unnest for arrays .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/9740/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17983 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic58ff6579ecff03962e7a8698edfbe0684ce6cf7 Gerrit-Change-Number: 17983 Gerrit-PatchSet: 4 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 09 Nov 2021 09:55:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10920: Zipping unnest for arrays
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/17983 ) Change subject: IMPALA-10920: Zipping unnest for arrays .. Patch Set 4: (4 comments) PS3 is fixing a bug with predicates being incorrectly pushed down to SCAN node. PS4 is fixing the comments from Jenkins plus eventually writing the commit msg. Will start processing the comments from Csaba and Daniel now. Thanks for taking a look! http://gerrit.cloudera.org:8080/#/c/17983/3/fe/src/main/java/org/apache/impala/analysis/UnnestExpr.java File fe/src/main/java/org/apache/impala/analysis/UnnestExpr.java: http://gerrit.cloudera.org:8080/#/c/17983/3/fe/src/main/java/org/apache/impala/analysis/UnnestExpr.java@49 PS3, Line 49: protected void analyzeImpl(Analyzer analyzer) throws AnalysisException { > line has trailing whitespace Done http://gerrit.cloudera.org:8080/#/c/17983/3/fe/src/main/java/org/apache/impala/analysis/UnnestExpr.java@50 PS3, Line 50: Preconditions.checkNotNull(rawPath_); > line has trailing whitespace Done http://gerrit.cloudera.org:8080/#/c/17983/3/fe/src/main/java/org/apache/impala/analysis/UnnestExpr.java@109 PS3, Line 109: Type resolvedType = > line has trailing whitespace Done http://gerrit.cloudera.org:8080/#/c/17983/3/tests/query_test/test_nested_types.py File tests/query_test/test_nested_types.py: http://gerrit.cloudera.org:8080/#/c/17983/3/tests/query_test/test_nested_types.py@196 PS3, Line 196: > flake8: E302 expected 2 blank lines, found 1 Done -- To view, visit http://gerrit.cloudera.org:8080/17983 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic58ff6579ecff03962e7a8698edfbe0684ce6cf7 Gerrit-Change-Number: 17983 Gerrit-PatchSet: 4 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 09 Nov 2021 09:34:36 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10920: Zipping unnest for arrays
Hello Daniel Becker, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17983 to look at the new patch set (#4). Change subject: IMPALA-10920: Zipping unnest for arrays .. IMPALA-10920: Zipping unnest for arrays This patch provides an unnest implementation for arrays where unnesting multiple arrays in one query results the items of the arrays being zipped together instead of joining. There are two different syntaxes introduced for this purpose: 1: ISO:SQL 2016 compliant syntax: SELECT a1.item, a2.item FROM complextypes_arrays t, UNNEST(t.arr1, t.arr2) AS (a1, a2); 2: Postgres compatible syntax: SELECT UNNEST(arr1), UNNEST(arr2) FROM complextypes_arrays; Let me show the expected behaviour through the following example: Inputs: arr1: {1,2,3}, arr2: {11, 12} After running any of the above queries we expect the following output: === | arr1 | arr2 | === | 1| 11 | | 2| 12 | | 3| NULL | === Expected behaviour: - When unnesting multiple arrays with zipping unnest then the 'i'th item of one array will be put next to the 'i'th item of the other arrays in the results. - In case the size of the arrays is not the same then the shorter arrays will be filled with NULL values up to the size of the longest array. Restrictions: - It is not allowed to have WHERE filters on an unnested item of an array in the same SELECT query. E.g. this is not allowed: SELECT arr1.item FROM complextypes_arrays t, UNNEST(t.arr1) WHERE arr1.item < 5; Note, that it is allowed to have an outer SELECT around the one doing unnests and have a filter there on the unnested items. - If there is an outer SELECT filtering on the unnested array's items from the inner SELECT then these predicates won't be pushed down to the SCAN node. They are rather evaluated in the UNNEST node to guarantee result correctness after unnesting. Note, this restriction is only active when there are multiple arrays being unnested, or in other words when zipping unnest logic is required to produce results. - It's not allowed to do a zipping and a (traditional) joining unnest together in one SELECT query. - It's not allowed to perform zipping unnests on arrays from different tables. Testing: - Added a bunch of E2E tests to the test suite to cover both syntaxes. - Did a manual test run on a table with 1000 rows, 3 array columns with size of around 5000 items in each array. I did an unnest on all three arrays in one query to see if there are any crashes or suspicious slowness when running on this scale. Change-Id: Ic58ff6579ecff03962e7a8698edfbe0684ce6cf7 --- M be/src/exec/unnest-node.cc M be/src/exec/unnest-node.h M common/thrift/PlanNodes.thrift M fe/src/main/cup/sql-parser.cup M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/FromClause.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/analysis/SlotRef.java M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java M fe/src/main/java/org/apache/impala/analysis/TableRef.java M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java A fe/src/main/java/org/apache/impala/analysis/UnnestExpr.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/PlanNode.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M fe/src/main/java/org/apache/impala/planner/UnnestNode.java M fe/src/main/jflex/sql-scanner.flex M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java A testdata/ComplexTypesTbl/arrays.orc A testdata/ComplexTypesTbl/arrays.parq M testdata/data/README M testdata/datasets/functional/functional_schema_template.sql M testdata/datasets/functional/schema_constraints.csv A testdata/workloads/functional-query/queries/QueryTest/zipping-unnest-in-from-clause.test A testdata/workloads/functional-query/queries/QueryTest/zipping-unnest-in-select-list.test M tests/query_test/test_nested_types.py 29 files changed, 1,348 insertions(+), 132 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/17983/4 -- To view, visit http://gerrit.cloudera.org:8080/17983 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic58ff6579ecff03962e7a8698edfbe0684ce6cf7 Gerrit-Change-Number: 17983 Gerrit-PatchSet: 4 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-10940: Pick parts of recent gutil changes from Kudu repo
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17897 ) Change subject: IMPALA-10940: Pick parts of recent gutil changes from Kudu repo .. Patch Set 4: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7609/ -- To view, visit http://gerrit.cloudera.org:8080/17897 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9230923111af3c1028f0e71cd37cb5ee6fbec654 Gerrit-Change-Number: 17897 Gerrit-PatchSet: 4 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Tue, 09 Nov 2021 08:44:09 + Gerrit-HasComments: No