[Impala-ASF-CR] IMPALA-10940: Pick parts of recent gutil changes from Kudu repo

2021-11-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17897 )

Change subject: IMPALA-10940: Pick parts of recent gutil changes from Kudu repo
..


Patch Set 6: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17897
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9230923111af3c1028f0e71cd37cb5ee6fbec654
Gerrit-Change-Number: 17897
Gerrit-PatchSet: 6
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 10 Nov 2021 07:49:38 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10940: Pick parts of recent gutil changes from Kudu repo

2021-11-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17897 )

Change subject: IMPALA-10940: Pick parts of recent gutil changes from Kudu repo
..


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7618/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/17897
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9230923111af3c1028f0e71cd37cb5ee6fbec654
Gerrit-Change-Number: 17897
Gerrit-PatchSet: 6
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 10 Nov 2021 07:49:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables

2021-11-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17858 )

Change subject: IMPALA-10923: Fine grained table refreshing at partition level 
events for transactional tables
..


Patch Set 15: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7616/


--
To view, visit http://gerrit.cloudera.org:8080/17858
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6ba07c9a338a25614690e314335ee4b801486da9
Gerrit-Change-Number: 17858
Gerrit-PatchSet: 15
Gerrit-Owner: Yu-Wen Lai 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Fucun Chu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 
Gerrit-Comment-Date: Wed, 10 Nov 2021 06:25:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5256: Force log rotation when max log size exceeded

2021-11-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17997 )

Change subject: IMPALA-5256: Force log rotation when max_log_size exceeded
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17997
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1b94727180354fe69989ebf3cd1a8f8cda1cf0c3
Gerrit-Change-Number: 17997
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Wed, 10 Nov 2021 04:36:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5256: Force log rotation when max log size exceeded

2021-11-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17997 )

Change subject: IMPALA-5256: Force log rotation when max_log_size exceeded
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7617/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/17997
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1b94727180354fe69989ebf3cd1a8f8cda1cf0c3
Gerrit-Change-Number: 17997
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Wed, 10 Nov 2021 04:36:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11006: Impalad crashes during query cancel tests

2021-11-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17999 )

Change subject: IMPALA-11006: Impalad crashes during query cancel tests
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9749/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17999
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia49411f8525734b8d463d9ffbfbce705b90a8d73
Gerrit-Change-Number: 17999
Gerrit-PatchSet: 3
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Comment-Date: Wed, 10 Nov 2021 03:54:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11006: Impalad crashes during query cancel tests

2021-11-09 Thread Qifan Chen (Code Review)
Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17999 )

Change subject: IMPALA-11006: Impalad crashes during query cancel tests
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/17999/1/be/src/service/client-request-state.cc
File be/src/service/client-request-state.cc:

http://gerrit.cloudera.org:8080/#/c/17999/1/be/src/service/client-request-state.cc@1070
PS1, Line 1070:   re
> My concern is about queries that do not have a coordinator at all, as Finis
CTAS is the only DDL that we call ExecQueryOrDmlRequest() and thus 
FinishExecQueryOrDmlRequest(). See
https://github.com/apache/impala/blob/master/be/src/service/client-request-state.cc#L721.

All other DDLs, if run async, will execute the rest of the code in 
ExecDdlRequestImpl(), or ExecDdlRequestImplSync() if run sync.

Added a IF test for CTAS that requires a coordinator. If true, we further test 
is_cancelled_.


http://gerrit.cloudera.org:8080/#/c/17999/1/be/src/service/client-request-state.cc@1076
PS1, Line 1076:  be no query
> MarkActive/Inactive affects query expiration only, which is not relevant in
MarkInactive() call is removed.



--
To view, visit http://gerrit.cloudera.org:8080/17999
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia49411f8525734b8d463d9ffbfbce705b90a8d73
Gerrit-Change-Number: 17999
Gerrit-PatchSet: 3
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Comment-Date: Wed, 10 Nov 2021 03:32:44 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11006: Impalad crashes during query cancel tests

2021-11-09 Thread Qifan Chen (Code Review)
Qifan Chen has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/17999 )

Change subject: IMPALA-11006: Impalad crashes during query cancel tests
..

IMPALA-11006: Impalad crashes during query cancel tests

This patch addresses impalad crash during query cancel tests for
CTAS. The root cause is that the wait thread in impalad assumes
that the coordinator or computation results are always available
once the worker thread async_exec_thread_ is joined. In reality,
this never happens as the query is cancelled.

The patch specifically checks the cancel status when it is found
that the coordinator is not available for CTAS and bails out
immediately. This prevents crash since the subsequent code that
assumes the coordinator or computation results are available is
never reached.

Testing:
  1. Ran core tests successfully.

Change-Id: Ia49411f8525734b8d463d9ffbfbce705b90a8d73
---
M be/src/service/client-request-state.cc
1 file changed, 12 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/99/17999/3
--
To view, visit http://gerrit.cloudera.org:8080/17999
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia49411f8525734b8d463d9ffbfbce705b90a8d73
Gerrit-Change-Number: 17999
Gerrit-PatchSet: 3
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Qifan Chen 


[Impala-ASF-CR] IMPALA-10943: Add test to verify support for multiple resource and executor pools

2021-11-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/17891 )

Change subject: IMPALA-10943: Add test to verify support for multiple resource 
and executor pools
..

IMPALA-10943: Add test to verify support for multiple resource and
executor pools

This patch adds a test to verify that admission control accounting
works when using multiple coordinators and multiple executor groups
mapped to different resource pools and having different sizes.

Change-Id: If76d386d8de5730da937674ddd9a69aa1aa1355e
Reviewed-on: http://gerrit.cloudera.org:8080/17891
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M tests/custom_cluster/test_executor_groups.py
1 file changed, 86 insertions(+), 7 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/17891
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: If76d386d8de5730da937674ddd9a69aa1aa1355e
Gerrit-Change-Number: 17891
Gerrit-PatchSet: 6
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 


[Impala-ASF-CR] IMPALA-10943: Add test to verify support for multiple resource and executor pools

2021-11-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17891 )

Change subject: IMPALA-10943: Add test to verify support for multiple resource 
and executor pools
..


Patch Set 5: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/17891
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If76d386d8de5730da937674ddd9a69aa1aa1355e
Gerrit-Change-Number: 17891
Gerrit-PatchSet: 5
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Comment-Date: Wed, 10 Nov 2021 02:53:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5256: Force log rotation when max log size exceeded

2021-11-09 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17997 )

Change subject: IMPALA-5256: Force log rotation when max_log_size exceeded
..


Patch Set 3:

> Patch Set 3: Verified-1
>
> Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7613/

This GVO failed on TestReusePartitionMetadata.test_reuse_partition_meta, flaky 
test reported by IMPALA-10886.


--
To view, visit http://gerrit.cloudera.org:8080/17997
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1b94727180354fe69989ebf3cd1a8f8cda1cf0c3
Gerrit-Change-Number: 17997
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Wed, 10 Nov 2021 02:26:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] [WIP] IMPALA-10992 Planner changes for estimate peak memory

2021-11-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17994 )

Change subject: [WIP] IMPALA-10992 Planner changes for estimate peak memory
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9748/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17994
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1dca6933f7db3d9e00b20c93b38310b0e77a09eb
Gerrit-Change-Number: 17994
Gerrit-PatchSet: 4
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Comment-Date: Wed, 10 Nov 2021 02:20:53 +
Gerrit-HasComments: No


[Impala-ASF-CR] [WIP] IMPALA-10992 Planner changes for estimate peak memory

2021-11-09 Thread Qifan Chen (Code Review)
Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17994 )

Change subject: [WIP] IMPALA-10992 Planner changes for estimate peak memory
..


Patch Set 4:

Changed all references to the default executor membership snapshot to refer to 
a particular executor group via a type available in Analyzer.

Also added the new query option 
max_per_host_memory_threshold_for_regular_queries.


--
To view, visit http://gerrit.cloudera.org:8080/17994
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1dca6933f7db3d9e00b20c93b38310b0e77a09eb
Gerrit-Change-Number: 17994
Gerrit-PatchSet: 4
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Comment-Date: Wed, 10 Nov 2021 02:00:33 +
Gerrit-HasComments: No


[Impala-ASF-CR] [WIP] IMPALA-10992 Planner changes for estimate peak memory

2021-11-09 Thread Qifan Chen (Code Review)
Qifan Chen has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/17994 )

Change subject: [WIP] IMPALA-10992 Planner changes for estimate peak memory
..

[WIP] IMPALA-10992 Planner changes for estimate peak memory

This patch provides planner support to deal with regular and large query
executor groups. Per a preset threshold, the planner delivers a suitable
plan that runs on the regular query executor group if the total estimated
memory is no more than the threshold. Otherwise, it delieves a different
kind of plan suitable to run on the large query executor group.

A new query option max_per_host_memory_threshold_for_regular_queries is
added to define a threshold T. A query with the estimated per host memory
<= T runs in the regular query executor group. Otherwise, the query runs
in the large query executor group.

Change-Id: I1dca6933f7db3d9e00b20c93b38310b0e77a09eb
---
M be/src/scheduling/cluster-membership-mgr.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/Frontend.thrift
M common/thrift/ImpalaService.thrift
M common/thrift/Query.thrift
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/util/ExecutorMembershipSnapshot.java
M fe/src/test/java/org/apache/impala/planner/ClusterSizeTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
13 files changed, 151 insertions(+), 61 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/94/17994/4
--
To view, visit http://gerrit.cloudera.org:8080/17994
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1dca6933f7db3d9e00b20c93b38310b0e77a09eb
Gerrit-Change-Number: 17994
Gerrit-PatchSet: 4
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 


[Impala-ASF-CR] IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables

2021-11-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17858 )

Change subject: IMPALA-10923: Fine grained table refreshing at partition level 
events for transactional tables
..


Patch Set 15:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9747/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17858
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6ba07c9a338a25614690e314335ee4b801486da9
Gerrit-Change-Number: 17858
Gerrit-PatchSet: 15
Gerrit-Owner: Yu-Wen Lai 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Fucun Chu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 
Gerrit-Comment-Date: Wed, 10 Nov 2021 00:36:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables

2021-11-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17858 )

Change subject: IMPALA-10923: Fine grained table refreshing at partition level 
events for transactional tables
..


Patch Set 15:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7616/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/17858
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6ba07c9a338a25614690e314335ee4b801486da9
Gerrit-Change-Number: 17858
Gerrit-PatchSet: 15
Gerrit-Owner: Yu-Wen Lai 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Fucun Chu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 
Gerrit-Comment-Date: Wed, 10 Nov 2021 00:15:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables

2021-11-09 Thread Yu-Wen Lai (Code Review)
Yu-Wen Lai has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17858 )

Change subject: IMPALA-10923: Fine grained table refreshing at partition level 
events for transactional tables
..


Patch Set 15:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/17858/13/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/17858/13/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3560
PS13, Line 3560:   LOG.debug("Not adding write ids to table {}.{} for event 
{} " +
> nit: add more details in the log message like table name, event id being pr
Done


http://gerrit.cloudera.org:8080/#/c/17858/11/fe/src/main/java/org/apache/impala/hive/common/MutableValidReaderWriteIdList.java
File 
fe/src/main/java/org/apache/impala/hive/common/MutableValidReaderWriteIdList.java:

http://gerrit.cloudera.org:8080/#/c/17858/11/fe/src/main/java/org/apache/impala/hive/common/MutableValidReaderWriteIdList.java@252
PS11, Line 252:   exceptions.add(currentId);
> Looked at the implementation of BitSet.get() and I think the following sequ
Yes, that already works because BitSet by default returns false if it is not 
set.


http://gerrit.cloudera.org:8080/#/c/17858/13/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/17858/13/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@4368
PS13, Line 4368:   throw new CatalogException(
> nit: Would be good to add a log message with details about the rollback.
Done


http://gerrit.cloudera.org:8080/#/c/17858/12/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/17858/12/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2487
PS12, Line 2487: } finally {
> nit: Original config should be restored in finally block
Done


http://gerrit.cloudera.org:8080/#/c/17858/12/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2499
PS12, Line 2499:   
stubCfg.setHms_event_incremental_refresh_transactional_table(true);
> nit: can include test name in the table name for example: test_abort_transa
Done


http://gerrit.cloudera.org:8080/#/c/17858/12/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2818
PS12, Line 2818:   }
> Instead of creating a new method createTransactionalTable, we can enhance g
I tried and verified that we need to set table params for creating 
transactional tables. Please see 
https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetastoreDefaultTransformer.java#L174.



--
To view, visit http://gerrit.cloudera.org:8080/17858
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6ba07c9a338a25614690e314335ee4b801486da9
Gerrit-Change-Number: 17858
Gerrit-PatchSet: 15
Gerrit-Owner: Yu-Wen Lai 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Fucun Chu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 
Gerrit-Comment-Date: Wed, 10 Nov 2021 00:14:45 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables

2021-11-09 Thread Yu-Wen Lai (Code Review)
Yu-Wen Lai has uploaded a new patch set (#15). ( 
http://gerrit.cloudera.org:8080/17858 )

Change subject: IMPALA-10923: Fine grained table refreshing at partition level 
events for transactional tables
..

IMPALA-10923: Fine grained table refreshing at partition level events
for transactional tables

To enable fine-grained table refreshing, there are three main changes
in this commit.
1. Maintain validWriteIdList in Catalogd for transactional tables. We
  will keep track of write id changes for partitioned tables by
  AllocWriteIdEvents, CommitTxnEvents, and AbortTxnEvents.
2. Conduct partition level refreshing for transactional tables'
  addPartitionEvents, dropPartitionEvents, and AlterPartitionEvents.
3. Introduce a config
  hms_event_incremental_refresh_transactional_table, which can switch
  on/off the fine-grained table refreshing.

Performance Tests:
A simple test was performed by running insert into one partition for
a partitioned ACID table(50,000 partitions). Below are the time taken
to refresh this table by the event.

StorageBefore  After
=
S3 50 secs 50 msecs
local  3 secs  3 msecs

Change-Id: I6ba07c9a338a25614690e314335ee4b801486da9
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
A fe/src/main/java/org/apache/impala/catalog/TableWriteId.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/hive/common/MutableValidReaderWriteIdList.java
M fe/src/main/java/org/apache/impala/hive/common/MutableValidWriteIdList.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
A fe/src/test/java/org/apache/impala/catalog/CatalogTableWriteIdTest.java
M fe/src/test/java/org/apache/impala/catalog/CatalogTest.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M 
fe/src/test/java/org/apache/impala/hive/common/MutableValidReaderWriteIdListTest.java
17 files changed, 1,130 insertions(+), 82 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/17858/15
--
To view, visit http://gerrit.cloudera.org:8080/17858
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6ba07c9a338a25614690e314335ee4b801486da9
Gerrit-Change-Number: 17858
Gerrit-PatchSet: 15
Gerrit-Owner: Yu-Wen Lai 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Fucun Chu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 


[Impala-ASF-CR] IMPALA-10940: Pick parts of recent gutil changes from Kudu repo

2021-11-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17897 )

Change subject: IMPALA-10940: Pick parts of recent gutil changes from Kudu repo
..


Patch Set 5: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7614/


--
To view, visit http://gerrit.cloudera.org:8080/17897
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9230923111af3c1028f0e71cd37cb5ee6fbec654
Gerrit-Change-Number: 17897
Gerrit-PatchSet: 5
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 10 Nov 2021 00:14:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5256: Force log rotation when max log size exceeded

2021-11-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17997 )

Change subject: IMPALA-5256: Force log rotation when max_log_size exceeded
..


Patch Set 3: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7613/


--
To view, visit http://gerrit.cloudera.org:8080/17997
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1b94727180354fe69989ebf3cd1a8f8cda1cf0c3
Gerrit-Change-Number: 17997
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Tue, 09 Nov 2021 23:07:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10943: Add test to verify support for multiple resource and executor pools

2021-11-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17891 )

Change subject: IMPALA-10943: Add test to verify support for multiple resource 
and executor pools
..


Patch Set 5: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17891
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If76d386d8de5730da937674ddd9a69aa1aa1355e
Gerrit-Change-Number: 17891
Gerrit-PatchSet: 5
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Comment-Date: Tue, 09 Nov 2021 20:43:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10943: Add test to verify support for multiple resource and executor pools

2021-11-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17891 )

Change subject: IMPALA-10943: Add test to verify support for multiple resource 
and executor pools
..


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7615/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/17891
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If76d386d8de5730da937674ddd9a69aa1aa1355e
Gerrit-Change-Number: 17891
Gerrit-PatchSet: 5
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Comment-Date: Tue, 09 Nov 2021 20:43:31 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10943: Add test to verify support for multiple resource and executor pools

2021-11-09 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has removed a vote on this change.

Change subject: IMPALA-10943: Add test to verify support for multiple resource 
and executor pools
..


Removed Verified-1 by Impala Public Jenkins 
--
To view, visit http://gerrit.cloudera.org:8080/17891
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: If76d386d8de5730da937674ddd9a69aa1aa1355e
Gerrit-Change-Number: 17891
Gerrit-PatchSet: 4
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 


[Impala-ASF-CR] IMPALA-10943: Add test to verify support for multiple resource and executor pools

2021-11-09 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17891 )

Change subject: IMPALA-10943: Add test to verify support for multiple resource 
and executor pools
..


Patch Set 4:

Hit another flaky test IMPALA-11012


--
To view, visit http://gerrit.cloudera.org:8080/17891
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If76d386d8de5730da937674ddd9a69aa1aa1355e
Gerrit-Change-Number: 17891
Gerrit-PatchSet: 4
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Comment-Date: Tue, 09 Nov 2021 20:42:31 +
Gerrit-HasComments: No


[Impala-ASF-CR] [DO NOT MERGE]: IMPALA-10926: Test patchset: 25 - set last synced event id for incremental reload

2021-11-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18006 )

Change subject: [DO NOT MERGE]: IMPALA-10926: Test patchset: 25 - set last 
synced event id for incremental reload
..


Patch Set 1: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7612/


--
To view, visit http://gerrit.cloudera.org:8080/18006
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If0f52186a14fb1097f26b88674c7e7cede4f68b4
Gerrit-Change-Number: 18006
Gerrit-PatchSet: 1
Gerrit-Owner: Sourabh Goyal 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 09 Nov 2021 20:37:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables

2021-11-09 Thread Sourabh Goyal (Code Review)
Sourabh Goyal has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17858 )

Change subject: IMPALA-10923: Fine grained table refreshing at partition level 
events for transactional tables
..


Patch Set 14:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/17858/12/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/17858/12/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3558
PS12, Line 3558:   tbl = getTable(dbName, tblName);
> In this case, that would make sense to load the table before processing All
Works for me. We can take it up in a follow up patch. However, I would like 
Vihang (since he has already reviewed the PR) to take a look at this to make 
sure we are not missing any issue here.


http://gerrit.cloudera.org:8080/#/c/17858/13/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/17858/13/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3560
PS13, Line 3560:   LOG.debug("Not adding write ids since database {} was 
not found", dbName);
nit: add more details in the log message like table name, event id being 
processed.


http://gerrit.cloudera.org:8080/#/c/17858/12/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/17858/12/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2737
PS12, Line 2737: Preconditions.checkArgument(partsFromEvent != null
> I added the check in reloadPartitions because there's where we modify the t
reloadPartitionsFromEvent is a public api of HdfsTable and anyone can call this 
api as long as the caller has access to table object. Preconditions check will 
ensure that any other user of this api in future would have to acquire write 
lock on the table first


http://gerrit.cloudera.org:8080/#/c/17858/11/fe/src/main/java/org/apache/impala/hive/common/MutableValidReaderWriteIdList.java
File 
fe/src/main/java/org/apache/impala/hive/common/MutableValidReaderWriteIdList.java:

http://gerrit.cloudera.org:8080/#/c/17858/11/fe/src/main/java/org/apache/impala/hive/common/MutableValidReaderWriteIdList.java@252
PS11, Line 252:   exceptions.add(currentId);
> I don't think we need to rely on the caller. AbortedBits need to be set onl
Looked at the implementation of BitSet.get() and I think the following sequence 
already works but can you just reconfirm? If it does work, I am fine with 
current implementation as well

1. addOpenWriteId(writeId) // add a new writeId which is greater than current 
highWatermark
2. isWriteIdOpen(writeId) -> should return true


http://gerrit.cloudera.org:8080/#/c/17858/12/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/17858/12/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@4342
PS12, Line 4342:   part.setWriteId(writeId);
> I assume this method is called by a commit event and it is impossible to ha
No HMS won't be wrong. But adding a check enforce the expected behavior of 
addCommittedWriteIdsAndReloadPartitionsIfExist and in my opinion it is always 
good to enforce stricter checks to avoid any abuse (knowingly or unknowingly)..


http://gerrit.cloudera.org:8080/#/c/17858/13/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/17858/13/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@4368
PS13, Line 4368:   hdfsTable.setValidWriteIds(previousWriteIdList);
nit: Would be good to add a log message with details about the rollback.


http://gerrit.cloudera.org:8080/#/c/17858/12/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/17858/12/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2487
PS12, Line 2487: BackendConfig.create(origCfg);
nit: Original config should be restored in finally block


http://gerrit.cloudera.org:8080/#/c/17858/12/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2818
PS12, Line 2818: params.put("transactional", "true");
> Could you elaborate this? A transactional table must be a managed table. Wh
Instead of creating a new method createTransactionalTable, we can enhance 
getTestTable() by passing an additional parameter which accepts a table type. 
Check this out for example: 
https://gerrit.clou

[Impala-ASF-CR] IMPALA-10940: Pick parts of recent gutil changes from Kudu repo

2021-11-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17897 )

Change subject: IMPALA-10940: Pick parts of recent gutil changes from Kudu repo
..


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7614/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/17897
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9230923111af3c1028f0e71cd37cb5ee6fbec654
Gerrit-Change-Number: 17897
Gerrit-PatchSet: 5
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Tue, 09 Nov 2021 17:45:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10940: Pick parts of recent gutil changes from Kudu repo

2021-11-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17897 )

Change subject: IMPALA-10940: Pick parts of recent gutil changes from Kudu repo
..


Patch Set 5: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17897
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9230923111af3c1028f0e71cd37cb5ee6fbec654
Gerrit-Change-Number: 17897
Gerrit-PatchSet: 5
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Tue, 09 Nov 2021 17:45:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10940: Pick parts of recent gutil changes from Kudu repo

2021-11-09 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17897 )

Change subject: IMPALA-10940: Pick parts of recent gutil changes from Kudu repo
..


Patch Set 4:

Hit IMPALA-10983, which is irrelevant to this patch.


--
To view, visit http://gerrit.cloudera.org:8080/17897
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9230923111af3c1028f0e71cd37cb5ee6fbec654
Gerrit-Change-Number: 17897
Gerrit-PatchSet: 4
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Tue, 09 Nov 2021 17:44:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10791 Add batching reading for remote temporary files

2021-11-09 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17979 )

Change subject: IMPALA-10791 Add batching reading for remote temporary files
..


Patch Set 6:

(18 comments)

Thanks Qifan for the detailed review.
Changed the name to "read buffer block", previously was "read buffer" or "mem 
block", and MemBlock is the implementation of the "read buffer block".

http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-file.h
File be/src/runtime/io/disk-file.h:

http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-file.h@52
PS5, Line 52:
> Need a description for MemBlock class.
Done


http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-file.h@62
PS5, Line 62:
> nit. Return whether the memory is reserved or allocated before deletion to
Done


http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-file.h@68
PS5, Line 68:
> nit. may switch the order of the two arguments as the lock appears first in
Done


http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-file.h@168
PS5, Line 168:   /// The status of the memory block.
> nit. May describe the content of a memory block a little bit. For example,
Done


http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-file.h@200
PS5, Line 200:   DiskFile(const std::string& path, DiskIoMgr* io_mgr, int64_t 
file_size,
> nit missing the comment.
Done


http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-file.h@206
PS5, Line 206:
> Since this class is nested within DiskFile class, suggest to call it ReadBu
Done


http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-file.h@206
PS5, Line 206:
 :   virtual ~DiskFile() {}
> A little bit confusing on how these two names relate to mem blocks. See my
Previously, a read buffer and a memory block are the same thing in our patch. 
Changed to name it as "read buffer block", because compared to "memory block", 
the TmpFileMgr may be easier to know what the block is used for by its name. 
And the block may not be necessarily in memory, at first we think to have it in 
disk, the name could be easier to extend if later we have some different 
implementation.
Added some comments, hope to make it clearer.


http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-file.h@210
PS5, Line 210:  page, each time
> Suggest to use a name relate to memory block, say mem_block_size_?
Changed to read_buffer_block_size_.


http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-file.h@213
PS5, Line 213: ilable or not.
> suggest to rename to num_mem_blocks_.
Changed to num_of_read_buffer_blocks_.


http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-file.h@326
PS5, Line 326: id UpdateReadBufferMetaData(int64_t offset) {
> I saw the IF check below which should take care of index being 12. Since th
It depends on the file size.
For example, if the file size is 128MB, block size is 8MB. Then there will be 
128/8 = 16 mem blocks per file.
Therefore, there is no limitation that to be only 4 blocks per file, the number 
of blocks is decided by default file size and default block size. In this case, 
the last index is 15 for offset over 120MB.

There could be a case that the actual file size is over the default one a 
little bit, but it is assumed no more than a page size. For example, it could 
be 129MB for the file, therefore, the actual size of the last block can be 9MB, 
but it is not a huge difference compared to other blocks.

The number of read blocks is calculated here:
https://gerrit.cloudera.org/#/c/17979/6/be/src/runtime/tmp-file-mgr.cc@411


http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-file.cc
File be/src/runtime/io/disk-file.cc:

http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-file.cc@116
PS5, Line 116:
> Since the memory is actually deleted in this method, maybe we should call t
Think for a while, maybe DISABLED is better, because it means the block doesn't 
allow to use anymore. DELETED may mean sth has been deleted, but in our case, 
it doesn't necessarily delete anything for being the state.


http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-io-mgr.cc
File be/src/runtime/io/disk-io-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-io-mgr.cc@396
PS5, Line 396: // Fetch the data from the source file (remote) to the 
destination file (local).
 :   DCHECK(disk_file_dst_ != nullptr);
> May need to add a comment to describe these two files. An example would be
Done


http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-io-mgr.cc@424
PS5, Line 424:   read_buffer_bloc, MemBlockStatus::DISABLED, dstl, 
&read_buffer_lock)) {
> Why can't we use this DISABLED memory block here?  Looks like we should not
Because the memory of the block is released, if this case happens, we can't use 
the memory block u

[Impala-ASF-CR] IMPALA-10791 Add batching reading for remote temporary files

2021-11-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17979 )

Change subject: IMPALA-10791 Add batching reading for remote temporary files
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9746/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17979
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1dcc5d0881ffaeff09c5c514306cd668373ad31b
Gerrit-Change-Number: 17979
Gerrit-PatchSet: 6
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Tue, 09 Nov 2021 17:37:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10791 Add batching reading for remote temporary files

2021-11-09 Thread Yida Wu (Code Review)
Yida Wu has uploaded a new patch set (#6). ( 
http://gerrit.cloudera.org:8080/17979 )

Change subject: IMPALA-10791 Add batching reading for remote temporary files
..

IMPALA-10791 Add batching reading for remote temporary files

The patch adds a feature to batching read from a remote temporary
file in order to improve the reading performance for the spilled
remote data.

Originally, the design is to use the local disk file as the buffer
for batching reading from the remote file. But in practice, it
doesn't help to improve the performance. Therefore, the design
is changed to use the memory as the read buffer.

Currently, each TmpFileRemote has two DiskFile, one is for the
remote, and one is for the local buffer. The patch adds MemBlocks
to the local buffer file. Each local buffer file is divided into
several MemBlocks evenly. Moreover, in order to guarantee a
single page not being cut into two parts in different blocks,
the block size could be a little different to each other in
practice. The default block size is the minimum value between
the default file size and
MAX_REMOTE_READ_MEM_BLOCK_THRESHOLD_MB, which is 16MB.

When pinning a page, the system will detect if there is enough
memory for the block that holds the page, if not, we will go
reading the page directly and disable this block, because it may
be good to avoid duplicated reads from the remote fs for the same
content. If the system decides to fetch a block, the block will be
stored in the memory until all of the pages in the block are read
or the query ends.

One challenge of using the memory for the buffer is that, when the
system is lacking of memory when it needs to spill the data. So we
make a restriction to limit the percentage of the memory for the
read buffer to 10% of the total, because right now the impala
process will reserve 20% memory as unused memory by default, using
10% for the emergency case like spilling is reasonable.

Two start options have been added for the new feature.

1. remote_batching_read. Default is false. If set true, the batching
read is enabled.
2. remote_read_memory_buffer_size. Default is 1G. The maximum memory
that can be used by the read buffer. The number also restricted by
the total system memory, which can not exceed 10% of the total
memory.

The patch also increases the MAX_REMOTE_TMPFILE_SIZE_THRESHOLD_MB
from 256 to 512.

Tests:
Ran core and exhaustive tests.
Added and ran TmpFileMgrTest::TestBatchingReadFromRemote.
Added e2e test test_scratch_dirs_batch_reading.

Change-Id: I1dcc5d0881ffaeff09c5c514306cd668373ad31b
---
M be/src/runtime/io/disk-file.cc
M be/src/runtime/io/disk-file.h
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/io/request-context.cc
M be/src/runtime/io/request-ranges.h
M be/src/runtime/io/scan-range.cc
M be/src/runtime/tmp-file-mgr-internal.h
M be/src/runtime/tmp-file-mgr-test.cc
M be/src/runtime/tmp-file-mgr.cc
M be/src/runtime/tmp-file-mgr.h
M common/thrift/metrics.json
M tests/custom_cluster/test_scratch_disk.py
12 files changed, 1,225 insertions(+), 150 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/17979/6
--
To view, visit http://gerrit.cloudera.org:8080/17979
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1dcc5d0881ffaeff09c5c514306cd668373ad31b
Gerrit-Change-Number: 17979
Gerrit-PatchSet: 6
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Yida Wu 


[Impala-ASF-CR] IMPALA-5256: Force log rotation when max log size exceeded

2021-11-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17997 )

Change subject: IMPALA-5256: Force log rotation when max_log_size exceeded
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7613/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/17997
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1b94727180354fe69989ebf3cd1a8f8cda1cf0c3
Gerrit-Change-Number: 17997
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Tue, 09 Nov 2021 16:56:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] [DO NOT MERGE]: IMPALA-10926: Test patchset:24 - set last synced event id when processing self event

2021-11-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18005 )

Change subject: [DO NOT MERGE]: IMPALA-10926: Test patchset:24 - set last 
synced event id when processing self event
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/18005
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I439f59b25c6b01f9f14a0cb58d48a700d627cd91
Gerrit-Change-Number: 18005
Gerrit-PatchSet: 1
Gerrit-Owner: Sourabh Goyal 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 09 Nov 2021 16:56:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5256: Force log rotation when max log size exceeded

2021-11-09 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17997 )

Change subject: IMPALA-5256: Force log rotation when max_log_size exceeded
..


Patch Set 3: Code-Review+2

Carrying +2


--
To view, visit http://gerrit.cloudera.org:8080/17997
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1b94727180354fe69989ebf3cd1a8f8cda1cf0c3
Gerrit-Change-Number: 17997
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Tue, 09 Nov 2021 16:46:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP IMPALA-9498: Allow returning arrays in select list

2021-11-09 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17811 )

Change subject: WIP IMPALA-9498: Allow returning arrays in select list
..


Patch Set 17:

(8 comments)

Sorry for the late review. I managed to go through the tests and the commit 
msg. Will continue checking the rest tomorrow.

http://gerrit.cloudera.org:8080/#/c/17811/17//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17811/17//COMMIT_MSG@7
PS17, Line 7: WIP
Is this still a work in progress? What would be needed to make this "live"?


http://gerrit.cloudera.org:8080/#/c/17811/17//COMMIT_MSG@15
PS17, Line 15: Things intentionally postponed for later commits:
By later commits do you mean later version of this patch or covered by separate 
Jiras?
If the latest then could you open Jiras to not to forget them and could you 
include the Jira IDs here?


http://gerrit.cloudera.org:8080/#/c/17811/17//COMMIT_MSG@26
PS17, Line 26: Finalize behavior with column masking/row filtering
Same question: is this expected in this patch or a separate one?


http://gerrit.cloudera.org:8080/#/c/17811/17//COMMIT_MSG@29
PS17, Line 29:
Was there anything specific in terms of testing that is worth mentioning here?


http://gerrit.cloudera.org:8080/#/c/17811/17/testdata/workloads/functional-query/queries/QueryTest/nested-array-in-select-list.test
File 
testdata/workloads/functional-query/queries/QueryTest/nested-array-in-select-list.test:

http://gerrit.cloudera.org:8080/#/c/17811/17/testdata/workloads/functional-query/queries/QueryTest/nested-array-in-select-list.test@34
PS17, Line 34: IllegalStateException: Sorting is not supported for collection 
columns.
The error msg is misleading here: Sorting is not done by collection columns in 
the query. The issue is that you do a sort in a query where you actually query 
arrays as well, but you don't sort by collection columns.


http://gerrit.cloudera.org:8080/#/c/17811/17/testdata/workloads/functional-query/queries/QueryTest/nested-array-in-select-list.test@37
PS17, Line 37:  in
nit: typo


http://gerrit.cloudera.org:8080/#/c/17811/17/testdata/workloads/functional-query/queries/QueryTest/nested-array-in-select-list.test@147
PS17, Line 147: 
Could you write a test that has a WHERE filter (on a non-array column)?

Could you create a view that contains arrays and query these arrays from the 
views? I see that there is coverage for inline views but not for regular ones.

These tests are run on a small dataset. Have you performed some tests on a way 
bigger dataset? E.g. multiple files, multiple partitions, where the data 
doesn't fit in a single row batch.


http://gerrit.cloudera.org:8080/#/c/17811/17/tests/query_test/test_nested_types.py
File tests/query_test/test_nested_types.py:

http://gerrit.cloudera.org:8080/#/c/17811/17/tests/query_test/test_nested_types.py@155
PS17, Line 155: nested-array-in-select-list
I find the naming a bit misleading: this is not just for nested arrays, but for 
regular arrays as well, right?



--
To view, visit http://gerrit.cloudera.org:8080/17811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb1e42ffb21c7ddc033aba0f754b0108e46f34d0
Gerrit-Change-Number: 17811
Gerrit-PatchSet: 17
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 09 Nov 2021 16:20:25 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10984: Improve TimestampValue to String casting

2021-11-09 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/17980 )

Change subject: IMPALA-10984: Improve TimestampValue to String casting
..

IMPALA-10984: Improve TimestampValue to String casting

TimestampValue::ToString was implemented by concatenating
boost::gregorian::to_iso_extended_string and
boost::posix_time::to_simple_string using stringstream. This involves
multiple string allocations, copying, and might hit lock within
tcmalloc::CentralFreeList. FROM_UNIXTIME and CAST expression that
touches this function can be inefficient if the expression is being
evaluated for millions of rows.

This patch adds method TimestampValue::ToStringVal and reimplements
TimestampValue::ToString by supplying default DateTimeFormatContext if
no pattern was specified. "-MM-dd HH:mm:ss" will be picked as the
default format if the time_ component does not have fractional seconds.
Otherwise, "-MM-dd HH:mm:ss.S" will be picked as the default
format. The chosen DateTimeFormatContext then is passed to
TimestampParser::Format along with date_ and time_ to be formatted into
the string representation. Int to string parsing method is replaced with
FastInt32ToBufferLeft in TimestampParser::Format.

We ran a set of expression benchmarks in a machine with Intel(R)
Core(TM) i7-4790 CPU @ 3.60GHz. This patch gives > 10X performance
improvement for CAST timestamp to string and FROM_UNIXTIME without a
date-time pattern. Following are the detailed results before and after
the patch.

Before the patch:
FromUnixCodegen:   Function   10%ile   50%ile   90%ile 10%ile   
  50%ile 90%ile
   (relative) 
(relative) (relative)
---
literal 36.7   37 37.3 1X   
  1X 1X
  cast(now() as string) 2.31 2.31 2.330.0628X   
 0.0623X0.0626X
cast(now() as string format 'Y .S') 16.9 17.5 17.5 0.459X   
  0.472X 0.471X
 from_unixtime(0,'-MM-dd HH:mm:ss')  6.3  6.3 6.37 0.171X   
   0.17X 0.171X
  from_unixtime(0,'-MM-dd') 11.8 11.8   12  0.32X   
   0.32X 0.322X
   from_unixtime(0) 2.36  2.4  2.40.0644X   
 0.0648X0.0644X

After the patch:
FromUnixCodegen:   Function   10%ile   50%ile   90%ile 10%ile   
  50%ile 90%ile
   (relative) 
(relative) (relative)
---
literal 37.7 38.1 38.4 1X   
  1X 1X
  cast(now() as string) 29.9 30.1 30.2 0.794X   
   0.79X 0.787X
cast(now() as string format 'Y .S') 61.1 61.3 61.6  1.62X   
   1.61X  1.61X
 from_unixtime(0,'-MM-dd HH:mm:ss') 33.6 33.8 34.2 0.892X   
  0.887X 0.892X
  from_unixtime(0,'-MM-dd') 50.5 50.6 50.9  1.34X   
   1.33X  1.33X
   from_unixtime(0)   34 34.2 34.5 0.902X   
  0.896X 0.898X

The literal expression used as the baseline in this benchmark is
"cast('2012-01-01 09:10:11.123456789' as timestamp)".

This patch also updates numbers in expr-benchmark for
BenchmarkTimestampFunctions and tidy up expr-benchmark a bit to clear
its MemPool in between benchmark iteration so that it does not run out
of memory.

Testing:
- Pass core tests.

Change-Id: I4fcb4545d9c9a3fdb38c4db58bb4b1321a429d61
Reviewed-on: http://gerrit.cloudera.org:8080/17980
Reviewed-by: Impala Public Jenkins 
Tested-by: Csaba Ringhofer 
Reviewed-by: Csaba Ringhofer 
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/exprs/aggregate-functions-ir.cc
M be/src/exprs/cast-functions-ir.cc
M be/src/exprs/literal.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/exprs/timestamp-functions.cc
M be/src/runtime/date-parse-util.cc
M be/src/runtime/datetime-iso-sql-format-tokenizer.cc
M be/src/runtime/datetime-iso-sql-format-tokenizer.h
M be/src/runtime/datetime-parser-common.cc
M be/src/runtime/datetime-parser-common.h
M be/src/runtime/datetime-simple-date-format-parser.cc
M be/src/runtime/datetime-simple-date-format-parser.h
M be/src/runtime/timestamp-parse-util.cc
M be/src/runtime/timestamp-parse-util.h
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.cc
M be/src/runtime/timestamp-value.h
M be/src/runtime/timestamp-value.inline.h
M be/src/util/min-max-filter.cc
20 files changed, 261 insertions(+), 159 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved
  Csaba Ringhofer: Looks good to me, approved; Veri

[Impala-ASF-CR] IMPALA-10984: Improve TimestampValue to String casting

2021-11-09 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17980 )

Change subject: IMPALA-10984: Improve TimestampValue to String casting
..


Patch Set 10: Code-Review+2

Submitting as the GVO failed twice due to flaky tests.


--
To view, visit http://gerrit.cloudera.org:8080/17980
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4fcb4545d9c9a3fdb38c4db58bb4b1321a429d61
Gerrit-Change-Number: 17980
Gerrit-PatchSet: 10
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Tue, 09 Nov 2021 16:13:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10984: Improve TimestampValue to String casting

2021-11-09 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17980 )

Change subject: IMPALA-10984: Improve TimestampValue to String casting
..


Patch Set 10: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/17980
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4fcb4545d9c9a3fdb38c4db58bb4b1321a429d61
Gerrit-Change-Number: 17980
Gerrit-PatchSet: 10
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Tue, 09 Nov 2021 16:13:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10984: Improve TimestampValue to String casting

2021-11-09 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has removed a vote on this change.

Change subject: IMPALA-10984: Improve TimestampValue to String casting
..


Removed Verified-1 by Impala Public Jenkins 
--
To view, visit http://gerrit.cloudera.org:8080/17980
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: I4fcb4545d9c9a3fdb38c4db58bb4b1321a429d61
Gerrit-Change-Number: 17980
Gerrit-PatchSet: 10
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Riza Suminto 


[Impala-ASF-CR] IMPALA-10984: Improve TimestampValue to String casting

2021-11-09 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has removed a vote on this change.

Change subject: IMPALA-10984: Improve TimestampValue to String casting
..


Removed -Verified by Impala Public Jenkins 
--
To view, visit http://gerrit.cloudera.org:8080/17980
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: I4fcb4545d9c9a3fdb38c4db58bb4b1321a429d61
Gerrit-Change-Number: 17980
Gerrit-PatchSet: 10
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Riza Suminto 


[Impala-ASF-CR] IMPALA-11006: Impalad crashes during query cancel tests

2021-11-09 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17999 )

Change subject: IMPALA-11006: Impalad crashes during query cancel tests
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/17999/1/be/src/service/client-request-state.cc
File be/src/service/client-request-state.cc:

http://gerrit.cloudera.org:8080/#/c/17999/1/be/src/service/client-request-state.cc@1070
PS1, Line 1070: else
> CTAS is processed async with a difference before the RPC patch.
My concern is about queries that do not have a coordinator at all, as 
FinishExecQueryOrDmlRequest is not called. My understanding is that there is no 
coordinator in case of DDLs, with the exception of CTAS. In this case it is 
normal to go in this branch - we will work correctly, but I still think that we 
should distinguish between the two cases.


http://gerrit.cloudera.org:8080/#/c/17999/1/be/src/service/client-request-state.cc@1076
PS1, Line 1076: MarkInactive
> I instrumented the code to observe when MarkActive() and MarkInactive() for
MarkActive/Inactive affects query expiration only, which is not relevant in 
case the query was cancelled. ref_count_ should reach 0 only when Impala is 
really not doing any work on the query and we are waiting for the client to do 
another RPC.



--
To view, visit http://gerrit.cloudera.org:8080/17999
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia49411f8525734b8d463d9ffbfbce705b90a8d73
Gerrit-Change-Number: 17999
Gerrit-PatchSet: 2
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Comment-Date: Tue, 09 Nov 2021 16:08:12 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10920: Zipping unnest for arrays

2021-11-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17983 )

Change subject: IMPALA-10920: Zipping unnest for arrays
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9745/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17983
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic58ff6579ecff03962e7a8698edfbe0684ce6cf7
Gerrit-Change-Number: 17983
Gerrit-PatchSet: 5
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 09 Nov 2021 15:09:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10920: Zipping unnest for arrays

2021-11-09 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17983 )

Change subject: IMPALA-10920: Zipping unnest for arrays
..


Patch Set 5:

(43 comments)

http://gerrit.cloudera.org:8080/#/c/17983/2/be/src/exec/unnest-node.h
File be/src/exec/unnest-node.h:

http://gerrit.cloudera.org:8080/#/c/17983/2/be/src/exec/unnest-node.h@40
PS2, Line 40: SlotRefs
> If they are always SlotRefs, can't the type of the vector be std::vectorhttp://gerrit.cloudera.org:8080/#/c/17983/2/be/src/exec/unnest-node.h@42
PS2, Line 42: 'UnnestNode::coll_values_', but instead manually retrieve the 
slot values to support
:   /// projection (see cla
> I think it should be mentioned that 'coll_values_' and the class comment ar
Done


http://gerrit.cloudera.org:8080/#/c/17983/2/be/src/exec/unnest-node.h@46
PS2, Line 46:  UnnestPlanNode.
> 'coll_expr_evals_' has been removed.
Done


http://gerrit.cloudera.org:8080/#/c/17983/2/be/src/exec/unnest-node.h@46
PS2, Line 46: t in
:   /// InitCollExpr().
> It seems no longer to be true that they are set in Prepare() and set to NUL
Right, these are set to null in a different class not here.
Done


http://gerrit.cloudera.org:8080/#/c/17983/2/be/src/exec/unnest-node.h@50
PS2, Line 50: Set in InitCollE
> It seems no longer to be the case.
Done


http://gerrit.cloudera.org:8080/#/c/17983/2/be/src/exec/unnest-node.h@109
PS2, Line 109: .
> Nit: belong.
Done


http://gerrit.cloudera.org:8080/#/c/17983/2/be/src/exec/unnest-node.h@116
PS2, Line 116: doesn
> Nit: doesn't.
Done


http://gerrit.cloudera.org:8080/#/c/17983/2/be/src/exec/unnest-node.h@126
PS2, Line 126: These slots are always set to NULL in
 :   /// Open() as a simple projection.
> Does not seem to be true anymore.
In fact it's partially true. These still are projected to NULL but not set in 
Prepare.
Done


http://gerrit.cloudera.org:8080/#/c/17983/2/be/src/exec/unnest-node.h@131
PS2, Line 131:
> Does not seem to be true.
Done


http://gerrit.cloudera.org:8080/#/c/17983/2/be/src/exec/unnest-node.h@144
PS2, Line 144: length
> typo
Done


http://gerrit.cloudera.org:8080/#/c/17983/2/be/src/exec/unnest-node.cc
File be/src/exec/unnest-node.cc:

http://gerrit.cloudera.org:8080/#/c/17983/2/be/src/exec/unnest-node.cc@60
PS2, Line 60: DCHECK(coll_expr->IsSlotRef(
> This DCHECK is superfluous because slot_ref==null iff coll_expr==nullptr, a
Done


http://gerrit.cloudera.org:8080/#/c/17983/2/be/src/exec/unnest-node.cc@90
PS2, Line 90: num_collections_counter_(nullptr) {
> nit: usually the constant is the second argument.
Done


http://gerrit.cloudera.org:8080/#/c/17983/2/be/src/exec/unnest-node.cc@143
PS2, Line 143:   if (num_tuples > 0) {
> nit: here and at line 146: using std::max would be easier to read IMO
Done


http://gerrit.cloudera.org:8080/#/c/17983/2/be/src/exec/unnest-node.cc@174
PS2, Line 174:
> Nit: should be collections.
Done


http://gerrit.cloudera.org:8080/#/c/17983/2/be/src/exec/unnest-node.cc@218
PS2, Line 218:   (*coll_slot_descs_)[coll_idx]->children_tuple_de
> Can't we use nullptr everywhere? My understanding is that nullptr as tuple
As I remember I had to create the tuple in the row batch and set it to null 
explicitly otherwise there is a point somewhere later where Impala crashes.


http://gerrit.cloudera.org:8080/#/c/17983/2/fe/src/main/cup/sql-parser.cup
File fe/src/main/cup/sql-parser.cup:

http://gerrit.cloudera.org:8080/#/c/17983/2/fe/src/main/cup/sql-parser.cup@315
PS2, Line 315: KW_UNNEST
> We should add to the comment that we have added a new keyword, making this
Mentioned it in the commit message.


http://gerrit.cloudera.org:8080/#/c/17983/2/fe/src/main/cup/sql-parser.cup@3030
PS2, Line 3030: Providing multiple UNNEST() in the FROM clause is not supported.
> It this something that we want to support later?
It depends on what semantics you want to achieve with multiple unnests. In case 
you want to do zipping unnest between the arrays in the same unnest but do 
joining unnest with the ones on different unnest then this might be feasible, 
but a bit hard to understand.
And anyway zipping and joining unnest is not allowed together atm.
I'm not really convinced we want to support this later as I see no added value.


http://gerrit.cloudera.org:8080/#/c/17983/2/fe/src/main/cup/sql-parser.cup@3091
PS2, Line 3091:
> What if the user provides more aliases than paths?
Indeed, thx :) Added some tests as well to cover this.
Done


http://gerrit.cloudera.org:8080/#/c/17983/2/fe/src/main/java/org/apache/impala/analysis/Analyzer.java
File fe/src/main/java/org/apache/impala/analysis/Analyzer.java:

http://gerrit.cloudera.org:8080/#/c/17983/2/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@574
PS2, Line 574: tableRefsFromUnnestExpr
> Nit: Should end in an underscore (_).
Done


http://gerrit.cloudera.org:8080/#/c/17983/2/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@706
P

[Impala-ASF-CR] IMPALA-10920: Zipping unnest for arrays

2021-11-09 Thread Gabor Kaszab (Code Review)
Hello Daniel Becker, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17983

to look at the new patch set (#5).

Change subject: IMPALA-10920: Zipping unnest for arrays
..

IMPALA-10920: Zipping unnest for arrays

This patch provides an unnest implementation for arrays where unnesting
multiple arrays in one query results the items of the arrays being
zipped together instead of joining. There are two different syntaxes
introduced for this purpose:

1: ISO:SQL 2016 compliant syntax:
SELECT a1.item, a2.item
FROM complextypes_arrays t, UNNEST(t.arr1, t.arr2) AS (a1, a2);

2: Postgres compatible syntax:
SELECT UNNEST(arr1), UNNEST(arr2) FROM complextypes_arrays;

Let me show the expected behaviour through the following example:
Inputs: arr1: {1,2,3}, arr2: {11, 12}
After running any of the above queries we expect the following output:
===
| arr1 | arr2 |
===
| 1| 11   |
| 2| 12   |
| 3| NULL |
===

Expected behaviour:
 - When unnesting multiple arrays with zipping unnest then the 'i'th
   item of one array will be put next to the 'i'th item of the other
   arrays in the results.
 - In case the size of the arrays is not the same then the shorter
   arrays will be filled with NULL values up to the size of the longest
   array.

On a sidenote, UNNEST is added to Impala's SQL language as a new
keyword. This might interfere with use cases where a resource (db,
table, column, etc.) is named "UNNEST".

Restrictions:
 - It is not allowed to have WHERE filters on an unnested item of an
   array in the same SELECT query. E.g. this is not allowed:
   SELECT arr1.item
   FROM complextypes_arrays t, UNNEST(t.arr1) WHERE arr1.item < 5;

   Note, that it is allowed to have an outer SELECT around the one
   doing unnests and have a filter there on the unnested items.
 - If there is an outer SELECT filtering on the unnested array's items
   from the inner SELECT then these predicates won't be pushed down to
   the SCAN node. They are rather evaluated in the UNNEST node to
   guarantee result correctness after unnesting.
   Note, this restriction is only active when there are multiple arrays
   being unnested, or in other words when zipping unnest logic is
   required to produce results.
 - It's not allowed to do a zipping and a (traditional) joining unnest
   together in one SELECT query.
 - It's not allowed to perform zipping unnests on arrays from different
   tables.

Testing:
 - Added a bunch of E2E tests to the test suite to cover both syntaxes.
 - Did a manual test run on a table with 1000 rows, 3 array columns
   with size of around 5000 items in each array. I did an unnest on all
   three arrays in one query to see if there are any crashes or
   suspicious slowness when running on this scale.

Change-Id: Ic58ff6579ecff03962e7a8698edfbe0684ce6cf7
---
M be/src/exec/unnest-node.cc
M be/src/exec/unnest-node.h
M common/thrift/PlanNodes.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/FromClause.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
M fe/src/main/java/org/apache/impala/analysis/TableRef.java
M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java
A fe/src/main/java/org/apache/impala/analysis/UnnestExpr.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/planner/UnnestNode.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java
A testdata/ComplexTypesTbl/arrays.orc
A testdata/ComplexTypesTbl/arrays.parq
M testdata/data/README
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
A 
testdata/workloads/functional-query/queries/QueryTest/zipping-unnest-in-from-clause.test
A 
testdata/workloads/functional-query/queries/QueryTest/zipping-unnest-in-select-list.test
M tests/query_test/test_nested_types.py
29 files changed, 1,366 insertions(+), 136 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/17983/5
--
To view, visit http://gerrit.cloudera.org:8080/17983
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic58ff6579ecff03962e7a8698edfbe0684ce6cf7
Gerrit-Change-Num

[Impala-ASF-CR] IMPALA-11006: Impalad crashes during query cancel tests

2021-11-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17999 )

Change subject: IMPALA-11006: Impalad crashes during query cancel tests
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9744/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17999
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia49411f8525734b8d463d9ffbfbce705b90a8d73
Gerrit-Change-Number: 17999
Gerrit-PatchSet: 2
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Comment-Date: Tue, 09 Nov 2021 14:11:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP IMPALA-9498: Allow returning arrays in select list

2021-11-09 Thread Daniel Becker (Code Review)
Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17811 )

Change subject: WIP IMPALA-9498: Allow returning arrays in select list
..


Patch Set 17:

(3 comments)

Thanks.

http://gerrit.cloudera.org:8080/#/c/17811/15/fe/src/main/java/org/apache/impala/analysis/SetOperationStmt.java
File fe/src/main/java/org/apache/impala/analysis/SetOperationStmt.java:

http://gerrit.cloudera.org:8080/#/c/17811/15/fe/src/main/java/org/apache/impala/analysis/SetOperationStmt.java@344
PS15, Line 344: DO: Currently only
> Yes, we want to exclude all other operators than UNION ALL here. Added a co
In this case, shouldn't the error message also say that "EXCEPT" is not 
supported? Like "UNION and EXCEPT are not supported for collection types".


http://gerrit.cloudera.org:8080/#/c/17811/15/fe/src/main/java/org/apache/impala/analysis/SlotRef.java
File fe/src/main/java/org/apache/impala/analysis/SlotRef.java:

http://gerrit.cloudera.org:8080/#/c/17811/15/fe/src/main/java/org/apache/impala/analysis/SlotRef.java@99
PS15, Line 99:
> Removed it here, but not in L82, because the default constructor of the par
You're right, thanks.


http://gerrit.cloudera.org:8080/#/c/17811/15/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java:

http://gerrit.cloudera.org:8080/#/c/17811/15/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@2296
PS15, Line 2296: Map type is
> I kept it as "Map type", as we also use "Struct type" in case of structs.
Okay.



--
To view, visit http://gerrit.cloudera.org:8080/17811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb1e42ffb21c7ddc033aba0f754b0108e46f34d0
Gerrit-Change-Number: 17811
Gerrit-PatchSet: 17
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 09 Nov 2021 13:57:09 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11006: Impalad crashes during query cancel tests

2021-11-09 Thread Qifan Chen (Code Review)
Qifan Chen has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/17999 )

Change subject: IMPALA-11006: Impalad crashes during query cancel tests
..

IMPALA-11006: Impalad crashes during query cancel tests

This patch addresses impalad crash during query cancel tests. The
root cause is that the wait thread in impalad assumes that the
coordinator or computation results are always available once the
worker thread async_exec_thread_ is joined. In reality, this may
not be true at all once the query is cancelled.

The patch specifically checks the cancel status when it is found
that the coordinator is not available and bails out immediately.
This prevents crash since the subsequent code that assumes the
coordinator or computation results are available is never reached.

Testing:
  1. Ran core tests successfully.

Change-Id: Ia49411f8525734b8d463d9ffbfbce705b90a8d73
---
M be/src/service/client-request-state.cc
1 file changed, 9 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/99/17999/2
--
To view, visit http://gerrit.cloudera.org:8080/17999
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia49411f8525734b8d463d9ffbfbce705b90a8d73
Gerrit-Change-Number: 17999
Gerrit-PatchSet: 2
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Qifan Chen 


[Impala-ASF-CR] IMPALA-10984: Improve TimestampValue to String casting

2021-11-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17980 )

Change subject: IMPALA-10984: Improve TimestampValue to String casting
..


Patch Set 10: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7610/


--
To view, visit http://gerrit.cloudera.org:8080/17980
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4fcb4545d9c9a3fdb38c4db58bb4b1321a429d61
Gerrit-Change-Number: 17980
Gerrit-PatchSet: 10
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Tue, 09 Nov 2021 13:40:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP IMPALA-9498: Allow returning arrays in select list

2021-11-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17811 )

Change subject: WIP IMPALA-9498: Allow returning arrays in select list
..


Patch Set 17:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9743/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb1e42ffb21c7ddc033aba0f754b0108e46f34d0
Gerrit-Change-Number: 17811
Gerrit-PatchSet: 17
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 09 Nov 2021 13:03:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP IMPALA-9498: Allow returning arrays in select list

2021-11-09 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17811 )

Change subject: WIP IMPALA-9498: Allow returning arrays in select list
..


Patch Set 17:

(19 comments)

Thanks for the comments!

http://gerrit.cloudera.org:8080/#/c/17811/15/fe/src/main/java/org/apache/impala/analysis/Analyzer.java
File fe/src/main/java/org/apache/impala/analysis/Analyzer.java:

http://gerrit.cloudera.org:8080/#/c/17811/15/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@1296
PS15, Line 1296: 
   :   /**
   :* Returns an existing or new SlotDescriptor for the given 
path.
   :*
> I think this comment refers to the 1-argument overload on L1293.
Done


http://gerrit.cloudera.org:8080/#/c/17811/15/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@1315
PS15, Line 1315: SlotDescriptor existingSlotDesc = slotPathMap_.get(key);
   : if (existingSlotDesc != null) return existingSlotDesc;
   : SlotDescriptor result = null;
   : Preconditions.checkState(!slotPath.destType().isMapType());
   : if (slotPath.destType().isArrayType()) {
   :   result = registerCollectionSlotRef(slotPath);
   : } else {
   :   result = addSlotDescriptor(slotPath.getRootDesc());
   : }
   : result.setPath(slotPath);
   : slotPathMap_.put(key, result);
   : registerColumnPrivReq(result);
   : return result;
   :   }
   :
   :   /**
   :* Registers a collection and its descendendants.
   :* Creates a CollectionTableRef for all collections.
   :*/
   :   private SlotDescriptor registerCollectionSlotRef(Path 
slotPath)
   :   throws AnalysisException {
   : 
Preconditions.checkState(slotPath.destType().isArrayType());
   : SlotDescriptor result = null;
   : Path currentPath = slotPath;
   : List rawPath = currentPath.getRawPath();
   : boolean first_level = true;
   : while (currentPath.destType().isArrayType()) {
   :   TableRef tblRef = new TableRef(rawPath, null);
   :
   :   CollectionTableRef collTblRef = new 
CollectionTableRef(tblRef, currentPath, true);
   :   collTblRef.analyze(this);
   :
   :   Preconditions.checkState(collTblRef.getCollectionExpr() 
!= null);
   :   Preconditions.checkState(collTblRef.getCollectionExpr() 
instanceof SlotRef);
   :   SlotDescriptor desc = ((SlotRef) 
collTblRef.getCollectionExpr()).getDesc();
   :   desc.setIsMaterializedRecursively(true);
   :   if (first_level) {
   : result = desc;
   : firs
> Optional: the handling of arrays could be extracted to a function.
Done


http://gerrit.cloudera.org:8080/#/c/17811/15/fe/src/main/java/org/apache/impala/analysis/CollectionTableRef.java
File fe/src/main/java/org/apache/impala/analysis/CollectionTableRef.java:

http://gerrit.cloudera.org:8080/#/c/17811/15/fe/src/main/java/org/apache/impala/analysis/CollectionTableRef.java@25
PS15, Line 25: an be either being
 :  * flattened during execution
> Is it still true? The TODO on L104 indicates it is not.
You are right, this is no longer true.


http://gerrit.cloudera.org:8080/#/c/17811/15/fe/src/main/java/org/apache/impala/analysis/CollectionTableRef.java@52
PS15, Line 52:
 :   /**
 :* Create a CollectionTableRef from the original unresolved 
table ref as well as
 :*
> This comment should be for the 2 argument version on L48. The boolean argum
added some explanation about inSelectList


http://gerrit.cloudera.org:8080/#/c/17811/15/fe/src/main/java/org/apache/impala/analysis/CollectionTableRef.java@91
PS15, Line 91: d
> Typo: y
Done


http://gerrit.cloudera.org:8080/#/c/17811/15/fe/src/main/java/org/apache/impala/analysis/CollectionTableRef.java@92
PS15, Line 92: n
> Typo: y
Done


http://gerrit.cloudera.org:8080/#/c/17811/15/fe/src/main/java/org/apache/impala/analysis/SetOperationStmt.java
File fe/src/main/java/org/apache/impala/analysis/SetOperationStmt.java:

http://gerrit.cloudera.org:8080/#/c/17811/15/fe/src/main/java/org/apache/impala/analysis/SetOperationStmt.java@344
PS15, Line 344: DO: Currently only
> Do we only have to do this check if there are only UNION ALL operations? Wh
Yes, we want to exclude all other operators than UNION ALL here. Added a 
comment about this.


http://gerrit.cloudera.org:8080/#/c/17811/15/fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java
File fe/src/main/java/org/apache/impala/analysis

[Impala-ASF-CR] WIP IMPALA-9498: Allow returning arrays in select list

2021-11-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17811 )

Change subject: WIP IMPALA-9498: Allow returning arrays in select list
..


Patch Set 17:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/17811/17/fe/src/main/java/org/apache/impala/analysis/Analyzer.java
File fe/src/main/java/org/apache/impala/analysis/Analyzer.java:

http://gerrit.cloudera.org:8080/#/c/17811/17/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@1410
PS17, Line 1410: Column column = new Column(String.join(".", 
slotDesc.getPath().getRawPath()), slotDesc.getType(),
line too long (101 > 90)


http://gerrit.cloudera.org:8080/#/c/17811/17/fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java
File 
fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java:

http://gerrit.cloudera.org:8080/#/c/17811/17/fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java@729
PS17, Line 729: // Select with struct in select list. Uses 
functional_orc_def.complextypes_nested_structs
line too long (93 > 90)


http://gerrit.cloudera.org:8080/#/c/17811/17/fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java@757
PS17, Line 757: allExcept(TPrivilegeLevel.ALL, 
TPrivilegeLevel.OWNER, TPrivilegeLevel.SELECT)));
line too long (96 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/17811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb1e42ffb21c7ddc033aba0f754b0108e46f34d0
Gerrit-Change-Number: 17811
Gerrit-PatchSet: 17
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 09 Nov 2021 12:41:53 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] WIP IMPALA-9498: Allow returning arrays in select list

2021-11-09 Thread Csaba Ringhofer (Code Review)
Hello Daniel Becker, Gabor Kaszab, Attila Jeges, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17811

to look at the new patch set (#17).

Change subject: WIP IMPALA-9498: Allow returning arrays in select list
..

WIP IMPALA-9498: Allow returning arrays in select list

Until now ARRAYs had to be unnested in queries. This patch adds
support to return ARRAYs as STRINGs (JSON arrays) in select list.

Though STRUCTs are already supported, ARRAYs and STRUCTs nested in
each other are not supported yet.

Things intentionally postponed for later commits:
- Add MAP suppport too - this shouldn't be too tricky after
  ARRAY support, but I don't want to make this patch even more
  complex.
- Unify HS2 / Beeswax logic with the way STRUCTs are handled.
  This could be done in a "final" logic that can handle
  STRUCTS/ARRAYS nested to each other
- Implement "deep copy" and "deep serialize" for ARRAYs in BE.
  This would enable all operators, e.g. ORDER BY and UNION.

TODOs:
- Finalize behavior with column masking/row filtering and
  modify tests accordingly. Currently some related tests are
  broken.

Change-Id: Ibb1e42ffb21c7ddc033aba0f754b0108e46f34d0
---
M be/src/codegen/codegen-anyval.cc
M be/src/exec/blocking-plan-root-sink.cc
M be/src/exec/buffered-plan-root-sink.cc
M be/src/exec/parquet/parquet-collection-column-reader.cc
M be/src/exec/plan-root-sink.cc
M be/src/exec/plan-root-sink.h
M be/src/exprs/expr.h
M be/src/exprs/slot-ref.cc
M be/src/exprs/slot-ref.h
M be/src/runtime/collection-value.h
M be/src/runtime/raw-value.cc
M be/src/runtime/raw-value.h
M be/src/runtime/types.cc
M be/src/runtime/types.h
M be/src/service/hs2-util.cc
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-server.h
M be/src/service/query-result-set.cc
M be/src/service/query-result-set.h
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/CollectionTableRef.java
M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/SetOperationStmt.java
M fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
M fe/src/main/java/org/apache/impala/analysis/TableRef.java
M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java
M fe/src/main/java/org/apache/impala/authorization/TableMask.java
M fe/src/main/java/org/apache/impala/catalog/Type.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/planner/UnionNode.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeUpsertStmtTest.java
M fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java
A 
testdata/workloads/functional-query/queries/QueryTest/nested-array-in-select-list.test
M 
testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_struct_in_select_list.test
M 
testdata/workloads/functional-query/queries/QueryTest/struct-in-select-list.test
M tests/authorization/test_ranger.py
M tests/query_test/test_nested_types.py
42 files changed, 734 insertions(+), 178 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/17811/17
--
To view, visit http://gerrit.cloudera.org:8080/17811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibb1e42ffb21c7ddc033aba0f754b0108e46f34d0
Gerrit-Change-Number: 17811
Gerrit-PatchSet: 17
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] [DO NOT MERGE]: IMPALA-10926: Test patchset: 25 - set last synced event id for incremental reload

2021-11-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18006 )

Change subject: [DO NOT MERGE]: IMPALA-10926: Test patchset: 25 - set last 
synced event id for incremental reload
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9742/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/18006
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If0f52186a14fb1097f26b88674c7e7cede4f68b4
Gerrit-Change-Number: 18006
Gerrit-PatchSet: 1
Gerrit-Owner: Sourabh Goyal 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 09 Nov 2021 10:57:38 +
Gerrit-HasComments: No


[Impala-ASF-CR] [DO NOT MERGE]: IMPALA-10926: Test patchset:24 - set last synced event id when processing self event

2021-11-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18005 )

Change subject: [DO NOT MERGE]: IMPALA-10926: Test patchset:24 - set last 
synced event id when processing self event
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9741/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/18005
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I439f59b25c6b01f9f14a0cb58d48a700d627cd91
Gerrit-Change-Number: 18005
Gerrit-PatchSet: 1
Gerrit-Owner: Sourabh Goyal 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 09 Nov 2021 10:55:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] [DO NOT MERGE]: IMPALA-10926: Test patchset: 25 - set last synced event id for incremental reload

2021-11-09 Thread Sourabh Goyal (Code Review)
Sourabh Goyal has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/18006


Change subject: [DO NOT MERGE]: IMPALA-10926: Test patchset: 25 - set last 
synced event id for incremental reload
..

[DO NOT MERGE]: IMPALA-10926: Test patchset: 25 - set last synced event id for 
incremental reload

Change-Id: If0f52186a14fb1097f26b88674c7e7cede4f68b4
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M 
fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M 
fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java
M 
fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M 
fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M 
fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java
A 
fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/custom_cluster/test_metastore_service.py
26 files changed, 3,561 insertions(+), 292 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/06/18006/1
--
To view, visit http://gerrit.cloudera.org:8080/18006
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: If0f52186a14fb1097f26b88674c7e7cede4f68b4
Gerrit-Change-Number: 18006
Gerrit-PatchSet: 1
Gerrit-Owner: Sourabh Goyal 


[Impala-ASF-CR] [DO NOT MERGE]: IMPALA-10926: Test patchset:24 - set last synced event id when processing self event

2021-11-09 Thread Sourabh Goyal (Code Review)
Sourabh Goyal has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/18005


Change subject: [DO NOT MERGE]: IMPALA-10926: Test patchset:24 - set last 
synced event id when processing self event
..

[DO NOT MERGE]: IMPALA-10926: Test patchset:24 - set last synced event id
when processing self event

Change-Id: I439f59b25c6b01f9f14a0cb58d48a700d627cd91
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M 
fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M 
fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java
M 
fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M 
fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M 
fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java
A 
fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/custom_cluster/test_metastore_service.py
26 files changed, 3,495 insertions(+), 283 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/18005/1
-- 
To view, visit http://gerrit.cloudera.org:8080/18005
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I439f59b25c6b01f9f14a0cb58d48a700d627cd91
Gerrit-Change-Number: 18005
Gerrit-PatchSet: 1
Gerrit-Owner: Sourabh Goyal 


[Impala-ASF-CR] [DO NOT MERGE]: IMPALA-10926: Test patchset: 25 - set last synced event id for incremental reload

2021-11-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18006 )

Change subject: [DO NOT MERGE]: IMPALA-10926: Test patchset: 25 - set last 
synced event id for incremental reload
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/18006/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/18006/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2423
PS1, Line 2423:* String, long)} which passes false for {@code 
refreshUpdatedPartitions} argument and ignore
line too long (95 > 90)


http://gerrit.cloudera.org:8080/#/c/18006/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/18006/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@5877
PS1, Line 5877:   updatedThriftTable = catalog_.reloadTable(tbl, 
req, resultType, cmdString, -1);
line too long (93 > 90)


http://gerrit.cloudera.org:8080/#/c/18006/1/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/18006/1/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2471
PS1, Line 2471:   batchEvents = eventFactory.createBatchEvents(mockEvents, 
eventsProcessor_.getMetrics());
line too long (94 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/18006
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If0f52186a14fb1097f26b88674c7e7cede4f68b4
Gerrit-Change-Number: 18006
Gerrit-PatchSet: 1
Gerrit-Owner: Sourabh Goyal 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 09 Nov 2021 10:37:16 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] [DO NOT MERGE]: IMPALA-10926: Test patchset: 25 - set last synced event id for incremental reload

2021-11-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18006 )

Change subject: [DO NOT MERGE]: IMPALA-10926: Test patchset: 25 - set last 
synced event id for incremental reload
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7612/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/18006
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If0f52186a14fb1097f26b88674c7e7cede4f68b4
Gerrit-Change-Number: 18006
Gerrit-PatchSet: 1
Gerrit-Owner: Sourabh Goyal 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 09 Nov 2021 10:37:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] [DO NOT MERGE]: IMPALA-10926: Test patchset:24 - set last synced event id when processing self event

2021-11-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18005 )

Change subject: [DO NOT MERGE]: IMPALA-10926: Test patchset:24 - set last 
synced event id when processing self event
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7611/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/18005
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I439f59b25c6b01f9f14a0cb58d48a700d627cd91
Gerrit-Change-Number: 18005
Gerrit-PatchSet: 1
Gerrit-Owner: Sourabh Goyal 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 09 Nov 2021 10:34:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] [DO NOT MERGE]: IMPALA-10926: Test patchset:24 - set last synced event id when processing self event

2021-11-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18005 )

Change subject: [DO NOT MERGE]: IMPALA-10926: Test patchset:24 - set last 
synced event id when processing self event
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/18005/1/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/18005/1/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2452
PS1, Line 2452:   batchEvents = eventFactory.createBatchEvents(mockEvents, 
eventsProcessor_.getMetrics());
line too long (94 > 90)


http://gerrit.cloudera.org:8080/#/c/18005/1/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
File 
fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java:

http://gerrit.cloudera.org:8080/#/c/18005/1/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java@85
PS1, Line 85: private static boolean flagEnableCatalogCache 
,flagInvalidateCache, flagSyncToLatestEventId;
line too long (96 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/18005
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I439f59b25c6b01f9f14a0cb58d48a700d627cd91
Gerrit-Change-Number: 18005
Gerrit-PatchSet: 1
Gerrit-Owner: Sourabh Goyal 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 09 Nov 2021 10:34:15 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

2021-11-09 Thread Sourabh Goyal (Code Review)
Sourabh Goyal has abandoned this change. ( 
http://gerrit.cloudera.org:8080/18004 )

Change subject: [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to 
latest HMS event id when performing DDL operations via catalog HMS endpoints
..


Abandoned
--
To view, visit http://gerrit.cloudera.org:8080/18004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: I8cae9cccdcd94ef2c0f5cfaf5ddb15eaa8c46f5c
Gerrit-Change-Number: 18004
Gerrit-PatchSet: 1
Gerrit-Owner: Sourabh Goyal 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-10920: Zipping unnest for arrays

2021-11-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17983 )

Change subject: IMPALA-10920: Zipping unnest for arrays
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9740/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17983
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic58ff6579ecff03962e7a8698edfbe0684ce6cf7
Gerrit-Change-Number: 17983
Gerrit-PatchSet: 4
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 09 Nov 2021 09:55:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10920: Zipping unnest for arrays

2021-11-09 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17983 )

Change subject: IMPALA-10920: Zipping unnest for arrays
..


Patch Set 4:

(4 comments)

PS3 is fixing a bug with predicates being incorrectly pushed down to SCAN node.
PS4 is fixing the comments from Jenkins plus eventually writing the commit msg.

Will start processing the comments from Csaba and Daniel now. Thanks for taking 
a look!

http://gerrit.cloudera.org:8080/#/c/17983/3/fe/src/main/java/org/apache/impala/analysis/UnnestExpr.java
File fe/src/main/java/org/apache/impala/analysis/UnnestExpr.java:

http://gerrit.cloudera.org:8080/#/c/17983/3/fe/src/main/java/org/apache/impala/analysis/UnnestExpr.java@49
PS3, Line 49:   protected void analyzeImpl(Analyzer analyzer) throws 
AnalysisException {
> line has trailing whitespace
Done


http://gerrit.cloudera.org:8080/#/c/17983/3/fe/src/main/java/org/apache/impala/analysis/UnnestExpr.java@50
PS3, Line 50: Preconditions.checkNotNull(rawPath_);
> line has trailing whitespace
Done


http://gerrit.cloudera.org:8080/#/c/17983/3/fe/src/main/java/org/apache/impala/analysis/UnnestExpr.java@109
PS3, Line 109: Type resolvedType =
> line has trailing whitespace
Done


http://gerrit.cloudera.org:8080/#/c/17983/3/tests/query_test/test_nested_types.py
File tests/query_test/test_nested_types.py:

http://gerrit.cloudera.org:8080/#/c/17983/3/tests/query_test/test_nested_types.py@196
PS3, Line 196:
> flake8: E302 expected 2 blank lines, found 1
Done



--
To view, visit http://gerrit.cloudera.org:8080/17983
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic58ff6579ecff03962e7a8698edfbe0684ce6cf7
Gerrit-Change-Number: 17983
Gerrit-PatchSet: 4
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 09 Nov 2021 09:34:36 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10920: Zipping unnest for arrays

2021-11-09 Thread Gabor Kaszab (Code Review)
Hello Daniel Becker, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17983

to look at the new patch set (#4).

Change subject: IMPALA-10920: Zipping unnest for arrays
..

IMPALA-10920: Zipping unnest for arrays

This patch provides an unnest implementation for arrays where unnesting
multiple arrays in one query results the items of the arrays being
zipped together instead of joining. There are two different syntaxes
introduced for this purpose:

1: ISO:SQL 2016 compliant syntax:
SELECT a1.item, a2.item
FROM complextypes_arrays t, UNNEST(t.arr1, t.arr2) AS (a1, a2);

2: Postgres compatible syntax:
SELECT UNNEST(arr1), UNNEST(arr2) FROM complextypes_arrays;

Let me show the expected behaviour through the following example:
Inputs: arr1: {1,2,3}, arr2: {11, 12}
After running any of the above queries we expect the following output:
===
| arr1 | arr2 |
===
| 1| 11   |
| 2| 12   |
| 3| NULL |
===

Expected behaviour:
 - When unnesting multiple arrays with zipping unnest then the 'i'th
   item of one array will be put next to the 'i'th item of the other
   arrays in the results.
 - In case the size of the arrays is not the same then the shorter
   arrays will be filled with NULL values up to the size of the longest
   array.

Restrictions:
 - It is not allowed to have WHERE filters on an unnested item of an
   array in the same SELECT query. E.g. this is not allowed:
   SELECT arr1.item
   FROM complextypes_arrays t, UNNEST(t.arr1) WHERE arr1.item < 5;

   Note, that it is allowed to have an outer SELECT around the one
   doing unnests and have a filter there on the unnested items.
 - If there is an outer SELECT filtering on the unnested array's items
   from the inner SELECT then these predicates won't be pushed down to
   the SCAN node. They are rather evaluated in the UNNEST node to
   guarantee result correctness after unnesting.
   Note, this restriction is only active when there are multiple arrays
   being unnested, or in other words when zipping unnest logic is
   required to produce results.
 - It's not allowed to do a zipping and a (traditional) joining unnest
   together in one SELECT query.
 - It's not allowed to perform zipping unnests on arrays from different
   tables.

Testing:
 - Added a bunch of E2E tests to the test suite to cover both syntaxes.
 - Did a manual test run on a table with 1000 rows, 3 array columns
   with size of around 5000 items in each array. I did an unnest on all
   three arrays in one query to see if there are any crashes or
   suspicious slowness when running on this scale.

Change-Id: Ic58ff6579ecff03962e7a8698edfbe0684ce6cf7
---
M be/src/exec/unnest-node.cc
M be/src/exec/unnest-node.h
M common/thrift/PlanNodes.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/FromClause.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
M fe/src/main/java/org/apache/impala/analysis/TableRef.java
M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java
A fe/src/main/java/org/apache/impala/analysis/UnnestExpr.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/planner/UnnestNode.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java
A testdata/ComplexTypesTbl/arrays.orc
A testdata/ComplexTypesTbl/arrays.parq
M testdata/data/README
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
A 
testdata/workloads/functional-query/queries/QueryTest/zipping-unnest-in-from-clause.test
A 
testdata/workloads/functional-query/queries/QueryTest/zipping-unnest-in-select-list.test
M tests/query_test/test_nested_types.py
29 files changed, 1,348 insertions(+), 132 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/17983/4
--
To view, visit http://gerrit.cloudera.org:8080/17983
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic58ff6579ecff03962e7a8698edfbe0684ce6cf7
Gerrit-Change-Number: 17983
Gerrit-PatchSet: 4
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-10940: Pick parts of recent gutil changes from Kudu repo

2021-11-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17897 )

Change subject: IMPALA-10940: Pick parts of recent gutil changes from Kudu repo
..


Patch Set 4: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7609/


--
To view, visit http://gerrit.cloudera.org:8080/17897
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9230923111af3c1028f0e71cd37cb5ee6fbec654
Gerrit-Change-Number: 17897
Gerrit-PatchSet: 4
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Tue, 09 Nov 2021 08:44:09 +
Gerrit-HasComments: No