[Impala-ASF-CR] IMPALA-10961: Implementing adaptive 3-way quicksort in sorter

2022-02-17 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18184 )

Change subject: IMPALA-10961: Implementing adaptive 3-way quicksort in sorter
..


Patch Set 10:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/18184/10/be/src/util/tuple-row-compare.h
File be/src/util/tuple-row-compare.h:

http://gerrit.cloudera.org:8080/#/c/18184/10/be/src/util/tuple-row-compare.h@138
PS10, Line 138: ordering_expr_evals_lhs_.data(), 
ordering_expr_evals_rhs_.data(), lhs, rhs);
nit: weird indentation



--
To view, visit http://gerrit.cloudera.org:8080/18184
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I81e7b36a04a43de3b83e6aeee49ca0943f0bf202
Gerrit-Change-Number: 18184
Gerrit-PatchSet: 10
Gerrit-Owner: Noemi Pap-Takacs 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 18 Feb 2022 07:55:30 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10961: Implementing adaptive 3-way quicksort in sorter

2022-02-17 Thread Noemi Pap-Takacs (Code Review)
Hello Kurt Deschler, Zoltan Borok-Nagy, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/18184

to look at the new patch set (#10).

Change subject: IMPALA-10961: Implementing adaptive 3-way quicksort in sorter
..

IMPALA-10961: Implementing adaptive 3-way quicksort in sorter

Based on a 3-way partitioning implementation by Kurt Deschler.
3-way quicksort performs much better on data with large number of
duplicates, but has a small regression in case of large NDV.
This adaptive implementation keeps the advantages of both 2-way
and 3-way quicksort. If duplicates are found during pivot selection
(among the 3 randomly selected candidates),the 3-way partitioning
function is called in SortHelper, otherwise partitioning goes 2-way.

Some benchmark results:
On a view created from 4 tpch_parquet lineitem tables
Full sort, 1 node, 1 run - no spills (only in-memory sort is changed)
Time of sorting adaptively during query execution compared to
the original implementation (sort node profile):

 
+--+++
 | Test | Original 2-way | Adaptive 
Quicksort |
 
+--+++
 | select * order by l_linestatus, NDV=2:   |  1 |  
 0.67 |
 | select l_shipmode order by l_shipmode, NDV=7 |  1 |  
 0.42 |
 | select * order by l_shipmode, NDV=7  |  1 |  
 0.57 |
 | large NDV, unique data   |  1 |  
1 | (no difference)
 
+--+++

Change-Id: I81e7b36a04a43de3b83e6aeee49ca0943f0bf202
---
M be/src/runtime/sorter-internal.h
M be/src/runtime/sorter-ir.cc
M be/src/runtime/sorter.cc
M be/src/util/tuple-row-compare.h
4 files changed, 184 insertions(+), 50 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/84/18184/10
--
To view, visit http://gerrit.cloudera.org:8080/18184
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I81e7b36a04a43de3b83e6aeee49ca0943f0bf202
Gerrit-Change-Number: 18184
Gerrit-PatchSet: 10
Gerrit-Owner: Noemi Pap-Takacs 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] WIP IMPALA-10898: Add runtime IN-list filters for ORC tables

2022-02-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18141 )

Change subject: WIP IMPALA-10898: Add runtime IN-list filters for ORC tables
..


Patch Set 9:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/10181/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/18141
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I25080628233799aa0b6be18d5a832f1385414501
Gerrit-Change-Number: 18141
Gerrit-PatchSet: 9
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Fri, 18 Feb 2022 06:47:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP IMPALA-10898: Add runtime IN-list filters for ORC tables

2022-02-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18141 )

Change subject: WIP IMPALA-10898: Add runtime IN-list filters for ORC tables
..


Patch Set 9:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/18141/9/be/src/exec/hdfs-orc-scanner.cc
File be/src/exec/hdfs-orc-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/18141/9/be/src/exec/hdfs-orc-scanner.cc@318
PS9, Line 318:   ADD_COUNTER(scan_node_->runtime_profile(), 
"NumPushedDownRuntimeFilters", TUnit::UNIT);
line too long (93 > 90)


http://gerrit.cloudera.org:8080/#/c/18141/9/tests/query_test/test_runtime_filters.py
File tests/query_test/test_runtime_filters.py:

http://gerrit.cloudera.org:8080/#/c/18141/9/tests/query_test/test_runtime_filters.py@70
PS9, Line 70: [
flake8: E131 continuation line unaligned for hanging indent



--
To view, visit http://gerrit.cloudera.org:8080/18141
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I25080628233799aa0b6be18d5a832f1385414501
Gerrit-Change-Number: 18141
Gerrit-PatchSet: 9
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Fri, 18 Feb 2022 06:35:45 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] WIP IMPALA-10898: Add runtime IN-list filters for ORC tables

2022-02-17 Thread Quanlong Huang (Code Review)
Hello Qifan Chen, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/18141

to look at the new patch set (#9).

Change subject: WIP IMPALA-10898: Add runtime IN-list filters for ORC tables
..

WIP IMPALA-10898: Add runtime IN-list filters for ORC tables

ORC files have optional bloom filter indexes for each column. Since
ORC-1.7.0, the C++ reader supports pushing down predicates to skip
unreleated RowGroups. The pushed down predicates will be evaludated on
file indexes (i.e. statistics and bloom filter indexes). Note that only
EQUALS and IN-list predicates can leverage bloom filter indexes.

Currently Impala has two kinds of runtime filters: bloom filter and
min-max filter. Unfortunately they can't be converted into EQUALS or
IN-list predicates. So they can't leverage the file level bloom filter
indexes.

This patch adds runtime IN-list filters for this purpose. Currently they
are generated only for small build side (e.g. #rows <= 1024) of a
broadcast join. They will only be applied on ORC tables and be pushed
down to the ORC reader(i.e. ORC lib). To avoid exploding the IN-list,
if #rows of the build side exceeds the threshold (1024), we set the
filter to ALWAYS_TRUE. The threshold can be configured by a new query
option, RUNTIME_IN_LIST_FILTER_ENTRY_LIMIT.

Example query that will benefit from this patch:
  use tpch_orc_def;
  select count(*) from lineitem_bf join (
select * from partsupp, part
where ps_partkey = p_partkey and p_size = 15
  and p_type like '%BRASS' and ps_availqty < 10) v
  on l_partkey = ps_partkey and l_suppkey = ps_suppkey;

The inline-view populates a runtime IN-list filter with 4 items. Note that
we need to re-generate the lineitem table with bloom filter indexes enabled
(e.g. setting orc.bloom.filter.columns to
"l_orderkey,l_partkey,l_suppkey,l_linenumber,l_quantity" in
tblproperties before inserting the data), so the runtime IN-list filter
can have a better filter rate.

Evaluating runtime IN-list filters is much slower than evaluating
runtime bloom filters due to the current simple implementation (i.e.
std::unorder_set). So we disable it at row level.

For visibility, this patch addes two counters in the HdfsScanNode:
 - NumPushedDownPredicates
 - NumPushedDownRuntimeFilters
They reflect the predicates and runtime filters that are pushed down to
the ORC reader.

Ran perf tests on a 3 instances cluster on my desktop using TPC-DS with
scale factor 20. It shows significant improvements in some queries:

+---+-+++-++++---++-++
| Workload  | Query   | File Format| Avg(s) | Base Avg(s) | 
Delta(Avg) | StdDev(%)  | Base StdDev(%) | Iters | Median Diff(%) | MW Zval | 
Tval   |
+---+-+++-++++---++-++
| TPCDS(20) | TPCDS-Q67A  | orc / snap / block | 35.07  | 44.01   | I 
-20.32%  |   0.38%|   1.38%| 10| I -25.69%  | -3.58   | 
-45.33 |
| TPCDS(20) | TPCDS-Q37   | orc / snap / block | 1.08   | 1.45| I 
-25.23%  |   7.14%|   3.09%| 10| I -34.09%  | -3.58   | 
-12.94 |
| TPCDS(20) | TPCDS-Q70A  | orc / snap / block | 6.30   | 8.60| I 
-26.81%  |   5.24%|   4.21%| 10| I -36.67%  | -3.58   | 
-14.88 |
| TPCDS(20) | TPCDS-Q16   | orc / snap / block | 1.33   | 1.85| I 
-28.28%  |   4.98%|   5.92%| 10| I -39.38%  | -3.58   | 
-12.93 |
| TPCDS(20) | TPCDS-Q18A  | orc / snap / block | 5.70   | 8.06| I 
-29.25%  |   3.00%|   4.12%| 10| I -40.30%  | -3.58   | 
-19.95 |
| TPCDS(20) | TPCDS-Q22A  | orc / snap / block | 2.01   | 2.97| I 
-32.21%  |   6.12%|   5.94%| 10| I -47.68%  | -3.58   | 
-14.05 |
| TPCDS(20) | TPCDS-Q77A  | orc / snap / block | 8.49   | 12.44   | I 
-31.75%  |   6.44%|   3.96%| 10| I -49.71%  | -3.58   | 
-16.97 |
| TPCDS(20) | TPCDS-Q75   | orc / snap / block | 7.76   | 12.27   | I 
-36.76%  |   5.01%|   3.87%| 10| I -59.56%  | -3.58   | 
-23.26 |
| TPCDS(20) | TPCDS-Q21   | orc / snap / block | 0.71   | 1.27| I 
-44.26%  |   4.56%|   4.24%| 10| I -77.31%  | -3.58   | 
-28.31 |
| TPCDS(20) | TPCDS-Q80A  | orc / snap / block | 9.24   | 20.42   | I 
-54.77%  |   4.03%|   3.82%| 10| I -123.12% | -3.58   | 
-40.90 |
| TPCDS(20) | TPCDS-Q39-1 | orc / snap / block | 1.07   | 2.26| I 
-52.74%  | * 23.83% * |   2.60%| 10| I -149.68% | -3.58   | 
-14.43 |
| TPCDS(20) | TPCDS-Q39-2 | orc / snap / block | 1.00   | 2.33| I 
-56.95%  | * 19.53% * |   2.07%| 10| I -151.89% 

[Impala-ASF-CR] IMPALA-11131: Replace 'cnd cwnd' with 'snd cwnd' in www/rpcz.tmpl

2022-02-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18249 )

Change subject: IMPALA-11131: Replace 'cnd_cwnd' with 'snd_cwnd' in 
www/rpcz.tmpl
..


Patch Set 2: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7854/


--
To view, visit http://gerrit.cloudera.org:8080/18249
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibf53f3420b447a895223642df3e9e1c7d0aee7df
Gerrit-Change-Number: 18249
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Fri, 18 Feb 2022 04:38:37 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11130: Upgrade Postgres JDBC driver to 42.3.3

2022-02-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18244 )

Change subject: IMPALA-11130: Upgrade Postgres JDBC driver to 42.3.3
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/18244
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic0827f9e5f24248d2ba0906f7921e38df189c3d9
Gerrit-Change-Number: 18244
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 18 Feb 2022 04:01:53 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11132 Front-end test PlannerTest.testResourceRequirements can fail

2022-02-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18250 )

Change subject: IMPALA-11132 Front-end test 
PlannerTest.testResourceRequirements can fail
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/10180/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/18250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I11c51f76212e1337a7e726097931890c2edab182
Gerrit-Change-Number: 18250
Gerrit-PatchSet: 3
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 17 Feb 2022 23:09:01 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11132 Front-end test PlannerTest.testResourceRequirements can fail

2022-02-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18250 )

Change subject: IMPALA-11132 Front-end test 
PlannerTest.testResourceRequirements can fail
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/10179/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/18250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I11c51f76212e1337a7e726097931890c2edab182
Gerrit-Change-Number: 18250
Gerrit-PatchSet: 2
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 17 Feb 2022 23:00:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11132 Front-end test PlannerTest.testResourceRequirements can fail

2022-02-17 Thread Qifan Chen (Code Review)
Qifan Chen has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/18250 )

Change subject: IMPALA-11132 Front-end test 
PlannerTest.testResourceRequirements can fail
..

IMPALA-11132 Front-end test PlannerTest.testResourceRequirements can fail

This patch addresses the potential row count over-estimation against
HBase tables by capping the estimation by the row count when available
from HMS.

Testing:
  1. PlannerTest.testResourceRequirements passes with an update on the
 cardinality of the HBase tables involved.
  2. Core test [TBD]

Change-Id: I11c51f76212e1337a7e726097931890c2edab182
---
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
2 files changed, 10 insertions(+), 8 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/18250/3
--
To view, visit http://gerrit.cloudera.org:8080/18250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I11c51f76212e1337a7e726097931890c2edab182
Gerrit-Change-Number: 18250
Gerrit-PatchSet: 3
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-11132 Front-end test PlannerTest.testResourceRequirements can fail

2022-02-17 Thread Qifan Chen (Code Review)
Qifan Chen has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/18250


Change subject: IMPALA-11132 Front-end test 
PlannerTest.testResourceRequirements can fail
..

IMPALA-11132 Front-end test PlannerTest.testResourceRequirements can fail

This patch addresses the potential row count over-estimation against HBase 
tables
by capping the estimation by the row count when available from HMS.

Testing:
  1. PlannerTest.testResourceRequirements passes with an update on the 
cardinality
 of the HBase tables involved.
  2. Core test [TBD]

Change-Id: I11c51f76212e1337a7e726097931890c2edab182
---
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
2 files changed, 10 insertions(+), 8 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/18250/2
--
To view, visit http://gerrit.cloudera.org:8080/18250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I11c51f76212e1337a7e726097931890c2edab182
Gerrit-Change-Number: 18250
Gerrit-PatchSet: 2
Gerrit-Owner: Qifan Chen 


[Impala-ASF-CR] IMPALA-11131: Replace 'cnd cwnd' with 'snd cwnd' in www/rpcz.tmpl

2022-02-17 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18249 )

Change subject: IMPALA-11131: Replace 'cnd_cwnd' with 'snd_cwnd' in 
www/rpcz.tmpl
..


Patch Set 2: Code-Review+2

Thanks to fix the bug.


--
To view, visit http://gerrit.cloudera.org:8080/18249
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibf53f3420b447a895223642df3e9e1c7d0aee7df
Gerrit-Change-Number: 18249
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Thu, 17 Feb 2022 22:19:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11131: Replace 'cnd cwnd' with 'snd cwnd' in www/rpcz.tmpl

2022-02-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18249 )

Change subject: IMPALA-11131: Replace 'cnd_cwnd' with 'snd_cwnd' in 
www/rpcz.tmpl
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/10178/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/18249
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibf53f3420b447a895223642df3e9e1c7d0aee7df
Gerrit-Change-Number: 18249
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Thu, 17 Feb 2022 22:08:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11131: Replace 'cnd cwnd' with 'snd cwnd' in www/rpcz.tmpl

2022-02-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18249 )

Change subject: IMPALA-11131: Replace 'cnd_cwnd' with 'snd_cwnd' in 
www/rpcz.tmpl
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7854/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/18249
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibf53f3420b447a895223642df3e9e1c7d0aee7df
Gerrit-Change-Number: 18249
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Thu, 17 Feb 2022 22:04:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11131: Replace 'cnd cwnd' with 'snd cwnd' in www/rpcz.tmpl

2022-02-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18249 )

Change subject: IMPALA-11131: Replace 'cnd_cwnd' with 'snd_cwnd' in 
www/rpcz.tmpl
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/18249
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibf53f3420b447a895223642df3e9e1c7d0aee7df
Gerrit-Change-Number: 18249
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Thu, 17 Feb 2022 22:04:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11131: Replace 'cnd cwnd' with 'snd cwnd' in www/rpcz.tmpl

2022-02-17 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18249 )

Change subject: IMPALA-11131: Replace 'cnd_cwnd' with 'snd_cwnd' in 
www/rpcz.tmpl
..


Patch Set 1: Code-Review+2

LGTM


--
To view, visit http://gerrit.cloudera.org:8080/18249
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibf53f3420b447a895223642df3e9e1c7d0aee7df
Gerrit-Change-Number: 18249
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Thu, 17 Feb 2022 21:58:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11131: Replace 'cnd cwnd' with 'snd cwnd' in www/rpcz.tmpl

2022-02-17 Thread Riza Suminto (Code Review)
Riza Suminto has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/18249


Change subject: IMPALA-11131: Replace 'cnd_cwnd' with 'snd_cwnd' in 
www/rpcz.tmpl
..

IMPALA-11131: Replace 'cnd_cwnd' with 'snd_cwnd' in www/rpcz.tmpl

This patch fixes a typo in www/rpcz.tmpl. The javascript at the end of
tmpl file refer to a json key 'cnd_cwnd', but the correct one should be
'snd_cwnd'. This typo caused DataTables script to repeatedly show warning
dialog pop up with message such as follow:

DataTables warning: table id=inbound_per_conn_metrics - Requested
unknown parameter '4' for row 0, column 4.
For more information about this error, please see
http://datatables.net/tn/4

Testing:
- Ran Q4 of TPC-DS against tpcds_parquet table manually. Verified that
  the page display the KRPC inbound/outbound connections table properly
  without any warning dialog.

Change-Id: Ibf53f3420b447a895223642df3e9e1c7d0aee7df
---
M www/rpcz.tmpl
1 file changed, 2 insertions(+), 2 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/18249/1
--
To view, visit http://gerrit.cloudera.org:8080/18249
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ibf53f3420b447a895223642df3e9e1c7d0aee7df
Gerrit-Change-Number: 18249
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto 


[Impala-ASF-CR] IMPALA-11130: Upgrade Postgres JDBC driver to 42.3.3

2022-02-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18244 )

Change subject: IMPALA-11130: Upgrade Postgres JDBC driver to 42.3.3
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7853/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/18244
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic0827f9e5f24248d2ba0906f7921e38df189c3d9
Gerrit-Change-Number: 18244
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 17 Feb 2022 21:29:33 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11130: Upgrade Postgres JDBC driver to 42.3.3

2022-02-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18244 )

Change subject: IMPALA-11130: Upgrade Postgres JDBC driver to 42.3.3
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/10177/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/18244
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic0827f9e5f24248d2ba0906f7921e38df189c3d9
Gerrit-Change-Number: 18244
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 17 Feb 2022 21:26:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11130: Upgrade Postgres JDBC driver to 42.3.3

2022-02-17 Thread Joe McDonnell (Code Review)
Joe McDonnell has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/18244


Change subject: IMPALA-11130: Upgrade Postgres JDBC driver to 42.3.3
..

IMPALA-11130: Upgrade Postgres JDBC driver to 42.3.3

This addresses CVE-2022-21724.

Change-Id: Ic0827f9e5f24248d2ba0906f7921e38df189c3d9
---
M bin/impala-config.sh
1 file changed, 1 insertion(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/18244/1
--
To view, visit http://gerrit.cloudera.org:8080/18244
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ic0827f9e5f24248d2ba0906f7921e38df189c3d9
Gerrit-Change-Number: 18244
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 


[Impala-ASF-CR] IMPALA-11124: Reuse local TPCH/TPCDS data in testdata loading

2022-02-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/18233 )

Change subject: IMPALA-11124: Reuse local TPCH/TPCDS data in testdata loading
..

IMPALA-11124: Reuse local TPCH/TPCDS data in testdata loading

When loading testdata for TPC-H/TPC-DS, we first run a preload script to
generate local data, and then upload them to HDFS to be used by Hive.
The preload script currently always generates the data, which is
time-consuming in large scale factors.

This patch modifies the preload scripts to check if the last run
succeeded, and reuse the data if it does. Otherwise, generate the data
and leave a success marker in the data directory.

Tests:
 - Verified the scripts locally.

Change-Id: Ied40e599cda009ae0ad88ad13385e7bb86428bb4
Reviewed-on: http://gerrit.cloudera.org:8080/18233
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M testdata/datasets/tpcds/preload
M testdata/datasets/tpch/preload
2 files changed, 14 insertions(+), 0 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/18233
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ied40e599cda009ae0ad88ad13385e7bb86428bb4
Gerrit-Change-Number: 18233
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 


[Impala-ASF-CR] IMPALA-11124: Reuse local TPCH/TPCDS data in testdata loading

2022-02-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18233 )

Change subject: IMPALA-11124: Reuse local TPCH/TPCDS data in testdata loading
..


Patch Set 3: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/18233
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ied40e599cda009ae0ad88ad13385e7bb86428bb4
Gerrit-Change-Number: 18233
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 17 Feb 2022 20:28:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10049: Include RPC call id in slow RPC logs

2022-02-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18243 )

Change subject: IMPALA-10049: Include RPC call_id in slow RPC logs
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/10176/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/18243
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7fb5746fa0be575745a8e168405d43115c425389
Gerrit-Change-Number: 18243
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Thu, 17 Feb 2022 20:10:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10049: Include RPC call id in slow RPC logs

2022-02-17 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18243 )

Change subject: IMPALA-10049: Include RPC call_id in slow RPC logs
..


Patch Set 1: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/18243
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7fb5746fa0be575745a8e168405d43115c425389
Gerrit-Change-Number: 18243
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Thu, 17 Feb 2022 20:03:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10049: Include RPC call id in slow RPC logs

2022-02-17 Thread Riza Suminto (Code Review)
Riza Suminto has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/18243


Change subject: IMPALA-10049: Include RPC call_id in slow RPC logs
..

IMPALA-10049: Include RPC call_id in slow RPC logs

KRPC log slow RPC trace in the receiver side. The trace log has the
call_id info that matches with the sender. However, our slow RPC logging
in the sender side does not log this call_id. It is hard to associate
the slow RPC logs between sender and receiver.

With the recent KRPC rebase in IMPALA-10931, we can now log the call_id
on the sender side.

Testing:
I tested this with a low threshold and delays added (the same as we did
in IMPALA-9128):

  start-impala-cluster.py \
  --impalad_args=--impala_slow_rpc_threshold_ms=1 \
  --impalad_args=--debug_actions=END_DATA_STREAM_DELAY:JITTER@3000@1.0

The following is how the logs look like on the sender and receiver sides:

impalad_node1.INFO (sender):
I0217 10:29:36.278754  6606 krpc-data-stream-sender.cc:394] Slow TransmitData 
RPC (request call id 414) to 127.0.0.1:27002 
(fragment_instance_id=d8453c2785c38df4:3473e28b0041): took 343.279ms. 
Receiver time: 342.780ms Network time: 498.405us

impalad_node2.INFO (receiver):
I0217 10:29:36.278379  6775 rpcz_store.cc:269] Call 
impala.DataStreamService.TransmitData from 127.0.0.1:39702 (request call id 
414) took 342ms. Trace:
I0217 10:29:36.278479  6775 rpcz_store.cc:270] 0217 10:29:35.935586 (+ 0us) 
impala-service-pool.cc:179] Inserting onto call queue
0217 10:29:36.277730 (+342144us) impala-service-pool.cc:278] Handling call
0217 10:29:36.277859 (+   129us) krpc-data-stream-recvr.cc:397] Deserializing 
batch
0217 10:29:36.278330 (+   471us) krpc-data-stream-recvr.cc:424] Enqueuing 
deserialized batch
0217 10:29:36.278369 (+39us) inbound_call.cc:171] Queueing success response
Metrics: {}

Change-Id: I7fb5746fa0be575745a8e168405d43115c425389
---
M be/src/runtime/krpc-data-stream-sender.cc
1 file changed, 4 insertions(+), 2 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/18243/1
--
To view, visit http://gerrit.cloudera.org:8080/18243
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I7fb5746fa0be575745a8e168405d43115c425389
Gerrit-Change-Number: 18243
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto 


[Impala-ASF-CR] IMPALA-9433: Improved caching of HdfsFileHandles

2022-02-17 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18191 )

Change subject: IMPALA-9433: Improved caching of HdfsFileHandles
..


Patch Set 21:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/18191/21/be/src/runtime/io/handle-cache.h
File be/src/runtime/io/handle-cache.h:

http://gerrit.cloudera.org:8080/#/c/18191/21/be/src/runtime/io/handle-cache.h@118
PS21, Line 118: return file_handle.mtime() == mtime;
The semantics of mtime  is unclear to me - why do we handle it separately and 
not as the part of the key? My understanding is that we always do lookups with 
exact file name and mtime and they remain constant for the lifetime of the 
handle.


http://gerrit.cloudera.org:8080/#/c/18191/21/be/src/util/lru-multi-cache.h
File be/src/util/lru-multi-cache.h:

http://gerrit.cloudera.org:8080/#/c/18191/21/be/src/util/lru-multi-cache.h@78
PS21, Line 78: /// this is a intrusive (as known as not owning) list, used 
for eviction
nit: an


http://gerrit.cloudera.org:8080/#/c/18191/21/be/src/util/lru-multi-cache.inline.h
File be/src/util/lru-multi-cache.inline.h:

http://gerrit.cloudera.org:8080/#/c/18191/21/be/src/util/lru-multi-cache.inline.h@197
PS21, Line 197:   p_value_internal->timestamp_seconds = MonotonicSeconds()
I think that we didn't refresh timestamp_second in the old version, so sooner 
or later we closed handles, even if it was used again and again. This can 
influence what happens if the file was deleted - keeping a file handle always 
open means the replica on the given data node cannot be deleted.

https://www.quora.com/What-happens-if-you-attempt-to-delete-a-file-in-HDFS-while-its-being-used-for-a-job



--
To view, visit http://gerrit.cloudera.org:8080/18191
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6b5c5e9e2b5db2847ab88c41f667c9ca1b03d51a
Gerrit-Change-Number: 18191
Gerrit-PatchSet: 21
Gerrit-Owner: Gergely Fürnstáhl 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 17 Feb 2022 19:03:12 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11053: Impala should be able to read migrated partitioned Iceberg tables

2022-02-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18240 )

Change subject: IMPALA-11053: Impala should be able to read migrated 
partitioned Iceberg tables
..


Patch Set 2: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7851/


--
To view, visit http://gerrit.cloudera.org:8080/18240
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iac11a02de709d43532056f71359c49d20c1be2b8
Gerrit-Change-Number: 18240
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 17 Feb 2022 18:51:26 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9498: Allow returning arrays in select list

2022-02-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/17811 )

Change subject: IMPALA-9498: Allow returning arrays in select list
..

IMPALA-9498: Allow returning arrays in select list

Until now ARRAYs had to be unnested in queries. This patch adds
support to return ARRAYs as STRINGs (JSON arrays) in select list,
for example:
select id, int_array from functional_parquet.complextypestbl where id = 1;
returns: 1, [1,2,3]

Returning ARRAYs from inline or HMS views is also supported -
these arrays can be used both in the select list or as relative
table references. Using them as non-relative table reference is
not supported (IMPALA-11052).

Though STRUCTs are already supported, ARRAYs and STRUCTs nested in
each other are not supported yet.

Things intentionally postponed for later commits:
- Add MAP suppport too - this shouldn't be too tricky after
  ARRAY support, but I don't want to make this patch even more
  complex.
- Unify HS2 / Beeswax logic with the way STRUCTs are handled.
  This could be done in a "final" logic that can handle
  STRUCTS/ARRAYS nested to each other
- Implement "deep copy" and "deep serialize" for ARRAYs in BE.
  This would enable all operators, e.g. ORDER BY and UNION.

Testing:
- FE tests were added for analyses and authorization
- EE tests were added
- core tests were ran

Change-Id: Ibb1e42ffb21c7ddc033aba0f754b0108e46f34d0
Reviewed-on: http://gerrit.cloudera.org:8080/17811
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M be/src/codegen/codegen-anyval.cc
M be/src/exec/blocking-plan-root-sink.cc
M be/src/exec/buffered-plan-root-sink.cc
M be/src/exec/parquet/parquet-collection-column-reader.cc
M be/src/exec/plan-root-sink.cc
M be/src/exec/plan-root-sink.h
M be/src/exprs/expr.h
M be/src/exprs/slot-ref.cc
M be/src/exprs/slot-ref.h
M be/src/runtime/collection-value.h
M be/src/runtime/raw-value.cc
M be/src/runtime/raw-value.h
M be/src/runtime/types.cc
M be/src/runtime/types.h
M be/src/service/hs2-util.cc
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-server.h
M be/src/service/query-result-set.cc
M be/src/service/query-result-set.h
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/CollectionTableRef.java
M fe/src/main/java/org/apache/impala/analysis/FromClause.java
M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/SetOperationStmt.java
M fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
M fe/src/main/java/org/apache/impala/analysis/TableRef.java
M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java
M fe/src/main/java/org/apache/impala/authorization/TableMask.java
M fe/src/main/java/org/apache/impala/catalog/Type.java
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/planner/UnionNode.java
M fe/src/main/java/org/apache/impala/planner/UnnestNode.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeUpsertStmtTest.java
M fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-nested.test
A 
testdata/workloads/functional-query/queries/QueryTest/nested-array-in-select-list.test
M 
testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_complex_types.test
M 
testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_struct_in_select_list.test
M 
testdata/workloads/functional-query/queries/QueryTest/struct-in-select-list.test
M 
testdata/workloads/functional-query/queries/QueryTest/zipping-unnest-in-from-clause.test
M 
testdata/workloads/functional-query/queries/QueryTest/zipping-unnest-in-select-list.test
M tests/authorization/test_ranger.py
M tests/query_test/test_nested_types.py
56 files changed, 1,158 insertions(+), 254 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified


[Impala-ASF-CR] IMPALA-9498: Allow returning arrays in select list

2022-02-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17811 )

Change subject: IMPALA-9498: Allow returning arrays in select list
..


Patch Set 34: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/17811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb1e42ffb21c7ddc033aba0f754b0108e46f34d0
Gerrit-Change-Number: 17811
Gerrit-PatchSet: 34
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 17 Feb 2022 18:51:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11115: Fix hitting DCHECK for brotli and deflate compressions

2022-02-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18242 )

Change subject: IMPALA-5: Fix hitting DCHECK for brotli and deflate 
compressions
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/10175/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/18242
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic38294b108ff3c4aa0b49117df95c5a1b8c60a4b
Gerrit-Change-Number: 18242
Gerrit-PatchSet: 1
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 17 Feb 2022 16:46:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11115: Fix hitting DCHECK for brotli and deflate compressions

2022-02-17 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/18242


Change subject: IMPALA-5: Fix hitting DCHECK for brotli and deflate 
compressions
..

IMPALA-5: Fix hitting DCHECK for brotli and deflate compressions

The DCHECK was hit when an unsupported compression was included in
enum THdfsCompression but not in COMPRESSION_MAP.
Removed COMPRESSION_MAP as we can get the names from enum
THdfsCompression directly.

In release builds this didn't cause a crash, only a weird error
message ("INVALID" instead of the compression name).

Testing:
- added ee tests that try to insert with brotli and deflate

Change-Id: Ic38294b108ff3c4aa0b49117df95c5a1b8c60a4b
---
M be/src/util/codec.cc
M common/thrift/CatalogObjects.thrift
M 
testdata/workloads/functional-query/queries/QueryTest/insert_parquet_invalid_codec.test
M tests/query_test/test_insert_parquet.py
4 files changed, 17 insertions(+), 23 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/18242/1
--
To view, visit http://gerrit.cloudera.org:8080/18242
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ic38294b108ff3c4aa0b49117df95c5a1b8c60a4b
Gerrit-Change-Number: 18242
Gerrit-PatchSet: 1
Gerrit-Owner: Csaba Ringhofer 


[Impala-ASF-CR] IMPALA-10838: Error when struct returned from WITH()

2022-02-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17847 )

Change subject: IMPALA-10838: Error when struct returned from WITH()
..


Patch Set 16:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/10174/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17847
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iadb9233677355b85d424cc3f22b00b5a3bf61c57
Gerrit-Change-Number: 17847
Gerrit-PatchSet: 16
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Comment-Date: Thu, 17 Feb 2022 15:42:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10838: Error when struct returned from WITH()

2022-02-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17847 )

Change subject: IMPALA-10838: Error when struct returned from WITH()
..


Patch Set 16:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/17847/16/fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
File fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java:

http://gerrit.cloudera.org:8080/#/c/17847/16/fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java@336
PS16, Line 336:   // // List rawPath = new 
ArrayList<>(childSlotRef.getResolvedPath().getRawPath());
line too long (96 > 90)


http://gerrit.cloudera.org:8080/#/c/17847/16/fe/src/main/java/org/apache/impala/analysis/Path.java
File fe/src/main/java/org/apache/impala/analysis/Path.java:

http://gerrit.cloudera.org:8080/#/c/17847/16/fe/src/main/java/org/apache/impala/analysis/Path.java@481
PS16, Line 481:   public static List 
getRawPathWithoutPrefix(List path, List prefix) {
line too long (94 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/17847
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iadb9233677355b85d424cc3f22b00b5a3bf61c57
Gerrit-Change-Number: 17847
Gerrit-PatchSet: 16
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Comment-Date: Thu, 17 Feb 2022 15:19:26 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10838: Error when struct returned from WITH()

2022-02-17 Thread Daniel Becker (Code Review)
Daniel Becker has uploaded a new patch set (#16). ( 
http://gerrit.cloudera.org:8080/17847 )

Change subject: IMPALA-10838: Error when struct returned from WITH()
..

IMPALA-10838: Error when struct returned from WITH()

The following query fails:
'''
with sub as (
select id, outer_struct
from functional_orc_def.complextypes_nested_structs)
select sub.id, sub.outer_struct.inner_struct2 from sub;
'''

with the following error:
'''
ERROR: IllegalStateException: Illegal reference to non-materialized
tuple: debugname=InlineViewRef sub alias=sub tid=6
'''

while if 'outer_struct.inner_struct2' is added to the select list of the
inline view, the query works as expected.

This change fixes the problem by two modifications:
  - if a field of a struct needs to be materialised, also materialise
all of its enclosing structs (ancestors)
  - in InlineViewRef, struct fields are inserted into the 'smap' and
'baseTableSmap' with the appropriate inline view prefix
TODO: Is this just a hack with the labels?

This change also changes the way struct fields are materialised: until
now, if a member of a struct was needed to be materialised, the whole
struct, including other members of the struct were materialised. This
behaviour can lead to using significantly more memory than necessary if
we for example query a single member of a large struct. This change
modifies this behaviour so that we only materialise the struct members
that are actually needed.

Tests:
  - added queries that are fixed by this change (including the one
above) in nested-struct-in-select-list.test
  - added a planner test in
fe/src/test/java/org/apache/impala/planner/PlannerTest.java that
asserts that only the required parts of structs are materialised

Change-Id: Iadb9233677355b85d424cc3f22b00b5a3bf61c57
---
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java
M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
M fe/src/main/java/org/apache/impala/analysis/Path.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M fe/src/main/java/org/apache/impala/analysis/SortInfo.java
M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
M 
testdata/workloads/functional-query/queries/QueryTest/nested-struct-in-select-list.test
15 files changed, 776 insertions(+), 92 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/47/17847/16
--
To view, visit http://gerrit.cloudera.org:8080/17847
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iadb9233677355b85d424cc3f22b00b5a3bf61c57
Gerrit-Change-Number: 17847
Gerrit-PatchSet: 16
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 


[Impala-ASF-CR] IMPALA-11124: Reuse local TPCH/TPCDS data in testdata loading

2022-02-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18233 )

Change subject: IMPALA-11124: Reuse local TPCH/TPCDS data in testdata loading
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7852/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/18233
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ied40e599cda009ae0ad88ad13385e7bb86428bb4
Gerrit-Change-Number: 18233
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 17 Feb 2022 13:48:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11124: Reuse local TPCH/TPCDS data in testdata loading

2022-02-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18233 )

Change subject: IMPALA-11124: Reuse local TPCH/TPCDS data in testdata loading
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/18233
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ied40e599cda009ae0ad88ad13385e7bb86428bb4
Gerrit-Change-Number: 18233
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 17 Feb 2022 13:48:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11053: Impala should be able to read migrated partitioned Iceberg tables

2022-02-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18240 )

Change subject: IMPALA-11053: Impala should be able to read migrated 
partitioned Iceberg tables
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7851/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/18240
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iac11a02de709d43532056f71359c49d20c1be2b8
Gerrit-Change-Number: 18240
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 17 Feb 2022 13:33:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11008: fix incorrect to propagate inferred predicates

2022-02-17 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18234 )

Change subject: IMPALA-11008: fix incorrect to propagate inferred predicates
..


Patch Set 4:

(7 comments)

Thanks for fixing this subtle bug!

http://gerrit.cloudera.org:8080/#/c/18234/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18234/4//COMMIT_MSG@15
PS4, Line 15: select * from (select id is not null and col is null as a from
: (select A.id, B.col from A left join B on A.id = B.id) t ) t
: where a = 1
nit: could you format this a bit for readability? E.g.

 select * from (
   select id is not null and col is null as a
   from (select A.id, B.col from A left join B on A.id = B.id) t
 ) v
 where a = 1


http://gerrit.cloudera.org:8080/#/c/18234/4//COMMIT_MSG@18
PS4, Line 18: (B.id is not null and
: B.col is null = 1)
nit: for readability, could you format this a bit? E.g.

 "(B.id is not null and B.col is null) = 1"


http://gerrit.cloudera.org:8080/#/c/18234/4//COMMIT_MSG@20
PS4, Line 20: (A.id is not null and B.col is null = 1)
nit: could you format this a bit as above?


http://gerrit.cloudera.org:8080/#/c/18234/4/fe/src/main/java/org/apache/impala/analysis/Analyzer.java
File fe/src/main/java/org/apache/impala/analysis/Analyzer.java:

http://gerrit.cloudera.org:8080/#/c/18234/4/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@2009
PS4, Line 2009: if (hasOuterJoinedTuple && 
isTrueWithNullSlots(srcConjunct)) continue;
Is it possible to reject the conjunct here? These codes aim to reject conjuncts 
that can't be pushed down to the nullable side. Unfortunatly, 
isTrueWithNullSlots() can't cover this case. We probably need a more strict 
check, ie. only substitute slots belong to this tuple with nulls (instead of 
substituting all slots).


http://gerrit.cloudera.org:8080/#/c/18234/4/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@2043
PS4, Line 2043:   // It is incorrect to propagate predicates inferred 
from equi-join conjuncts
  :   // into a plan subtree that is on the nullable side 
of an outer join if the
  :   // predicate is not null-filtering conditions for the 
nullable side.
  :   // For example:
  :   // select * from (select id is not null and col is 
null as a from (select A.id,
  :   // B.col from A left join B on A.id = B.id) t ) t 
where a = 1
  :   // In this query the inferred predicate (B.id is not 
null and B.col is null = 1)
  :   // should not be evaluated at the scanner of B.
Could you also add some comments about "ojsmap". I think we want to substitue 
the non-outer-join slots first and do some checks on the conjunct before the 
final substitution.


http://gerrit.cloudera.org:8080/#/c/18234/4/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@2647
PS4, Line 2647:   registerOjEqualSlots(outerSlot, innerSlot);
Should we only do this for outer joins? Now we also register anti join slots.


http://gerrit.cloudera.org:8080/#/c/18234/4/testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test:

http://gerrit.cloudera.org:8080/#/c/18234/4/testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test@1617
PS4, Line 1617: |  other predicates: CASE WHEN t1.id IS NOT NULL AND 
t2.some_nulls IS NULL THEN TRUE ELSE NULL END IS NOT NULL
Here we are missing a runtime filter, which causes the test failure: 
https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/15787/testReport/junit/org.apache.impala.planner/PlannerTest/testPredicatePropagation/



--
To view, visit http://gerrit.cloudera.org:8080/18234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9e64230f6d0c2b9ef1560186ceba349a5920ccdf
Gerrit-Change-Number: 18234
Gerrit-PatchSet: 4
Gerrit-Owner: Xianqing He 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 17 Feb 2022 12:43:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10948: Default scale and DecimalType

2022-02-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18224 )

Change subject: IMPALA-10948: Default scale and DecimalType
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/10173/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/18224
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I003220b6e2ef39d25d1c33df62c8432803fdc6eb
Gerrit-Change-Number: 18224
Gerrit-PatchSet: 5
Gerrit-Owner: Gergely Fürnstáhl 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 17 Feb 2022 12:33:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9498: Allow returning arrays in select list

2022-02-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17811 )

Change subject: IMPALA-9498: Allow returning arrays in select list
..


Patch Set 34:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7850/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/17811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb1e42ffb21c7ddc033aba0f754b0108e46f34d0
Gerrit-Change-Number: 17811
Gerrit-PatchSet: 34
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 17 Feb 2022 12:16:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9498: Allow returning arrays in select list

2022-02-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17811 )

Change subject: IMPALA-9498: Allow returning arrays in select list
..


Patch Set 34: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb1e42ffb21c7ddc033aba0f754b0108e46f34d0
Gerrit-Change-Number: 17811
Gerrit-PatchSet: 34
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 17 Feb 2022 12:16:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11053: Impala should be able to read migrated partitioned Iceberg tables

2022-02-17 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18240 )

Change subject: IMPALA-11053: Impala should be able to read migrated 
partitioned Iceberg tables
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/18240/2/be/src/exec/file-metadata-utils.cc
File be/src/exec/file-metadata-utils.cc:

http://gerrit.cloudera.org:8080/#/c/18240/2/be/src/exec/file-metadata-utils.cc@69
PS2, Line 69: auto transforms = ice_metadata->partition_keys();
: if (transforms != nullptr) {
Should be moved after L58.



--
To view, visit http://gerrit.cloudera.org:8080/18240
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iac11a02de709d43532056f71359c49d20c1be2b8
Gerrit-Change-Number: 18240
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 17 Feb 2022 12:09:46 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10948: Default scale and DecimalType

2022-02-17 Thread Code Review
Gergely Fürnstáhl has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/18224 )

Change subject: IMPALA-10948: Default scale and DecimalType
..

IMPALA-10948: Default scale and DecimalType

Added default 0 for scale if it is not set to comply with parquet spec.

Wrapped reading scale and precision in a function to support reading
LogicalType.DecimalType if it is set, falling back to old ones if it is
not, for backward compatibility.

Regenerated bad_parquet_decimals table with filled DecimalType, moved
missing scale test, as it is no longer a bad table.

Added no_scale.parquet table to test reading table without set scale.

Checked it with parquet-tools:
message schema {
  optional fixed_len_byte_array(2) d1 (DECIMAL(4,0));
}

Change-Id: I003220b6e2ef39d25d1c33df62c8432803fdc6eb
---
M be/src/exec/parquet/parquet-data-converter.h
M be/src/exec/parquet/parquet-metadata-utils.cc
M testdata/bad_parquet_data/README
M testdata/bad_parquet_data/illegal_decimals.parq
M testdata/data/README
A testdata/data/no_scale.parquet
A testdata/workloads/functional-query/queries/QueryTest/default-scale.test
M 
testdata/workloads/functional-query/queries/QueryTest/parquet-abort-on-error.test
M 
testdata/workloads/functional-query/queries/QueryTest/parquet-continue-on-error.test
M tests/query_test/test_scanners.py
10 files changed, 126 insertions(+), 48 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/18224/5
--
To view, visit http://gerrit.cloudera.org:8080/18224
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I003220b6e2ef39d25d1c33df62c8432803fdc6eb
Gerrit-Change-Number: 18224
Gerrit-PatchSet: 5
Gerrit-Owner: Gergely Fürnstáhl 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-10948: Default scale and DecimalType

2022-02-17 Thread Code Review
Gergely Fürnstáhl has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18224 )

Change subject: IMPALA-10948: Default scale and DecimalType
..


Patch Set 4:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/18224/4/be/src/exec/parquet/parquet-data-converter.h
File be/src/exec/parquet/parquet-data-converter.h:

http://gerrit.cloudera.org:8080/#/c/18224/4/be/src/exec/parquet/parquet-data-converter.h@74
PS4, Line 74: if (parquet_element_->__isset.logicalType
: && parquet_element_->logicalType.__isset.DECIMAL)
:   return parquet_element_->logicalType.DECIMAL.precision;
> nit: multi-line if stmts should use braces.
Done


http://gerrit.cloudera.org:8080/#/c/18224/4/be/src/exec/parquet/parquet-metadata-utils.cc
File be/src/exec/parquet/parquet-metadata-utils.cc:

http://gerrit.cloudera.org:8080/#/c/18224/4/be/src/exec/parquet/parquet-metadata-utils.cc@208
PS4, Line 208: Precision is required, this should be called after checking 
IsPrecisionSet
> We could add a DCHECK(IsPrecisionSet(schema_element));
Done


http://gerrit.cloudera.org:8080/#/c/18224/4/testdata/data/README
File testdata/data/README:

http://gerrit.cloudera.org:8080/#/c/18224/4/testdata/data/README@682
PS4, Line 682: .__set_scale(1);
> Do we need this line?
With scale=1 I wanted to showcase it does not read out the default 0 from 
there, rather getting it from the getter.


http://gerrit.cloudera.org:8080/#/c/18224/4/testdata/data/README@684
PS4, Line 684: +  file_metadata_.schema[1].logicalType.DECIMAL.scale = 1;
 : +  file_metadata_.schema[1].logicalType.__isset.DECIMAL = false;
> Are these lines needed?
With scale=1 I wanted to showcase it does not read out the default 0 from 
there, rather getting it from the getter.

Possibly__isset.logicalType =false is enough, but I think the intention is more 
clear this way


http://gerrit.cloudera.org:8080/#/c/18224/4/tests/query_test/test_scanners.py
File tests/query_test/test_scanners.py:

http://gerrit.cloudera.org:8080/#/c/18224/4/tests/query_test/test_scanners.py@393
PS4, Line 393:
> nit: unnecessary blank line
Done


http://gerrit.cloudera.org:8080/#/c/18224/4/tests/query_test/test_scanners.py@394
PS4, Line 394: default-scale
> default-scale.test is not added to the PS.
Done



--
To view, visit http://gerrit.cloudera.org:8080/18224
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I003220b6e2ef39d25d1c33df62c8432803fdc6eb
Gerrit-Change-Number: 18224
Gerrit-PatchSet: 4
Gerrit-Owner: Gergely Fürnstáhl 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 17 Feb 2022 12:08:28 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10737: Optimize the number of Iceberg API Metadata requests

2022-02-17 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18226 )

Change subject: IMPALA-10737: Optimize the number of Iceberg API Metadata 
requests
..


Patch Set 1:

(1 comment)

The change looks good. I only had one ask about caching which was overlooked 
earlier.

http://gerrit.cloudera.org:8080/#/c/18226/1/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java:

http://gerrit.cloudera.org:8080/#/c/18226/1/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@1036
PS1, Line 1036: TGetPartialCatalogObjectResponse resp = sendRequest(req);
Currently we always send the request for every query. Can we add caching? See 
loadWithCaching() above.



--
To view, visit http://gerrit.cloudera.org:8080/18226
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9e62a1fb9753ea1b022c7763047d9ccfd1d27d62
Gerrit-Change-Number: 18226
Gerrit-PatchSet: 1
Gerrit-Owner: Tamas Mate 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 17 Feb 2022 12:04:12 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9498: Allow returning arrays in select list

2022-02-17 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17811 )

Change subject: IMPALA-9498: Allow returning arrays in select list
..


Patch Set 33: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17811/27/testdata/workloads/functional-query/queries/QueryTest/nested-array-in-select-list.test
File 
testdata/workloads/functional-query/queries/QueryTest/nested-array-in-select-list.test:

http://gerrit.cloudera.org:8080/#/c/17811/27/testdata/workloads/functional-query/queries/QueryTest/nested-array-in-select-list.test@105
PS27, Line 105: Changing a column to a different type
> Sorry, I didn't read your comment carefully
Thanks for looking into this!



--
To view, visit http://gerrit.cloudera.org:8080/17811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb1e42ffb21c7ddc033aba0f754b0108e46f34d0
Gerrit-Change-Number: 17811
Gerrit-PatchSet: 33
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 17 Feb 2022 11:53:31 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11053: Impala should be able to read migrated partitioned Iceberg tables

2022-02-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18240 )

Change subject: IMPALA-11053: Impala should be able to read migrated 
partitioned Iceberg tables
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/10172/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/18240
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iac11a02de709d43532056f71359c49d20c1be2b8
Gerrit-Change-Number: 18240
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Comment-Date: Thu, 17 Feb 2022 11:22:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10948: Default scale and DecimalType

2022-02-17 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18224 )

Change subject: IMPALA-10948: Default scale and DecimalType
..


Patch Set 4:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/18224/4/be/src/exec/parquet/parquet-data-converter.h
File be/src/exec/parquet/parquet-data-converter.h:

http://gerrit.cloudera.org:8080/#/c/18224/4/be/src/exec/parquet/parquet-data-converter.h@74
PS4, Line 74: if (parquet_element_->__isset.logicalType
: && parquet_element_->logicalType.__isset.DECIMAL)
:   return parquet_element_->logicalType.DECIMAL.precision;
nit: multi-line if stmts should use braces.


http://gerrit.cloudera.org:8080/#/c/18224/4/be/src/exec/parquet/parquet-metadata-utils.cc
File be/src/exec/parquet/parquet-metadata-utils.cc:

http://gerrit.cloudera.org:8080/#/c/18224/4/be/src/exec/parquet/parquet-metadata-utils.cc@208
PS4, Line 208: Precision is required, this should be called after checking 
IsPrecisionSet
We could add a DCHECK(IsPrecisionSet(schema_element));


http://gerrit.cloudera.org:8080/#/c/18224/4/testdata/data/README
File testdata/data/README:

http://gerrit.cloudera.org:8080/#/c/18224/4/testdata/data/README@682
PS4, Line 682: .__set_scale(1);
Do we need this line?


http://gerrit.cloudera.org:8080/#/c/18224/4/testdata/data/README@684
PS4, Line 684: +  file_metadata_.schema[1].logicalType.DECIMAL.scale = 1;
 : +  file_metadata_.schema[1].logicalType.__isset.DECIMAL = false;
Are these lines needed?


http://gerrit.cloudera.org:8080/#/c/18224/4/tests/query_test/test_scanners.py
File tests/query_test/test_scanners.py:

http://gerrit.cloudera.org:8080/#/c/18224/4/tests/query_test/test_scanners.py@393
PS4, Line 393:
nit: unnecessary blank line


http://gerrit.cloudera.org:8080/#/c/18224/4/tests/query_test/test_scanners.py@394
PS4, Line 394: default-scale
default-scale.test is not added to the PS.



--
To view, visit http://gerrit.cloudera.org:8080/18224
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I003220b6e2ef39d25d1c33df62c8432803fdc6eb
Gerrit-Change-Number: 18224
Gerrit-PatchSet: 4
Gerrit-Owner: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 17 Feb 2022 11:11:34 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11008: fix incorrect to propagate inferred predicates

2022-02-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18234 )

Change subject: IMPALA-11008: fix incorrect to propagate inferred predicates
..


Patch Set 4: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7849/


--
To view, visit http://gerrit.cloudera.org:8080/18234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9e64230f6d0c2b9ef1560186ceba349a5920ccdf
Gerrit-Change-Number: 18234
Gerrit-PatchSet: 4
Gerrit-Owner: Xianqing He 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 17 Feb 2022 11:08:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10948: Default scale and DecimalType

2022-02-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18224 )

Change subject: IMPALA-10948: Default scale and DecimalType
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/10171/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/18224
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I003220b6e2ef39d25d1c33df62c8432803fdc6eb
Gerrit-Change-Number: 18224
Gerrit-PatchSet: 4
Gerrit-Owner: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 17 Feb 2022 10:59:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11053: Impala should be able to read migrated partitioned Iceberg tables

2022-02-17 Thread Zoltan Borok-Nagy (Code Review)
Hello Tamas Mate, Gergely Fürnstáhl, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/18240

to look at the new patch set (#2).

Change subject: IMPALA-11053: Impala should be able to read migrated 
partitioned Iceberg tables
..

IMPALA-11053: Impala should be able to read migrated partitioned Iceberg tables

When Hive (and probably other engines as well) converts a legacy Hive
table to Iceberg it doesn't rewrite the data files. It means that the
data files don't have write ids neither partition column data. Currently
Impala expects the partition columns to be present in the data files,
so it is not be able to read converted partitioned tables.

With this patch Impala loads partition values from the Iceberg metadata.
The extra metadata information is attached to the file descriptor
objects and propageted to the scanners. This metadata contains the
Iceberg data file format (later it could be used to handle mixed-format
tables), and partition data.

We use the partition data in the HdfsScanner to create the template
tuple that contains the partition values of identity-partitioned
columns. This is not only true to migrated tables, but all Iceberg
tables with identity partitions, which means we also save some IO
and CPU time for such columns. The partition information could also
be used for Dynamic Partition Pruning later.

We use the (human-readable) string representation of the partition data
when storing them in the flat buffers. This helps debugging, also
it provides the needed flexibility when the partition columns
evolve (e.g. INT -> BIGINT, DECIMAL(4,2) -> DECIMAL(6,2)).

Testing
 * e2e test for all data types that can be used to partition a table
 * e2e test for migrated partitioned table + schema evolution (without
   renaming columns)
 * e2e for table where all column is used as identity-partitions

Change-Id: Iac11a02de709d43532056f71359c49d20c1be2b8
---
M be/src/exec/CMakeLists.txt
A be/src/exec/file-metadata-utils.cc
A be/src/exec/file-metadata-utils.h
M be/src/exec/hdfs-orc-scanner.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scanner.cc
M be/src/exec/hdfs-scanner.h
M be/src/exec/orc-column-readers.cc
M be/src/exec/parquet/hdfs-parquet-scanner.cc
M be/src/exec/parquet/parquet-metadata-utils.h
M be/src/runtime/dml-exec-state.cc
M be/src/scheduling/scheduler.cc
M common/fbs/CatalogObjects.fbs
M common/fbs/IcebergObjects.fbs
M common/protobuf/planner.proto
M common/thrift/CatalogObjects.thrift
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/util/IcebergUtil.java
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part/metadata/283c54cb-5a45-4a2c-bca8-4bfa0e61cdbd-m0.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part/metadata/snap-6167994413873848621-1-283c54cb-5a45-4a2c-bca8-4bfa0e61cdbd.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part/metadata/v1.metadata.json
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part/metadata/v2.metadata.json
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part/p_bool=true/p_int=1/p_bigint=11/p_float=1.1/p_double=2.222/p_decimal=123.321/p_date=2022-02-22/p_string=impala/00_0
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part_orc/metadata/db72fbf2-f9f6-4985-8a5f-fd9f632f2c77-m0.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part_orc/metadata/snap-7569365419257304230-1-db72fbf2-f9f6-4985-8a5f-fd9f632f2c77.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part_orc/metadata/v1.metadata.json
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part_orc/metadata/v2.metadata.json
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part_orc/metadata/version-hint.text
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part_orc/p_bool=true/p_int=1/p_bigint=11/p_float=1.1/p_double=2.222/p_decimal=123.321/p_date=2022-02-22/p_string=impala/00_0
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution/metadata/2d05a7d4-c229-44c3-860e-e77e46e71a19-m0.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution/metadata/snap-6654673546382518186-1-2d05a7d4-c229-44c3-860e-e77e46e71a19.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution/metadata/v1.metadata.json
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution/metadata/v2.metadata.json
A 

[Impala-ASF-CR] IMPALA-10948: Default scale and DecimalType

2022-02-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18224 )

Change subject: IMPALA-10948: Default scale and DecimalType
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/10170/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/18224
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I003220b6e2ef39d25d1c33df62c8432803fdc6eb
Gerrit-Change-Number: 18224
Gerrit-PatchSet: 3
Gerrit-Owner: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 17 Feb 2022 10:52:14 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10948: Default scale and DecimalType

2022-02-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18224 )

Change subject: IMPALA-10948: Default scale and DecimalType
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/10169/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/18224
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I003220b6e2ef39d25d1c33df62c8432803fdc6eb
Gerrit-Change-Number: 18224
Gerrit-PatchSet: 2
Gerrit-Owner: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 17 Feb 2022 10:49:11 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11053: Impala should be able to read migrated partitioned Iceberg tables

2022-02-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18240 )

Change subject: IMPALA-11053: Impala should be able to read migrated 
partitioned Iceberg tables
..


Patch Set 1:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/10168/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/18240
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iac11a02de709d43532056f71359c49d20c1be2b8
Gerrit-Change-Number: 18240
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Comment-Date: Thu, 17 Feb 2022 10:42:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10948: Default scale and DecimalType

2022-02-17 Thread Code Review
Gergely Fürnstáhl has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/18224 )

Change subject: IMPALA-10948: Default scale and DecimalType
..

IMPALA-10948: Default scale and DecimalType

Added default 0 for scale if it is not set to comply with parquet spec.

Wrapped reading scale and precision in a function to support reading
LogicalType.DecimalType if it is set, falling back to old ones if it is
not, for backward compatibility.

Regenerated bad_parquet_decimals table with filled DecimalType, moved
missing scale test, as it is no longer a bad table.

Added no_scale.parquet table to test reading table without set scale.

Checked it with parquet-tools:
message schema {
  optional fixed_len_byte_array(2) d1 (DECIMAL(4,0));
}

Change-Id: I003220b6e2ef39d25d1c33df62c8432803fdc6eb
---
M be/src/exec/parquet/parquet-data-converter.h
M be/src/exec/parquet/parquet-metadata-utils.cc
M testdata/bad_parquet_data/README
M testdata/bad_parquet_data/illegal_decimals.parq
M testdata/data/README
A testdata/data/no_scale.parquet
M 
testdata/workloads/functional-query/queries/QueryTest/parquet-abort-on-error.test
M 
testdata/workloads/functional-query/queries/QueryTest/parquet-continue-on-error.test
M tests/query_test/test_scanners.py
9 files changed, 120 insertions(+), 48 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/18224/4
--
To view, visit http://gerrit.cloudera.org:8080/18224
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I003220b6e2ef39d25d1c33df62c8432803fdc6eb
Gerrit-Change-Number: 18224
Gerrit-PatchSet: 4
Gerrit-Owner: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-10948: Default scale and DecimalType

2022-02-17 Thread Code Review
Gergely Fürnstáhl has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/18224 )

Change subject: IMPALA-10948: Default scale and DecimalType
..

IMPALA-10948: Default scale and DecimalType

Added default 0 for scale if it is not set to comply with parquet spec.

Wrapped reading scale and precision in a function to support reading
LogicalType.DecimalType if it is set, falling back to old ones if it is
not, for backward compatibility.

Regenerated bad_parquet_decimals table with filled DecimalType, moved
missing scale test, as it is no longer a bad table.

Added no_scale.parquet table to test reading table without set scale.

Change-Id: I003220b6e2ef39d25d1c33df62c8432803fdc6eb
---
M be/src/exec/parquet/parquet-data-converter.h
M be/src/exec/parquet/parquet-metadata-utils.cc
M testdata/bad_parquet_data/README
M testdata/bad_parquet_data/illegal_decimals.parq
M testdata/data/README
A testdata/data/no_scale.parquet
M 
testdata/workloads/functional-query/queries/QueryTest/parquet-abort-on-error.test
M 
testdata/workloads/functional-query/queries/QueryTest/parquet-continue-on-error.test
M tests/query_test/test_scanners.py
9 files changed, 120 insertions(+), 48 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/18224/3
--
To view, visit http://gerrit.cloudera.org:8080/18224
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I003220b6e2ef39d25d1c33df62c8432803fdc6eb
Gerrit-Change-Number: 18224
Gerrit-PatchSet: 3
Gerrit-Owner: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-10948: Default scale and DecimalType

2022-02-17 Thread Code Review
Gergely Fürnstáhl has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/18224


Change subject: IMPALA-10948: Default scale and DecimalType
..

IMPALA-10948: Default scale and DecimalType

Added default 0 for scale if it is not set to comply with parquet spec.

Wrapped reading scale and precision in a function to support reading
LogicalType.DecimalType if it is set, falling back to old ones if it is
not, for backward compatibility.

Regenerated bad_parquet_decimals table with filled DecimalType, moved
missing scale test, as it is no longer a bad table.

Added no_scale.parquet table to test reading table without set scale.

Change-Id: I003220b6e2ef39d25d1c33df62c8432803fdc6eb
---
M be/src/exec/parquet/parquet-data-converter.h
M be/src/exec/parquet/parquet-metadata-utils.cc
M testdata/bad_parquet_data/README
M testdata/bad_parquet_data/illegal_decimals.parq
M testdata/data/README
A testdata/data/no_scale.parquet
M 
testdata/workloads/functional-query/queries/QueryTest/parquet-abort-on-error.test
M 
testdata/workloads/functional-query/queries/QueryTest/parquet-continue-on-error.test
M tests/query_test/test_scanners.py
9 files changed, 117 insertions(+), 48 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/18224/2
--
To view, visit http://gerrit.cloudera.org:8080/18224
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I003220b6e2ef39d25d1c33df62c8432803fdc6eb
Gerrit-Change-Number: 18224
Gerrit-PatchSet: 2
Gerrit-Owner: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-11053: Impala should be able to read migrated partitioned Iceberg tables

2022-02-17 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/18240


Change subject: IMPALA-11053: Impala should be able to read migrated 
partitioned Iceberg tables
..

IMPALA-11053: Impala should be able to read migrated partitioned Iceberg tables

When Hive (and probably other engines as well) converts a legacy Hive
table to Iceberg it doesn't rewrite the data files. It means that the
data files don't have write ids neither partition column data. Currently
Impala expects the partition columns to be present in the data files,
so it is not be able to read converted partitioned tables.

With this patch Impala loads partition values from the Iceberg metadata.
The extra metadata information is attached to the file descriptor
objects and propageted to the scanners. This metadata contains the
Iceberg data file format (later it could be used to handle mixed-format
tables), and partition data.

We use the partition data in the HdfsScanner to create the template
tuple that contains the partition values of identity-partitioned
columns. This is not only true to migrated tables, but all Iceberg
tables with identity partitions, which means we also save some IO
and CPU time for such columns. The partition information could also
be used for Dynamic Partition Pruning later.

We use the (human-readable) string representation of the partition data
when storing them in the flat buffers. This helps debugging, also
it provides the needed flexibility when the partition columns
evolve (e.g. INT -> BIGINT, DECIMAL(4,2) -> DECIMAL(6,2)).

Testing
 * e2e test for all data types that can be used to partition a table
 * e2e test for migrated partitioned table + schema evolution (without
   renaming columns)
 * e2e for table where all column is used as identity-partitions

Change-Id: Iac11a02de709d43532056f71359c49d20c1be2b8
---
M be/src/exec/CMakeLists.txt
A be/src/exec/file-metadata-utils.cc
A be/src/exec/file-metadata-utils.h
M be/src/exec/hdfs-orc-scanner.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scanner.cc
M be/src/exec/hdfs-scanner.h
M be/src/exec/orc-column-readers.cc
M be/src/exec/parquet/hdfs-parquet-scanner.cc
M be/src/exec/parquet/parquet-metadata-utils.h
M be/src/runtime/dml-exec-state.cc
M be/src/scheduling/scheduler.cc
M common/fbs/CatalogObjects.fbs
M common/fbs/IcebergObjects.fbs
M common/protobuf/planner.proto
M common/thrift/CatalogObjects.thrift
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/util/IcebergUtil.java
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part/metadata/283c54cb-5a45-4a2c-bca8-4bfa0e61cdbd-m0.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part/metadata/snap-6167994413873848621-1-283c54cb-5a45-4a2c-bca8-4bfa0e61cdbd.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part/metadata/v1.metadata.json
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part/metadata/v2.metadata.json
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part/p_bool=true/p_int=1/p_bigint=11/p_float=1.1/p_double=2.222/p_decimal=123.321/p_date=2022-02-22/p_string=impala/00_0
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part_orc/metadata/db72fbf2-f9f6-4985-8a5f-fd9f632f2c77-m0.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part_orc/metadata/snap-7569365419257304230-1-db72fbf2-f9f6-4985-8a5f-fd9f632f2c77.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part_orc/metadata/v1.metadata.json
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part_orc/metadata/v2.metadata.json
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part_orc/metadata/version-hint.text
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part_orc/p_bool=true/p_int=1/p_bigint=11/p_float=1.1/p_double=2.222/p_decimal=123.321/p_date=2022-02-22/p_string=impala/00_0
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution/metadata/2d05a7d4-c229-44c3-860e-e77e46e71a19-m0.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution/metadata/snap-6654673546382518186-1-2d05a7d4-c229-44c3-860e-e77e46e71a19.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution/metadata/v1.metadata.json
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution/metadata/v2.metadata.json
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution/metadata/version-hint.text
A 

[Impala-ASF-CR] IMPALA-9498: Allow returning arrays in select list

2022-02-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17811 )

Change subject: IMPALA-9498: Allow returning arrays in select list
..


Patch Set 33:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/10167/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb1e42ffb21c7ddc033aba0f754b0108e46f34d0
Gerrit-Change-Number: 17811
Gerrit-PatchSet: 33
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 17 Feb 2022 10:13:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9498: Allow returning arrays in select list

2022-02-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17811 )

Change subject: IMPALA-9498: Allow returning arrays in select list
..


Patch Set 32:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/10166/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb1e42ffb21c7ddc033aba0f754b0108e46f34d0
Gerrit-Change-Number: 17811
Gerrit-PatchSet: 32
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 17 Feb 2022 10:02:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9498: Allow returning arrays in select list

2022-02-17 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17811 )

Change subject: IMPALA-9498: Allow returning arrays in select list
..


Patch Set 33:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17811/27/testdata/workloads/functional-query/queries/QueryTest/nested-array-in-select-list.test
File 
testdata/workloads/functional-query/queries/QueryTest/nested-array-in-select-list.test:

http://gerrit.cloudera.org:8080/#/c/17811/27/testdata/workloads/functional-query/queries/QueryTest/nested-array-in-select-list.test@105
PS27, Line 105: Changing a column to a different type
> Sorry that I should not mark this test. The query I have question is
Sorry, I didn't read your comment carefully

It turned out that a consts always lead to a non-pass-through union, probably 
because it's tuple won't contain the const members, so its size will be 
different than the result size we expect: 
https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/planner/UnionNode.java#L213



--
To view, visit http://gerrit.cloudera.org:8080/17811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb1e42ffb21c7ddc033aba0f754b0108e46f34d0
Gerrit-Change-Number: 17811
Gerrit-PatchSet: 33
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 17 Feb 2022 09:51:58 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9498: Allow returning arrays in select list

2022-02-17 Thread Csaba Ringhofer (Code Review)
Hello Quanlong Huang, Qifan Chen, Daniel Becker, Gabor Kaszab, Attila Jeges, 
Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17811

to look at the new patch set (#33).

Change subject: IMPALA-9498: Allow returning arrays in select list
..

IMPALA-9498: Allow returning arrays in select list

Until now ARRAYs had to be unnested in queries. This patch adds
support to return ARRAYs as STRINGs (JSON arrays) in select list,
for example:
select id, int_array from functional_parquet.complextypestbl where id = 1;
returns: 1, [1,2,3]

Returning ARRAYs from inline or HMS views is also supported -
these arrays can be used both in the select list or as relative
table references. Using them as non-relative table reference is
not supported (IMPALA-11052).

Though STRUCTs are already supported, ARRAYs and STRUCTs nested in
each other are not supported yet.

Things intentionally postponed for later commits:
- Add MAP suppport too - this shouldn't be too tricky after
  ARRAY support, but I don't want to make this patch even more
  complex.
- Unify HS2 / Beeswax logic with the way STRUCTs are handled.
  This could be done in a "final" logic that can handle
  STRUCTS/ARRAYS nested to each other
- Implement "deep copy" and "deep serialize" for ARRAYs in BE.
  This would enable all operators, e.g. ORDER BY and UNION.

Testing:
- FE tests were added for analyses and authorization
- EE tests were added
- core tests were ran

Change-Id: Ibb1e42ffb21c7ddc033aba0f754b0108e46f34d0
---
M be/src/codegen/codegen-anyval.cc
M be/src/exec/blocking-plan-root-sink.cc
M be/src/exec/buffered-plan-root-sink.cc
M be/src/exec/parquet/parquet-collection-column-reader.cc
M be/src/exec/plan-root-sink.cc
M be/src/exec/plan-root-sink.h
M be/src/exprs/expr.h
M be/src/exprs/slot-ref.cc
M be/src/exprs/slot-ref.h
M be/src/runtime/collection-value.h
M be/src/runtime/raw-value.cc
M be/src/runtime/raw-value.h
M be/src/runtime/types.cc
M be/src/runtime/types.h
M be/src/service/hs2-util.cc
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-server.h
M be/src/service/query-result-set.cc
M be/src/service/query-result-set.h
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/CollectionTableRef.java
M fe/src/main/java/org/apache/impala/analysis/FromClause.java
M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/SetOperationStmt.java
M fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
M fe/src/main/java/org/apache/impala/analysis/TableRef.java
M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java
M fe/src/main/java/org/apache/impala/authorization/TableMask.java
M fe/src/main/java/org/apache/impala/catalog/Type.java
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/planner/UnionNode.java
M fe/src/main/java/org/apache/impala/planner/UnnestNode.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeUpsertStmtTest.java
M fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-nested.test
A 
testdata/workloads/functional-query/queries/QueryTest/nested-array-in-select-list.test
M 
testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_complex_types.test
M 
testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_struct_in_select_list.test
M 
testdata/workloads/functional-query/queries/QueryTest/struct-in-select-list.test
M 
testdata/workloads/functional-query/queries/QueryTest/zipping-unnest-in-from-clause.test
M 
testdata/workloads/functional-query/queries/QueryTest/zipping-unnest-in-select-list.test
M tests/authorization/test_ranger.py
M tests/query_test/test_nested_types.py
56 files changed, 1,158 insertions(+), 254 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF 

[Impala-ASF-CR] IMPALA-9498: Allow returning arrays in select list

2022-02-17 Thread Csaba Ringhofer (Code Review)
Hello Quanlong Huang, Qifan Chen, Daniel Becker, Gabor Kaszab, Attila Jeges, 
Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17811

to look at the new patch set (#32).

Change subject: IMPALA-9498: Allow returning arrays in select list
..

IMPALA-9498: Allow returning arrays in select list

Until now ARRAYs had to be unnested in queries. This patch adds
support to return ARRAYs as STRINGs (JSON arrays) in select list,
for example:
select id, int_array from functional_parquet.complextypestbl where id = 1;
returns: 1, [1,2,3]

Returning ARRAYs from inline or HMS views is also supported -
these arrays can be used both in the select list or as relative
table references. Using them as non-relative table reference is
not supported (IMPALA-11052).

Though STRUCTs are already supported, ARRAYs and STRUCTs nested in
each other are not supported yet.

Things intentionally postponed for later commits:
- Add MAP suppport too - this shouldn't be too tricky after
  ARRAY support, but I don't want to make this patch even more
  complex.
- Unify HS2 / Beeswax logic with the way STRUCTs are handled.
  This could be done in a "final" logic that can handle
  STRUCTS/ARRAYS nested to each other
- Implement "deep copy" and "deep serialize" for ARRAYs in BE.
  This would enable all operators, e.g. ORDER BY and UNION.

Testing:
- FE tests were added for analyses and authorization
- EE tests were added
- core tests were ran

Change-Id: Ibb1e42ffb21c7ddc033aba0f754b0108e46f34d0
---
M be/src/codegen/codegen-anyval.cc
M be/src/exec/blocking-plan-root-sink.cc
M be/src/exec/buffered-plan-root-sink.cc
M be/src/exec/parquet/parquet-collection-column-reader.cc
M be/src/exec/plan-root-sink.cc
M be/src/exec/plan-root-sink.h
M be/src/exprs/expr.h
M be/src/exprs/slot-ref.cc
M be/src/exprs/slot-ref.h
M be/src/runtime/collection-value.h
M be/src/runtime/raw-value.cc
M be/src/runtime/raw-value.h
M be/src/runtime/types.cc
M be/src/runtime/types.h
M be/src/service/hs2-util.cc
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-server.h
M be/src/service/query-result-set.cc
M be/src/service/query-result-set.h
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/CollectionTableRef.java
M fe/src/main/java/org/apache/impala/analysis/FromClause.java
M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/SetOperationStmt.java
M fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
M fe/src/main/java/org/apache/impala/analysis/TableRef.java
M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java
M fe/src/main/java/org/apache/impala/authorization/TableMask.java
M fe/src/main/java/org/apache/impala/catalog/Type.java
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/planner/UnionNode.java
M fe/src/main/java/org/apache/impala/planner/UnnestNode.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeUpsertStmtTest.java
M fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-nested.test
A 
testdata/workloads/functional-query/queries/QueryTest/nested-array-in-select-list.test
M 
testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_complex_types.test
M 
testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_struct_in_select_list.test
M 
testdata/workloads/functional-query/queries/QueryTest/struct-in-select-list.test
M 
testdata/workloads/functional-query/queries/QueryTest/zipping-unnest-in-from-clause.test
M 
testdata/workloads/functional-query/queries/QueryTest/zipping-unnest-in-select-list.test
M tests/authorization/test_ranger.py
M tests/query_test/test_nested_types.py
56 files changed, 1,158 insertions(+), 254 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF 

[Impala-ASF-CR] IMPALA-11124: Reuse local TPCH/TPCDS data in testdata loading

2022-02-17 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18233 )

Change subject: IMPALA-11124: Reuse local TPCH/TPCDS data in testdata loading
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/18233
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ied40e599cda009ae0ad88ad13385e7bb86428bb4
Gerrit-Change-Number: 18233
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 17 Feb 2022 08:40:53 +
Gerrit-HasComments: No