[Impala-ASF-CR] IMPALA-11845: Fix incorrect check of struct STAR path in resolvePathWithMasking

2023-01-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/19429 )

Change subject: IMPALA-11845: Fix incorrect check of struct STAR path in 
resolvePathWithMasking
..

IMPALA-11845: Fix incorrect check of struct STAR path in resolvePathWithMasking

resolvePathWithMasking() is a wrapper on resolvePath() to further
resolve nested columns inside the table masking view. When it was
added, complex types in the select list hadn't been supported yet. So
the table masking view can't expose complex type columns directly in the
select list. Any paths in nested types will be further resolved inside
the table masking view in resolvePathWithMasking().

Take the following query as an example:
  select id, nested_struct.* from complextypestbl;
If Ranger column-masking/row-filter policies applied on the table, the
query is rewritten as
  select id, nested_struct.* from (
select mask(id) from complextypestbl
where row-filtering-condition
  ) t;
Table masking view "t" can't expose the nested column "nested_struct".
So we further resolve "nested_struct" inside the inlineView to use the
masked table "complextypestbl". The underlying TableRef is expected to
be a BaseTableRef.

Paths that don't reference nested columns should be resolved and
returned directly (just like the original resolvePath() does). E.g.
  select v.* from masked_view v
is rewritten to
  select v.* from (
select mask(c1), mask(c2), ..., mask(cn)
from masked_view
where row-filtering-condition
  ) v;

The STAR path "v.*" should be resolved directly. However, it's treated
as a nested column unexpectedly. The code then tries to resolve it
inside the table "masked_view" and found "masked_view" is not a table so
throws the IllegalStateException.

These are the current conditions for identifying nested STAR paths:
 - The destType is STRUCT
 - And the resolved path is rooted at a valid tuple descriptor

They don't really recognize the nested struct columns because STAR paths
on table/view also match these conditions. When the STAR path is an
expansion on a catalog table/view, the root tuple descriptor is
exactly the output tuple of the table/view. The destType is the type of
the tuple descriptor which is always a StructType.

Note that STAR paths on other nested types, i.e. array/map, are invalid.
So the first condition matches for all valid cases. The second condition
also matches all valid cases since both the table/view and struct STAR
expansion have the path rooted at a valid tuple descriptor.

This patch fixes the check for nested struct STAR path by checking
the matched types instead. Note that if "v.*" is a table/view expansion,
the matched type list is empty. If "v.*" is a struct column expansion,
the matched type list contains the STRUCT column type.

Tests:
 - Add missing coverage on STAR paths (v.*) on masked views.

Change-Id: I8f1e78e325baafbe23101909d47e82bf140a2d77
Reviewed-on: http://gerrit.cloudera.org:8080/19429
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/Path.java
M 
testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking.test
M 
testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_complex_types.test
4 files changed, 166 insertions(+), 5 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/19429
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I8f1e78e325baafbe23101909d47e82bf140a2d77
Gerrit-Change-Number: 19429
Gerrit-PatchSet: 10
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 


[Impala-ASF-CR] IMPALA-11845: Fix incorrect check of struct STAR path in resolvePathWithMasking

2023-01-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19429 )

Change subject: IMPALA-11845: Fix incorrect check of struct STAR path in 
resolvePathWithMasking
..


Patch Set 9: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/19429
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8f1e78e325baafbe23101909d47e82bf140a2d77
Gerrit-Change-Number: 19429
Gerrit-PatchSet: 9
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Wed, 01 Feb 2023 04:23:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11756: Disable auto analyze table triggered by Hive

2023-01-31 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19464 )

Change subject: IMPALA-11756: Disable auto analyze table triggered by Hive
..


Patch Set 1:

Thanks Riza for the quick review!


--
To view, visit http://gerrit.cloudera.org:8080/19464
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idc23100ae74d6cb07894053a26806e01258065ec
Gerrit-Change-Number: 19464
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Wed, 01 Feb 2023 03:47:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11756: Disable auto analyze table triggered by Hive

2023-01-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19464 )

Change subject: IMPALA-11756: Disable auto analyze table triggered by Hive
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/9012/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/19464
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idc23100ae74d6cb07894053a26806e01258065ec
Gerrit-Change-Number: 19464
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Wed, 01 Feb 2023 03:47:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11756: Disable auto analyze table triggered by Hive

2023-01-31 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19464 )

Change subject: IMPALA-11756: Disable auto analyze table triggered by Hive
..


Patch Set 1: Code-Review+2

This looks OK to me.


--
To view, visit http://gerrit.cloudera.org:8080/19464
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idc23100ae74d6cb07894053a26806e01258065ec
Gerrit-Change-Number: 19464
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Wed, 01 Feb 2023 03:45:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11809: Support non unique primary key for Kudu

2023-01-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19383 )

Change subject: IMPALA-11809: Support non unique primary key for Kudu
..


Patch Set 13:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/12280/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 13
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Marton Greber 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 01 Feb 2023 02:53:26 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11859: Add bytes-read-encrypted metric

2023-01-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19461 )

Change subject: IMPALA-11859: Add bytes-read-encrypted metric
..


Patch Set 3: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/19461
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9dbc194a4bc31cb0e01545fb6032a0853db60f34
Gerrit-Change-Number: 19461
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Wed, 01 Feb 2023 02:44:03 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11756: Disable auto analyze table triggered by Hive

2023-01-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19464 )

Change subject: IMPALA-11756: Disable auto analyze table triggered by Hive
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/12279/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/19464
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idc23100ae74d6cb07894053a26806e01258065ec
Gerrit-Change-Number: 19464
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 01 Feb 2023 02:43:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11809: Support non unique primary key for Kudu

2023-01-31 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has uploaded a new patch set (#13). ( 
http://gerrit.cloudera.org:8080/19383 )

Change subject: IMPALA-11809: Support non unique primary key for Kudu
..

IMPALA-11809: Support non unique primary key for Kudu

Kudu engine recently enables the auto-incrementing column feature. The
feature works by appending a system generated auto-incrementing column
to the primary key columns to guarantee the uniqueness on primary key
when the primary key columns can be non-unique. The non unique primary
key columns and the auto-incrementing column form the effective unique
composite primary key.

This auto-incrementing column is named as 'auto_incrementing_id' with
big int type. The assignment to it during insertion is automatic so
insertion statements should not specify values for auto-incrementing
column. In current Kudu implementation, there is no central key provider
for auto-incrementing columns. It uses a per tablet-server global
counter to assign values for auto-incrementing columns. So the values
of auto-incrementing columns are not unique in a Kudu table, but unique
in a tablet-server.

This patch upgraded Kudu version to 345fd44ca3 to pick up Kudu changes
needed for supporting non-unique primary key. It added syntactic
support for creating Kudu table with non unique primary key.
When creating a Kudu table, specifying PRIMARY KEY is optional.
If there is no primary key attribute specified, the partition key
columes will be promoted as non unique primary key if those columns
are the beginning columns of the table.
New column "key_unique" is added to the output of 'describe' table
command for Kudu table.

Examples of CREATE TABLE statement with non unique primary key:
  CREATE TABLE tbl (i INT NON UNIQUE PRIMARY KEY, s STRING)
  PARTITION BY HASH (i) partitions 3
  STORED as KUDU;
  CREATE TABLE tbl (i INT, s STRING, NON UNIQUE PRIMARY KEY(i))
  PARTITION BY HASH (i) partitions 3
  STORED as KUDU;
  CREATE TABLE tbl NON UNIQUE PRIMARY KEY(id)
  PARTITION BY HASH (id) partitions 3
  STORED as KUDU
  AS SELECT id, string_col FROM functional.alltypes WHERE id = 10;

SELECT statement does not show the system generated auto-incrementing
column unless the column is explicitly specified in the select list.
Auto-incrementing column cannot be added, removed or renamed with
ALTER TABLE statements.
UPSERT operation is not supported now for Kudu tables with auto
incrementing column due to limitation in Kudu engine.

Testing:
 - Ran manual test in impala-shell with queries to create tables
   with non unique primary key, and tested insert/update/delete
   operations for Kudu tables with non unique primary key.
 - Added front end tests, and end to end unit tests for Kudu tables
   with non unique primary key.
 - Passed exhaustive test.

Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
---
M bin/impala-config.sh
M common/thrift/CatalogObjects.thrift
M common/thrift/JniCatalog.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/AlterTableAddColsStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableAlterColStmt.java
M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/TableDef.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/FeDb.java
M fe/src/main/java/org/apache/impala/catalog/FeKuduTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduColumn.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalKuduTable.java
M fe/src/main/java/org/apache/impala/service/DescribeResultFactory.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/KuduUtil.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M testdata/workloads/functional-query/queries/QueryTest/kudu-scan-node.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_create.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_delete.test
M 

[Impala-ASF-CR] IMPALA-11756: Disable auto analyze table triggered by Hive

2023-01-31 Thread Quanlong Huang (Code Review)
Quanlong Huang has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/19464


Change subject: IMPALA-11756: Disable auto analyze table triggered by Hive
..

IMPALA-11756: Disable auto analyze table triggered by Hive

Hive will automatically recompute stats asynchronously after major
compactions. The tasks could fail if the table is removed, which usually
happens on our test tables. The failure could cause further failures on
other query-based compactions in the same session, and results in test
failures. See the analysis in the JIRA for an example.

We don't rely on Hive to recompute the stats. So we can disable this
feature to avoid the issue. This patch turns off
"hive.compactor.gather.stats" to disable it.

Tests:
 - Ran exhaustive tests

Change-Id: Idc23100ae74d6cb07894053a26806e01258065ec
---
M fe/src/test/resources/hive-site.xml.py
1 file changed, 5 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/19464/1
--
To view, visit http://gerrit.cloudera.org:8080/19464
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Idc23100ae74d6cb07894053a26806e01258065ec
Gerrit-Change-Number: 19464
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 


[Impala-ASF-CR] IMPALA-11845: Fix incorrect check of struct STAR path in resolvePathWithMasking

2023-01-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19429 )

Change subject: IMPALA-11845: Fix incorrect check of struct STAR path in 
resolvePathWithMasking
..


Patch Set 9:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/9011/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/19429
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8f1e78e325baafbe23101909d47e82bf140a2d77
Gerrit-Change-Number: 19429
Gerrit-PatchSet: 9
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Tue, 31 Jan 2023 23:13:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11845: Fix incorrect check of struct STAR path in resolvePathWithMasking

2023-01-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19429 )

Change subject: IMPALA-11845: Fix incorrect check of struct STAR path in 
resolvePathWithMasking
..


Patch Set 9: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/19429
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8f1e78e325baafbe23101909d47e82bf140a2d77
Gerrit-Change-Number: 19429
Gerrit-PatchSet: 9
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Tue, 31 Jan 2023 23:13:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11845: Fix incorrect check of struct STAR path in resolvePathWithMasking

2023-01-31 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19429 )

Change subject: IMPALA-11845: Fix incorrect check of struct STAR path in 
resolvePathWithMasking
..


Patch Set 8:

> Patch Set 8: Verified-1
>
> Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/9007/

The failure is due to IMPALA-11572:
https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/6947


--
To view, visit http://gerrit.cloudera.org:8080/19429
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8f1e78e325baafbe23101909d47e82bf140a2d77
Gerrit-Change-Number: 19429
Gerrit-PatchSet: 8
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Tue, 31 Jan 2023 23:12:48 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11745: Add Hive's ESRI geospatial functions as builtins

2023-01-31 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19425 )

Change subject: IMPALA-11745: Add Hive's ESRI geospatial functions as builtins
..


Patch Set 18:

(10 comments)

Looks pretty clean. Mostly have a few questions, and we need to think about 
documentation.

http://gerrit.cloudera.org:8080/#/c/19425/18//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19425/18//COMMIT_MSG@31
PS18, Line 31: Known limitations:
This could all use some documentation. Let's at least file a ticket for it if 
you feel it's too early to publish documentation.


http://gerrit.cloudera.org:8080/#/c/19425/18//COMMIT_MSG@35
PS18, Line 35:  - ST_MultiPoint, ST_LineStrin supports a maximum of 7
Should be ST_LineString?


http://gerrit.cloudera.org:8080/#/c/19425/18/be/src/exprs/hive-udf-call.cc
File be/src/exprs/hive-udf-call.cc:

http://gerrit.cloudera.org:8080/#/c/19425/18/be/src/exprs/hive-udf-call.cc@195
PS18, Line 195: bool builtin_udf = fn_.hdfs_location.empty();
Why was this check necessary?


http://gerrit.cloudera.org:8080/#/c/19425/18/be/src/kudu/rpc/CMakeLists.txt
File be/src/kudu/rpc/CMakeLists.txt:

http://gerrit.cloudera.org:8080/#/c/19425/18/be/src/kudu/rpc/CMakeLists.txt@110
PS18, Line 110: add_dependencies(protoc-gen-krpc thrift-deps ImpalaThrift)
Why was this needed? It looks a little weird to depend on Impala things in our 
borrowed kudu source.


http://gerrit.cloudera.org:8080/#/c/19425/18/common/function-registry/gen_geospatial_udf_wrappers.py
File common/function-registry/gen_geospatial_udf_wrappers.py:

http://gerrit.cloudera.org:8080/#/c/19425/18/common/function-registry/gen_geospatial_udf_wrappers.py@53
PS18, Line 53: UDF_PACKAGE = "org.apache.hadoop.hive.ql.udf.esri"
What library does this come from? Would be nice to document it to tie it back 
to how it's included (presumably pom.xml).


http://gerrit.cloudera.org:8080/#/c/19425/18/fe/src/main/java/org/apache/impala/hive/executor/HiveLegacyFunctionExtractor.java
File 
fe/src/main/java/org/apache/impala/hive/executor/HiveLegacyFunctionExtractor.java:

http://gerrit.cloudera.org:8080/#/c/19425/18/fe/src/main/java/org/apache/impala/hive/executor/HiveLegacyFunctionExtractor.java@66
PS18, Line 66: fnArgsList,
nit: Formatting seems a little odd here.


http://gerrit.cloudera.org:8080/#/c/19425/18/fe/src/main/java/org/apache/impala/hive/executor/HiveLegacyJavaFunction.java
File 
fe/src/main/java/org/apache/impala/hive/executor/HiveLegacyJavaFunction.java:

http://gerrit.cloudera.org:8080/#/c/19425/18/fe/src/main/java/org/apache/impala/hive/executor/HiveLegacyJavaFunction.java@156
PS18, Line 156: LOG.warn("Ignoring incompatible method: " + m + " 
during load of "
How about using slf4j variable substitution instead of concatenation?


http://gerrit.cloudera.org:8080/#/c/19425/18/fe/src/main/java/org/apache/impala/hive/executor/HiveLegacyJavaFunction.java@171
PS18, Line 171:   "No compatible signatures found for function:" + 
hiveFn_.getFunctionName());
nit: space after ":"


http://gerrit.cloudera.org:8080/#/c/19425/18/fe/src/test/java/org/apache/impala/hive/executor/TestHiveJavaFunctionFactory.java
File 
fe/src/test/java/org/apache/impala/hive/executor/TestHiveJavaFunctionFactory.java:

http://gerrit.cloudera.org:8080/#/c/19425/18/fe/src/test/java/org/apache/impala/hive/executor/TestHiveJavaFunctionFactory.java@32
PS18, Line 32: public List extract() throws 
CatalogException {
nit: Presumably you don't need this override anymore, since the updated 
implementation will create a default HiveLegacyFunctionExtractor and pass it to 
the new signature.


http://gerrit.cloudera.org:8080/#/c/19425/18/testdata/workloads/functional-query/queries/QueryTest/geospatial-esri.test
File testdata/workloads/functional-query/queries/QueryTest/geospatial-esri.test:

http://gerrit.cloudera.org:8080/#/c/19425/18/testdata/workloads/functional-query/queries/QueryTest/geospatial-esri.test@1
PS18, Line 1: =
Is this test set something we could generate? I guess at this point you've 
presumably enumerated all the function signatures.



--
To view, visit http://gerrit.cloudera.org:8080/19425
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If0ca02a70b4ba244778c9db6d14df4423072b225
Gerrit-Change-Number: 19425
Gerrit-PatchSet: 18
Gerrit-Owner: Peter Rozsa 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Comment-Date: Tue, 31 Jan 2023 22:51:53 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11859: Add bytes-read-encrypted metric

2023-01-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19461 )

Change subject: IMPALA-11859: Add bytes-read-encrypted metric
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/9010/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/19461
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9dbc194a4bc31cb0e01545fb6032a0853db60f34
Gerrit-Change-Number: 19461
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Tue, 31 Jan 2023 21:30:26 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11604 Planner changes for CPU usage

2023-01-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19033 )

Change subject: IMPALA-11604 Planner changes for CPU usage
..


Patch Set 38:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/12278/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/19033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a
Gerrit-Change-Number: 19033
Gerrit-PatchSet: 38
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Tue, 31 Jan 2023 21:27:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11604 Planner changes for CPU usage

2023-01-31 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19033 )

Change subject: IMPALA-11604 Planner changes for CPU usage
..


Patch Set 38:

(17 comments)

Thank you for your review, Qifan!

http://gerrit.cloudera.org:8080/#/c/19033/37/be/src/scheduling/scheduler.cc
File be/src/scheduling/scheduler.cc:

http://gerrit.cloudera.org:8080/#/c/19033/37/be/src/scheduling/scheduler.cc@266
PS37, Line 266: cheduler::CheckEffective
> nit. is positive
Done. Moved to scheduler.h.


http://gerrit.cloudera.org:8080/#/c/19033/37/be/src/scheduling/scheduler.cc@272
PS37, Line 272: || IsExceedMaxFsWriters(fragment_state, state)) {
> Is it possible to exclude the checking for scan fragment and certain table
Yes! Consolidated the case checking in this method.


http://gerrit.cloudera.org:8080/#/c/19033/37/be/src/util/backend-gflag-util.cc
File be/src/util/backend-gflag-util.cc:

http://gerrit.cloudera.org:8080/#/c/19033/37/be/src/util/backend-gflag-util.cc@201
PS37, Line 201: 0.5
> Just wonder why not directly use a value of 2 here.
My intention is to associate the scaling flag to CPU requirement of the query, 
not the total CPU cores of executor group set.
Hence, this means halving the CpuRequirement returned by costing algorithm 
(oversubscribing the executor group).


http://gerrit.cloudera.org:8080/#/c/19033/37/common/thrift/Planner.thrift
File common/thrift/Planner.thrift:

http://gerrit.cloudera.org:8080/#/c/19033/37/common/thrift/Planner.thrift@87
PS37, Line 87: //
> Miss the comment here
Added comment.


http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:

http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/analysis/Expr.java@473
PS37, Line 473: public long getNumDistinctValues() { return n
> This is unused. However, this point me to existing evalCost_ added in IMPAL
Removed.


http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java
File fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java:

http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java@364
PS37, Line 364: 'partitionByEq_' and 'orderBy
> This does not match with the code at line 368: ExprUtil.computeExprCost(par
Removed partitionByEq_ from cost calculation.


http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java@365
PS37, Line 365: titioned and sorte
> This does not match with the code at line 368 ExprUtil.computeExprCost(orde
Removed orderByEq_ from cost calculation.


http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/BroadcastProcessingCost.java
File fe/src/main/java/org/apache/impala/planner/BroadcastProcessingCost.java:

http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/BroadcastProcessingCost.java@33
PS37, Line 33: childP
> May use the name childProcessingCost_  or add a comment to indicate this is
Done


http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/BroadcastProcessingCost.java@47
PS37, Line 47: onditions.checkState(
> Code duplication with line at 40.
Replaced preconditions in line 40 with single call to getMultiple()


http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/CoreRequirement.java
File fe/src/main/java/org/apache/impala/planner/CoreRequirement.java:

http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/CoreRequirement.java@29
PS37, Line 29:
> Comment is missing.
Done


http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/CoreRequirement.java@31
PS37, Line 31: * A container class that represent CPU core requirement of 
certain subtree of a query or
 :  * the query itself.
 :  */
> A comment for each data member.
Done


http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/DataSink.java
File fe/src/main/java/org/apache/impala/planner/DataSink.java:

http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/DataSink.java@49
PS37, Line 49: Set in computeProc
> nit Set in computeProcessingCost() for a meaningful value.
Done


http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/DataSink.java@131
PS37, Line 131: ws p
> nit. data fields in
Done


http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/DataSink.java@141
PS37, Line 141: setNumRowToProduce(Mat
> This is missing in the comment at line 131 above.
Done


http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/MultipleProcessingCost.java
File 

[Impala-ASF-CR] IMPALA-11604 Planner changes for CPU usage

2023-01-31 Thread Riza Suminto (Code Review)
Riza Suminto has uploaded a new patch set (#38) to the change originally 
created by Qifan Chen. ( http://gerrit.cloudera.org:8080/19033 )

Change subject: IMPALA-11604 Planner changes for CPU usage
..

IMPALA-11604 Planner changes for CPU usage

This patch augments IMPALA-10992 by establishing an infrastructure to
allow the weighted total amount of data to process to be used as a new
factor in the definition and selection of an executor group. At the
basis of the CPU costing model, we define ProcessingCost as a cost for a
distinct PlanNode / DataSink / PlanFragment to process its input rows
globally across all of its instances. The costing algorithm then tries
to adjust the number of instances for each fragment by considering their
production-consumption ratio and blocking-operator nature between their
plan nodes, and finally then returns a number representing an ideal CPU
core count required for a query to run efficiently. A more detailed
explanation of the CPU costing algorithm can be explained in four steps
below.

I. Compute ProcessingCost for each plan node and data sink.

ProcessingCost of a PlanNode/DataSink is a weighted amount of data
processed by that node/sink. The basic ProcessingCost is computed with a
general formula as follows.

  ProcessingCost is a pair: PC(D, N), where D = I * (C + M)

  where D is the weighted amount of data processed
I is the input cardinality
C is the expression evaluation cost per row.
  Set to total weight of expression evaluation in node/sink.
M is a materialization cost per row.
  Only used by scan and exchange node. Otherwise, 0.
N is the number of instances.
  Default to D / MIN_COST_PER_THREAD (1 million), but
  is fixed for a certain node/sink and adjustable in step III.

In this patch, the weight of each expression evaluation is set to a
constant of 1. A description of the computation for each kind of
PlanNode/DataSink is given below.

01. AggregationNode:
Each AggregateInfo has its C as a sum of grouping expression and
aggregate expression and then assigned a single ProcessingCost
individually. These ProcessingCosts then summed to be the Aggregation
node's ProcessingCost;

02. AnalyticEvalNode:
C is the sum of the evaluation costs for analytic functions;

03. CardinalityCheckNode:
Use the general formula, I = 1;

04. DataSourceScanNode:
Follow the formula from the superclass ScanNode;

05. EmptySetNode:
  I = 0;

06. ExchangeNode:
  M = 1 / num rows per batch.

A modification of the general formula when in broadcast mode:
  D = D * number of receivers;

07. HashJoinNode:
  probe cost = PC(I0 * C(equiJoin predicate),  N)  +
   PC(output cardinality * C(otherJoin predicate), N)
  build cost = PC(I1 * C(equi-join predicate), N)

With I0 and I1 as input cardinality of the probe and build side
accordingly. If the plan node does not have a separate build, ProcessingCost
is the sum of probe cost and build cost. Otherwise, ProcessingCost is
equal to probeCost.

08. HbaseScanNode:
Follow the formula from the superclass ScanNode;

09. HdfsScanNode and KuduScanNode:
Follow the formula from the superclass ScanNode with modified N.
N is mt_dop when query option mt_dop >= 1, otherwise
N is the number of nodes * max scan threads;

10. Nested loop join node:
When the right child is not a SingularRowSrcNode:

  probe cost = PC(I0 * C(equiJoin predicate), N)  +
   PC(output cardinality * C(otherJoin predicate), N)
  build cost = PC(I1 * C(equiJoin predicate), N)

When the right child is a SingularRowSrcNode:

  probe cost = PC(I0, N)
  build cost = PC(I0 * I1, N)

With I0 and I1 as input cardinality of the probe and build side
accordingly. If the plan node does not have a separate build, ProcessingCost
is the sum of probe cost and build cost. Otherwise, ProcessingCost is
equal to probeCost.

11. ScanNode:
  M = average row size / ROWBATCH_MAX_MEM_USAGE (8MB);

12. SelectNode:
Use the general formula;

13. SingularRowSrcNode:
Since the node is involved once per input in nested loop join, the
contribution of this node is computed in nested loop join;

14. SortNode:
C is the evaluation cost for the sort expression;

15. SubplanNode:
C is 1. I is the multiplication of the cardinality of the left and
the right child;

16. Union node:
C is the cost of result expression evaluation from all non-pass-through
children;

17. Unnest node:
I is the cardinality of the containing SubplanNode and C is 1.

18. DataStreamSink:
  M = 1 / num rows per batch.

19. JoinBuildSink:
ProcessingCost is the build cost of its associated JoinNode.

20. PlanRootSink:
If result spooling is enabled, C is the cost of output expression
evaluation. Otherwise. 

[Impala-ASF-CR] IMPALA-11859: Add bytes-read-encrypted metric

2023-01-31 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19461 )

Change subject: IMPALA-11859: Add bytes-read-encrypted metric
..


Patch Set 3: Code-Review+2

LGTM


--
To view, visit http://gerrit.cloudera.org:8080/19461
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9dbc194a4bc31cb0e01545fb6032a0853db60f34
Gerrit-Change-Number: 19461
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Tue, 31 Jan 2023 20:48:11 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11859: Add bytes-read-encrypted metric

2023-01-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19461 )

Change subject: IMPALA-11859: Add bytes-read-encrypted metric
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/12277/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/19461
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9dbc194a4bc31cb0e01545fb6032a0853db60f34
Gerrit-Change-Number: 19461
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Tue, 31 Jan 2023 19:09:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11859: Add bytes-read-encrypted metric

2023-01-31 Thread Michael Smith (Code Review)
Hello Joe McDonnell, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/19461

to look at the new patch set (#3).

Change subject: IMPALA-11859: Add bytes-read-encrypted metric
..

IMPALA-11859: Add bytes-read-encrypted metric

Adds a metric bytes-read-encrypted to track encrypted reads.

Testing:
- ran test_io_metrics.py with Ozone (encrypts by default)
- ran test_io_metrics.py with HDFS (no encryption)

Change-Id: I9dbc194a4bc31cb0e01545fb6032a0853db60f34
---
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/orc/hdfs-orc-scanner.cc
M be/src/runtime/io/request-context.h
M be/src/runtime/io/request-ranges.h
M be/src/runtime/io/scan-range.cc
M be/src/scheduling/scheduler.cc
M be/src/util/impalad-metrics.cc
M be/src/util/impalad-metrics.h
M common/fbs/CatalogObjects.fbs
M common/protobuf/planner.proto
M common/thrift/PlanNodes.thrift
M common/thrift/metrics.json
M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
D fe/src/main/java/org/apache/impala/compat/HdfsShim.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M tests/query_test/test_io_metrics.py
19 files changed, 76 insertions(+), 46 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/61/19461/3
--
To view, visit http://gerrit.cloudera.org:8080/19461
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9dbc194a4bc31cb0e01545fb6032a0853db60f34
Gerrit-Change-Number: 19461
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 


[Impala-ASF-CR] Make Docker-based tests available on more hosting OS platforms

2023-01-31 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12892 )

Change subject: Make Docker-based tests available on more hosting OS platforms
..


Patch Set 9: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/12892
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6d326611772facc193d8e509a827515f10207808
Gerrit-Change-Number: 12892
Gerrit-PatchSet: 9
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Tue, 31 Jan 2023 18:47:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] Make Docker-based tests available on more hosting OS platforms

2023-01-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12892 )

Change subject: Make Docker-based tests available on more hosting OS platforms
..


Patch Set 9:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/12276/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12892
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6d326611772facc193d8e509a827515f10207808
Gerrit-Change-Number: 12892
Gerrit-PatchSet: 9
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Comment-Date: Tue, 31 Jan 2023 18:44:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] Make Docker-based tests available on more hosting OS platforms

2023-01-31 Thread Laszlo Gaal (Code Review)
Hello Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12892

to look at the new patch set (#9).

Change subject: Make Docker-based tests available on more hosting OS platforms
..

Make Docker-based tests available on more hosting OS platforms

Tweak the test driver and the monitoring code to be compatible with
more OS versions:

1. Change the code collecting memory statistics to go directly to
   /proc/meminfo instead of parsing the output of 'free', which
   requires different arguments on different OS distributions.
   This makes test-with-docker.py depend on the MemAvailable field
   in /proc/meminfo, which should be present on kernels v3.14 and above.
   This also foregoes forking 'free' once per second.

2. Relax the check for a valid timezone specifier.
   On Ubuntu 14.04 /etc/localtime is a copy of a timezone file instead
   of being a symlink to a timezone file.
   Add an alternative check that accepts this option for timezone
   synchronization between the host and the Docker containers.

3. Centos 6 did not like the leading dash in the 'ps' command argument
   list. Change this to a universally accepted form by dropping
   the leading dash.
   Tested by running 'ps axho' manually on all OSs supported for
   Docker-based tests.

Change-Id: I6d326611772facc193d8e509a827515f10207808
---
M docker/entrypoint.sh
M docker/monitor.py
M docker/test-with-docker.py
3 files changed, 51 insertions(+), 21 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/92/12892/9
--
To view, visit http://gerrit.cloudera.org:8080/12892
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6d326611772facc193d8e509a827515f10207808
Gerrit-Change-Number: 12892
Gerrit-PatchSet: 9
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 


[Impala-ASF-CR] IMPALA-11604 Planner changes for CPU usage

2023-01-31 Thread Qifan Chen (Code Review)
Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19033 )

Change subject: IMPALA-11604 Planner changes for CPU usage
..


Patch Set 37:

(14 comments)

Looks good!

http://gerrit.cloudera.org:8080/#/c/19033/37/be/src/scheduling/scheduler.cc
File be/src/scheduling/scheduler.cc:

http://gerrit.cloudera.org:8080/#/c/19033/37/be/src/scheduling/scheduler.cc@266
PS37, Line 266: effective_instance_count
nit. is positive


http://gerrit.cloudera.org:8080/#/c/19033/37/be/src/scheduling/scheduler.cc@272
PS37, Line 272:   if (effective_instance_count > 0) {
Is it possible to exclude the checking for scan fragment and certain table sink 
fragment here?


http://gerrit.cloudera.org:8080/#/c/19033/37/be/src/util/backend-gflag-util.cc
File be/src/util/backend-gflag-util.cc:

http://gerrit.cloudera.org:8080/#/c/19033/37/be/src/util/backend-gflag-util.cc@201
PS37, Line 201: 0.5
Just wonder why not directly use a value of 2 here.


http://gerrit.cloudera.org:8080/#/c/19033/37/common/thrift/Planner.thrift
File common/thrift/Planner.thrift:

http://gerrit.cloudera.org:8080/#/c/19033/37/common/thrift/Planner.thrift@87
PS37, Line 87: 14:
Miss the comment here


http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java
File fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java:

http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java@364
PS37, Line 364: 'partitionExprs_' is excluded
This does not match with the code at line 368: 
ExprUtil.computeExprCost(partitionByEq_)


http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java@365
PS37, Line 365: sorted within each
This does not match with the code at line 368 
ExprUtil.computeExprCost(orderByEq_)


http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/BroadcastProcessingCost.java
File fe/src/main/java/org/apache/impala/planner/BroadcastProcessingCost.java:

http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/BroadcastProcessingCost.java@33
PS37, Line 33: cost_;
May use the name childProcessingCost_  or add a comment to indicate this is the 
cost from the child.


http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/BroadcastProcessingCost.java@47
PS37, Line 47: multiple_.get() > 0, "BroadcastProcessingCost: multiple must be 
greater than 0!");
Code duplication with line at 40.


http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/CoreRequirement.java
File fe/src/main/java/org/apache/impala/planner/CoreRequirement.java:

http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/CoreRequirement.java@29
PS37, Line 29:
Comment is missing.


http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/CoreRequirement.java@31
PS37, Line 31:  private final List ids_;
 :   private final List counts_;
 :   private int total_ = -1;
A comment for each data member.


http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/DataSink.java
File fe/src/main/java/org/apache/impala/planner/DataSink.java:

http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/DataSink.java@49
PS37, Line 49: Gets set correctly
nit Set in computeProcessingCost() for a meaningful value.


http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/DataSink.java@131
PS37, Line 131: into
nit. data fields in


http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/DataSink.java@141
PS37, Line 141: setNumInstanceExpected
This is missing in the comment at line 131 above.


http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/MultipleProcessingCost.java
File fe/src/main/java/org/apache/impala/planner/MultipleProcessingCost.java:

http://gerrit.cloudera.org:8080/#/c/19033/37/fe/src/main/java/org/apache/impala/planner/MultipleProcessingCost.java@26
PS37, Line 26: multiple_
multiplier_ is better.



--
To view, visit http://gerrit.cloudera.org:8080/19033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a
Gerrit-Change-Number: 19033
Gerrit-PatchSet: 37
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Tue, 31 Jan 2023 17:29:32 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11662: Improve 'refresh iceberg tbl on oss' performance

2023-01-31 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/19379 )

Change subject: IMPALA-11662: Improve 'refresh iceberg_tbl_on_oss' performance
..

IMPALA-11662: Improve 'refresh iceberg_tbl_on_oss' performance

As the cost of directory listing on Cloud Storage Systems such as OSS or
S3 is higher than the cost on HDFS, we could create the file descriptors
from the rich metadata provided by Iceberg instead of using
org.apache.hadoop.fs.FileSystem#listFiles. The only thing missing there
is the last_modification_time of the files. But since Iceberg files are
immutable, we could just come up with a special timestamp for these
files.

At the same time, we can also construct file descriptors ourselves
during time travel to reduce the cost of requests with OSS services.

Test:
 * existing tests
 * test on COS with my local test environment

Change-Id: If2ee8b6b7559e6590698b46ef1d574e55ed52f9a
Reviewed-on: http://gerrit.cloudera.org:8080/19379
Tested-by: Impala Public Jenkins 
Reviewed-by: Zoltan Borok-Nagy 
---
M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
A fe/src/main/java/org/apache/impala/catalog/IcebergFileMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/iceberg/GroupedContentFiles.java
M fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java
9 files changed, 332 insertions(+), 107 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Zoltan Borok-Nagy: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/19379
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: If2ee8b6b7559e6590698b46ef1d574e55ed52f9a
Gerrit-Change-Number: 19379
Gerrit-PatchSet: 13
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Xiaoqing Gao 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-11662: Improve 'refresh iceberg tbl on oss' performance

2023-01-31 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19379 )

Change subject: IMPALA-11662: Improve 'refresh iceberg_tbl_on_oss' performance
..


Patch Set 12: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/19379
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2ee8b6b7559e6590698b46ef1d574e55ed52f9a
Gerrit-Change-Number: 19379
Gerrit-PatchSet: 12
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Xiaoqing Gao 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 31 Jan 2023 17:23:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11693: Enable allow erasure coded files by default

2023-01-31 Thread Michael Smith (Code Review)
Michael Smith has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/19362 )

Change subject: IMPALA-11693: Enable allow_erasure_coded_files by default
..

IMPALA-11693: Enable allow_erasure_coded_files by default

Enables allow_erasure_coded_files by default as we've now completed all
planned work to support it.

Testing
- Ran HDFS+EC test suite
- Ran Ozone+EC test suite

Change-Id: I0cfef087f2a7ae0889f47e85c5fab61a795d8fd4
Reviewed-on: http://gerrit.cloudera.org:8080/19362
Reviewed-by: Joe McDonnell 
Tested-by: Impala Public Jenkins 
---
M bin/run-all-tests.sh
M common/thrift/Query.thrift
M docs/topics/impala_allow_erasure_coded_files.xml
M testdata/bin/create-load-data.sh
M tests/common/custom_cluster_test_suite.py
M tests/query_test/test_observability.py
M tests/util/auto_scaler.py
7 files changed, 5 insertions(+), 18 deletions(-)

Approvals:
  Joe McDonnell: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/19362
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I0cfef087f2a7ae0889f47e85c5fab61a795d8fd4
Gerrit-Change-Number: 19362
Gerrit-PatchSet: 5
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 


[Impala-ASF-CR] IMPALA-11855: Upgrade jetty to 9.4.50

2023-01-31 Thread Michael Smith (Code Review)
Michael Smith has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/19436 )

Change subject: IMPALA-11855: Upgrade jetty to 9.4.50
..

IMPALA-11855: Upgrade jetty to 9.4.50

Upgrades jetty to 9.4.50 due to CVE-2022-2047, CVE-2022-2048.

Change-Id: Icbc6d3ad40b63986137ea1b5c71b9af61bd9e637
Reviewed-on: http://gerrit.cloudera.org:8080/19436
Reviewed-by: Riza Suminto 
Tested-by: Impala Public Jenkins 
---
M fe/pom.xml
1 file changed, 1 insertion(+), 1 deletion(-)

Approvals:
  Riza Suminto: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/19436
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Icbc6d3ad40b63986137ea1b5c71b9af61bd9e637
Gerrit-Change-Number: 19436
Gerrit-PatchSet: 5
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 


[Impala-ASF-CR] IMPALA-9551: Allow mixed complex types in select list

2023-01-31 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19322 )

Change subject: IMPALA-9551: Allow mixed complex types in select list
..


Patch Set 15: Code-Review+1

(10 comments)

looks good to me, just indentation + few more test ideas

http://gerrit.cloudera.org:8080/#/c/19322/15/be/src/runtime/complex-value-writer.inline.h
File be/src/runtime/complex-value-writer.inline.h:

http://gerrit.cloudera.org:8080/#/c/19322/15/be/src/runtime/complex-value-writer.inline.h@90
PS15, Line 90: struct_slot_desc.children_tuple_descriptor();
nit: +2 indentation


http://gerrit.cloudera.org:8080/#/c/19322/15/be/src/runtime/complex-value-writer.inline.h@116
PS15, Line 116: reinterpret_cast(child);
+2


http://gerrit.cloudera.org:8080/#/c/19322/15/be/src/runtime/complex-value-writer.inline.h@118
PS15, Line 118: child_slot_desc.children_tuple_descriptor();
+2


http://gerrit.cloudera.org:8080/#/c/19322/15/be/src/runtime/complex-value-writer.inline.h@236
PS15, Line 236: children_item_tuple_desc->slots();
+2


http://gerrit.cloudera.org:8080/#/c/19322/15/be/src/service/hs2-util.cc
File be/src/service/hs2-util.cc:

http://gerrit.cloudera.org:8080/#/c/19322/15/be/src/service/hs2-util.cc@390
PS15, Line 390: static_cast(scalar_expr).GetSlotDescriptor();
+2 indentation


http://gerrit.cloudera.org:8080/#/c/19322/15/be/src/service/query-result-set.cc
File be/src/service/query-result-set.cc:

http://gerrit.cloudera.org:8080/#/c/19322/15/be/src/service/query-result-set.cc@256
PS15, Line 256:   static_cast(scalar_expr).GetSlotDescriptor();
+2


http://gerrit.cloudera.org:8080/#/c/19322/15/fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
File fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java:

http://gerrit.cloudera.org:8080/#/c/19322/15/fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java@311
PS15, Line 311: if (colExpr.getType().isStructType()) {
else if?


http://gerrit.cloudera.org:8080/#/c/19322/15/fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java@417
PS15, Line 417:   childSlotDescRawPath.get(childSlotDescRawPath.size() - 1);
nit, here and next rows: indentation


http://gerrit.cloudera.org:8080/#/c/19322/15/testdata/workloads/functional-query/queries/QueryTest/mixed-collections-and-structs.test
File 
testdata/workloads/functional-query/queries/QueryTest/mixed-collections-and-structs.test:

http://gerrit.cloudera.org:8080/#/c/19322/15/testdata/workloads/functional-query/queries/QueryTest/mixed-collections-and-structs.test@257
PS15, Line 257: Zipping unnest
If we look at it strictly this is not zipping unnest, just simply unnest, as 
the zipping part comes in when there are at least 2 columns, e.g 
arr_contains_nested_struct and arr_contains_struct

Other zipping unnest related test idedas:
- test zipping unnest in the from clause
- test zipping unnest with where filter on an unnested col


http://gerrit.cloudera.org:8080/#/c/19322/15/testdata/workloads/functional-query/queries/QueryTest/mixed-collections-and-structs.test@387
PS15, Line 387:  WHERE clauses at different levels.
Can you also add one where different view levels have where conditions on the 
same struct member?



--
To view, visit http://gerrit.cloudera.org:8080/19322
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I476d98884b5fd192dfcd4feeec7947526aebe993
Gerrit-Change-Number: 19322
Gerrit-PatchSet: 15
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Comment-Date: Tue, 31 Jan 2023 15:55:04 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11845: Fix incorrect check of struct STAR path in resolvePathWithMasking

2023-01-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19429 )

Change subject: IMPALA-11845: Fix incorrect check of struct STAR path in 
resolvePathWithMasking
..


Patch Set 8: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/9007/


--
To view, visit http://gerrit.cloudera.org:8080/19429
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8f1e78e325baafbe23101909d47e82bf140a2d77
Gerrit-Change-Number: 19429
Gerrit-PatchSet: 8
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Tue, 31 Jan 2023 15:19:00 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11856: Use POST requests to set log level

2023-01-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/19199 )

Change subject: IMPALA-11856: Use POST requests to set log level
..

IMPALA-11856: Use POST requests to set log level

Set and reset loglevel handlers now require a POST. Implements
Cross-Site Request Forgery (CSRF) prevention in Impala's webserver using
the Double Submit Cookie pattern - where POST requests must include a
csrf_token field in their post with the random value from the cookie -
or a custom header.

CSRF attacks rely on the browser always sending a cookie or
'Authorization: Basic' header.
- With cookies, attacks don't have access to default form values or the
  original cookie, so we can include the cookie's random value in the
  form as a cross-check. As the cookie is cryptographically signed, they
  also can't be replaced with one that would match an attack's forms.
- When not using cookies, a custom header (X-Requested-By) is required
  as CSRFs are unable to add custom headers. This approach is also used
  by Jersey; see
  http://blog.alutam.com/2011/09/14/jersey-and-cross-site-request-forgery-csrf

In a broader implementation this would require a separate cookie so it
can be used to protect logins as well, but login is handled external to
Impala so we re-use the cookie the page already has.

Cookies are now generated for the HTPASSWD authentication method.
Authenticating via JWT still omits cookies because the JWT is already
provided via custom header (preventing CSRF) and disabling
authentication (NONE) means anyone could directly send a request so CSRF
protection is meaningless.

We also start an additional Webserver instance with authentication NONE
when metrics_webserver_port > 0, and the Webserver metric
"impala.webserver.total-cookie-auth-success" can only be registered
once. Additional changes would be necessary to make metric names unique
in Webserver (based on port); for the moment we avoid that by ensuring
all metrics counters are only instantiated for Webservers that use
authentication.

Cookie generation and authentication were updated to provide access to
the random value.

Adds flag to enable SameSite=Strict for defense in depth as mentioned in
https://datatracker.ietf.org/doc/html/draft-ietf-httpbis-rfc6265bis.
This can be enabled if another CSRF attack method is found.

Verified that this prevents CSRF attacks by disabling SameSite=Strict
and visiting (via https://security.love/CSRF-PoC-Genorator):
```

  http://localhost:45000/set_glog_level;>

  
glog

  

http://localhost:45000/set_glog_level;>
  

```

Adds tests for the webserver with basic authentication, LDAP, and SPNEGO
that authorization fails on POST unless
- using a cookie and csrf_token is correctly set in the POST body
- the X-Requested-By header is set

Change-Id: I4be8694492b8ba16737f644ac8c56d8124f19693
Reviewed-on: http://gerrit.cloudera.org:8080/19199
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M be/src/rpc/authentication-util.cc
M be/src/rpc/authentication-util.h
M be/src/util/logging-support.cc
M be/src/util/webserver-test.cc
M be/src/util/webserver.cc
M be/src/util/webserver.h
M fe/src/test/java/org/apache/impala/customcluster/JwtHttpTest.java
M fe/src/test/java/org/apache/impala/customcluster/JwtWebserverTest.java
M fe/src/test/java/org/apache/impala/customcluster/LdapHS2Test.java
M fe/src/test/java/org/apache/impala/customcluster/LdapImpalaShellTest.java
M fe/src/test/java/org/apache/impala/customcluster/LdapImpylaHttpTest.java
M fe/src/test/java/org/apache/impala/customcluster/LdapJdbcTest.java
M fe/src/test/java/org/apache/impala/customcluster/LdapWebserverTest.java
M fe/src/test/java/org/apache/impala/service/JdbcTest.java
A fe/src/test/java/org/apache/impala/testutil/WebClient.java
D fe/src/test/java/org/apache/impala/util/Metrics.java
M tests/webserver/test_web_pages.py
M www/form-hidden-inputs.tmpl
M www/log_level.tmpl
19 files changed, 900 insertions(+), 369 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/19199
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I4be8694492b8ba16737f644ac8c56d8124f19693
Gerrit-Change-Number: 19199
Gerrit-PatchSet: 25
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] IMPALA-11856: Use POST requests to set log level

2023-01-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19199 )

Change subject: IMPALA-11856: Use POST requests to set log level
..


Patch Set 24: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/19199
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4be8694492b8ba16737f644ac8c56d8124f19693
Gerrit-Change-Number: 19199
Gerrit-PatchSet: 24
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Tue, 31 Jan 2023 14:40:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11626: Handle COMMIT COMPACTION EVENT from HMS

2023-01-31 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19155 )

Change subject: IMPALA-11626: Handle COMMIT_COMPACTION_EVENT from HMS
..


Patch Set 8:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/19155/7/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
File fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java:

http://gerrit.cloudera.org:8080/#/c/19155/7/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java@577
PS7, Line 577: throws MetastoreNotificationException
> It is possible that the new impala client can talk to the old HMS where the
Sorry that I'm not clear how the MetastoreNotificationException will be thrown 
here. This is a function declaration, not an error handling that catch any 
exception. Impala could talk to old HMS. But it should use the HMS client that 
it compiles with. And the HMS client seems won't thrown any exceptions here.

To be specifit, the invoked methods don't throw exceptions:
MetastoreEventsProcessor#getMessageDeserializer()
https://github.com/apache/impala/blob/9ebe8ccdaa8a7e1c671beabb30597e089917482e/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java#L1083
MessageDeserializer#getCommitCompactionMessage()
https://github.com/apache/hive/blob/fcf80448513f17c48cfac5ff5cf613e93dd313f2/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/messaging/MessageDeserializer.java#L245
NotificationEvent#getMessage()
https://github.com/apache/hive/blob/fcf80448513f17c48cfac5ff5cf613e93dd313f2/standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/NotificationEvent.java#L324
CommitCompactionMessage#getPartName()
https://github.com/apache/hive/blob/fcf80448513f17c48cfac5ff5cf613e93dd313f2/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/messaging/CommitCompactionMessage.java#L46


http://gerrit.cloudera.org:8080/#/c/19155/7/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/19155/7/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@2632
PS7, Line 2632: MetastoreEvent
> There is no method in CatalogServiceCatalog that would only refresh the fil
I see. I thought we can use MetastoreTableEvent#reloadTableFromCatalog() for 
unpartitioned tables and MetastoreTableEvent#reloadPartitions() for partitioned 
tables. But reloadTableFromCatalog() will reload the table schema. 
reloadPartitions() requires the hmsPartition objects. Here we just have the 
partition names. So they don't fit all the requirements here.

However, I think we do have methods that reload the file metadata only, e.g. 
HdfsTable#reloadPartitionsFromNames()
https://github.com/apache/impala/blob/9ebe8ccdaa8a7e1c671beabb30597e089917482e/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java#L2752

Such methods are more mature since they handle some corner cases, e.g. if the 
partition is already removed in HMS while processing the event. So I suggest 
adding a reloadPartitionsForNames method in MetastoreTableEvent and let 
CommitCompactionEvent extends MetastoreTableEvent to reuse it. We can move the 
current codes of refreshFileMetadataForCommitCompaction() into it, and use 
HdfsTable#reloadPartitionsFromNames() instead.



--
To view, visit http://gerrit.cloudera.org:8080/19155
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I464faedb4e3bbcd417bab2e3cb0d57e339d42605
Gerrit-Change-Number: 19155
Gerrit-PatchSet: 8
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Tue, 31 Jan 2023 12:27:11 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11662: Improve 'refresh iceberg tbl on oss' performance

2023-01-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19379 )

Change subject: IMPALA-11662: Improve 'refresh iceberg_tbl_on_oss' performance
..


Patch Set 12:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/12275/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/19379
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2ee8b6b7559e6590698b46ef1d574e55ed52f9a
Gerrit-Change-Number: 19379
Gerrit-PatchSet: 12
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Xiaoqing Gao 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 31 Jan 2023 12:10:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11662: Improve 'refresh iceberg tbl on oss' performance

2023-01-31 Thread Anonymous Coward (Code Review)
lipeng...@apache.org has uploaded a new patch set (#12). ( 
http://gerrit.cloudera.org:8080/19379 )

Change subject: IMPALA-11662: Improve 'refresh iceberg_tbl_on_oss' performance
..

IMPALA-11662: Improve 'refresh iceberg_tbl_on_oss' performance

As the cost of directory listing on Cloud Storage Systems such as OSS or
S3 is higher than the cost on HDFS, we could create the file descriptors
from the rich metadata provided by Iceberg instead of using
org.apache.hadoop.fs.FileSystem#listFiles. The only thing missing there
is the last_modification_time of the files. But since Iceberg files are
immutable, we could just come up with a special timestamp for these
files.

At the same time, we can also construct file descriptors ourselves
during time travel to reduce the cost of requests with OSS services.

Test:
 * existing tests
 * test on COS with my local test environment

Change-Id: If2ee8b6b7559e6590698b46ef1d574e55ed52f9a
---
M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
A fe/src/main/java/org/apache/impala/catalog/IcebergFileMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/iceberg/GroupedContentFiles.java
M fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java
9 files changed, 332 insertions(+), 107 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/19379/12
--
To view, visit http://gerrit.cloudera.org:8080/19379
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If2ee8b6b7559e6590698b46ef1d574e55ed52f9a
Gerrit-Change-Number: 19379
Gerrit-PatchSet: 12
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Xiaoqing Gao 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-11662: Improve 'refresh iceberg tbl on oss' performance

2023-01-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19379 )

Change subject: IMPALA-11662: Improve 'refresh iceberg_tbl_on_oss' performance
..


Patch Set 12:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/9009/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/19379
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2ee8b6b7559e6590698b46ef1d574e55ed52f9a
Gerrit-Change-Number: 19379
Gerrit-PatchSet: 12
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Xiaoqing Gao 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 31 Jan 2023 11:54:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11662: Improve 'refresh iceberg tbl on oss' performance

2023-01-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19379 )

Change subject: IMPALA-11662: Improve 'refresh iceberg_tbl_on_oss' performance
..


Patch Set 11:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/9008/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/19379
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2ee8b6b7559e6590698b46ef1d574e55ed52f9a
Gerrit-Change-Number: 19379
Gerrit-PatchSet: 11
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Xiaoqing Gao 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 31 Jan 2023 10:21:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11845: Fix incorrect check of struct STAR path in resolvePathWithMasking

2023-01-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19429 )

Change subject: IMPALA-11845: Fix incorrect check of struct STAR path in 
resolvePathWithMasking
..


Patch Set 8:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/9007/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/19429
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8f1e78e325baafbe23101909d47e82bf140a2d77
Gerrit-Change-Number: 19429
Gerrit-PatchSet: 8
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Tue, 31 Jan 2023 10:15:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11845: Fix incorrect check of struct STAR path in resolvePathWithMasking

2023-01-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19429 )

Change subject: IMPALA-11845: Fix incorrect check of struct STAR path in 
resolvePathWithMasking
..


Patch Set 8: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/19429
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8f1e78e325baafbe23101909d47e82bf140a2d77
Gerrit-Change-Number: 19429
Gerrit-PatchSet: 8
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Tue, 31 Jan 2023 10:15:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11845: Fix incorrect check of struct STAR path in resolvePathWithMasking

2023-01-31 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19429 )

Change subject: IMPALA-11845: Fix incorrect check of struct STAR path in 
resolvePathWithMasking
..


Patch Set 7: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/19429
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8f1e78e325baafbe23101909d47e82bf140a2d77
Gerrit-Change-Number: 19429
Gerrit-PatchSet: 7
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Tue, 31 Jan 2023 10:09:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11845: Fix incorrect check of struct STAR path in resolvePathWithMasking

2023-01-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19429 )

Change subject: IMPALA-11845: Fix incorrect check of struct STAR path in 
resolvePathWithMasking
..


Patch Set 7:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/12274/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/19429
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8f1e78e325baafbe23101909d47e82bf140a2d77
Gerrit-Change-Number: 19429
Gerrit-PatchSet: 7
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Tue, 31 Jan 2023 10:02:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11845: Fix incorrect check of struct STAR path in resolvePathWithMasking

2023-01-31 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19429 )

Change subject: IMPALA-11845: Fix incorrect check of struct STAR path in 
resolvePathWithMasking
..


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/19429/6/testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_complex_types.test
File 
testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_complex_types.test:

http://gerrit.cloudera.org:8080/#/c/19429/6/testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_complex_types.test@63
PS6, Line 63: # Test resolving nested columns in expanding star expression.
> Can you add a test with EXPAND_COMPLEX_TYPES=1? It should return an analysi
Done. That would be a good coverage!



--
To view, visit http://gerrit.cloudera.org:8080/19429
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8f1e78e325baafbe23101909d47e82bf140a2d77
Gerrit-Change-Number: 19429
Gerrit-PatchSet: 7
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Tue, 31 Jan 2023 09:43:07 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11845: Fix incorrect check of struct STAR path in resolvePathWithMasking

2023-01-31 Thread Quanlong Huang (Code Review)
Hello Fang-Yu Rao, Daniel Becker, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/19429

to look at the new patch set (#7).

Change subject: IMPALA-11845: Fix incorrect check of struct STAR path in 
resolvePathWithMasking
..

IMPALA-11845: Fix incorrect check of struct STAR path in resolvePathWithMasking

resolvePathWithMasking() is a wrapper on resolvePath() to further
resolve nested columns inside the table masking view. When it was
added, complex types in the select list hadn't been supported yet. So
the table masking view can't expose complex type columns directly in the
select list. Any paths in nested types will be further resolved inside
the table masking view in resolvePathWithMasking().

Take the following query as an example:
  select id, nested_struct.* from complextypestbl;
If Ranger column-masking/row-filter policies applied on the table, the
query is rewritten as
  select id, nested_struct.* from (
select mask(id) from complextypestbl
where row-filtering-condition
  ) t;
Table masking view "t" can't expose the nested column "nested_struct".
So we further resolve "nested_struct" inside the inlineView to use the
masked table "complextypestbl". The underlying TableRef is expected to
be a BaseTableRef.

Paths that don't reference nested columns should be resolved and
returned directly (just like the original resolvePath() does). E.g.
  select v.* from masked_view v
is rewritten to
  select v.* from (
select mask(c1), mask(c2), ..., mask(cn)
from masked_view
where row-filtering-condition
  ) v;

The STAR path "v.*" should be resolved directly. However, it's treated
as a nested column unexpectedly. The code then tries to resolve it
inside the table "masked_view" and found "masked_view" is not a table so
throws the IllegalStateException.

These are the current conditions for identifying nested STAR paths:
 - The destType is STRUCT
 - And the resolved path is rooted at a valid tuple descriptor

They don't really recognize the nested struct columns because STAR paths
on table/view also match these conditions. When the STAR path is an
expansion on a catalog table/view, the root tuple descriptor is
exactly the output tuple of the table/view. The destType is the type of
the tuple descriptor which is always a StructType.

Note that STAR paths on other nested types, i.e. array/map, are invalid.
So the first condition matches for all valid cases. The second condition
also matches all valid cases since both the table/view and struct STAR
expansion have the path rooted at a valid tuple descriptor.

This patch fixes the check for nested struct STAR path by checking
the matched types instead. Note that if "v.*" is a table/view expansion,
the matched type list is empty. If "v.*" is a struct column expansion,
the matched type list contains the STRUCT column type.

Tests:
 - Add missing coverage on STAR paths (v.*) on masked views.

Change-Id: I8f1e78e325baafbe23101909d47e82bf140a2d77
---
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/Path.java
M 
testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking.test
M 
testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_complex_types.test
4 files changed, 166 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/29/19429/7
--
To view, visit http://gerrit.cloudera.org:8080/19429
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8f1e78e325baafbe23101909d47e82bf140a2d77
Gerrit-Change-Number: 19429
Gerrit-PatchSet: 7
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 


[Impala-ASF-CR] IMPALA-11856: Use POST requests to set log level

2023-01-31 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19199 )

Change subject: IMPALA-11856: Use POST requests to set log level
..


Patch Set 23: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/19199
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4be8694492b8ba16737f644ac8c56d8124f19693
Gerrit-Change-Number: 19199
Gerrit-PatchSet: 23
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Tue, 31 Jan 2023 09:29:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11845: Fix incorrect check of struct STAR path in resolvePathWithMasking

2023-01-31 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19429 )

Change subject: IMPALA-11845: Fix incorrect check of struct STAR path in 
resolvePathWithMasking
..


Patch Set 6: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/19429/6/testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_complex_types.test
File 
testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_complex_types.test:

http://gerrit.cloudera.org:8080/#/c/19429/6/testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_complex_types.test@63
PS6, Line 63: # Test resolving nested columns in expanding star expression.
Can you add a test with EXPAND_COMPLEX_TYPES=1? It should return an analysis 
for now, but it is something that should work with column mask on ID once 
https://gerrit.cloudera.org/#/c/19322/ is merged



--
To view, visit http://gerrit.cloudera.org:8080/19429
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8f1e78e325baafbe23101909d47e82bf140a2d77
Gerrit-Change-Number: 19429
Gerrit-PatchSet: 6
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Tue, 31 Jan 2023 09:27:17 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11856: Use POST requests to set log level

2023-01-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19199 )

Change subject: IMPALA-11856: Use POST requests to set log level
..


Patch Set 24:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/9006/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/19199
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4be8694492b8ba16737f644ac8c56d8124f19693
Gerrit-Change-Number: 19199
Gerrit-PatchSet: 24
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Tue, 31 Jan 2023 09:31:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11856: Use POST requests to set log level

2023-01-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19199 )

Change subject: IMPALA-11856: Use POST requests to set log level
..


Patch Set 24: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/19199
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4be8694492b8ba16737f644ac8c56d8124f19693
Gerrit-Change-Number: 19199
Gerrit-PatchSet: 24
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Tue, 31 Jan 2023 09:31:21 +
Gerrit-HasComments: No