[Impala-ASF-CR] WIP IMPALA-11809: Support non unique primary key for Kudu

2023-01-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19383 )

Change subject: WIP IMPALA-11809: Support non unique primary key for Kudu
..


Patch Set 10:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/12176/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 10
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Marton Greber 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Tue, 17 Jan 2023 06:34:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP IMPALA-11809: Support non unique primary key for Kudu

2023-01-16 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has uploaded a new patch set (#10). ( 
http://gerrit.cloudera.org:8080/19383 )

Change subject: WIP IMPALA-11809: Support non unique primary key for Kudu
..

WIP IMPALA-11809: Support non unique primary key for Kudu

Kudu engine recently adds support for non unique primary key. Kudu
server automatically adds an auto-incrementing column named
'auto_increment_id' in a Kudu table which has non unique primary key.
The non unique primary key and 'auto_increment_id' form the effective
unique composite primary key.

This patch integrated new version of Kudu which support non unique
primary key, added syntactic support for creating table with non unique
primary key.
Example:
  CREATE TABLE tbl (i INT NON UNIQUE PRIMARY KEY, s STRING)
  PARTITION BY HASH (i) partitions 3
  STORED as KUDU;
  CREATE TABLE tbl (i INT, s STRING, NON UNIQUE PRIMARY KEY(i))
  PARTITION BY HASH (i) partitions 3
  STORED as KUDU;
  CREATE TABLE tbl NON UNIQUE PRIMARY KEY(id)
  PARTITION BY HASH (id) partitions 3
  STORED as KUDU
  AS SELECT id, string_col FROM functional.alltypes WHERE id = 10;

Kudu assign values for auto-incrementing column automatically
when inserting rows so insertion statements don't need to specify
values for auto-incrementing column.
SELECT statement does not show auto-incrementing column unless the
column is explicitly specified in select list.
UPSERT operation is not supported now for Kudu table with auto
incrementing column due to limitation in Kudu engine.

When creating a Kudu table, specifying PRIMARY KEY is optional.
If there is no primary key attribute specified, the partition key
columes will be promoted as non unique primary key if those columns
are the beginning columns of the table.
New column "key_unique" is added to the output of 'describe' table
command for Kudu table.

Testing:
 - Integrated new version of Kudu built on local machine, ran
   manual test in impala-shell with queries to create tables
   with non unique primary key, and tested insert/update/delete
   operations for Kudu tables with non unique primary key.
 - Added front end and end to end unit tests.
   Passed query_test/test_kudu.py and custom_cluster/test_kudu.py
   on local environment with new version of Kudu built on local
   machine.
 - TODO build toolchian with new version of Kudu, including
   the commits for KUDU-1945. Run core test.

Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
---
M common/thrift/CatalogObjects.thrift
M common/thrift/JniCatalog.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/AlterTableAddColsStmt.java
M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/TableDef.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/FeDb.java
M fe/src/main/java/org/apache/impala/catalog/FeKuduTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduColumn.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalKuduTable.java
M fe/src/main/java/org/apache/impala/service/DescribeResultFactory.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/KuduUtil.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M testdata/workloads/functional-query/queries/QueryTest/kudu-scan-node.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_create.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_delete.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_describe.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_hms_alter.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_partition_ddl.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_stats.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_update.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_upsert.test
M tests/custom_cluster/test_kudu.py
M tests/query_test/test_kudu.py

[Impala-ASF-CR] WIP IMPALA-11809: Support non unique primary key for Kudu

2023-01-16 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19383 )

Change subject: WIP IMPALA-11809: Support non unique primary key for Kudu
..


Patch Set 9:

(16 comments)

Thanks Qifan for your review.

http://gerrit.cloudera.org:8080/#/c/19383/9//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19383/9//COMMIT_MSG@9
PS9, Line 9: Kudu engine adds support for non unique primary key. It adds
> nit recently
Done


http://gerrit.cloudera.org:8080/#/c/19383/9/common/thrift/CatalogObjects.thrift
File common/thrift/CatalogObjects.thrift:

http://gerrit.cloudera.org:8080/#/c/19383/9/common/thrift/CatalogObjects.thrift@580
PS9, Line 580: add
> nit adds
Done


http://gerrit.cloudera.org:8080/#/c/19383/9/common/thrift/CatalogObjects.thrift@581
PS9, Line 581: .
> nit. ", in this case, this field is set to false"
Done


http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/analysis/AlterTableAddColsStmt.java
File fe/src/main/java/org/apache/impala/analysis/AlterTableAddColsStmt.java:

http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/analysis/AlterTableAddColsStmt.java@99
PS9, Line 99: c.isPrimaryKeyUnique() ? "" : "non unique ", c.toString()
> This statement can be integrated into ColumnDef.toString().
use KuduUtil.getPrimaryKeyString()


http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
File fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java:

http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java@227
PS9, Line 227: createStmt_.isPrimaryKeyUnique(), 
createStmt_.getPrimaryKeyColumnDefs(),
> nit. It may be more logic to switch the two arguments, as the new argument
Done


http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/analysis/TableDef.java
File fe/src/main/java/org/apache/impala/analysis/TableDef.java:

http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/analysis/TableDef.java@87
PS9, Line 87: An auto-incrementing column will be added
:   // automatically by Kudu engine as key column if primary key is 
not unique.
> nit. If not, an auto-incrementing column will be added
Done


http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/analysis/TableDef.java@537
PS9, Line 537: only the columns, on which the
 :   // partitions are being created,
> nit. the partition columns have to be first in the primary key columns.
Done


http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/analysis/TableDef.java@566
PS9, Line 566:   throw new AnalysisException(String.format("Multiple 
%sprimary keys specified. " +
 :   "Composite %sprimary keys can be specified using the " 
+
 :   "%sPRIMARY KEY (col1, col2, ...) syntax at the end of 
the column definition.",
 :   isPrimaryKeyUnique() ? "" : "non unique ",
 :   isPrimaryKeyUnique() ? "" : "non unique ",
 :   isPrimaryKeyUnique() ? "" : "NON UNIQUE "));
> nit. This looks similar to the lines starting at 520.
use KuduUtil.getPrimaryKeyString()


http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/catalog/Db.java
File fe/src/main/java/org/apache/impala/catalog/Db.java:

http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/catalog/Db.java@242
PS9, Line 242: boolean isPrimaryKeyUnique, List primaryKeyColumnDefs,
> As noted in another comment, these two arguments probably should be swapped
Done


http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/catalog/KuduColumn.java
File fe/src/main/java/org/apache/impala/catalog/KuduColumn.java:

http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/catalog/KuduColumn.java@134
PS9, Line 134: position
> nit. 'position'.
Done


http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/catalog/KuduColumn.java@135
PS9, Line 135: if
> nit. when
Done


http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/catalog/KuduColumn.java@144
PS9, Line 144: false
> One may call this function even when column.isIs_primary_key_unique() is tr
This is a static function and it is only called when the primary key of the 
table is not unique.


http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java@133
PS9, Line 133:   if (isKey && !isKeyUnique) {
 :   csb.nonUniqueKey(true);
 : } else {

[Impala-ASF-CR] IMPALA-11843: Fix IndexOutOfBoundsException in analytic limit pushdown

2023-01-16 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19422 )

Change subject: IMPALA-11843: Fix IndexOutOfBoundsException in analytic limit 
pushdown
..


Patch Set 1:

Thank Aman! Merging this.


--
To view, visit http://gerrit.cloudera.org:8080/19422
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I26d6bd58be58d09a29b8b81972e76665f41cf103
Gerrit-Change-Number: 19422
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Tue, 17 Jan 2023 02:38:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11843: Fix IndexOutOfBoundsException in analytic limit pushdown

2023-01-16 Thread Quanlong Huang (Code Review)
Quanlong Huang has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/19422 )

Change subject: IMPALA-11843: Fix IndexOutOfBoundsException in analytic limit 
pushdown
..

IMPALA-11843: Fix IndexOutOfBoundsException in analytic limit pushdown

When finding analytic conjuncts for analytic limit pushdown, the
following conditions are checked:
 - The conjunct should be a binary predicate
 - Left hand side is a SlotRef referencing the analytic expression, e.g.
   "rn" of "row_number() as rn"
 - The underlying analytic function is rank(), dense_rank() or row_number()
 - The window frame is UNBOUNDED PRECEDING to CURRENT ROW
 - Right hand side is a valid numeric limit
 - The op is =, <, or <=
See more details in AnalyticPlanner.inferPartitionLimits().

While checking the 2nd and 3rd condition, we get the source exprs of the
SlotRef. The source exprs could be empty if the SlotRef is actually
referencing a column of the table, i.e. a column materialized by the
scan node. Currently, we check the first source expr directly regardless
whether the list is empty, which causes the IndexOutOfBoundsException.

This patch fixes it by augmenting the check to consider an empty list.
Also fixes a similar code in AnalyticEvalNode.

Tests:
 - Add FE and e2e regression tests

Change-Id: I26d6bd58be58d09a29b8b81972e76665f41cf103
Reviewed-on: http://gerrit.cloudera.org:8080/19422
Reviewed-by: Aman Sinha 
Tested-by: Impala Public Jenkins 
---
M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java
M fe/src/main/java/org/apache/impala/planner/AnalyticPlanner.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/limit-pushdown-partitioned-top-n.test
M testdata/workloads/functional-query/queries/QueryTest/partitioned-top-n.test
4 files changed, 46 insertions(+), 2 deletions(-)

Approvals:
  Aman Sinha: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/19422
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I26d6bd58be58d09a29b8b81972e76665f41cf103
Gerrit-Change-Number: 19422
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 


[Impala-ASF-CR] IMPALA-11843: Fix IndexOutOfBoundsException in analytic limit pushdown

2023-01-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19422 )

Change subject: IMPALA-11843: Fix IndexOutOfBoundsException in analytic limit 
pushdown
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/19422
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I26d6bd58be58d09a29b8b81972e76665f41cf103
Gerrit-Change-Number: 19422
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Tue, 17 Jan 2023 01:38:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11826: Avoid calling planFiles() on Iceberg V2 tables when there are no predicates

2023-01-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19419 )

Change subject: IMPALA-11826: Avoid calling planFiles() on Iceberg V2 tables 
when there are no predicates
..


Patch Set 3: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/19419
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia46bd2dce248a9e096fc1c0bd914fc3fa4686fb0
Gerrit-Change-Number: 19419
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 16 Jan 2023 22:18:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11843: Fix IndexOutOfBoundsException in analytic limit pushdown

2023-01-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19422 )

Change subject: IMPALA-11843: Fix IndexOutOfBoundsException in analytic limit 
pushdown
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/8968/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/19422
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I26d6bd58be58d09a29b8b81972e76665f41cf103
Gerrit-Change-Number: 19422
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Mon, 16 Jan 2023 20:31:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11843: Fix IndexOutOfBoundsException in analytic limit pushdown

2023-01-16 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19422 )

Change subject: IMPALA-11843: Fix IndexOutOfBoundsException in analytic limit 
pushdown
..


Patch Set 1: Code-Review+2

LGTM.  Thanks for the patch. I am surprised we did not already have such a test 
case with mixed conjuncts.


--
To view, visit http://gerrit.cloudera.org:8080/19422
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I26d6bd58be58d09a29b8b81972e76665f41cf103
Gerrit-Change-Number: 19422
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Mon, 16 Jan 2023 19:27:03 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11798: Property 'external.table.purge' should not be ignored when CREATE Iceberg tables

2023-01-16 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19416 )

Change subject: IMPALA-11798: Property 'external.table.purge' should not be 
ignored when CREATE Iceberg tables
..


Patch Set 1: Code-Review+1

(1 comment)

LGTM, thanks for fixing this.

http://gerrit.cloudera.org:8080/#/c/19416/1/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
File fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java:

http://gerrit.cloudera.org:8080/#/c/19416/1/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java@768
PS1, Line 768: // HMS will override 'external.table.purge' to 'TRUE' When 
'iceberg.catalog' is not
 : // the Hive Catalog for managed tables.
Do you want to open a Jira for this?



--
To view, visit http://gerrit.cloudera.org:8080/19416
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2649dd38fbe050044817d6c425ef447245aa2829
Gerrit-Change-Number: 19416
Gerrit-PatchSet: 1
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 16 Jan 2023 18:03:59 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11658: Implement Iceberg manifest caching config for Impala

2023-01-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19423 )

Change subject: IMPALA-11658: Implement Iceberg manifest caching config for 
Impala
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/12175/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/19423
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5a60a700d2ae6302dfe395d1ef602e6b1d821888
Gerrit-Change-Number: 19423
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 16 Jan 2023 17:31:29 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11826: Avoid calling planFiles() on Iceberg V2 tables when there are no predicates

2023-01-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19419 )

Change subject: IMPALA-11826: Avoid calling planFiles() on Iceberg V2 tables 
when there are no predicates
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/12174/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/19419
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia46bd2dce248a9e096fc1c0bd914fc3fa4686fb0
Gerrit-Change-Number: 19419
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 16 Jan 2023 17:30:11 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11826: Avoid calling planFiles() on Iceberg V2 tables when there are no predicates

2023-01-16 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19419 )

Change subject: IMPALA-11826: Avoid calling planFiles() on Iceberg V2 tables 
when there are no predicates
..


Patch Set 3:

(13 comments)

Thanks for the comments!

http://gerrit.cloudera.org:8080/#/c/19419/2/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
File fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java:

http://gerrit.cloudera.org:8080/#/c/19419/2/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java@754
PS2, Line 754: ble, contentFile);
 : nameToStats.put(name, mergePartitionStats(nameToStats, 
contentFile, name));
 :   }
 :   return nameToStats;
> I prefer to put this logic in GroupedContentFiles, which could be nameed li
Done


http://gerrit.cloudera.org:8080/#/c/19419/2/fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java
File fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java:

http://gerrit.cloudera.org:8080/#/c/19419/2/fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java@49
PS2, Line 49: file
> nit: file?
Done


http://gerrit.cloudera.org:8080/#/c/19419/2/fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java@149
PS2, Line 149:   public List getDataFilesWithoutDeletes() {
> +1 indeed.
Done


http://gerrit.cloudera.org:8080/#/c/19419/2/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java
File fe/src/main/java/org/apache/impala/catalog/IcebergTable.java:

http://gerrit.cloudera.org:8080/#/c/19419/2/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java@32
PS2, Line 32: import org.apache.impala.analysis.IcebergPartitionField;
: import org.apache.impala.analysis.Ice
> nit: unused import.
Done


http://gerrit.cloudera.org:8080/#/c/19419/2/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java@37
PS2, Line 37: ec;
> ditto
Done


http://gerrit.cloudera.org:8080/#/c/19419/2/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java@39
PS2, Line 39: import org.apache.impala.thrift.TGetP
> ditto
Done


http://gerrit.cloudera.org:8080/#/c/19419/2/fe/src/main/java/org/apache/impala/catalog/iceberg/GroupedContentFiles.java
File 
fe/src/main/java/org/apache/impala/catalog/iceberg/GroupedContentFiles.java:

http://gerrit.cloudera.org:8080/#/c/19419/2/fe/src/main/java/org/apache/impala/catalog/iceberg/GroupedContentFiles.java@34
PS2, Line 34:  * Struct-like object to group different Iceberg content files:
> line has trailing whitespace
Done


http://gerrit.cloudera.org:8080/#/c/19419/2/fe/src/main/java/org/apache/impala/catalog/iceberg/GroupedContentFiles.java@36
PS2, Line 36: es with deleted row
> This is a nice encapsulation, the class name is explicit, and it's more ele
Yeah, I agree. Sometimes a Pair is a handy lazy solution when we only need to 
group two objects, but I agree that it's nicer to introduce a new class instead.


http://gerrit.cloudera.org:8080/#/c/19419/2/fe/src/main/java/org/apache/impala/catalog/iceberg/GroupedContentFiles.java@42
PS2, Line 42:   public Set deleteFiles = new HashSet<>();
> This is a style thing, but it can be neater to do these initializations in
Done


http://gerrit.cloudera.org:8080/#/c/19419/2/fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java
File fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java:

http://gerrit.cloudera.org:8080/#/c/19419/2/fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java@162
PS2, Line 162:   private void setFileDescriptorsBasedOnFileStore() {
> +1 indeed
Done


http://gerrit.cloudera.org:8080/#/c/19419/2/fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java@166
PS2, Line 166: deleteFiles_ = new HashSet<>(fileStore.getDeleteFiles());
> +1 indeed
Done


http://gerrit.cloudera.org:8080/#/c/19419/2/fe/src/main/java/org/apache/impala/util/IcebergUtil.java
File fe/src/main/java/org/apache/impala/util/IcebergUtil.java:

http://gerrit.cloudera.org:8080/#/c/19419/2/fe/src/main/java/org/apache/impala/util/IcebergUtil.java@558
PS2, Line 558: try (CloseableIterable fileScanTasks = 
planFiles(
> dataFilesWithoutDeletes, dataFilesWithDeletes, deleteFiles are unused now.
Done


http://gerrit.cloudera.org:8080/#/c/19419/2/fe/src/test/java/org/apache/impala/catalog/local/LocalCatalogTest.java
File fe/src/test/java/org/apache/impala/catalog/local/LocalCatalogTest.java:

http://gerrit.cloudera.org:8080/#/c/19419/2/fe/src/test/java/org/apache/impala/catalog/local/LocalCatalogTest.java@287
PS2, Line 287: tblInfo.getIceberg_table().getContent_files(),
> Can add
Good point, thanks.



--
To view, visit http://gerrit.cloudera.org:8080/19419
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: 

[Impala-ASF-CR] IMPALA-11658: Implement Iceberg manifest caching config for Impala

2023-01-16 Thread Riza Suminto (Code Review)
Riza Suminto has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/19423


Change subject: IMPALA-11658: Implement Iceberg manifest caching config for 
Impala
..

IMPALA-11658: Implement Iceberg manifest caching config for Impala

Impala need to supply Iceberg's catalog properties to enable manifest
caching feature. This commit implement the necessary config reading.
Iceberg related config is read from hadoop-conf.xml and supplied as a
Map in catalog instantiation.

Additionally, this patch also replace deprecated RuntimeIOException with
its superclass, UncheckedIOException.

Testing:
- Pass core tests.
- Checked that manifest caching works through debug logging.

Change-Id: I5a60a700d2ae6302dfe395d1ef602e6b1d821888
---
M fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCatalog.java
M fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCatalogs.java
M fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHadoopCatalog.java
M fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHadoopTables.java
M fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java
M fe/src/main/java/org/apache/impala/util/IcebergUtil.java
M testdata/cluster/node_templates/common/etc/hadoop/conf/core-site.xml.py
7 files changed, 65 insertions(+), 23 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/19423/1
--
To view, visit http://gerrit.cloudera.org:8080/19423
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I5a60a700d2ae6302dfe395d1ef602e6b1d821888
Gerrit-Change-Number: 19423
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto 


[Impala-ASF-CR] IMPALA-11826: Avoid calling planFiles() on Iceberg V2 tables when there are no predicates

2023-01-16 Thread Zoltan Borok-Nagy (Code Review)
Hello Andrew Sherman, Tamas Mate, lipeng...@apache.org, Gergely Fürnstáhl, 
Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/19419

to look at the new patch set (#3).

Change subject: IMPALA-11826: Avoid calling planFiles() on Iceberg V2 tables 
when there are no predicates
..

IMPALA-11826: Avoid calling planFiles() on Iceberg V2 tables when there are no 
predicates

Similar to IMPALA-11591 but this Jira extends it to V2 tables. With
this patch we group data files into two categories in
IcebergContentFileStore:
 * data files without deletes
 * data files with deletes

With this information we can avoid calling planFiles() when planning
the scans of Iceberg tables. We can just set the lists of the file
descriptors based on IcebergContentFileStore then invoke the regular
planning methods.

iceberg-v2-tables.test had to be updated a bit because now we are
calculating the lengths of the file paths based on Impala's file
descriptor objects + table location, and not based on data file
information in Iceberg metadata (which has the file system prefix
stripped)

Testing:
  * executed existing tests
  * Updated plan tests

Change-Id: Ia46bd2dce248a9e096fc1c0bd914fc3fa4686fb0
---
M common/thrift/CatalogObjects.thrift
M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java
M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java
A fe/src/main/java/org/apache/impala/catalog/iceberg/GroupedContentFiles.java
M fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java
M fe/src/main/java/org/apache/impala/util/IcebergUtil.java
M fe/src/test/java/org/apache/impala/catalog/local/LocalCatalogTest.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-tables.test
9 files changed, 277 insertions(+), 172 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/19419/3
--
To view, visit http://gerrit.cloudera.org:8080/19419
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia46bd2dce248a9e096fc1c0bd914fc3fa4686fb0
Gerrit-Change-Number: 19419
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 


[Impala-ASF-CR] IMPALA-11826: Avoid calling planFiles() on Iceberg V2 tables when there are no predicates

2023-01-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19419 )

Change subject: IMPALA-11826: Avoid calling planFiles() on Iceberg V2 tables 
when there are no predicates
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/8967/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/19419
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia46bd2dce248a9e096fc1c0bd914fc3fa4686fb0
Gerrit-Change-Number: 19419
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 16 Jan 2023 17:11:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11833: Fixed manifest length in snapshot files

2023-01-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19409 )

Change subject: IMPALA-11833: Fixed manifest_length in snapshot files
..


Patch Set 6: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/19409
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I258055998a1d41b7f6047b6879e919834ed2c247
Gerrit-Change-Number: 19409
Gerrit-PatchSet: 6
Gerrit-Owner: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 16 Jan 2023 14:37:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11833: Fixed manifest length in snapshot files

2023-01-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/19409 )

Change subject: IMPALA-11833: Fixed manifest_length in snapshot files
..

IMPALA-11833: Fixed manifest_length in snapshot files

Some manifest files were manually edited to support multiple
environments, namely hdfs://localhost:20500 were cropped from every
path. The snapshot files contain the length of these manifest files,
which was not adjusted.

Testing:
 - Ran test_iceberg.py locally
 - Verified lengths manually

Change-Id: I258055998a1d41b7f6047b6879e919834ed2c247
Reviewed-on: http://gerrit.cloudera.org:8080/19409
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M testdata/data/README
M 
testdata/data/iceberg_test/hadoop_catalog/ice/airports_orc/metadata/snap-4990977953383402321-1-1ebf435e-7da7-41e7-bebf-eb3ebf1b1002.avro
M 
testdata/data/iceberg_test/hadoop_catalog/ice/airports_parquet/metadata/snap-2304960110511088609-1-2d65964e-90ea-4442-bab5-71a67b84dfd9.avro
M 
testdata/data/iceberg_test/hadoop_catalog/ice/complextypestbl_iceberg_orc/metadata/snap-8747481058330439933-1-46b4a907-2ff3-4799-ba4a-074d04734265.avro
M 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part/metadata/snap-6167994413873848621-1-283c54cb-5a45-4a2c-bca8-4bfa0e61cdbd.avro
M 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part_orc/metadata/snap-7569365419257304230-1-db72fbf2-f9f6-4985-8a5f-fd9f632f2c77.avro
M 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution/metadata/snap-6654673546382518186-1-2d05a7d4-c229-44c3-860e-e77e46e71a19.avro
M 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution_orc/metadata/snap-888589552112488046-1-8db62f0e-38e5-434b-94dc-c84210302ad8.avro
M 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_timestamp_part/metadata/snap-1967339514069250436-1-a366370e-6b9a-4698-82d0-95fb69b19afb.avro
M 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_timestamptz_part/metadata/snap-2778998487482282437-1-94003077-eabb-4dab-95ec-52a1727ef853.avro
M 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/metadata/snap-5725822353600261755-1-0eadf173-0c84-4378-a9d0-5d7f47183978.avro
M 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/metadata/snap-6816997371555012807-1-8cbef400-daea-478a-858a-2baf2438f644.avro
M 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_no_deletes/metadata/snap-728158873687794725-1-5c80922f-01b5-4d52-bc93-6505be3b977b.avro
M 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_no_deletes_orc/metadata/snap-1041485290740594175-1-a72290c9-c518-4719-8502-6c83a881de07.avro
M 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_partitioned_position_deletes/metadata/snap-2057976186205897384-1-771485e9-78ac-4ffc-b1ef-1fda5bab33cf.avro
M 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_partitioned_position_deletes/metadata/snap-8885697082976537578-1-464c179e-c9ba-40f5-a35f-144106a1f16c.avro
M 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_partitioned_position_deletes_orc/metadata/snap-5359840930115020310-1-5b14eaa8-83cb-4f71-a473-402f345fa5b5.avro
M 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_partitioned_position_deletes_orc/metadata/snap-5416468273053855108-1-588ee1ca-6a85-4af2-8ba2-e595e71712ba.avro
M 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_positional_delete_all_rows/metadata/snap-444149380144800647-1-816400dd-012d-40c5-ab65-bc16ff18d2d7.avro
M 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_positional_delete_all_rows/metadata/snap-8593920101374128463-1-236523f7-a5bc-459f-b4c9-16af5bd43bca.avro
M 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_positional_delete_all_rows_orc/metadata/snap-1801547319505512253-1-6d54c2e9-a4c7-4c5c-8a6f-a17f92de4c48.avro
M 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_positional_delete_all_rows_orc/metadata/snap-4807054508647143162-1-83d17f01-336e-41ab-a791-ffd5f511f6ab.avro
M 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_positional_not_all_data_files_have_delete_files/metadata/snap-1497619269847778439-1-ea749da1-7b98-4dca-a4eb-f7d5d62f9dde.avro
M 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_positional_not_all_data_files_have_delete_files/metadata/snap-4363979609026842966-1-db6f17fe-6fb6-4120-839d-4d6ca5244a1c.avro
M 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_positional_not_all_data_files_have_delete_files/metadata/snap-5762682948883272650-1-969de65c-8915-4ae5-8d54-a82701195c55.avro
M 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_positional_not_all_data_files_have_delete_files/metadata/snap-7490459762454857930-1-ec9e7ecc-b546-42dd-8d0d-0dde2182dbc7.avro
M 

[Impala-ASF-CR] IMPALA-11821: Adjusting manifest length and absolute paths in case of metadata rewrite

2023-01-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19412 )

Change subject: IMPALA-11821: Adjusting manifest_length and absolute paths in 
case of metadata rewrite
..


Patch Set 10:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/12173/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/19412
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I89b9208f25552012cc1ab16fa60a819dd5a683d9
Gerrit-Change-Number: 19412
Gerrit-PatchSet: 10
Gerrit-Owner: Gergely Fürnstáhl 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Comment-Date: Mon, 16 Jan 2023 13:39:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11821: Adjusting manifest length and absolute paths in case of metadata rewrite

2023-01-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19412 )

Change subject: IMPALA-11821: Adjusting manifest_length and absolute paths in 
case of metadata rewrite
..


Patch Set 9:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/12172/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/19412
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I89b9208f25552012cc1ab16fa60a819dd5a683d9
Gerrit-Change-Number: 19412
Gerrit-PatchSet: 9
Gerrit-Owner: Gergely Fürnstáhl 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Comment-Date: Mon, 16 Jan 2023 13:39:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11821: Adjusting manifest length and absolute paths in case of metadata rewrite

2023-01-16 Thread Code Review
Gergely Fürnstáhl has uploaded a new patch set (#10). ( 
http://gerrit.cloudera.org:8080/19412 )

Change subject: IMPALA-11821: Adjusting manifest_length and absolute paths in 
case of metadata rewrite
..

IMPALA-11821: Adjusting manifest_length and absolute paths in case of metadata 
rewrite

testdata/bin/rewrite-iceberg-metadata.py rewrites manifest and snapshot
files using the provided prefix for file pathes. Snapshot files store
the length of manifest files as well, this needs to be adjusted too.

Additionally, improved path rewrite to be able to rewrite absolute
paths correctly and pretty dumping metadata jsons.

Testing:
 - Tested locally, manually

Change-Id: I89b9208f25552012cc1ab16fa60a819dd5a683d9
---
M testdata/bin/rewrite-iceberg-metadata.py
M tests/common/skip.py
M tests/query_test/test_iceberg.py
3 files changed, 47 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/19412/10
--
To view, visit http://gerrit.cloudera.org:8080/19412
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I89b9208f25552012cc1ab16fa60a819dd5a683d9
Gerrit-Change-Number: 19412
Gerrit-PatchSet: 10
Gerrit-Owner: Gergely Fürnstáhl 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Noemi Pap-Takacs 


[Impala-ASF-CR] IMPALA-11821: Adjusting manifest length and absolute paths in case of metadata rewrite

2023-01-16 Thread Code Review
Gergely Fürnstáhl has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19412 )

Change subject: IMPALA-11821: Adjusting manifest_length and absolute paths in 
case of metadata rewrite
..


Patch Set 7:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/19412/7/testdata/bin/rewrite-iceberg-metadata.py
File testdata/bin/rewrite-iceberg-metadata.py:

http://gerrit.cloudera.org:8080/#/c/19412/7/testdata/bin/rewrite-iceberg-metadata.py@35
PS7, Line 35: pathes
> nit: paths
Done


http://gerrit.cloudera.org:8080/#/c/19412/7/testdata/bin/rewrite-iceberg-metadata.py@40
PS7, Line 40: pathes
> nit: paths
Done



--
To view, visit http://gerrit.cloudera.org:8080/19412
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I89b9208f25552012cc1ab16fa60a819dd5a683d9
Gerrit-Change-Number: 19412
Gerrit-PatchSet: 7
Gerrit-Owner: Gergely Fürnstáhl 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Comment-Date: Mon, 16 Jan 2023 13:19:40 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11821: Adjusting manifest length and absolute paths in case of metadata rewrite

2023-01-16 Thread Code Review
Gergely Fürnstáhl has uploaded a new patch set (#9). ( 
http://gerrit.cloudera.org:8080/19412 )

Change subject: IMPALA-11821: Adjusting manifest_length and absolute paths in 
case of metadata rewrite
..

IMPALA-11821: Adjusting manifest_length and absolute paths in case of metadata 
rewrite

testdata/bin/rewrite-iceberg-metadata.py rewrites manifest and snapshot
files using the provided prefix for file pathes. Snapshot files store
the length of manifest files as well, this needs to be adjusted too.

Additionally, improved path rewrite to be able to rewrite absolute
paths correctly and pretty dumping metadata jsons.

Testing:
 - Tested locally, manually

Change-Id: I89b9208f25552012cc1ab16fa60a819dd5a683d9
---
M testdata/bin/rewrite-iceberg-metadata.py
M tests/common/skip.py
M tests/query_test/test_iceberg.py
3 files changed, 47 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/19412/9
--
To view, visit http://gerrit.cloudera.org:8080/19412
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I89b9208f25552012cc1ab16fa60a819dd5a683d9
Gerrit-Change-Number: 19412
Gerrit-PatchSet: 9
Gerrit-Owner: Gergely Fürnstáhl 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Noemi Pap-Takacs 


[Impala-ASF-CR] IMPALA-11821: Adjusting manifest length and absolute paths in case of metadata rewrite

2023-01-16 Thread Code Review
Gergely Fürnstáhl has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19412 )

Change subject: IMPALA-11821: Adjusting manifest_length and absolute paths in 
case of metadata rewrite
..


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/19412/7/testdata/bin/rewrite-iceberg-metadata.py
File testdata/bin/rewrite-iceberg-metadata.py:

http://gerrit.cloudera.org:8080/#/c/19412/7/testdata/bin/rewrite-iceberg-metadata.py@97
PS7, Line 97: prefix + metadata['location']
> generate_new_path(prefix, metadata['location'])
Thanks, nice catch! Fixed



--
To view, visit http://gerrit.cloudera.org:8080/19412
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I89b9208f25552012cc1ab16fa60a819dd5a683d9
Gerrit-Change-Number: 19412
Gerrit-PatchSet: 7
Gerrit-Owner: Gergely Fürnstáhl 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Comment-Date: Mon, 16 Jan 2023 13:18:01 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9551: Allow mixed complex types in select list

2023-01-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19322 )

Change subject: IMPALA-9551: Allow mixed complex types in select list
..


Patch Set 12:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/12171/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/19322
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I476d98884b5fd192dfcd4feeec7947526aebe993
Gerrit-Change-Number: 19322
Gerrit-PatchSet: 12
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Comment-Date: Mon, 16 Jan 2023 12:51:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11821: Adjusting manifest length and absolute paths in case of metadata rewrite

2023-01-16 Thread Noemi Pap-Takacs (Code Review)
Noemi Pap-Takacs has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19412 )

Change subject: IMPALA-11821: Adjusting manifest_length and absolute paths in 
case of metadata rewrite
..


Patch Set 7:

(3 comments)

Thank you for reverting the 'SkipIf' annotations in the avro and mixed format 
tests, and also for the path rewrites.
'Location' also needs to be rewritten.

http://gerrit.cloudera.org:8080/#/c/19412/7/testdata/bin/rewrite-iceberg-metadata.py
File testdata/bin/rewrite-iceberg-metadata.py:

http://gerrit.cloudera.org:8080/#/c/19412/7/testdata/bin/rewrite-iceberg-metadata.py@35
PS7, Line 35: pathes
nit: paths


http://gerrit.cloudera.org:8080/#/c/19412/7/testdata/bin/rewrite-iceberg-metadata.py@40
PS7, Line 40: pathes
nit: paths


http://gerrit.cloudera.org:8080/#/c/19412/7/testdata/bin/rewrite-iceberg-metadata.py@97
PS7, Line 97: prefix + metadata['location']
generate_new_path(prefix, metadata['location'])



--
To view, visit http://gerrit.cloudera.org:8080/19412
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I89b9208f25552012cc1ab16fa60a819dd5a683d9
Gerrit-Change-Number: 19412
Gerrit-PatchSet: 7
Gerrit-Owner: Gergely Fürnstáhl 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Comment-Date: Mon, 16 Jan 2023 12:52:36 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9551: Allow mixed complex types in select list

2023-01-16 Thread Daniel Becker (Code Review)
Daniel Becker has uploaded a new patch set (#12). ( 
http://gerrit.cloudera.org:8080/19322 )

Change subject: IMPALA-9551: Allow mixed complex types in select list
..

IMPALA-9551: Allow mixed complex types in select list

Currently collections and structs are supported in the select list, also
when they are nested (structs in structs and collections in
collections), but mixing different kinds of complex types, i.e. having
structs in collections or vice versa, is not supported.

This patch adds support for mixed complex types in the select list.

There is a limitation: zipping unnest for arrays that are within a
struct is not supported, for example the following query:

  use functional_parquet;
  select unnest(struct_contains_nested_arr.arr) from
  collection_struct_mix;

Testing:
 - Created a new test table, 'collection_struct_mix', that contains
   mixed complex types.
 - Added tests in mixed-collections-and-structs.test that test having
   mixed complex types in the select list. These tests are called from
   test_nested_types.py::TestMixedCollectionsAndStructsInSelectList.
 - Ran existing tests that test collections and structs in the select
   list; test queries that expected a failure in case of mixed complex
   types have been moved to mixed-collections-and-structs.test and now
   expect success.

Change-Id: I476d98884b5fd192dfcd4feeec7947526aebe993
---
M be/src/exec/unnest-node.cc
M be/src/exprs/slot-ref.h
M be/src/runtime/complex-value-writer.h
M be/src/runtime/complex-value-writer.inline.h
M be/src/runtime/descriptors.cc
M be/src/runtime/raw-value.cc
M be/src/service/hs2-util.cc
M be/src/service/query-result-set.cc
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/CollectionTableRef.java
M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
M fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java
M fe/src/main/java/org/apache/impala/analysis/UnnestExpr.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
M testdata/workloads/functional-query/queries/QueryTest/map_null_keys.test
A 
testdata/workloads/functional-query/queries/QueryTest/mixed-collections-and-structs.test
M 
testdata/workloads/functional-query/queries/QueryTest/struct-in-select-list.test
M tests/query_test/test_nested_types.py
23 files changed, 1,113 insertions(+), 257 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/22/19322/12
--
To view, visit http://gerrit.cloudera.org:8080/19322
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I476d98884b5fd192dfcd4feeec7947526aebe993
Gerrit-Change-Number: 19322
Gerrit-PatchSet: 12
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Peter Rozsa 


[Impala-ASF-CR] IMPALA-11843: Fix IndexOutOfBoundsException in analytic limit pushdown

2023-01-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19422 )

Change subject: IMPALA-11843: Fix IndexOutOfBoundsException in analytic limit 
pushdown
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/12170/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/19422
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I26d6bd58be58d09a29b8b81972e76665f41cf103
Gerrit-Change-Number: 19422
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Mon, 16 Jan 2023 12:00:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11843: Fix IndexOutOfBoundsException in analytic limit pushdown

2023-01-16 Thread Quanlong Huang (Code Review)
Quanlong Huang has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/19422


Change subject: IMPALA-11843: Fix IndexOutOfBoundsException in analytic limit 
pushdown
..

IMPALA-11843: Fix IndexOutOfBoundsException in analytic limit pushdown

When finding analytic conjuncts for analytic limit pushdown, the
following conditions are checked:
 - The conjunct should be a binary predicate
 - Left hand side is a SlotRef referencing the analytic expression, e.g.
   "rn" of "row_number() as rn"
 - The underlying analytic function is rank(), dense_rank() or row_number()
 - The window frame is UNBOUNDED PRECEDING to CURRENT ROW
 - Right hand side is a valid numeric limit
 - The op is =, <, or <=
See more details in AnalyticPlanner.inferPartitionLimits().

While checking the 2nd and 3rd condition, we get the source exprs of the
SlotRef. The source exprs could be empty if the SlotRef is actually
referencing a column of the table, i.e. a column materialized by the
scan node. Currently, we check the first source expr directly regardless
whether the list is empty, which causes the IndexOutOfBoundsException.

This patch fixes it by augmenting the check to consider an empty list.
Also fixes a similar code in AnalyticEvalNode.

Tests:
 - Add FE and e2e regression tests

Change-Id: I26d6bd58be58d09a29b8b81972e76665f41cf103
---
M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java
M fe/src/main/java/org/apache/impala/planner/AnalyticPlanner.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/limit-pushdown-partitioned-top-n.test
M testdata/workloads/functional-query/queries/QueryTest/partitioned-top-n.test
4 files changed, 46 insertions(+), 2 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/22/19422/1
--
To view, visit http://gerrit.cloudera.org:8080/19422
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I26d6bd58be58d09a29b8b81972e76665f41cf103
Gerrit-Change-Number: 19422
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 


[Impala-ASF-CR] IMPALA-11821: Adjusting manifest length and absolute paths in case of metadata rewrite

2023-01-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19412 )

Change subject: IMPALA-11821: Adjusting manifest_length and absolute paths in 
case of metadata rewrite
..


Patch Set 7:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/12169/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/19412
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I89b9208f25552012cc1ab16fa60a819dd5a683d9
Gerrit-Change-Number: 19412
Gerrit-PatchSet: 7
Gerrit-Owner: Gergely Fürnstáhl 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Comment-Date: Mon, 16 Jan 2023 11:14:37 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11821: Adjusting manifest length and absolute paths in case of metadata rewrite

2023-01-16 Thread Code Review
Gergely Fürnstáhl has uploaded a new patch set (#7). ( 
http://gerrit.cloudera.org:8080/19412 )

Change subject: IMPALA-11821: Adjusting manifest_length and absolute paths in 
case of metadata rewrite
..

IMPALA-11821: Adjusting manifest_length and absolute paths in case of metadata 
rewrite

testdata/bin/rewrite-iceberg-metadata.py rewrites manifest and snapshot
files using the provided prefix for file pathes. Snapshot files store
the length of manifest files as well, this needs to be adjusted too.

Additionally, improved path rewrite to be able to rewrite absolute
paths correctly.

Testing:
 - Tested locally, manually

Change-Id: I89b9208f25552012cc1ab16fa60a819dd5a683d9
---
M testdata/bin/rewrite-iceberg-metadata.py
M tests/common/skip.py
M tests/query_test/test_iceberg.py
3 files changed, 45 insertions(+), 9 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/19412/7
--
To view, visit http://gerrit.cloudera.org:8080/19412
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I89b9208f25552012cc1ab16fa60a819dd5a683d9
Gerrit-Change-Number: 19412
Gerrit-PatchSet: 7
Gerrit-Owner: Gergely Fürnstáhl 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Noemi Pap-Takacs 


[Impala-ASF-CR] IMPALA-11821: Adjusting manifest length and absolute paths in case of metadata rewrite

2023-01-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19412 )

Change subject: IMPALA-11821: Adjusting manifest_length and absolute paths in 
case of metadata rewrite
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/12168/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/19412
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I89b9208f25552012cc1ab16fa60a819dd5a683d9
Gerrit-Change-Number: 19412
Gerrit-PatchSet: 6
Gerrit-Owner: Gergely Fürnstáhl 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Comment-Date: Mon, 16 Jan 2023 10:06:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11821: Adjusting manifest length and absolute paths in case of metadata rewrite

2023-01-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19412 )

Change subject: IMPALA-11821: Adjusting manifest_length and absolute paths in 
case of metadata rewrite
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/12167/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/19412
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I89b9208f25552012cc1ab16fa60a819dd5a683d9
Gerrit-Change-Number: 19412
Gerrit-PatchSet: 5
Gerrit-Owner: Gergely Fürnstáhl 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Comment-Date: Mon, 16 Jan 2023 10:00:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11821: Adjusting manifest length and absolute paths in case of metadata rewrite

2023-01-16 Thread Code Review
Gergely Fürnstáhl has uploaded a new patch set (#6). ( 
http://gerrit.cloudera.org:8080/19412 )

Change subject: IMPALA-11821: Adjusting manifest_length and absolute paths in 
case of metadata rewrite
..

IMPALA-11821: Adjusting manifest_length and absolute paths in case of metadata 
rewrite

testdata/bin/rewrite-iceberg-metadata.py rewrites manifest and snapshot
files using the provided prefix for file pathes. Snapshot files store
the length of manifest files as well, this needs to be adjusted too.

Additionally, improved path rewrite to be able to rewrite absolute
paths correctly.

Testing:
 - Tested locally, manually

Change-Id: I89b9208f25552012cc1ab16fa60a819dd5a683d9
---
M testdata/bin/rewrite-iceberg-metadata.py
M tests/common/skip.py
M tests/query_test/test_iceberg.py
3 files changed, 44 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/19412/6
--
To view, visit http://gerrit.cloudera.org:8080/19412
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I89b9208f25552012cc1ab16fa60a819dd5a683d9
Gerrit-Change-Number: 19412
Gerrit-PatchSet: 6
Gerrit-Owner: Gergely Fürnstáhl 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Noemi Pap-Takacs 


[Impala-ASF-CR] IMPALA-11821: Adjusting manifest length and absolute paths in case of metadata rewrite

2023-01-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19412 )

Change subject: IMPALA-11821: Adjusting manifest_length and absolute paths in 
case of metadata rewrite
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/12166/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/19412
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I89b9208f25552012cc1ab16fa60a819dd5a683d9
Gerrit-Change-Number: 19412
Gerrit-PatchSet: 4
Gerrit-Owner: Gergely Fürnstáhl 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Comment-Date: Mon, 16 Jan 2023 09:50:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11821: Adjusting manifest length and absolute paths in case of metadata rewrite

2023-01-16 Thread Code Review
Gergely Fürnstáhl has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/19412 )

Change subject: IMPALA-11821: Adjusting manifest_length and absolute paths in 
case of metadata rewrite
..

IMPALA-11821: Adjusting manifest_length and absolute paths in case of metadata 
rewrite

testdata/bin/rewrite-iceberg-metadata.py rewrites manifest and snapshot
files using the provided prefix for file pathes. Snapshot files store
the length of manifest files as well, this needs to be adjusted too.

Additionally, improved path rewrite to be able to rewrite absolute
paths correctly.

Testing:
 - Tested locally, manually

Change-Id: I89b9208f25552012cc1ab16fa60a819dd5a683d9
---
M testdata/bin/rewrite-iceberg-metadata.py
M tests/common/skip.py
M tests/query_test/test_iceberg.py
3 files changed, 41 insertions(+), 14 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/19412/5
--
To view, visit http://gerrit.cloudera.org:8080/19412
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I89b9208f25552012cc1ab16fa60a819dd5a683d9
Gerrit-Change-Number: 19412
Gerrit-PatchSet: 5
Gerrit-Owner: Gergely Fürnstáhl 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Noemi Pap-Takacs 


[Impala-ASF-CR] IMPALA-11821: Adjusting manifest length and absolute paths in case of metadata rewrite

2023-01-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19412 )

Change subject: IMPALA-11821: Adjusting manifest_length and absolute paths in 
case of metadata rewrite
..


Patch Set 5:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/19412/5/testdata/bin/rewrite-iceberg-metadata.py
File testdata/bin/rewrite-iceberg-metadata.py:

http://gerrit.cloudera.org:8080/#/c/19412/5/testdata/bin/rewrite-iceberg-metadata.py@38
PS5, Line 38: def generate_new_path(prefix, file_path):
flake8: E302 expected 2 blank lines, found 1


http://gerrit.cloudera.org:8080/#/c/19412/5/testdata/bin/rewrite-iceberg-metadata.py@44
PS5, Line 44: )
flake8: E501 line too long (92 > 90 characters)


http://gerrit.cloudera.org:8080/#/c/19412/5/testdata/bin/rewrite-iceberg-metadata.py@52
PS5, Line 52: s
flake8: E501 line too long (94 > 90 characters)


http://gerrit.cloudera.org:8080/#/c/19412/5/testdata/bin/rewrite-iceberg-metadata.py@65
PS5, Line 65: a
flake8: E501 line too long (96 > 90 characters)


http://gerrit.cloudera.org:8080/#/c/19412/5/tests/query_test/test_iceberg.py
File tests/query_test/test_iceberg.py:

http://gerrit.cloudera.org:8080/#/c/19412/5/tests/query_test/test_iceberg.py@788
PS5, Line 788: 
flake8: W293 blank line contains whitespace


http://gerrit.cloudera.org:8080/#/c/19412/5/tests/query_test/test_iceberg.py@788
PS5, Line 788:
line has trailing whitespace



--
To view, visit http://gerrit.cloudera.org:8080/19412
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I89b9208f25552012cc1ab16fa60a819dd5a683d9
Gerrit-Change-Number: 19412
Gerrit-PatchSet: 5
Gerrit-Owner: Gergely Fürnstáhl 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Comment-Date: Mon, 16 Jan 2023 09:41:44 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10986 (Addendum): Add and refactor some E2E tests

2023-01-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/19394 )

Change subject: IMPALA-10986 (Addendum): Add and refactor some E2E tests
..

IMPALA-10986 (Addendum): Add and refactor some E2E tests

This patch adds an additional test case in test_select_function to
verify Impala's behavior when a user tries to execute a UDF in an INSERT
statement. Moreover, some test functions added in IMPALA-10986 are
refactored according to reviewers' suggestion.

Change-Id: Ieb4f69934401a745da66a983528a7a3679279c28
Reviewed-on: http://gerrit.cloudera.org:8080/19394
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M tests/authorization/test_ranger.py
1 file changed, 239 insertions(+), 242 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/19394
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ieb4f69934401a745da66a983528a7a3679279c28
Gerrit-Change-Number: 19394
Gerrit-PatchSet: 11
Gerrit-Owner: Fang-Yu Rao 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 


[Impala-ASF-CR] IMPALA-10986 (Addendum): Add and refactor some E2E tests

2023-01-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19394 )

Change subject: IMPALA-10986 (Addendum): Add and refactor some E2E tests
..


Patch Set 10: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/19394
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ieb4f69934401a745da66a983528a7a3679279c28
Gerrit-Change-Number: 19394
Gerrit-PatchSet: 10
Gerrit-Owner: Fang-Yu Rao 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Mon, 16 Jan 2023 09:36:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11821: Adjusting manifest length and absolute paths in case of metadata rewrite

2023-01-16 Thread Code Review
Gergely Fürnstáhl has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19412 )

Change subject: IMPALA-11821: Adjusting manifest_length and absolute paths in 
case of metadata rewrite
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/19412/2/testdata/bin/rewrite-iceberg-metadata.py
File testdata/bin/rewrite-iceberg-metadata.py:

http://gerrit.cloudera.org:8080/#/c/19412/2/testdata/bin/rewrite-iceberg-metadata.py@69
PS2, Line 69: def fix_manifest_length(entry):
> I think we should remove 'v' from the regex to also match metadata.json fil
Done


http://gerrit.cloudera.org:8080/#/c/19412/2/testdata/bin/rewrite-iceberg-metadata.py@83
PS2, Line 83: if 'format-version' not in metadata:
> Could you please add support for absolute paths?
Done



--
To view, visit http://gerrit.cloudera.org:8080/19412
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I89b9208f25552012cc1ab16fa60a819dd5a683d9
Gerrit-Change-Number: 19412
Gerrit-PatchSet: 4
Gerrit-Owner: Gergely Fürnstáhl 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Comment-Date: Mon, 16 Jan 2023 09:30:22 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11821: Adjusting manifest length and absolute paths in case of metadata rewrite

2023-01-16 Thread Code Review
Gergely Fürnstáhl has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19412 )

Change subject: IMPALA-11821: Adjusting manifest_length and absolute paths in 
case of metadata rewrite
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/19412/1/testdata/bin/rewrite-iceberg-metadata.py
File testdata/bin/rewrite-iceberg-metadata.py:

http://gerrit.cloudera.org:8080/#/c/19412/1/testdata/bin/rewrite-iceberg-metadata.py@121
PS1, Line 121:
> Should address this.
Done



--
To view, visit http://gerrit.cloudera.org:8080/19412
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I89b9208f25552012cc1ab16fa60a819dd5a683d9
Gerrit-Change-Number: 19412
Gerrit-PatchSet: 1
Gerrit-Owner: Gergely Fürnstáhl 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Comment-Date: Mon, 16 Jan 2023 09:29:49 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11821: Adjusting manifest length and absolute paths in case of metadata rewrite

2023-01-16 Thread Code Review
Gergely Fürnstáhl has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/19412 )

Change subject: IMPALA-11821: Adjusting manifest_length and absolute paths in 
case of metadata rewrite
..

IMPALA-11821: Adjusting manifest_length and absolute paths in case of metadata 
rewrite

testdata/bin/rewrite-iceberg-metadata.py rewrites manifest and snapshot
files using the provided prefix for file pathes. Snapshot files store
the length of manifest files as well, this needs to be adjusted too.

Additionally, improved path rewrite to be able to rewrite absolute
paths correctly.

Testing:
 - Tested locally, manually

Change-Id: I89b9208f25552012cc1ab16fa60a819dd5a683d9
---
M testdata/bin/rewrite-iceberg-metadata.py
M tests/common/skip.py
M tests/query_test/test_iceberg.py
3 files changed, 41 insertions(+), 14 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/19412/4
--
To view, visit http://gerrit.cloudera.org:8080/19412
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I89b9208f25552012cc1ab16fa60a819dd5a683d9
Gerrit-Change-Number: 19412
Gerrit-PatchSet: 4
Gerrit-Owner: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Noemi Pap-Takacs 


[Impala-ASF-CR] IMPALA-11821: Adjusting manifest length and absolute paths in case of metadata rewrite

2023-01-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19412 )

Change subject: IMPALA-11821: Adjusting manifest_length and absolute paths in 
case of metadata rewrite
..


Patch Set 4:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/19412/4/testdata/bin/rewrite-iceberg-metadata.py
File testdata/bin/rewrite-iceberg-metadata.py:

http://gerrit.cloudera.org:8080/#/c/19412/4/testdata/bin/rewrite-iceberg-metadata.py@38
PS4, Line 38: def generate_new_path(prefix, file_path):
flake8: E302 expected 2 blank lines, found 1


http://gerrit.cloudera.org:8080/#/c/19412/4/testdata/bin/rewrite-iceberg-metadata.py@44
PS4, Line 44: )
flake8: E501 line too long (92 > 90 characters)


http://gerrit.cloudera.org:8080/#/c/19412/4/testdata/bin/rewrite-iceberg-metadata.py@52
PS4, Line 52: s
flake8: E501 line too long (94 > 90 characters)


http://gerrit.cloudera.org:8080/#/c/19412/4/testdata/bin/rewrite-iceberg-metadata.py@65
PS4, Line 65: a
flake8: E501 line too long (96 > 90 characters)


http://gerrit.cloudera.org:8080/#/c/19412/4/tests/query_test/test_iceberg.py
File tests/query_test/test_iceberg.py:

http://gerrit.cloudera.org:8080/#/c/19412/4/tests/query_test/test_iceberg.py@788
PS4, Line 788: 
flake8: W293 blank line contains whitespace


http://gerrit.cloudera.org:8080/#/c/19412/4/tests/query_test/test_iceberg.py@788
PS4, Line 788:
line has trailing whitespace



--
To view, visit http://gerrit.cloudera.org:8080/19412
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I89b9208f25552012cc1ab16fa60a819dd5a683d9
Gerrit-Change-Number: 19412
Gerrit-PatchSet: 4
Gerrit-Owner: Gergely Fürnstáhl 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Comment-Date: Mon, 16 Jan 2023 09:30:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11833: Fixed manifest length in snapshot files

2023-01-16 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19409 )

Change subject: IMPALA-11833: Fixed manifest_length in snapshot files
..


Patch Set 5: Code-Review+2

Thanks for fixing this!


--
To view, visit http://gerrit.cloudera.org:8080/19409
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I258055998a1d41b7f6047b6879e919834ed2c247
Gerrit-Change-Number: 19409
Gerrit-PatchSet: 5
Gerrit-Owner: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 16 Jan 2023 09:23:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11833: Fixed manifest length in snapshot files

2023-01-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19409 )

Change subject: IMPALA-11833: Fixed manifest_length in snapshot files
..


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/8966/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/19409
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I258055998a1d41b7f6047b6879e919834ed2c247
Gerrit-Change-Number: 19409
Gerrit-PatchSet: 6
Gerrit-Owner: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 16 Jan 2023 09:25:53 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11833: Fixed manifest length in snapshot files

2023-01-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19409 )

Change subject: IMPALA-11833: Fixed manifest_length in snapshot files
..


Patch Set 6: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/19409
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I258055998a1d41b7f6047b6879e919834ed2c247
Gerrit-Change-Number: 19409
Gerrit-PatchSet: 6
Gerrit-Owner: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 16 Jan 2023 09:25:52 +
Gerrit-HasComments: No