[Impala-ASF-CR] IMPALA-12925: Fix decimal data type for external JDBC table

2024-04-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21218 )

Change subject: IMPALA-12925: Fix decimal data type for external JDBC table
..


Patch Set 8:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15778/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21218
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c9d2e0667c42c0e52436b158e3dfe3ec14b9e3b
Gerrit-Change-Number: 21218
Gerrit-PatchSet: 8
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: gaurav singh 
Gerrit-Comment-Date: Thu, 04 Apr 2024 05:28:03 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12960: Fix Incorrect RowsPassedThrough Metric in Streaming Aggregation

2024-04-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21235 )

Change subject: IMPALA-12960: Fix Incorrect RowsPassedThrough Metric in 
Streaming Aggregation
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10491/ 
DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/21235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59205a4b06824ee1607a25e906db1f96dc4eda9f
Gerrit-Change-Number: 21235
Gerrit-PatchSet: 1
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Thu, 04 Apr 2024 05:27:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12612: SELECT * queries expand complex type columns from Iceberg metadata tables

2024-04-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21236 )

Change subject: IMPALA-12612: SELECT * queries expand complex type columns from 
Iceberg metadata tables
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15777/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21236
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia298705ba54411cc439e99d5cb27184093541f02
Gerrit-Change-Number: 21236
Gerrit-PatchSet: 2
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 04 Apr 2024 05:25:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12925: Fix decimal data type for external JDBC table

2024-04-03 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has uploaded a new patch set (#8). ( 
http://gerrit.cloudera.org:8080/21218 )

Change subject: IMPALA-12925: Fix decimal data type for external JDBC table
..

IMPALA-12925: Fix decimal data type for external JDBC table

Decimal type is a primitive data type for Impala. Current code returns
wrong values for columns with decimal data type in external JDBC tables.

This patch fixes wrong values returned from JDBC data source, and
supports pushing down decimal type of predicates to remote database
and remote Impala.
The decimal precisions and scales of the columns in external JDBC table
must be no less than the decimal precisions and scales of the
corresponding columns in the table of remote database. Otherwise,
Impala fails with an error since it may cause truncation of decimal
data.

Testing:
 - Added Planner test for pushing down decimal type of predicates.
 - Added end-to-end unit-tests for tables with decimal type of columns
   for Postgres, MySQL, and Impala-to-Impala.
 - Passed core-tests.

Change-Id: I8c9d2e0667c42c0e52436b158e3dfe3ec14b9e3b
---
M fe/src/main/java/org/apache/impala/extdatasource/jdbc/JdbcDataSource.java
M 
fe/src/main/java/org/apache/impala/extdatasource/jdbc/dao/JdbcRecordIterator.java
M 
fe/src/main/java/org/apache/impala/extdatasource/jdbc/util/QueryConditionUtil.java
M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
M testdata/bin/clean-mysql-env.sh
M testdata/bin/create-ext-data-source-table.sql
M testdata/bin/load-ext-data-sources.sh
M testdata/bin/setup-mysql-env.sh
M 
testdata/workloads/functional-planner/queries/PlannerTest/data-source-tables.test
M 
testdata/workloads/functional-query/queries/QueryTest/impala-ext-jdbc-tables-predicates.test
M 
testdata/workloads/functional-query/queries/QueryTest/impala-ext-jdbc-tables.test
M testdata/workloads/functional-query/queries/QueryTest/jdbc-data-source.test
M 
testdata/workloads/functional-query/queries/QueryTest/mysql-ext-jdbc-tables.test
13 files changed, 566 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/21218/8
--
To view, visit http://gerrit.cloudera.org:8080/21218
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8c9d2e0667c42c0e52436b158e3dfe3ec14b9e3b
Gerrit-Change-Number: 21218
Gerrit-PatchSet: 8
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: gaurav singh 


[Impala-ASF-CR] IMPALA-12612: SELECT * queries expand complex type columns from Iceberg metadata tables

2024-04-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21236 )

Change subject: IMPALA-12612: SELECT * queries expand complex type columns from 
Iceberg metadata tables
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10490/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/21236
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia298705ba54411cc439e99d5cb27184093541f02
Gerrit-Change-Number: 21236
Gerrit-PatchSet: 3
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 04 Apr 2024 05:03:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12612: SELECT * queries expand complex type columns from Iceberg metadata tables

2024-04-03 Thread Gabor Kaszab (Code Review)
Hello Daniel Becker, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21236

to look at the new patch set (#2).

Change subject: IMPALA-12612: SELECT * queries expand complex type columns from 
Iceberg metadata tables
..

IMPALA-12612: SELECT * queries expand complex type columns from Iceberg 
metadata tables

Similarly to how regular tables behave, the nested columns are omitted
when we do a SELECT * on Iceberg metadata tables and the user needs to
turn EXPAND_COMPLEX_TYPES on to include the nested columns into the
result. This patch changes this behaviour to unconditionally include
the nested columns from Iceberg metadata tables.
Note, the behavior of handling nested columns from regular tables
doesn't change with this patch.

Testing:
  - Adjusted the SELECT * metadata table queries to add the nested
columns into the results.
  - Added some new tests where both metadata tables and regular tables
were queried in the same query.

Change-Id: Ia298705ba54411cc439e99d5cb27184093541f02
---
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-metadata-tables.test
2 files changed, 103 insertions(+), 82 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/36/21236/2
--
To view, visit http://gerrit.cloudera.org:8080/21236
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia298705ba54411cc439e99d5cb27184093541f02
Gerrit-Change-Number: 21236
Gerrit-PatchSet: 2
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-12612: SELECT * queries expand complex type columns from Iceberg metadata tables

2024-04-03 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21236 )

Change subject: IMPALA-12612: SELECT * queries expand complex type columns from 
Iceberg metadata tables
..


Patch Set 2:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/21236/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21236/1//COMMIT_MSG@7
PS1, Line 7: IMPALA-12612: SELECT * queries expand complex type columns from 
Iceberg metadata tables
> The title could include "select *", otherwise it's not clear what this refe
Done


http://gerrit.cloudera.org:8080/#/c/21236/1//COMMIT_MSG@9
PS1, Line 9:
> Nit: should come after "behave".
Done


http://gerrit.cloudera.org:8080/#/c/21236/1//COMMIT_MSG@14
PS1, Line 14: Note, the behavior of handling nested columns from regular tables
> We could mention that although this is technically a breaking change, metad
I don't think this is a breaking change because we don't have a release ATM 
that contains metadata querying. I checked that the parser and planner parts 
went into 4.3 but the executor part is waiting for 4.4 so we are safe with this 
change.


http://gerrit.cloudera.org:8080/#/c/21236/1/testdata/workloads/functional-query/queries/QueryTest/iceberg-metadata-tables.test
File 
testdata/workloads/functional-query/queries/QueryTest/iceberg-metadata-tables.test:

http://gerrit.cloudera.org:8080/#/c/21236/1/testdata/workloads/functional-query/queries/QueryTest/iceberg-metadata-tables.test@25
PS1, Line 25: $NAMENODE/test-warehou
> Is it intentional that these are changed from "$NAMENODE" to "hdfs://localh
it isn't, thanks for noticing. dockerised GVO also broke because of this


http://gerrit.cloudera.org:8080/#/c/21236/1/testdata/workloads/functional-query/queries/QueryTest/iceberg-metadata-tables.test@1110
PS1, Line 1110: select readable_metrics.i.* from 
functional_parquet.iceberg_query_metadata.`files`;
> This example can be confusing at first because the 'readable_metrics' struc
I changed this to expand readable_metrics.i



--
To view, visit http://gerrit.cloudera.org:8080/21236
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia298705ba54411cc439e99d5cb27184093541f02
Gerrit-Change-Number: 21236
Gerrit-PatchSet: 2
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 04 Apr 2024 05:03:06 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12881: Use getFkPkJoinCardinality to reduce scan cardinality

2024-04-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/21118 )

Change subject: IMPALA-12881: Use getFkPkJoinCardinality to reduce scan 
cardinality
..

IMPALA-12881: Use getFkPkJoinCardinality to reduce scan cardinality

IMPALA-12018 adds reduceCardinalityForScanNode to lower cardinality
estimation when a runtime filter is involved. It calls
JoinNode.computeGenericJoinCardinality(). However, if the originating
join node has FK-PK conjunct, it should be possible to obtain a lower
cardinality estimate by calling JoinNode.getFkPkJoinCardinality()
instead.

This patch adds that analysis and calls
JoinNode.getFkPkJoinCardinality() when possible. It is, however, only
limited to runtime filters that evaluate at the storage layer, such as
partition filter and pushed-down Kudu filter. Row-level runtime filters
that evaluate at scan node will continue using
JoinNode.computeGenericJoinCardinality().

This distinction is because a storage layer filter is applied more
consistently than a row-level filter. For example, a partition filter
evaluate all partition_id and never disabled regardless of its
precision (see HdfsScanNodeBase::PartitionPassesFilters). On the other
hand, scan node can disable a row-level filter later on if it is deemed
ineffective / not precise enough (see
HdfsScanner::CheckFiltersEffectiveness,
LocalFilterStats::enabled_for_row, and min_filter_reject_ratio flag).
For the pushed-down Kudu filter, Impala will rely on Kudu to evaluate
the filter.

Runtime filters can arrive late as well. But for both storage layer
filter and row-level filter, the scan node can stop waiting and start
scanning after runtime_filter_wait_time_ms passed. Scan node will still
evaluate a late runtime filter later on if the scan process is still
ongoing.

Also, note that this cardinality reduction algorithm is based only on
highly selective runtime filters to increase its estimate
confidence (see RuntimeFilter.isHighlySelective()).

Testing:
- Update TpcdsCpuCostPlannerTest.
- Pass FE tests.

Change-Id: I6efafffc8f96247a860b88e85d9097b2b4327f32
Reviewed-on: http://gerrit.cloudera.org:8080/21118
Reviewed-by: Wenzhe Zhou 
Reviewed-by: Michael Smith 
Tested-by: Impala Public Jenkins 
---
M fe/src/main/java/org/apache/impala/planner/JoinNode.java
M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/outer-to-inner-joins.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-cardinality-reduction-on-kudu.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-cardinality-reduction.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q13.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q17.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q19.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q25.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q29.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q33.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q34.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q42.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q46.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q48.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q49.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q52.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q53.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q55.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q56.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q60.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q61.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q63.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q64.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q66.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q75.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q89.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q04.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q07.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q13.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q14a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q15.test
M 

[Impala-ASF-CR] IMPALA-12881: Use getFkPkJoinCardinality to reduce scan cardinality

2024-04-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21118 )

Change subject: IMPALA-12881: Use getFkPkJoinCardinality to reduce scan 
cardinality
..


Patch Set 8: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/21118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6efafffc8f96247a860b88e85d9097b2b4327f32
Gerrit-Change-Number: 21118
Gerrit-PatchSet: 8
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Thu, 04 Apr 2024 04:46:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12925: Fix decimal data type for external JDBC table

2024-04-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21218 )

Change subject: IMPALA-12925: Fix decimal data type for external JDBC table
..


Patch Set 7:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15776/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21218
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c9d2e0667c42c0e52436b158e3dfe3ec14b9e3b
Gerrit-Change-Number: 21218
Gerrit-PatchSet: 7
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: gaurav singh 
Gerrit-Comment-Date: Thu, 04 Apr 2024 04:11:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12925: Fix decimal data type for external JDBC table

2024-04-03 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has uploaded a new patch set (#7). ( 
http://gerrit.cloudera.org:8080/21218 )

Change subject: IMPALA-12925: Fix decimal data type for external JDBC table
..

IMPALA-12925: Fix decimal data type for external JDBC table

Decimal type is a primitive data type for Impala. Current code returns
wrong values for columns with decimal data type in external JDBC tables.

This patch fixes wrong values returned from JDBC data source, and
supports pushing down decimal type of predicates to remote database
and remote Impala.
The decimal precisions and scales of the columns in external JDBC table
must be no less than the decimal precisions and scales of the
corresponding columns in the table of remote database. Otherwise,
Impala fails with an error since it may cause truncation of decimal
data.

Testing:
 - Added Planner test for pushing down decimal type of predicates.
 - Added end-to-end unit-tests for tables with decimal type of columns
   for Postgres, MySQL, and Impala-to-Impala.
 - Passed core-tests.

Change-Id: I8c9d2e0667c42c0e52436b158e3dfe3ec14b9e3b
---
M fe/src/main/java/org/apache/impala/extdatasource/jdbc/JdbcDataSource.java
M 
fe/src/main/java/org/apache/impala/extdatasource/jdbc/dao/JdbcRecordIterator.java
M 
fe/src/main/java/org/apache/impala/extdatasource/jdbc/util/QueryConditionUtil.java
M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
M testdata/bin/clean-mysql-env.sh
M testdata/bin/create-ext-data-source-table.sql
M testdata/bin/load-ext-data-sources.sh
M testdata/bin/setup-mysql-env.sh
M 
testdata/workloads/functional-planner/queries/PlannerTest/data-source-tables.test
M 
testdata/workloads/functional-query/queries/QueryTest/impala-ext-jdbc-tables-predicates.test
M 
testdata/workloads/functional-query/queries/QueryTest/impala-ext-jdbc-tables.test
M testdata/workloads/functional-query/queries/QueryTest/jdbc-data-source.test
M 
testdata/workloads/functional-query/queries/QueryTest/mysql-ext-jdbc-tables.test
13 files changed, 565 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/21218/7
--
To view, visit http://gerrit.cloudera.org:8080/21218
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8c9d2e0667c42c0e52436b158e3dfe3ec14b9e3b
Gerrit-Change-Number: 21218
Gerrit-PatchSet: 7
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: gaurav singh 


[Impala-ASF-CR] IMPALA-12612: Expand complex type columns from Iceberg metadata tables

2024-04-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21236 )

Change subject: IMPALA-12612: Expand complex type columns from Iceberg metadata 
tables
..


Patch Set 1: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/10488/


--
To view, visit http://gerrit.cloudera.org:8080/21236
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia298705ba54411cc439e99d5cb27184093541f02
Gerrit-Change-Number: 21236
Gerrit-PatchSet: 1
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 04 Apr 2024 02:05:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12969: Release JNI array if DeserializeThriftMsg failed

2024-04-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21234 )

Change subject: IMPALA-12969: Release JNI array if DeserializeThriftMsg failed
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/21234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id2c0335b12e9289ae851d0ec050765951a8ca6c7
Gerrit-Change-Number: 21234
Gerrit-PatchSet: 1
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 04 Apr 2024 00:13:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5323: Support BINARY columns in Kudu tables

2024-04-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18868 )

Change subject: IMPALA-5323: Support BINARY columns in Kudu tables
..


Patch Set 7: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/10486/


--
To view, visit http://gerrit.cloudera.org:8080/18868
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iff701a4b3a09ce7b6982c5d238e65f3d4f3d1151
Gerrit-Change-Number: 18868
Gerrit-PatchSet: 7
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Comment-Date: Wed, 03 Apr 2024 23:57:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12152: Add query option to wait for events sync up

2024-04-03 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20131 )

Change subject: IMPALA-12152: Add query option to wait for events sync up
..


Patch Set 14: Code-Review+1

(2 comments)

http://gerrit.cloudera.org:8080/#/c/20131/14/be/src/service/client-request-state.cc
File be/src/service/client-request-state.cc:

http://gerrit.cloudera.org:8080/#/c/20131/14/be/src/service/client-request-state.cc@601
PS14, Line 601:   coord_->AddErrorLog(early_error_msg_);
Can/should we clear early_error_msg_ here? Could at least free up the memory.


http://gerrit.cloudera.org:8080/#/c/20131/14/tests/metadata/test_event_processing_base.py
File tests/metadata/test_event_processing_base.py:

http://gerrit.cloudera.org:8080/#/c/20131/14/tests/metadata/test_event_processing_base.py@30
PS14, Line 30: class TestEventProcessingBase(ImpalaTestSuite):
nit: either this doesn't need to inherit from ImpalaTestSuite, or you could 
have TestEventProcessing inherit from in. Do the SkipIf clauses do anything?



--
To view, visit http://gerrit.cloudera.org:8080/20131
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36ac941bb2c2217b09fcfa2eb567b011b38efa2a
Gerrit-Change-Number: 20131
Gerrit-PatchSet: 14
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Wed, 03 Apr 2024 23:57:27 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12960: Fix Incorrect RowsPassedThrough Metric in Streaming Aggregation

2024-04-03 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21235 )

Change subject: IMPALA-12960: Fix Incorrect RowsPassedThrough Metric in 
Streaming Aggregation
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59205a4b06824ee1607a25e906db1f96dc4eda9f
Gerrit-Change-Number: 21235
Gerrit-PatchSet: 1
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 03 Apr 2024 23:53:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12960: Fix Incorrect RowsPassedThrough Metric in Streaming Aggregation

2024-04-03 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21235 )

Change subject: IMPALA-12960: Fix Incorrect RowsPassedThrough Metric in 
Streaming Aggregation
..


Patch Set 1: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/21235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59205a4b06824ee1607a25e906db1f96dc4eda9f
Gerrit-Change-Number: 21235
Gerrit-PatchSet: 1
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 03 Apr 2024 23:48:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12881: Use getFkPkJoinCardinality to reduce scan cardinality

2024-04-03 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21118 )

Change subject: IMPALA-12881: Use getFkPkJoinCardinality to reduce scan 
cardinality
..


Patch Set 8: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6efafffc8f96247a860b88e85d9097b2b4327f32
Gerrit-Change-Number: 21118
Gerrit-PatchSet: 8
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 03 Apr 2024 23:40:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12881: Use getFkPkJoinCardinality to reduce scan cardinality

2024-04-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21118 )

Change subject: IMPALA-12881: Use getFkPkJoinCardinality to reduce scan 
cardinality
..


Patch Set 8:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10489/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/21118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6efafffc8f96247a860b88e85d9097b2b4327f32
Gerrit-Change-Number: 21118
Gerrit-PatchSet: 8
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 03 Apr 2024 23:40:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12881: Use getFkPkJoinCardinality to reduce scan cardinality

2024-04-03 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21118 )

Change subject: IMPALA-12881: Use getFkPkJoinCardinality to reduce scan 
cardinality
..


Patch Set 8: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/21118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6efafffc8f96247a860b88e85d9097b2b4327f32
Gerrit-Change-Number: 21118
Gerrit-PatchSet: 8
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 03 Apr 2024 23:38:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12881: Use getFkPkJoinCardinality to reduce scan cardinality

2024-04-03 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21118 )

Change subject: IMPALA-12881: Use getFkPkJoinCardinality to reduce scan 
cardinality
..


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21118/7//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21118/7//COMMIT_MSG@37
PS7, Line 37: evaluate a late runtime filter later on if the scan process is 
still
> nit: capital A?
Done



--
To view, visit http://gerrit.cloudera.org:8080/21118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6efafffc8f96247a860b88e85d9097b2b4327f32
Gerrit-Change-Number: 21118
Gerrit-PatchSet: 8
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 03 Apr 2024 23:33:31 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12920: Support ai generate text built-in function for OpenAI's chat completion API

2024-04-03 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21168 )

Change subject: IMPALA-12920: Support ai_generate_text built-in function for 
OpenAI's chat completion API
..


Patch Set 6:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/21168/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21168/6//COMMIT_MSG@38
PS6, Line 38:
Don't see test case for key_jceks_secret


http://gerrit.cloudera.org:8080/#/c/21168/6/be/src/runtime/exec-env.cc
File be/src/runtime/exec-env.cc:

http://gerrit.cloudera.org:8080/#/c/21168/6/be/src/runtime/exec-env.cc@524
PS6, Line 524: frontend_->GetSecretFromKeyStore(
It's better to get jecks_secret per AiFunctions::AiGenerateTextInternal() 
instead of caching it.


http://gerrit.cloudera.org:8080/#/c/21168/6/fe/src/main/java/org/apache/impala/service/JniFrontend.java
File fe/src/main/java/org/apache/impala/service/JniFrontend.java:

http://gerrit.cloudera.org:8080/#/c/21168/6/fe/src/main/java/org/apache/impala/service/JniFrontend.java@812
PS6, Line 812: new String()
nit: init as null



--
To view, visit http://gerrit.cloudera.org:8080/21168
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id4446957f6030bab1f985fdd69185c3da07d7c4b
Gerrit-Change-Number: 21168
Gerrit-PatchSet: 6
Gerrit-Owner: Abhishek Rawat 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Wed, 03 Apr 2024 23:33:23 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12881: Use getFkPkJoinCardinality to reduce scan cardinality

2024-04-03 Thread Riza Suminto (Code Review)
Hello Daniel Becker, Csaba Ringhofer, Wenzhe Zhou, Michael Smith, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21118

to look at the new patch set (#8).

Change subject: IMPALA-12881: Use getFkPkJoinCardinality to reduce scan 
cardinality
..

IMPALA-12881: Use getFkPkJoinCardinality to reduce scan cardinality

IMPALA-12018 adds reduceCardinalityForScanNode to lower cardinality
estimation when a runtime filter is involved. It calls
JoinNode.computeGenericJoinCardinality(). However, if the originating
join node has FK-PK conjunct, it should be possible to obtain a lower
cardinality estimate by calling JoinNode.getFkPkJoinCardinality()
instead.

This patch adds that analysis and calls
JoinNode.getFkPkJoinCardinality() when possible. It is, however, only
limited to runtime filters that evaluate at the storage layer, such as
partition filter and pushed-down Kudu filter. Row-level runtime filters
that evaluate at scan node will continue using
JoinNode.computeGenericJoinCardinality().

This distinction is because a storage layer filter is applied more
consistently than a row-level filter. For example, a partition filter
evaluate all partition_id and never disabled regardless of its
precision (see HdfsScanNodeBase::PartitionPassesFilters). On the other
hand, scan node can disable a row-level filter later on if it is deemed
ineffective / not precise enough (see
HdfsScanner::CheckFiltersEffectiveness,
LocalFilterStats::enabled_for_row, and min_filter_reject_ratio flag).
For the pushed-down Kudu filter, Impala will rely on Kudu to evaluate
the filter.

Runtime filters can arrive late as well. But for both storage layer
filter and row-level filter, the scan node can stop waiting and start
scanning after runtime_filter_wait_time_ms passed. Scan node will still
evaluate a late runtime filter later on if the scan process is still
ongoing.

Also, note that this cardinality reduction algorithm is based only on
highly selective runtime filters to increase its estimate
confidence (see RuntimeFilter.isHighlySelective()).

Testing:
- Update TpcdsCpuCostPlannerTest.
- Pass FE tests.

Change-Id: I6efafffc8f96247a860b88e85d9097b2b4327f32
---
M fe/src/main/java/org/apache/impala/planner/JoinNode.java
M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/outer-to-inner-joins.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-cardinality-reduction-on-kudu.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-cardinality-reduction.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q13.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q17.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q19.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q25.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q29.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q33.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q34.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q42.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q46.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q48.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q49.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q52.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q53.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q55.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q56.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q60.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q61.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q63.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q64.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q66.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q75.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q89.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q04.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q07.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q13.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q14a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q15.test
M 

[Impala-ASF-CR] IMPALA-12291 impala checks hdfs ranger policy

2024-04-03 Thread Fang-Yu Rao (Code Review)
Fang-Yu Rao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20221 )

Change subject: IMPALA-12291 impala checks hdfs ranger policy
..


Patch Set 9:

(2 comments)

Thanks for the reply Halim!

Other than changing how the catalog server initializes the access level of an 
HdfsTable during table loading, it may also make sense to disable 
analyzeWriteAccess() at
https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java#L406-L408
 depending on the value of the newly introduced startup flag.

I do not have a strong objection against adding a new startup flag in this 
patch. But maybe Quanlong and Aman could chime in to see if there is a better 
alternative since we have already got a lot of flags.

http://gerrit.cloudera.org:8080/#/c/20221/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/20221/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@852
PS5, Line 852: location = location.getParent();
> Your test code will be very helpful. Thank you Fang-Yu.
Thanks Halim! Not a problem.

Adding a startup flag like 'hdfs_permission_check' or 
'skip_fs_permissions_check_in_analysis' to give the Impala administrator a 
choice is reasonable and it does not have to depend on whether Ranger is 
enabled.

It would be good that we mention the purpose of not performing the file system 
permissions check somewhere (maybe in the commit message).

This prevents users from encountering an AnalysisException during query
analysis when the target table or any partition inside is not writable
to the Impala service according to Impala's FsPermissionChecker() even
though the Impala service user is granted the READ and WRITE privileges
on the respective file system paths via the Ranger policy repository of the
corresponding storage.

If we add such a startup flag, e.g., 'skip_fs_permissions_check_in_analysis', 
then we could still add end-to-end tests to 
https://github.com/apache/impala/blob/master/tests/authorization/test_ranger.py 
like the following to make sure file system permissions check is still 
performed if the Impala administrator does not want to skip it.

  @pytest.mark.execute_serially
  @SkipIfFS.hdfs_acls
  @CustomClusterTestSuite.with_args(
impalad_args="{0} {1}".format(CATALOGD_ARGS,
  
"--skip_fs_permissions_check_in_analysis=false"),
catalogd_args="{0} {1}".format(CATALOGD_ARGS,
   
"--skip_fs_permissions_check_in_analysis=false"))
  def test_insert_with_catalog_v1_not_skip_fs_permissions_check(self, 
unique_name):
self._test_insert_with_catalog_v1(unique_name, False)

  def _test_insert_with_catalog_v1(self, unique_name, 
skip_fs_permissions_check=True):
"""
Test that when Ranger is the authorization provider in the legacy catalog 
mode,
Impala skips or performs the file system permissions checking in query 
analysis
depending on the startup flag 'skip_fs_permissions_check_in_analysis'.
"""
user = getuser()
admin_client = self.create_impala_client()
unique_database = unique_name + "_db"
unique_table = unique_name + "_tbl"
table_path = "test-warehouse/{0}.db/{1}".format(unique_database, 
unique_table)
try:
  admin_client.execute("drop database if exists {0} cascade"
   .format(unique_database), user=ADMIN)
  admin_client.execute("create database {0}".format(unique_database), 
user=ADMIN)
  admin_client.execute("create table {0}.{1} (x int)"
  .format(unique_database, unique_table), user=ADMIN)
  admin_client.execute("grant insert on table {0}.{1} to user {2}"
   .format(unique_database, unique_table, user))

  # Change the owner user and group of the HDFS path corresponding to the 
table
  # so that according to Impala's FsPermissionChecker, the table could not 
be
  # writable to the user that loads the table. This user usually is the one
  # representing the Impala service.
  self.hdfs_client.chown(table_path, "another_user", "another_group")

  # Invalidate the table metadata to force the catalog server to reload the 
HDFS
  # table.
  admin_client.execute("invalidate metadata {0}.{1}"
   .format(unique_database, unique_table), user=ADMIN)

  # Verify that Impala skips or performs the permissions checking in
  # HdfsTable#getAvailableAccessLevel() depending on the startup flag
  # 'skip_fs_permissions_check_in_analysis'.
  query = "insert into {0}.{1} values (1)".format(unique_database, 
unique_table)
  if skip_fs_permissions_check:
self._run_query_as_user(query, user, True)
  else:
result = self._run_query_as_user(query, user, False)
err = "Unable to INSERT into target table"
   

[Impala-ASF-CR] IMPALA-12960: Fix Incorrect RowsPassedThrough Metric in Streaming Aggregation

2024-04-03 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21235 )

Change subject: IMPALA-12960: Fix Incorrect RowsPassedThrough Metric in 
Streaming Aggregation
..


Patch Set 1: Code-Review+1

The fix makes sense to me.


--
To view, visit http://gerrit.cloudera.org:8080/21235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59205a4b06824ee1607a25e906db1f96dc4eda9f
Gerrit-Change-Number: 21235
Gerrit-PatchSet: 1
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Wed, 03 Apr 2024 23:31:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12543: Detect self-events before finishing DDL

2024-04-03 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21029 )

Change subject: IMPALA-12543: Detect self-events before finishing DDL
..


Patch Set 18:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21029/18//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21029/18//COMMIT_MSG@51
PS18, Line 51: - Pass exhaustive tests.
Rerun exhaustive tests overnight and it is now failing at some tests. I'll take 
a closer look whats happening.



--
To view, visit http://gerrit.cloudera.org:8080/21029
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8365c934349ad21a4d9327fc11594d2fc3445f79
Gerrit-Change-Number: 21029
Gerrit-PatchSet: 18
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Wed, 03 Apr 2024 23:21:29 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12881: Use getFkPkJoinCardinality to reduce scan cardinality

2024-04-03 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21118 )

Change subject: IMPALA-12881: Use getFkPkJoinCardinality to reduce scan 
cardinality
..


Patch Set 7: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21118/7//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21118/7//COMMIT_MSG@37
PS7, Line 37: evaluate A late runtime filter later on if the scan process is 
still
nit: capital A?



--
To view, visit http://gerrit.cloudera.org:8080/21118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6efafffc8f96247a860b88e85d9097b2b4327f32
Gerrit-Change-Number: 21118
Gerrit-PatchSet: 7
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 03 Apr 2024 23:24:30 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12920: Support ai generate text built-in function for OpenAI's chat completion API

2024-04-03 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21168 )

Change subject: IMPALA-12920: Support ai_generate_text built-in function for 
OpenAI's chat completion API
..


Patch Set 6:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/21168/6/be/src/exprs/ai-functions-ir.cc
File be/src/exprs/ai-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/21168/6/be/src/exprs/ai-functions-ir.cc@101
PS6, Line 101:   const rapidjson::Value& firstChoice = 
document[OPEN_AI_RESPONSE_FIELD_CHOICES][0];
Theoretically you could set the 'n' parameter, which would return multiple 
choices. This function doesn't support more than one response choice; we should 
mention it somewhere in the documentation.

We could potentially return an error when parsing params below.


http://gerrit.cloudera.org:8080/#/c/21168/6/be/src/exprs/ai-functions-ir.cc@128
PS6, Line 128:   string endpoint_str(FLAGS_ai_endpoint);
This makes an unnecessary copy; can we use string_view instead? 
https://en.cppreference.com/w/cpp/string/basic_string_view was added in C++17, 
which we now use.

Could also just make initializing the value from FLAGS_ai_endpoint happen in an 
else clause, since it's going to be used as a 'const string&' for 
curl.PostToURL.


http://gerrit.cloudera.org:8080/#/c/21168/6/be/src/exprs/scalar-expr-evaluator.cc
File be/src/exprs/scalar-expr-evaluator.cc:

http://gerrit.cloudera.org:8080/#/c/21168/6/be/src/exprs/scalar-expr-evaluator.cc@453
PS6, Line 453:   AiFunctions::AiGenerateText(nullptr, StringVal::null(), 
StringVal::null(),
Presumably this results an in error because 'prompt' is a null string, but 
might make sense to use the dry_run=true version to be safe.


http://gerrit.cloudera.org:8080/#/c/21168/6/be/src/runtime/exec-env.cc
File be/src/runtime/exec-env.cc:

http://gerrit.cloudera.org:8080/#/c/21168/6/be/src/runtime/exec-env.cc@a213
PS6, Line 213:
nit: unnecessary whitespace change; spacing here is pretty arbitrary


http://gerrit.cloudera.org:8080/#/c/21168/6/be/src/runtime/exec-env.cc@528
PS6, Line 528:   AiFunctions::set_api_key(api_key);
Is this safe to permanently cache? I guess this comes from a site file, so it 
probably can't be dynamically updated.


http://gerrit.cloudera.org:8080/#/c/21168/6/be/src/runtime/exec-env.cc@535
PS6, Line 535:   LOG(ERROR) << "Config 'ai_endpoint' (" << 
FLAGS_ai_endpoint << ") is invalid"
These don't cause anything immediately to fail. What's the rationale for not 
failing startup on invalid config?

Could they be implemented via DEFINE_validator?


http://gerrit.cloudera.org:8080/#/c/21168/6/be/src/udf/udf.h
File be/src/udf/udf.h:

http://gerrit.cloudera.org:8080/#/c/21168/6/be/src/udf/udf.h@a742
PS6, Line 742:
nit: unnecessary whitespace change, although I think the new form is a little 
more consistent with the rest of our code.



--
To view, visit http://gerrit.cloudera.org:8080/21168
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id4446957f6030bab1f985fdd69185c3da07d7c4b
Gerrit-Change-Number: 21168
Gerrit-PatchSet: 6
Gerrit-Owner: Abhishek Rawat 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Wed, 03 Apr 2024 22:39:51 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12925: Fix decimal data type for external JDBC table

2024-04-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21218 )

Change subject: IMPALA-12925: Fix decimal data type for external JDBC table
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15775/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21218
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c9d2e0667c42c0e52436b158e3dfe3ec14b9e3b
Gerrit-Change-Number: 21218
Gerrit-PatchSet: 6
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: gaurav singh 
Gerrit-Comment-Date: Wed, 03 Apr 2024 22:15:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12612: Expand complex type columns from Iceberg metadata tables

2024-04-03 Thread Daniel Becker (Code Review)
Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21236 )

Change subject: IMPALA-12612: Expand complex type columns from Iceberg metadata 
tables
..


Patch Set 1:

(5 comments)

Thanks Gábor, it looks good, I have only a few remarks.

http://gerrit.cloudera.org:8080/#/c/21236/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21236/1//COMMIT_MSG@7
PS1, Line 7: IMPALA-12612: Expand complex type columns from Iceberg metadata 
tables
The title could include "select *", otherwise it's not clear what this refers 
to.


http://gerrit.cloudera.org:8080/#/c/21236/1//COMMIT_MSG@9
PS1, Line 9: ,
Nit: should come after "behave".


http://gerrit.cloudera.org:8080/#/c/21236/1//COMMIT_MSG@14
PS1, Line 14: Note, the behavior of handling nested columns from regular tables
We could mention that although this is technically a breaking change, metadata 
tables are a very recent feature so it is not problematic.


http://gerrit.cloudera.org:8080/#/c/21236/1/testdata/workloads/functional-query/queries/QueryTest/iceberg-metadata-tables.test
File 
testdata/workloads/functional-query/queries/QueryTest/iceberg-metadata-tables.test:

http://gerrit.cloudera.org:8080/#/c/21236/1/testdata/workloads/functional-query/queries/QueryTest/iceberg-metadata-tables.test@25
PS1, Line 25: hdfs://localhost:20500
Is it intentional that these are changed from "$NAMENODE" to 
"hdfs://localhost:20500"? Applies also to some of the other queries.


http://gerrit.cloudera.org:8080/#/c/21236/1/testdata/workloads/functional-query/queries/QueryTest/iceberg-metadata-tables.test@1110
PS1, Line 1110: select readable_metrics.* from 
functional_parquet.iceberg_query_metadata.`files`;
This example can be confusing at first because the 'readable_metrics' struct 
itself only contains a single struct. I thought first that it failed to expand 
to the columns "column_size", "value_count" etc. We could either mention it in 
a comment and/or expand "readable_metrics.i" instead.



--
To view, visit http://gerrit.cloudera.org:8080/21236
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia298705ba54411cc439e99d5cb27184093541f02
Gerrit-Change-Number: 21236
Gerrit-PatchSet: 1
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 03 Apr 2024 21:52:04 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12925: Fix decimal data type for external JDBC table

2024-04-03 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has uploaded a new patch set (#6). ( 
http://gerrit.cloudera.org:8080/21218 )

Change subject: IMPALA-12925: Fix decimal data type for external JDBC table
..

IMPALA-12925: Fix decimal data type for external JDBC table

Decimal type is a primitive data type for Impala. Current code returns
wrong values for columns with decimal data type in external JDBC tables.

This patch fixes wrong values returned from JDBC data source, and
supports pushing down decimal type of predicates to remote database
and remote Impala.
The decimal precisions and scales of the columns in external JDBC table
must be no less than the decimal precisions and scales of the
corresponding columns in the table of remote database. Otherwise,
Impala fails with an error since it may cause truncation of decimal
data.

Testing:
 - Added Planner test for pushing down decimal type of predicates.
 - Added end-to-end unit-tests for tables with decimal type of columns
   for Postgres, MySQL, and Impala-to-Impala.
 - Passed core-tests.

Change-Id: I8c9d2e0667c42c0e52436b158e3dfe3ec14b9e3b
---
M fe/src/main/java/org/apache/impala/extdatasource/jdbc/JdbcDataSource.java
M 
fe/src/main/java/org/apache/impala/extdatasource/jdbc/dao/JdbcRecordIterator.java
M 
fe/src/main/java/org/apache/impala/extdatasource/jdbc/util/QueryConditionUtil.java
M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
M testdata/bin/clean-mysql-env.sh
M testdata/bin/create-ext-data-source-table.sql
M testdata/bin/load-ext-data-sources.sh
M testdata/bin/setup-mysql-env.sh
M 
testdata/workloads/functional-planner/queries/PlannerTest/data-source-tables.test
M 
testdata/workloads/functional-query/queries/QueryTest/impala-ext-jdbc-tables-predicates.test
M 
testdata/workloads/functional-query/queries/QueryTest/impala-ext-jdbc-tables.test
M testdata/workloads/functional-query/queries/QueryTest/jdbc-data-source.test
M 
testdata/workloads/functional-query/queries/QueryTest/mysql-ext-jdbc-tables.test
13 files changed, 564 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/21218/6
--
To view, visit http://gerrit.cloudera.org:8080/21218
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8c9d2e0667c42c0e52436b158e3dfe3ec14b9e3b
Gerrit-Change-Number: 21218
Gerrit-PatchSet: 6
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: gaurav singh 


[Impala-ASF-CR] IMPALA-12969: Release JNI array if DeserializeThriftMsg failed

2024-04-03 Thread Daniel Becker (Code Review)
Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21234 )

Change subject: IMPALA-12969: Release JNI array if DeserializeThriftMsg failed
..


Patch Set 1: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21234/1/be/src/rpc/jni-thrift-util.h
File be/src/rpc/jni-thrift-util.h:

http://gerrit.cloudera.org:8080/#/c/21234/1/be/src/rpc/jni-thrift-util.h@61
PS1, Line 61:   Status status = DeserializeThriftMsg(
We could use something like JniUtil::JniUtfCharGuard for arrays as well. I 
understand that it is urgent to correct this error so it can be done in another 
commit.



--
To view, visit http://gerrit.cloudera.org:8080/21234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id2c0335b12e9289ae851d0ec050765951a8ca6c7
Gerrit-Change-Number: 21234
Gerrit-PatchSet: 1
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Wed, 03 Apr 2024 21:39:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12881: Use getFkPkJoinCardinality to reduce scan cardinality

2024-04-03 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21118 )

Change subject: IMPALA-12881: Use getFkPkJoinCardinality to reduce scan 
cardinality
..


Patch Set 7: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/21118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6efafffc8f96247a860b88e85d9097b2b4327f32
Gerrit-Change-Number: 21118
Gerrit-PatchSet: 7
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 03 Apr 2024 21:27:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12965: Add debug query option RUNTIME FILTER IDS TO SKIP

2024-04-03 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21230 )

Change subject: IMPALA-12965: Add debug query option RUNTIME_FILTER_IDS_TO_SKIP
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21230/2/be/src/service/query-options-test.cc
File be/src/service/query-options-test.cc:

http://gerrit.cloudera.org:8080/#/c/21230/2/be/src/service/query-options-test.cc@739
PS2, Line 739: 0, 1, , 2
nit: just curious why "0,,1" is accept, but "0, 1, , 2" is rejected. Maybe we 
should trim the spaces after splitting the string and skip empty string.


http://gerrit.cloudera.org:8080/#/c/21230/2/be/src/service/query-options.cc
File be/src/service/query-options.cc:

http://gerrit.cloudera.org:8080/#/c/21230/2/be/src/service/query-options.cc@1257
PS2, Line 1257: t
trim the spaces and skip empty string?



--
To view, visit http://gerrit.cloudera.org:8080/21230
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I897e37685dd1ec279989b55560ec7616a00d2280
Gerrit-Change-Number: 21230
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 03 Apr 2024 21:25:49 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12612: Expand complex type columns from Iceberg metadata tables

2024-04-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21236 )

Change subject: IMPALA-12612: Expand complex type columns from Iceberg metadata 
tables
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15774/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21236
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia298705ba54411cc439e99d5cb27184093541f02
Gerrit-Change-Number: 21236
Gerrit-PatchSet: 1
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 03 Apr 2024 21:21:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12612: Expand complex type columns from Iceberg metadata tables

2024-04-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21236 )

Change subject: IMPALA-12612: Expand complex type columns from Iceberg metadata 
tables
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10488/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/21236
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia298705ba54411cc439e99d5cb27184093541f02
Gerrit-Change-Number: 21236
Gerrit-PatchSet: 1
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 03 Apr 2024 21:01:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12612: Expand complex type columns from Iceberg metadata tables

2024-04-03 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21236


Change subject: IMPALA-12612: Expand complex type columns from Iceberg metadata 
tables
..

IMPALA-12612: Expand complex type columns from Iceberg metadata tables

Similarly to how regular tables, behave the nested columns are omitted
when we do a SELECT * on Iceberg metadata tables and the user needs to
turn EXPAND_COMPLEX_TYPES on to include the nested columns into the
result. This patch changes this behaviour to unconditionally include
the nested columns from Iceberg metadata tables.
Note, the behavior of handling nested columns from regular tables
doesn't change with this patch.

Testing:
  - Adjusted the SELECT * metadata table queries to add the nested
columns into the results.
  - Added some new tests where both metadata tables and regular tables
were queried in the same query.

Change-Id: Ia298705ba54411cc439e99d5cb27184093541f02
---
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-metadata-tables.test
2 files changed, 103 insertions(+), 82 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/36/21236/1
--
To view, visit http://gerrit.cloudera.org:8080/21236
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia298705ba54411cc439e99d5cb27184093541f02
Gerrit-Change-Number: 21236
Gerrit-PatchSet: 1
Gerrit-Owner: Gabor Kaszab 


[Impala-ASF-CR] IMPALA-12965: Add debug query option RUNTIME FILTER IDS TO SKIP

2024-04-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21230 )

Change subject: IMPALA-12965: Add debug query option RUNTIME_FILTER_IDS_TO_SKIP
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15772/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21230
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I897e37685dd1ec279989b55560ec7616a00d2280
Gerrit-Change-Number: 21230
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 03 Apr 2024 19:52:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12960: Fix Incorrect RowsPassedThrough Metric in Streaming Aggregation

2024-04-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21235 )

Change subject: IMPALA-12960: Fix Incorrect RowsPassedThrough Metric in 
Streaming Aggregation
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15773/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59205a4b06824ee1607a25e906db1f96dc4eda9f
Gerrit-Change-Number: 21235
Gerrit-PatchSet: 1
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 03 Apr 2024 19:52:31 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12960: Fix Incorrect RowsPassedThrough Metric in Streaming Aggregation

2024-04-03 Thread Yida Wu (Code Review)
Yida Wu has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21235


Change subject: IMPALA-12960: Fix Incorrect RowsPassedThrough Metric in 
Streaming Aggregation
..

IMPALA-12960: Fix Incorrect RowsPassedThrough Metric in Streaming Aggregation

This patch fixes a bug in the RowsPassedThrough metric within the
query profile while using Streaming Aggregation. The issue is from
the AddBatchStreaming() function's logic, where the number of rows
in the output batch isn't necessarily initialized to 0, while the
function uses num_rows() of the output batch directly to be the
actual number of rows returned and passed through of this specific
aggregator. This discrepancy can significantly impact the accuracy
of the returned and passed through numbers, as well as the
calculation of reduction rates during hash table expansion in
Streaming Aggregation. Huge differences can be observed especially
when using the rollup function.

The solution is to calculate the actual number of rows added
to the output batch within each round of the AddBatchStreaming()
function.

Tests:
Passed exhaustive tests.
Added a corresponding case in tpch-passthrough-aggregations.test.

Change-Id: I59205a4b06824ee1607a25e906db1f96dc4eda9f
---
M be/src/exec/grouping-aggregator.cc
M testdata/workloads/tpch/queries/tpch-passthrough-aggregations.test
2 files changed, 27 insertions(+), 2 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/21235/1
--
To view, visit http://gerrit.cloudera.org:8080/21235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I59205a4b06824ee1607a25e906db1f96dc4eda9f
Gerrit-Change-Number: 21235
Gerrit-PatchSet: 1
Gerrit-Owner: Yida Wu 


[Impala-ASF-CR] IMPALA-12965: Add debug query option RUNTIME FILTER IDS TO SKIP

2024-04-03 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21230 )

Change subject: IMPALA-12965: Add debug query option RUNTIME_FILTER_IDS_TO_SKIP
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/21230/1/be/src/service/query-options-test.cc
File be/src/service/query-options-test.cc:

http://gerrit.cloudera.org:8080/#/c/21230/1/be/src/service/query-options-test.cc@734
PS1, Line 734: EXPECT_TRUE(SetQueryOption(KEY, "0,,1", , 
nullptr).ok());
> more cases, like "0,  1", "0, 1, , 2"
Done


http://gerrit.cloudera.org:8080/#/c/21230/1/be/src/service/query-options.cc
File be/src/service/query-options.cc:

http://gerrit.cloudera.org:8080/#/c/21230/1/be/src/service/query-options.cc@1299
PS1, Line 1299:
  : end++;
> if (options.at(end) == ',' && double_quote_ct == 0 && begin == end), need t
Make sense. Done.


http://gerrit.cloudera.org:8080/#/c/21230/1/common/thrift/ImpalaService.thrift
File common/thrift/ImpalaService.thrift:

http://gerrit.cloudera.org:8080/#/c/21230/1/common/thrift/ImpalaService.thrift@928
PS1, Line 928: List of runtime filter id to skip
> nit: could you give format of id list with sample? Double quoted numbers se
Done



--
To view, visit http://gerrit.cloudera.org:8080/21230
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I897e37685dd1ec279989b55560ec7616a00d2280
Gerrit-Change-Number: 21230
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 03 Apr 2024 19:28:12 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12969: Release JNI array if DeserializeThriftMsg failed

2024-04-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21234 )

Change subject: IMPALA-12969: Release JNI array if DeserializeThriftMsg failed
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15771/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id2c0335b12e9289ae851d0ec050765951a8ca6c7
Gerrit-Change-Number: 21234
Gerrit-PatchSet: 1
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Wed, 03 Apr 2024 19:33:01 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12965: Add debug query option RUNTIME FILTER IDS TO SKIP

2024-04-03 Thread Riza Suminto (Code Review)
Hello Wenzhe Zhou, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21230

to look at the new patch set (#2).

Change subject: IMPALA-12965: Add debug query option RUNTIME_FILTER_IDS_TO_SKIP
..

IMPALA-12965: Add debug query option RUNTIME_FILTER_IDS_TO_SKIP

Runtime filter still have negative effect on certain scenario such as
long wait time that delays scan and cascading runtime filter chain that
prevents parallel execution of fragments. Having debug query option to
simply skip a runtime filter id from being scheduled can help us
investigate and test a solution early before implementing the
improvement code.

This patch add RUNTIME_FILTER_IDS_TO_SKIP option to do that. This patch
also improve parsing of multi-value query options to not split at ','
char that is within two double quotes.

Testing:
- Add BE test in query-options-test.cc
- Add FE test in runtime-filter-query-options.test

Change-Id: I897e37685dd1ec279989b55560ec7616a00d2280
---
M be/src/service/child-query.cc
M be/src/service/query-options-test.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaService.thrift
M common/thrift/Query.thrift
M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-query-options.test
9 files changed, 350 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/21230/2
--
To view, visit http://gerrit.cloudera.org:8080/21230
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I897e37685dd1ec279989b55560ec7616a00d2280
Gerrit-Change-Number: 21230
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] IMPALA-12969: Release JNI array if DeserializeThriftMsg failed

2024-04-03 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21234 )

Change subject: IMPALA-12969: Release JNI array if DeserializeThriftMsg failed
..


Patch Set 1: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/21234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id2c0335b12e9289ae851d0ec050765951a8ca6c7
Gerrit-Change-Number: 21234
Gerrit-PatchSet: 1
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Wed, 03 Apr 2024 19:24:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12881: Use getFkPkJoinCardinality to reduce scan cardinality

2024-04-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21118 )

Change subject: IMPALA-12881: Use getFkPkJoinCardinality to reduce scan 
cardinality
..


Patch Set 7:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15770/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6efafffc8f96247a860b88e85d9097b2b4327f32
Gerrit-Change-Number: 21118
Gerrit-PatchSet: 7
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 03 Apr 2024 19:22:00 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12969: Release JNI array if DeserializeThriftMsg failed

2024-04-03 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21234


Change subject: IMPALA-12969: Release JNI array if DeserializeThriftMsg failed
..

IMPALA-12969: Release JNI array if DeserializeThriftMsg failed

Before this patch ReleaseByteArrayElements was not called in case
the deserialization failed (e.g. by hitting Thrift's MaxMessageSize).
This could potentially cause JVM/native heap leak, depending on how
the JVM handled the array allocation.

Change-Id: Id2c0335b12e9289ae851d0ec050765951a8ca6c7
---
M be/src/rpc/jni-thrift-util.h
1 file changed, 3 insertions(+), 3 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/21234/1
--
To view, visit http://gerrit.cloudera.org:8080/21234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Id2c0335b12e9289ae851d0ec050765951a8ca6c7
Gerrit-Change-Number: 21234
Gerrit-PatchSet: 1
Gerrit-Owner: Csaba Ringhofer 


[Impala-ASF-CR] IMPALA-12969: Release JNI array if DeserializeThriftMsg failed

2024-04-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21234 )

Change subject: IMPALA-12969: Release JNI array if DeserializeThriftMsg failed
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10487/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/21234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id2c0335b12e9289ae851d0ec050765951a8ca6c7
Gerrit-Change-Number: 21234
Gerrit-PatchSet: 1
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 03 Apr 2024 19:11:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5323: Support BINARY columns in Kudu tables

2024-04-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18868 )

Change subject: IMPALA-5323: Support BINARY columns in Kudu tables
..


Patch Set 7:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15769/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/18868
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iff701a4b3a09ce7b6982c5d238e65f3d4f3d1151
Gerrit-Change-Number: 18868
Gerrit-PatchSet: 7
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Comment-Date: Wed, 03 Apr 2024 19:11:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12881: Use getFkPkJoinCardinality to reduce scan cardinality

2024-04-03 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21118 )

Change subject: IMPALA-12881: Use getFkPkJoinCardinality to reduce scan 
cardinality
..


Patch Set 7:

ps7 is a rebase.


--
To view, visit http://gerrit.cloudera.org:8080/21118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6efafffc8f96247a860b88e85d9097b2b4327f32
Gerrit-Change-Number: 21118
Gerrit-PatchSet: 7
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 03 Apr 2024 18:59:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12881: Use getFkPkJoinCardinality to reduce scan cardinality

2024-04-03 Thread Riza Suminto (Code Review)
Hello Daniel Becker, Csaba Ringhofer, Wenzhe Zhou, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21118

to look at the new patch set (#7).

Change subject: IMPALA-12881: Use getFkPkJoinCardinality to reduce scan 
cardinality
..

IMPALA-12881: Use getFkPkJoinCardinality to reduce scan cardinality

IMPALA-12018 adds reduceCardinalityForScanNode to lower cardinality
estimation when a runtime filter is involved. It calls
JoinNode.computeGenericJoinCardinality(). However, if the originating
join node has FK-PK conjunct, it should be possible to obtain a lower
cardinality estimate by calling JoinNode.getFkPkJoinCardinality()
instead.

This patch adds that analysis and calls
JoinNode.getFkPkJoinCardinality() when possible. It is, however, only
limited to runtime filters that evaluate at the storage layer, such as
partition filter and pushed-down Kudu filter. Row-level runtime filters
that evaluate at scan node will continue using
JoinNode.computeGenericJoinCardinality().

This distinction is because a storage layer filter is applied more
consistently than a row-level filter. For example, a partition filter
evaluate all partition_id and never disabled regardless of its
precision (see HdfsScanNodeBase::PartitionPassesFilters). On the other
hand, scan node can disable a row-level filter later on if it is deemed
ineffective / not precise enough (see
HdfsScanner::CheckFiltersEffectiveness,
LocalFilterStats::enabled_for_row, and min_filter_reject_ratio flag).
For the pushed-down Kudu filter, Impala will rely on Kudu to evaluate
the filter.

Runtime filters can arrive late as well. But for both storage layer
filter and row-level filter, the scan node can stop waiting and start
scanning after runtime_filter_wait_time_ms passed. Scan node will still
evaluate A late runtime filter later on if the scan process is still
ongoing.

Also, note that this cardinality reduction algorithm is based only on
highly selective runtime filters to increase its estimate
confidence (see RuntimeFilter.isHighlySelective()).

Testing:
- Update TpcdsCpuCostPlannerTest.
- Pass FE tests.

Change-Id: I6efafffc8f96247a860b88e85d9097b2b4327f32
---
M fe/src/main/java/org/apache/impala/planner/JoinNode.java
M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/outer-to-inner-joins.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-cardinality-reduction-on-kudu.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-cardinality-reduction.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q13.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q17.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q19.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q25.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q29.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q33.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q34.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q42.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q46.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q48.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q49.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q52.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q53.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q55.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q56.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q60.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q61.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q63.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q64.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q66.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q75.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q89.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q04.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q07.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q13.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q14a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q15.test
M 

[Impala-ASF-CR] IMPALA-5323: Support BINARY columns in Kudu tables

2024-04-03 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18868 )

Change subject: IMPALA-5323: Support BINARY columns in Kudu tables
..


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/18868/5/testdata/workloads/functional-query/queries/QueryTest/binary-type.test
File testdata/workloads/functional-query/queries/QueryTest/binary-type.test:

http://gerrit.cloudera.org:8080/#/c/18868/5/testdata/workloads/functional-query/queries/QueryTest/binary-type.test@170
PS5, Line 170: ants folding and non ascii char
> oops, what I wrote is incorrect
Cleaned up the commit message. It was confusing binary and string - case a. was 
only possible for STRING as a BINARY constant will always have a cast and need 
constant folding.



--
To view, visit http://gerrit.cloudera.org:8080/18868
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iff701a4b3a09ce7b6982c5d238e65f3d4f3d1151
Gerrit-Change-Number: 18868
Gerrit-PatchSet: 7
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Comment-Date: Wed, 03 Apr 2024 18:48:41 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5323: Support BINARY columns in Kudu tables

2024-04-03 Thread Csaba Ringhofer (Code Review)
Hello Daniel Becker, Peter Rozsa, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/18868

to look at the new patch set (#7).

Change subject: IMPALA-5323: Support BINARY columns in Kudu tables
..

IMPALA-5323: Support BINARY columns in Kudu tables

The patch adds read and write support for BINARY columns in Kudu
tables.

Predicate push down is implemented, but is incomplete:
a constant binary argument will be only pushed down if
the constant folding never encounters non-ascii strings.
Examples:
 - cast(unhex(hex("aa")) as binary) can be pushed down
 - cast(hex(unhex("aa")) as binary) can't be pushed
   down as unhex("aa") is not ascii (even though the
   final result is ascii)
See IMPALA-10349 for more details on this limitation.

The patch also changes casting BINARY <-> STRING from noop
to calling an actual function. While this may add some small
overhead it allows the backend to know whether an expression
returns STRING or BINARY.

Change-Id: Iff701a4b3a09ce7b6982c5d238e65f3d4f3d1151
---
M be/src/exec/kudu/kudu-util-ir.cc
M be/src/exec/kudu/kudu-util.cc
M be/src/exprs/cast-functions-ir.cc
M be/src/exprs/cast-functions.h
M be/src/runtime/types.cc
M be/src/runtime/types.h
M fe/src/main/java/org/apache/impala/analysis/CastExpr.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/util/KuduUtil.java
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
M testdata/workloads/functional-query/queries/QueryTest/binary-type.test
M tests/common/kudu_test_suite.py
M tests/query_test/test_kudu.py
M tests/query_test/test_scanners.py
16 files changed, 195 insertions(+), 80 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/18868/7
--
To view, visit http://gerrit.cloudera.org:8080/18868
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iff701a4b3a09ce7b6982c5d238e65f3d4f3d1151
Gerrit-Change-Number: 18868
Gerrit-PatchSet: 7
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Peter Rozsa 


[Impala-ASF-CR] IMPALA-5323: Support BINARY columns in Kudu tables

2024-04-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18868 )

Change subject: IMPALA-5323: Support BINARY columns in Kudu tables
..


Patch Set 7:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10486/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/18868
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iff701a4b3a09ce7b6982c5d238e65f3d4f3d1151
Gerrit-Change-Number: 18868
Gerrit-PatchSet: 7
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Comment-Date: Wed, 03 Apr 2024 18:49:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12925: Fix decimal data type for external JDBC table

2024-04-03 Thread Abhishek Rawat (Code Review)
Abhishek Rawat has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21218 )

Change subject: IMPALA-12925: Fix decimal data type for external JDBC table
..


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21218/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21218/5//COMMIT_MSG@16
PS5, Line 16: less than the decimal scales of the corresponding columns in the 
table
Is it just scale or precision also? I think we should enforce both.


http://gerrit.cloudera.org:8080/#/c/21218/5/fe/src/main/java/org/apache/impala/extdatasource/jdbc/dao/JdbcRecordIterator.java
File 
fe/src/main/java/org/apache/impala/extdatasource/jdbc/dao/JdbcRecordIterator.java:

http://gerrit.cloudera.org:8080/#/c/21218/5/fe/src/main/java/org/apache/impala/extdatasource/jdbc/dao/JdbcRecordIterator.java@154
PS5, Line 154: if (scalarType.scale < valScale) {
We should probably check both scale and precision here since Impala does 
enforce strict conversion from Decimal to Decimal.
https://impala.apache.org/docs/build/html/topics/impala_decimal.html

Updating test cases to capture this will also be good.



--
To view, visit http://gerrit.cloudera.org:8080/21218
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c9d2e0667c42c0e52436b158e3dfe3ec14b9e3b
Gerrit-Change-Number: 21218
Gerrit-PatchSet: 5
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: gaurav singh 
Gerrit-Comment-Date: Wed, 03 Apr 2024 18:09:51 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5323: Support BINARY columns in Kudu tables

2024-04-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18868 )

Change subject: IMPALA-5323: Support BINARY columns in Kudu tables
..


Patch Set 6: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/10485/


--
To view, visit http://gerrit.cloudera.org:8080/18868
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iff701a4b3a09ce7b6982c5d238e65f3d4f3d1151
Gerrit-Change-Number: 18868
Gerrit-PatchSet: 6
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Comment-Date: Wed, 03 Apr 2024 15:43:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12920: Support ai generate text built-in function for OpenAI's chat completion API

2024-04-03 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21168 )

Change subject: IMPALA-12920: Support ai_generate_text built-in function for 
OpenAI's chat completion API
..


Patch Set 6: Code-Review+1

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21168/5/be/src/exprs/ai-functions-ir.cc
File be/src/exprs/ai-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/21168/5/be/src/exprs/ai-functions-ir.cc@83
PS5, Line 83:   != nullptr);
> This was suggested by clang to improve readability, I think. I'm inclined t
Ack


http://gerrit.cloudera.org:8080/#/c/21168/5/be/src/exprs/ai-functions-ir.cc@251
PS5, Line 251: response
> Some of that would be controlled by parameters such as max_tokens that you
That makes sense. Ack.



--
To view, visit http://gerrit.cloudera.org:8080/21168
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id4446957f6030bab1f985fdd69185c3da07d7c4b
Gerrit-Change-Number: 21168
Gerrit-PatchSet: 6
Gerrit-Owner: Abhishek Rawat 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Wed, 03 Apr 2024 15:05:01 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12899: Temporary workaround for BINARY in complex types

2024-04-03 Thread Daniel Becker (Code Review)
Daniel Becker has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/21219 )

Change subject: IMPALA-12899: Temporary workaround for BINARY in complex types
..

IMPALA-12899: Temporary workaround for BINARY in complex types

The BINARY type is currently not supported inside complex types and a
cross-component decision is probably needed to support it (see
IMPALA-11491). We would like to enable EXPAND_COMPLEX_TYPES for Iceberg
metadata tables (IMPALA-12612), which requires that queries with BINARY
inside complex types don't fail. Enabling EXPAND_COMPLEX_TYPES is a more
prioritised issue than IMPALA-11491, so we have come up with a
temporary solution.

This change NULLs out BINARY values in complex types coming from Iceberg
metadata tables and logs a warning.

BINARYs in complex types from regular tables are not affected by this
change.

Testing:
 - Added test queries in iceberg-metadata-tables.test.

Change-Id: I0d834126c7d702a25e957bb6071ecbf0fda2c203
Reviewed-on: http://gerrit.cloudera.org:8080/21219
Reviewed-by: Gabor Kaszab 
Tested-by: Impala Public Jenkins 
---
M be/src/exec/iceberg-metadata/iceberg-row-reader.cc
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/Path.java
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-metadata-tables.test
5 files changed, 64 insertions(+), 6 deletions(-)

Approvals:
  Gabor Kaszab: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/21219
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I0d834126c7d702a25e957bb6071ecbf0fda2c203
Gerrit-Change-Number: 21219
Gerrit-PatchSet: 10
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-12899: Temporary workaround for BINARY in complex types

2024-04-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21219 )

Change subject: IMPALA-12899: Temporary workaround for BINARY in complex types
..


Patch Set 9: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/21219
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0d834126c7d702a25e957bb6071ecbf0fda2c203
Gerrit-Change-Number: 21219
Gerrit-PatchSet: 9
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 03 Apr 2024 14:51:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5323: Support BINARY columns in Kudu tables

2024-04-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18868 )

Change subject: IMPALA-5323: Support BINARY columns in Kudu tables
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15768/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/18868
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iff701a4b3a09ce7b6982c5d238e65f3d4f3d1151
Gerrit-Change-Number: 18868
Gerrit-PatchSet: 6
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Comment-Date: Wed, 03 Apr 2024 11:36:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5323: Support BINARY columns in Kudu tables

2024-04-03 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18868 )

Change subject: IMPALA-5323: Support BINARY columns in Kudu tables
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/18868/5/testdata/workloads/functional-query/queries/QueryTest/binary-type.test
File testdata/workloads/functional-query/queries/QueryTest/binary-type.test:

http://gerrit.cloudera.org:8080/#/c/18868/5/testdata/workloads/functional-query/queries/QueryTest/binary-type.test@170
PS5, Line 170: constants folding and non-ascii
> The reason for not having such a test is that in case of BINARY I cannot ha
oops, what I wrote is incorrect
Will add such a test in the next patch.



--
To view, visit http://gerrit.cloudera.org:8080/18868
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iff701a4b3a09ce7b6982c5d238e65f3d4f3d1151
Gerrit-Change-Number: 18868
Gerrit-PatchSet: 5
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Comment-Date: Wed, 03 Apr 2024 11:23:48 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5323: Support BINARY columns in Kudu tables

2024-04-03 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18868 )

Change subject: IMPALA-5323: Support BINARY columns in Kudu tables
..


Patch Set 6:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/18868/5/be/src/exec/kudu/kudu-util-ir.cc
File be/src/exec/kudu/kudu-util-ir.cc:

http://gerrit.cloudera.org:8080/#/c/18868/5/be/src/exec/kudu/kudu-util-ir.cc@75
PS5, Line 75:
> nit: A bit out of the scope of this patch, but this string literal is used
Done


http://gerrit.cloudera.org:8080/#/c/18868/5/be/src/exprs/cast-functions-ir.cc
File be/src/exprs/cast-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/18868/5/be/src/exprs/cast-functions-ir.cc@378
PS5, Line 378: nyVa
> Could you create a Jira ticket for this and include its number in the comme
Did the optimization in the end, now TruncateIfNecessary is called in cast( ... 
as varchar)


http://gerrit.cloudera.org:8080/#/c/18868/5/be/src/exprs/cast-functions-ir.cc@384
PS5, Line 384: // STRING -> BINARY
> Do we also cast to VARCHAR? If not, this line is not needed; if yes, please
cleaned up the comments


http://gerrit.cloudera.org:8080/#/c/18868/5/be/src/runtime/types.cc
File be/src/runtime/types.cc:

http://gerrit.cloudera.org:8080/#/c/18868/5/be/src/runtime/types.cc@263
PS5, Line 263:   return is_binary_ ? "BINARY" : "STRING";
> Optional: We could take this branch out of the SWITCH and before the creati
rewritten all to Substitute
I don't know how much constructing an empty stringstream costs (probably not 
much), but my experience is that creating/deleting lot of small stringstreams 
is slow.


http://gerrit.cloudera.org:8080/#/c/18868/5/fe/src/main/java/org/apache/impala/analysis/CastExpr.java
File fe/src/main/java/org/apache/impala/analysis/CastExpr.java:

http://gerrit.cloudera.org:8080/#/c/18868/5/fe/src/main/java/org/apache/impala/analysis/CastExpr.java@198
PS5, Line 198:
> Do you plan to resolve this in this patch?
I agree with Peter that it should be a separate patch.


http://gerrit.cloudera.org:8080/#/c/18868/5/testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
File testdata/workloads/functional-planner/queries/PlannerTest/kudu.test:

http://gerrit.cloudera.org:8080/#/c/18868/5/testdata/workloads/functional-planner/queries/PlannerTest/kudu.test@774
PS5, Line 774: Not
> Nit: Not valid? Or Non-utf-8 strings?
Done


http://gerrit.cloudera.org:8080/#/c/18868/5/testdata/workloads/functional-query/queries/QueryTest/binary-type.test
File testdata/workloads/functional-query/queries/QueryTest/binary-type.test:

http://gerrit.cloudera.org:8080/#/c/18868/5/testdata/workloads/functional-query/queries/QueryTest/binary-type.test@159
PS5, Line 159: simp
> typo
Done


http://gerrit.cloudera.org:8080/#/c/18868/5/testdata/workloads/functional-query/queries/QueryTest/binary-type.test@170
PS5, Line 170: ants folding and non-ascii char
> Is any of 1) constant folding 2) non-ascii characters enough for not being
The reason for not having such a test is that in case of BINARY I cannot have a 
literal on the right side, it needs to be an explicit cast to binary.

The literal use case is only relevant for strings.



--
To view, visit http://gerrit.cloudera.org:8080/18868
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iff701a4b3a09ce7b6982c5d238e65f3d4f3d1151
Gerrit-Change-Number: 18868
Gerrit-PatchSet: 6
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Comment-Date: Wed, 03 Apr 2024 11:22:17 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5323: Support BINARY columns in Kudu tables

2024-04-03 Thread Csaba Ringhofer (Code Review)
Hello Daniel Becker, Peter Rozsa, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/18868

to look at the new patch set (#6).

Change subject: IMPALA-5323: Support BINARY columns in Kudu tables
..

IMPALA-5323: Support BINARY columns in Kudu tables

The patch adds read and write support for BINARY columns in Kudu
tables.

Predicate push down is implemented, but is incomplete:
a constant binary argument will be only pushed down if it is
a: directly cast from string literal, e.g. cast("a" as binary)
b: the constant folding never encounters non-ascii strings,
   e.g. cast(hex("a") as binary)
See IMPALA-10349 for more details on this limitation.

The patch also changes casting BINARY <-> STRING from noop
to calling an actual function. While this may add some small
overhead it allows the backend to know whether an expression
returns STRING or BINARY.

Change-Id: Iff701a4b3a09ce7b6982c5d238e65f3d4f3d1151
---
M be/src/exec/kudu/kudu-util-ir.cc
M be/src/exec/kudu/kudu-util.cc
M be/src/exprs/cast-functions-ir.cc
M be/src/exprs/cast-functions.h
M be/src/runtime/types.cc
M be/src/runtime/types.h
M fe/src/main/java/org/apache/impala/analysis/CastExpr.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/util/KuduUtil.java
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
M testdata/workloads/functional-query/queries/QueryTest/binary-type.test
M tests/common/kudu_test_suite.py
M tests/query_test/test_kudu.py
M tests/query_test/test_scanners.py
16 files changed, 175 insertions(+), 80 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/18868/6
--
To view, visit http://gerrit.cloudera.org:8080/18868
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iff701a4b3a09ce7b6982c5d238e65f3d4f3d1151
Gerrit-Change-Number: 18868
Gerrit-PatchSet: 6
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Peter Rozsa 


[Impala-ASF-CR] IMPALA-5323: Support BINARY columns in Kudu tables

2024-04-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18868 )

Change subject: IMPALA-5323: Support BINARY columns in Kudu tables
..


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10485/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/18868
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iff701a4b3a09ce7b6982c5d238e65f3d4f3d1151
Gerrit-Change-Number: 18868
Gerrit-PatchSet: 6
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Comment-Date: Wed, 03 Apr 2024 11:13:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12899: Temporary workaround for BINARY in complex types

2024-04-03 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21219 )

Change subject: IMPALA-12899: Temporary workaround for BINARY in complex types
..


Patch Set 9: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21219
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0d834126c7d702a25e957bb6071ecbf0fda2c203
Gerrit-Change-Number: 21219
Gerrit-PatchSet: 9
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 03 Apr 2024 10:54:38 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12899: Temporary workaround for BINARY in complex types

2024-04-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21219 )

Change subject: IMPALA-12899: Temporary workaround for BINARY in complex types
..


Patch Set 9:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15767/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21219
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0d834126c7d702a25e957bb6071ecbf0fda2c203
Gerrit-Change-Number: 21219
Gerrit-PatchSet: 9
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 03 Apr 2024 10:09:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12899: Temporary workaround for BINARY in complex types

2024-04-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21219 )

Change subject: IMPALA-12899: Temporary workaround for BINARY in complex types
..


Patch Set 8:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15766/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21219
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0d834126c7d702a25e957bb6071ecbf0fda2c203
Gerrit-Change-Number: 21219
Gerrit-PatchSet: 8
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 03 Apr 2024 10:01:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12899: Temporary workaround for BINARY in complex types

2024-04-03 Thread Daniel Becker (Code Review)
Daniel Becker has uploaded a new patch set (#9). ( 
http://gerrit.cloudera.org:8080/21219 )

Change subject: IMPALA-12899: Temporary workaround for BINARY in complex types
..

IMPALA-12899: Temporary workaround for BINARY in complex types

The BINARY type is currently not supported inside complex types and a
cross-component decision is probably needed to support it (see
IMPALA-11491). We would like to enable EXPAND_COMPLEX_TYPES for Iceberg
metadata tables (IMPALA-12612), which requires that queries with BINARY
inside complex types don't fail. Enabling EXPAND_COMPLEX_TYPES is a more
prioritised issue than IMPALA-11491, so we have come up with a
temporary solution.

This change NULLs out BINARY values in complex types coming from Iceberg
metadata tables and logs a warning.

BINARYs in complex types from regular tables are not affected by this
change.

Testing:
 - Added test queries in iceberg-metadata-tables.test.

Change-Id: I0d834126c7d702a25e957bb6071ecbf0fda2c203
---
M be/src/exec/iceberg-metadata/iceberg-row-reader.cc
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/Path.java
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-metadata-tables.test
5 files changed, 64 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/21219/9
--
To view, visit http://gerrit.cloudera.org:8080/21219
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0d834126c7d702a25e957bb6071ecbf0fda2c203
Gerrit-Change-Number: 21219
Gerrit-PatchSet: 9
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-12899: Temporary workaround for BINARY in complex types

2024-04-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21219 )

Change subject: IMPALA-12899: Temporary workaround for BINARY in complex types
..


Patch Set 9:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10484/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/21219
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0d834126c7d702a25e957bb6071ecbf0fda2c203
Gerrit-Change-Number: 21219
Gerrit-PatchSet: 9
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 03 Apr 2024 09:45:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12899: Temporary workaround for BINARY in complex types

2024-04-03 Thread Daniel Becker (Code Review)
Daniel Becker has uploaded a new patch set (#8). ( 
http://gerrit.cloudera.org:8080/21219 )

Change subject: IMPALA-12899: Temporary workaround for BINARY in complex types
..

IMPALA-12899: Temporary workaround for BINARY in complex types

The BINARY type is currently not supported inside complex types and a
cross-component decision is probably needed to support it (see
IMPALA-11491). We would like to enable EXPAND_COMPLEX_TYPES for Iceberg
metadata tables (IMPALA-12612), which requires that queries with BINARY
inside complex types don't fail. Enabling EXPAND_COMPLEX_TYPES is a more
prioritised issue than IMPALA-11491, so we have come up with a
temporary solution.

This change NULLs out BINARY values in complex types coming from Iceberg
metadata tables and logs a warning.

BINARYs in complex types from regular tables are not affected by this
change.

Testing:
 - Added test queries in iceberg-metadata-tables.test.

Change-Id: I0d834126c7d702a25e957bb6071ecbf0fda2c203
---
M be/src/exec/iceberg-metadata/iceberg-row-reader.cc
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/Path.java
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-metadata-tables.test
5 files changed, 64 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/21219/8
--
To view, visit http://gerrit.cloudera.org:8080/21219
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0d834126c7d702a25e957bb6071ecbf0fda2c203
Gerrit-Change-Number: 21219
Gerrit-PatchSet: 8
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins