[Impala-ASF-CR] IMPALA-12925: Fix decimal data type for external JDBC table

2024-04-04 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21218 )

Change subject: IMPALA-12925: Fix decimal data type for external JDBC table
..


Patch Set 9:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21218/9/fe/src/main/java/org/apache/impala/extdatasource/jdbc/JdbcDataSource.java
File fe/src/main/java/org/apache/impala/extdatasource/jdbc/JdbcDataSource.java:

http://gerrit.cloudera.org:8080/#/c/21218/9/fe/src/main/java/org/apache/impala/extdatasource/jdbc/JdbcDataSource.java@194
PS9, Line 194: new TStatus(TErrorCode.INTERNAL_ERROR, 
Lists.newArrayList(e.getMessage(;
Why is this an internal error? Isn't it misconfiguration between table 
definition and remote source?


http://gerrit.cloudera.org:8080/#/c/21218/9/testdata/bin/clean-mysql-env.sh
File testdata/bin/clean-mysql-env.sh:

http://gerrit.cloudera.org:8080/#/c/21218/9/testdata/bin/clean-mysql-env.sh@38
PS9, Line 38: rm -f /tmp/mysql_jdbc_decimal_tbl.*
Why are these created here instead of testdata/data?



--
To view, visit http://gerrit.cloudera.org:8080/21218
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c9d2e0667c42c0e52436b158e3dfe3ec14b9e3b
Gerrit-Change-Number: 21218
Gerrit-PatchSet: 9
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: gaurav singh 
Gerrit-Comment-Date: Thu, 04 Apr 2024 20:46:37 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12925: Fix decimal data type for external JDBC table

2024-04-04 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21218 )

Change subject: IMPALA-12925: Fix decimal data type for external JDBC table
..


Patch Set 9:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21218/9/fe/src/main/java/org/apache/impala/extdatasource/jdbc/JdbcDataSource.java
File fe/src/main/java/org/apache/impala/extdatasource/jdbc/JdbcDataSource.java:

http://gerrit.cloudera.org:8080/#/c/21218/9/fe/src/main/java/org/apache/impala/extdatasource/jdbc/JdbcDataSource.java@194
PS9, Line 194: new TStatus(TErrorCode.INTERNAL_ERROR, 
Lists.newArrayList(e.getMessage(;
> Why is this an internal error? Isn't it misconfiguration between table defi
The errors are caused by misconfiguration between JDBC table definition and 
remote tables in RDBMS, like upsupported column data type, invalid decimal 
precision and scale, etc. The error messages will be returned to client. We can 
define an error code for JDBC configuration.


http://gerrit.cloudera.org:8080/#/c/21218/9/testdata/bin/clean-mysql-env.sh
File testdata/bin/clean-mysql-env.sh:

http://gerrit.cloudera.org:8080/#/c/21218/9/testdata/bin/clean-mysql-env.sh@38
PS9, Line 38: rm -f /tmp/mysql_jdbc_decimal_tbl.*
> Why are these created here instead of testdata/data?
These files are created by testdata/bin/setup-mysql-env.sh when running 
unit-tests for MySQL. These are temp files used for creating table and loading 
data for MySQL when setting up MySQL server. We clean up these files after the 
MySQL server has been setup.



--
To view, visit http://gerrit.cloudera.org:8080/21218
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c9d2e0667c42c0e52436b158e3dfe3ec14b9e3b
Gerrit-Change-Number: 21218
Gerrit-PatchSet: 9
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: gaurav singh 
Gerrit-Comment-Date: Thu, 04 Apr 2024 21:09:57 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12925: Fix decimal data type for external JDBC table

2024-04-04 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21218 )

Change subject: IMPALA-12925: Fix decimal data type for external JDBC table
..


Patch Set 9:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21218/9/testdata/bin/clean-mysql-env.sh
File testdata/bin/clean-mysql-env.sh:

http://gerrit.cloudera.org:8080/#/c/21218/9/testdata/bin/clean-mysql-env.sh@38
PS9, Line 38: rm -f /tmp/mysql_jdbc_decimal_tbl.*
> These files are created by testdata/bin/setup-mysql-env.sh when running uni
Ack



--
To view, visit http://gerrit.cloudera.org:8080/21218
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c9d2e0667c42c0e52436b158e3dfe3ec14b9e3b
Gerrit-Change-Number: 21218
Gerrit-PatchSet: 9
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: gaurav singh 
Gerrit-Comment-Date: Thu, 04 Apr 2024 21:11:05 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12925: Fix decimal data type for external JDBC table

2024-04-04 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has uploaded a new patch set (#10). ( 
http://gerrit.cloudera.org:8080/21218 )

Change subject: IMPALA-12925: Fix decimal data type for external JDBC table
..

IMPALA-12925: Fix decimal data type for external JDBC table

Decimal type is a primitive data type for Impala. Current code returns
wrong values for columns with decimal data type in external JDBC tables.

This patch fixes wrong values returned from JDBC data source, and
supports pushing down decimal type of predicates to remote database
and remote Impala.
The decimal precision and scale of the columns in external JDBC table
must be no less than the decimal precision and scale of the
corresponding columns in the table of remote database. Otherwise,
Impala fails with an error since it may cause truncation of decimal
data.

Testing:
 - Added Planner test for pushing down decimal type of predicates.
 - Added end-to-end unit-tests for tables with decimal type of columns
   for Postgres, MySQL, and Impala-to-Impala.
 - Passed core-tests.

Change-Id: I8c9d2e0667c42c0e52436b158e3dfe3ec14b9e3b
---
M common/thrift/generate_error_codes.py
M fe/src/main/java/org/apache/impala/extdatasource/jdbc/JdbcDataSource.java
M 
fe/src/main/java/org/apache/impala/extdatasource/jdbc/dao/JdbcRecordIterator.java
M 
fe/src/main/java/org/apache/impala/extdatasource/jdbc/util/QueryConditionUtil.java
M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
M testdata/bin/clean-mysql-env.sh
M testdata/bin/create-ext-data-source-table.sql
M testdata/bin/load-ext-data-sources.sh
M testdata/bin/setup-mysql-env.sh
M 
testdata/workloads/functional-planner/queries/PlannerTest/data-source-tables.test
M 
testdata/workloads/functional-query/queries/QueryTest/impala-ext-jdbc-tables-predicates.test
M 
testdata/workloads/functional-query/queries/QueryTest/impala-ext-jdbc-tables.test
M testdata/workloads/functional-query/queries/QueryTest/jdbc-data-source.test
M 
testdata/workloads/functional-query/queries/QueryTest/mysql-ext-jdbc-tables.test
14 files changed, 576 insertions(+), 21 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/21218/10
--
To view, visit http://gerrit.cloudera.org:8080/21218
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8c9d2e0667c42c0e52436b158e3dfe3ec14b9e3b
Gerrit-Change-Number: 21218
Gerrit-PatchSet: 10
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: gaurav singh 


[Impala-ASF-CR] IMPALA-12925: Fix decimal data type for external JDBC table

2024-04-04 Thread Abhishek Rawat (Code Review)
Abhishek Rawat has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21218 )

Change subject: IMPALA-12925: Fix decimal data type for external JDBC table
..


Patch Set 10: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21218
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c9d2e0667c42c0e52436b158e3dfe3ec14b9e3b
Gerrit-Change-Number: 21218
Gerrit-PatchSet: 10
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: gaurav singh 
Gerrit-Comment-Date: Thu, 04 Apr 2024 22:24:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12925: Fix decimal data type for external JDBC table

2024-04-04 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21218 )

Change subject: IMPALA-12925: Fix decimal data type for external JDBC table
..


Patch Set 10:

Thanks Abhishek and Michael.


--
To view, visit http://gerrit.cloudera.org:8080/21218
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c9d2e0667c42c0e52436b158e3dfe3ec14b9e3b
Gerrit-Change-Number: 21218
Gerrit-PatchSet: 10
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: gaurav singh 
Gerrit-Comment-Date: Thu, 04 Apr 2024 22:26:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12925: Fix decimal data type for external JDBC table

2024-04-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21218 )

Change subject: IMPALA-12925: Fix decimal data type for external JDBC table
..


Patch Set 10:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10495/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/21218
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c9d2e0667c42c0e52436b158e3dfe3ec14b9e3b
Gerrit-Change-Number: 21218
Gerrit-PatchSet: 10
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: gaurav singh 
Gerrit-Comment-Date: Thu, 04 Apr 2024 22:26:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12925: Fix decimal data type for external JDBC table

2024-04-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21218 )

Change subject: IMPALA-12925: Fix decimal data type for external JDBC table
..


Patch Set 10:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10494/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/21218
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c9d2e0667c42c0e52436b158e3dfe3ec14b9e3b
Gerrit-Change-Number: 21218
Gerrit-PatchSet: 10
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: gaurav singh 
Gerrit-Comment-Date: Thu, 04 Apr 2024 22:26:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12925: Fix decimal data type for external JDBC table

2024-04-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21218 )

Change subject: IMPALA-12925: Fix decimal data type for external JDBC table
..


Patch Set 10:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15782/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21218
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c9d2e0667c42c0e52436b158e3dfe3ec14b9e3b
Gerrit-Change-Number: 21218
Gerrit-PatchSet: 10
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: gaurav singh 
Gerrit-Comment-Date: Thu, 04 Apr 2024 22:44:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12925: Fix decimal data type for external JDBC table

2024-04-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21218 )

Change subject: IMPALA-12925: Fix decimal data type for external JDBC table
..


Patch Set 10: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/10494/


--
To view, visit http://gerrit.cloudera.org:8080/21218
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c9d2e0667c42c0e52436b158e3dfe3ec14b9e3b
Gerrit-Change-Number: 21218
Gerrit-PatchSet: 10
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: gaurav singh 
Gerrit-Comment-Date: Fri, 05 Apr 2024 03:33:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12925: Fix decimal data type for external JDBC table

2024-04-04 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21218 )

Change subject: IMPALA-12925: Fix decimal data type for external JDBC table
..


Patch Set 10:

Hit irrelevant IMPALA-11285. Try yo rerun


--
To view, visit http://gerrit.cloudera.org:8080/21218
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c9d2e0667c42c0e52436b158e3dfe3ec14b9e3b
Gerrit-Change-Number: 21218
Gerrit-PatchSet: 10
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: gaurav singh 
Gerrit-Comment-Date: Fri, 05 Apr 2024 04:01:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12925: Fix decimal data type for external JDBC table

2024-04-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21218 )

Change subject: IMPALA-12925: Fix decimal data type for external JDBC table
..


Patch Set 10:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10496/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/21218
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c9d2e0667c42c0e52436b158e3dfe3ec14b9e3b
Gerrit-Change-Number: 21218
Gerrit-PatchSet: 10
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: gaurav singh 
Gerrit-Comment-Date: Fri, 05 Apr 2024 04:03:01 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12905: Disk-based tuple caching

2024-04-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21171 )

Change subject: IMPALA-12905: Disk-based tuple caching
..


Patch Set 7:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/21171/7/be/src/exec/tuple-file-writer.cc
File be/src/exec/tuple-file-writer.cc:

http://gerrit.cloudera.org:8080/#/c/21171/7/be/src/exec/tuple-file-writer.cc@130
PS7, Line 130: Substitute("Write of size $0 would cause $1 to exceed 
the maximum file size of $2",
line too long (91 > 90)


http://gerrit.cloudera.org:8080/#/c/21171/7/tests/custom_cluster/test_tuple_cache.py
File tests/custom_cluster/test_tuple_cache.py:

http://gerrit.cloudera.org:8080/#/c/21171/7/tests/custom_cluster/test_tuple_cache.py@60
PS7, Line 60: }
flake8: E123 closing bracket does not match indentation of opening bracket's 
line


http://gerrit.cloudera.org:8080/#/c/21171/7/tests/custom_cluster/test_tuple_cache.py@118
PS7, Line 118: @
flake8: E303 too many blank lines (2)


http://gerrit.cloudera.org:8080/#/c/21171/7/tests/custom_cluster/test_tuple_cache.py@173
PS7, Line 173: e
flake8: F841 local variable 'e' is assigned to but never used



--
To view, visit http://gerrit.cloudera.org:8080/21171
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I13a65c4c0559cad3559d5f714a074dd06e9cc9bf
Gerrit-Change-Number: 21171
Gerrit-PatchSet: 7
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Fri, 05 Apr 2024 05:58:51 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12905: Disk-based tuple caching

2024-04-04 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21171 )

Change subject: IMPALA-12905: Disk-based tuple caching
..


Patch Set 6:

(13 comments)

http://gerrit.cloudera.org:8080/#/c/21171/6/be/src/exec/tuple-cache-node.h
File be/src/exec/tuple-cache-node.h:

http://gerrit.cloudera.org:8080/#/c/21171/6/be/src/exec/tuple-cache-node.h@27
PS6, Line 27: class TupleFileReader;
> There is a special provision for std::unique_ptr and std::shared_ptr in the
Here's what I think is happening:
1. We only ever need to instantiate TupleCacheNode from TupleCachePlanNode's 
CreateExecNode() in tuple-cache-node.cc.
2. At that point, TupleFileReader and TupleFileWriter are fully defined, 
because tuple-cache-node.cc includes tuple-file-reader.h and 
tuple-file-writer.h.

The compiler implicitly declares the destructor, and that happens immediately 
for anything that includes the header. The compiler implicitly defines the 
destructor when it is first needed, and that can happen in multiple translation 
units.

My guess is that this is pretty fragile. If we needed to instantiate 
TupleCacheNode anywhere other than tuple-cache-node.cc, I think those places 
would need to include tuple-cache-reader.h and tuple-cache-writer.h. The better 
way to do this is to declare the destructor in the .h file and then define it 
in the .cc file with = default.


http://gerrit.cloudera.org:8080/#/c/21171/6/be/src/exec/tuple-cache-node.cc
File be/src/exec/tuple-cache-node.cc:

http://gerrit.cloudera.org:8080/#/c/21171/6/be/src/exec/tuple-cache-node.cc@33
PS6, Line 33: DECLARE_bool(allow_tuple_caching);
> Is this flag actually used here?
No, removed


http://gerrit.cloudera.org:8080/#/c/21171/6/be/src/exec/tuple-cache-node.cc@61
PS6, Line 61:   SCOPED_TIMER(runtime_profile_->total_time_counter());
> runtime_profile() used above.
Good point, changed


http://gerrit.cloudera.org:8080/#/c/21171/6/be/src/exec/tuple-cache-node.cc@142
PS6, Line 142:   } else if (writer_->BytesWritten() > 
tuple_cache_mgr->MaxSize()) {
> Should check the size and not write if it is over.
Reworked the code so that the TupleFileWriter enforces the max file size. It 
calculates how many bytes it would write and returns an error if the resulting 
file would exceed the limit.


http://gerrit.cloudera.org:8080/#/c/21171/6/be/src/exec/tuple-cache-node.cc@180
PS6, Line 180: // TODO: is this needed for Reset too?
> Do we need to support Reset?
I don't think we need to support Reset(), so I changed it's definition to throw 
an error / DCHECK.

Separately, I moved the CompleteWrite() logic to happen in GetNext if it 
reaches eos.


http://gerrit.cloudera.org:8080/#/c/21171/6/be/src/exec/tuple-file-reader.h
File be/src/exec/tuple-file-reader.h:

http://gerrit.cloudera.org:8080/#/c/21171/6/be/src/exec/tuple-file-reader.h@30
PS6, Line 30: class MemTracker;
> Is this forward declaration needed?
It looks like we don't need it. buffer-pool.h includes 
runtime/mem-tracker-types.h, which does the same thing.

Removed


http://gerrit.cloudera.org:8080/#/c/21171/6/be/src/exec/tuple-file-reader.cc
File be/src/exec/tuple-file-reader.cc:

http://gerrit.cloudera.org:8080/#/c/21171/6/be/src/exec/tuple-file-reader.cc@77
PS6, Line 77:   reader_.read(reinterpret_cast(_data_len), 
sizeof(tuple_data_len));
> Check status after reads using if(!reader_) like below.
Changed to use Kudu's RWFile, which returns status for each operation (which we 
check and return).


http://gerrit.cloudera.org:8080/#/c/21171/6/be/src/exec/tuple-file-reader.cc@79
PS6, Line 79:   DCHECK_GE(tuple_data_len, 0);
> These may not be initialized if read fails.
Fixed


http://gerrit.cloudera.org:8080/#/c/21171/6/be/src/exec/tuple-file-reader.cc@89
PS6, Line 89: // TODO: more detailed error messages. May need to check for 
failure immediately
> This would be nice to address.
Good point, we should check after each read() call. I switch to use Kudu's 
RWFile which returns status after each call.

This also does additional sanity checks for the lengths to make sure the file 
is long enough to contain the necessary data, etc.


http://gerrit.cloudera.org:8080/#/c/21171/6/be/src/exec/tuple-file-writer.cc
File be/src/exec/tuple-file-writer.cc:

http://gerrit.cloudera.org:8080/#/c/21171/6/be/src/exec/tuple-file-writer.cc@79
PS6, Line 79:   // TODO: swap write order of data/offsets since all the 
functions use offsets first.
> I don't remember what this todo meant, we can probably get rid of it.
Removed


http://gerrit.cloudera.org:8080/#/c/21171/6/be/src/exec/tuple-file-writer.cc@89
PS6, Line 89:   if (!writer_) {
> I don't know of any particular issue with multiple failed writes.
Changed to Kudu's WritableFile and modified this to check status after each 
write.


http://gerrit.cloudera.org:8080/#/c/21171/6/be/src/exec/tuple-file-writer.cc@91
PS6, Line 91: // TODO: more detailed error messages. May need to check 

[Impala-ASF-CR] IMPALA-12905: Disk-based tuple caching

2024-04-04 Thread Joe McDonnell (Code Review)
Hello Kurt Deschler, Yida Wu, Alexey Serbin, Michael Smith, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21171

to look at the new patch set (#7).

Change subject: IMPALA-12905: Disk-based tuple caching
..

IMPALA-12905: Disk-based tuple caching

This implements on-disk caching for the tuple cache. The
TupleCacheNode uses the TupleFileWriter and TupleFileReader
to write and read back tuples from local files. The file format
uses RowBatch's standard serialization used for KRPC data streams.

The TupleCacheMgr is the daemon-level structure that coordinates
the state machine for cache entries, including eviction. When a
writer is adding an entry, it inserts an IN_PROGRESS entry before
starting to write data. This does not count towards cache capacity,
because the total size is not known yet. This IN_PROGRESS entry
prevents other writers from concurrently writing the same entry.
If the write is successful, the entry transitions to the COMPLETE
state and updates the total size of the entry. If the write is
unsuccessful and a new execution might succeed, then the entry is
removed. If the write is unsuccessful and won't succeed later
(e.g. if the total size of the entry exceeds the max size of an
entry), then it transitions to the TOMBSTONE state. TOMBSTONE
entries avoid the overhead of trying to write entries that are
too large.

Given these states, when a TupleCacheNode is doing its initial
Lookup() call, one of three things can happen:
 1. It can find a COMPLETE entry and read it.
 2. It can find an IN_PROGRESS/TOMBSTONE entry, which means it
cannot read or write the entry.
 3. It finds no entry and inserts its own IN_PROGRESS entry
to start a write.

The tuple cache is configured using the tuple_cache parameter,
which is a combination of the cache directory and the capacity
similar to the data_cache parameter. For example, /data/0:100GB
uses directory /data/0 for the cache with a total capacity of
100GB. This currently supports a single directory, but it can
be expanded to multiple directories later if needed. The cache
eviction policy can be specified via the tuple_cache_eviction_policy
parameter, which currently supports LRU or LIRS. The tuple_cache
parameter cannot be specified if allow_tuple_caching=false.

This contains contributions from Michael Smith, Yida Wu,
and Joe McDonnell.

Testing:
 - This adds basic custom cluster tests for the tuple cache.

Change-Id: I13a65c4c0559cad3559d5f714a074dd06e9cc9bf
---
M be/src/exec/CMakeLists.txt
M be/src/exec/tuple-cache-node.cc
M be/src/exec/tuple-cache-node.h
A be/src/exec/tuple-file-read-write-test.cc
A be/src/exec/tuple-file-reader.cc
A be/src/exec/tuple-file-reader.h
A be/src/exec/tuple-file-writer.cc
A be/src/exec/tuple-file-writer.h
M be/src/runtime/CMakeLists.txt
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
A be/src/runtime/tuple-cache-mgr-test.cc
A be/src/runtime/tuple-cache-mgr.cc
A be/src/runtime/tuple-cache-mgr.h
M bin/start-impala-cluster.py
M common/thrift/metrics.json
A tests/custom_cluster/test_tuple_cache.py
17 files changed, 2,226 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/21171/7
--
To view, visit http://gerrit.cloudera.org:8080/21171
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I13a65c4c0559cad3559d5f714a074dd06e9cc9bf
Gerrit-Change-Number: 21171
Gerrit-PatchSet: 7
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Yida Wu 


[Impala-ASF-CR] IMPALA-12612: SELECT * queries expand complex type columns from Iceberg metadata tables

2024-04-04 Thread Gabor Kaszab (Code Review)
Hello Daniel Becker, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21236

to look at the new patch set (#4).

Change subject: IMPALA-12612: SELECT * queries expand complex type columns from 
Iceberg metadata tables
..

IMPALA-12612: SELECT * queries expand complex type columns from Iceberg 
metadata tables

Similarly to how regular tables behave, the nested columns are omitted
when we do a SELECT * on Iceberg metadata tables and the user needs to
turn EXPAND_COMPLEX_TYPES on to include the nested columns into the
result. This patch changes this behaviour to unconditionally include
the nested columns from Iceberg metadata tables.
Note, the behavior of handling nested columns from regular tables
doesn't change with this patch.

Testing:
  - Adjusted the SELECT * metadata table queries to add the nested
columns into the results.
  - Added some new tests where both metadata tables and regular tables
were queried in the same query.

Change-Id: Ia298705ba54411cc439e99d5cb27184093541f02
---
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-metadata-tables.test
2 files changed, 103 insertions(+), 82 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/36/21236/4
--
To view, visit http://gerrit.cloudera.org:8080/21236
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia298705ba54411cc439e99d5cb27184093541f02
Gerrit-Change-Number: 21236
Gerrit-PatchSet: 4
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-12612: SELECT * queries expand complex type columns from Iceberg metadata tables

2024-04-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21236 )

Change subject: IMPALA-12612: SELECT * queries expand complex type columns from 
Iceberg metadata tables
..


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10492/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/21236
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia298705ba54411cc439e99d5cb27184093541f02
Gerrit-Change-Number: 21236
Gerrit-PatchSet: 5
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 04 Apr 2024 09:01:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12612: SELECT * queries expand complex type columns from Iceberg metadata tables

2024-04-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21236 )

Change subject: IMPALA-12612: SELECT * queries expand complex type columns from 
Iceberg metadata tables
..


Patch Set 3: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/21236
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia298705ba54411cc439e99d5cb27184093541f02
Gerrit-Change-Number: 21236
Gerrit-PatchSet: 3
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 04 Apr 2024 10:09:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12612: SELECT * queries expand complex type columns from Iceberg metadata tables

2024-04-04 Thread Daniel Becker (Code Review)
Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21236 )

Change subject: IMPALA-12612: SELECT * queries expand complex type columns from 
Iceberg metadata tables
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21236/1/testdata/workloads/functional-query/queries/QueryTest/iceberg-metadata-tables.test
File 
testdata/workloads/functional-query/queries/QueryTest/iceberg-metadata-tables.test:

http://gerrit.cloudera.org:8080/#/c/21236/1/testdata/workloads/functional-query/queries/QueryTest/iceberg-metadata-tables.test@1110
PS1, Line 1110: select readable_metrics.i.* from 
functional_parquet.iceberg_query_metadata.`files`;
> I changed this to expand readable_metrics.i
Sorry, I didn't notice that readable_metrics.i doesn't contain complex types, 
so expanding it is not relevant for this patch. We could try "select 
data_file.* from functional_parquet.iceberg_query_metadata.entries" instead, 
that contains some maps.



--
To view, visit http://gerrit.cloudera.org:8080/21236
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia298705ba54411cc439e99d5cb27184093541f02
Gerrit-Change-Number: 21236
Gerrit-PatchSet: 3
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 04 Apr 2024 07:54:45 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12969: Release JNI array if DeserializeThriftMsg failed

2024-04-04 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/21234 )

Change subject: IMPALA-12969: Release JNI array if DeserializeThriftMsg failed
..

IMPALA-12969: Release JNI array if DeserializeThriftMsg failed

Before this patch ReleaseByteArrayElements was not called in case
the deserialization failed (e.g. by hitting Thrift's MaxMessageSize).
This could potentially cause JVM/native heap leak, depending on how
the JVM handled the array allocation.

Change-Id: Id2c0335b12e9289ae851d0ec050765951a8ca6c7
Reviewed-on: http://gerrit.cloudera.org:8080/21234
Reviewed-by: Michael Smith 
Reviewed-by: Daniel Becker 
Tested-by: Impala Public Jenkins 
---
M be/src/rpc/jni-thrift-util.h
1 file changed, 3 insertions(+), 3 deletions(-)

Approvals:
  Michael Smith: Looks good to me, but someone else must approve
  Daniel Becker: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/21234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Id2c0335b12e9289ae851d0ec050765951a8ca6c7
Gerrit-Change-Number: 21234
Gerrit-PatchSet: 2
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 


[Impala-ASF-CR] IMPALA-12152: Add query option to wait for events sync up

2024-04-04 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20131 )

Change subject: IMPALA-12152: Add query option to wait for events sync up
..


Patch Set 14:

(1 comment)

Thank Michael for reviewing this again! I'm working on the new solution - 
coordinator tells what it needs and catalogd sends back the required metadata 
in response. I'll update this soon.

http://gerrit.cloudera.org:8080/#/c/20131/14/be/src/service/client-request-state.cc
File be/src/service/client-request-state.cc:

http://gerrit.cloudera.org:8080/#/c/20131/14/be/src/service/client-request-state.cc@601
PS14, Line 601:   coord_->AddErrorLog(early_error_msg_);
> Can/should we clear early_error_msg_ here? Could at least free up the memor
I'll remove this in the new patch set.



--
To view, visit http://gerrit.cloudera.org:8080/20131
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36ac941bb2c2217b09fcfa2eb567b011b38efa2a
Gerrit-Change-Number: 20131
Gerrit-PatchSet: 14
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Thu, 04 Apr 2024 06:57:37 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12612: SELECT * queries expand complex type columns from Iceberg metadata tables

2024-04-04 Thread Daniel Becker (Code Review)
Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21236 )

Change subject: IMPALA-12612: SELECT * queries expand complex type columns from 
Iceberg metadata tables
..


Patch Set 5: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21236
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia298705ba54411cc439e99d5cb27184093541f02
Gerrit-Change-Number: 21236
Gerrit-PatchSet: 5
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 04 Apr 2024 09:15:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12612: SELECT * queries expand complex type columns from Iceberg metadata tables

2024-04-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21236 )

Change subject: IMPALA-12612: SELECT * queries expand complex type columns from 
Iceberg metadata tables
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15779/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21236
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia298705ba54411cc439e99d5cb27184093541f02
Gerrit-Change-Number: 21236
Gerrit-PatchSet: 4
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 04 Apr 2024 09:24:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] [TEST][WIP]IMPALA-11499: Refactor UrlEncode function to handle special characters

2024-04-04 Thread Anonymous Coward (Code Review)
pranav.lo...@cloudera.com has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/21131 )

Change subject: [TEST][WIP]IMPALA-11499: Refactor UrlEncode function to handle 
special characters
..

[TEST][WIP]IMPALA-11499: Refactor UrlEncode function to handle special
characters

In this commit, a comprehensive mapping of special characters, including
Unicode, to their respective URL-encoded forms has been introduced. The
UrlEncode function has been modified to dynamically iterate through
this mapping, providing a more generic and scalable approach to special
character encoding. This change also enhances code readability and
maintainability by incorporating detailed comments explaining each
section of the updated implementation.

Testing: Some basic tests are provided in unicode-column-name.test.

Change-Id: I88c4aba5d811dfcec809583d0c16fcbc0ca730fb
---
M be/src/util/coding-util.cc
M testdata/workloads/functional-query/queries/QueryTest/unicode-column-name.test
2 files changed, 81 insertions(+), 18 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/21131/4
--
To view, visit http://gerrit.cloudera.org:8080/21131
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I88c4aba5d811dfcec809583d0c16fcbc0ca730fb
Gerrit-Change-Number: 21131
Gerrit-PatchSet: 4
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Zihao Ye 


[Impala-ASF-CR] [TEST][WIP]IMPALA-11499: Refactor UrlEncode function to handle special characters

2024-04-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21131 )

Change subject: [TEST][WIP]IMPALA-11499: Refactor UrlEncode function to handle 
special characters
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15780/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21131
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I88c4aba5d811dfcec809583d0c16fcbc0ca730fb
Gerrit-Change-Number: 21131
Gerrit-PatchSet: 4
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Zihao Ye 
Gerrit-Comment-Date: Thu, 04 Apr 2024 10:39:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] [TEST][WIP]IMPALA-11499: Refactor UrlEncode function to handle special characters

2024-04-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21131 )

Change subject: [TEST][WIP]IMPALA-11499: Refactor UrlEncode function to handle 
special characters
..


Patch Set 4:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/21131/4/be/src/util/coding-util.cc
File be/src/util/coding-util.cc:

http://gerrit.cloudera.org:8080/#/c/21131/4/be/src/util/coding-util.cc@74
PS4, Line 74: static inline void UrlEncode(const char* in, int in_len, 
std::string* out, bool hive_compat) {
line too long (94 > 90)


http://gerrit.cloudera.org:8080/#/c/21131/4/be/src/util/coding-util.cc@81
PS4, Line 81: // a) We are in Hive-compat mode and the character should be 
escaped according to specialCharacterMap
line too long (105 > 90)


http://gerrit.cloudera.org:8080/#/c/21131/4/be/src/util/coding-util.cc@82
PS4, Line 82: // b) We are not in Hive-compat mode and the character is not 
alphanumeric or one of the commonly excluded characters
line too long (121 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/21131
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I88c4aba5d811dfcec809583d0c16fcbc0ca730fb
Gerrit-Change-Number: 21131
Gerrit-PatchSet: 4
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Zihao Ye 
Gerrit-Comment-Date: Thu, 04 Apr 2024 10:14:15 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] [TEST][WIP]IMPALA-11499: Refactor UrlEncode function to handle special characters

2024-04-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21131 )

Change subject: [TEST][WIP]IMPALA-11499: Refactor UrlEncode function to handle 
special characters
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10493/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/21131
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I88c4aba5d811dfcec809583d0c16fcbc0ca730fb
Gerrit-Change-Number: 21131
Gerrit-PatchSet: 4
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Zihao Ye 
Gerrit-Comment-Date: Thu, 04 Apr 2024 10:13:48 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12960: Fix Incorrect RowsPassedThrough Metric in Streaming Aggregation

2024-04-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21235 )

Change subject: IMPALA-12960: Fix Incorrect RowsPassedThrough Metric in 
Streaming Aggregation
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/21235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59205a4b06824ee1607a25e906db1f96dc4eda9f
Gerrit-Change-Number: 21235
Gerrit-PatchSet: 1
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Thu, 04 Apr 2024 10:34:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] [TEST][WIP]IMPALA-11499: Refactor UrlEncode function to handle special characters

2024-04-04 Thread Anonymous Coward (Code Review)
pranav.lo...@cloudera.com has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21131 )

Change subject: [TEST][WIP]IMPALA-11499: Refactor UrlEncode function to handle 
special characters
..


Patch Set 4:

> Uploaded patch set 4.

This is just for testing some types, will address the comments in a new patch.


--
To view, visit http://gerrit.cloudera.org:8080/21131
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I88c4aba5d811dfcec809583d0c16fcbc0ca730fb
Gerrit-Change-Number: 21131
Gerrit-PatchSet: 4
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Zihao Ye 
Gerrit-Comment-Date: Thu, 04 Apr 2024 10:15:03 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12960: Fix Incorrect RowsPassedThrough Metric in Streaming Aggregation

2024-04-04 Thread Yida Wu (Code Review)
Yida Wu has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/21235 )

Change subject: IMPALA-12960: Fix Incorrect RowsPassedThrough Metric in 
Streaming Aggregation
..

IMPALA-12960: Fix Incorrect RowsPassedThrough Metric in Streaming Aggregation

This patch fixes a bug in the RowsPassedThrough metric within the
query profile while using Streaming Aggregation. The issue is from
the AddBatchStreaming() function's logic, where the number of rows
in the output batch isn't necessarily initialized to 0, while the
function uses num_rows() of the output batch directly to be the
actual number of rows returned and passed through of this specific
aggregator. This discrepancy can significantly impact the accuracy
of the returned and passed through numbers, as well as the
calculation of reduction rates during hash table expansion in
Streaming Aggregation. Huge differences can be observed especially
when using the rollup function.

The solution is to calculate the actual number of rows added
to the output batch within each round of the AddBatchStreaming()
function.

Tests:
Passed exhaustive tests.
Added a corresponding case in tpch-passthrough-aggregations.test.

Change-Id: I59205a4b06824ee1607a25e906db1f96dc4eda9f
Reviewed-on: http://gerrit.cloudera.org:8080/21235
Reviewed-by: Wenzhe Zhou 
Reviewed-by: Riza Suminto 
Tested-by: Impala Public Jenkins 
---
M be/src/exec/grouping-aggregator.cc
M testdata/workloads/tpch/queries/tpch-passthrough-aggregations.test
2 files changed, 27 insertions(+), 2 deletions(-)

Approvals:
  Wenzhe Zhou: Looks good to me, but someone else must approve
  Riza Suminto: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/21235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I59205a4b06824ee1607a25e906db1f96dc4eda9f
Gerrit-Change-Number: 21235
Gerrit-PatchSet: 2
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Yida Wu 


[Impala-ASF-CR] IMPALA-12925: Fix decimal data type for external JDBC table

2024-04-04 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has uploaded a new patch set (#9). ( 
http://gerrit.cloudera.org:8080/21218 )

Change subject: IMPALA-12925: Fix decimal data type for external JDBC table
..

IMPALA-12925: Fix decimal data type for external JDBC table

Decimal type is a primitive data type for Impala. Current code returns
wrong values for columns with decimal data type in external JDBC tables.

This patch fixes wrong values returned from JDBC data source, and
supports pushing down decimal type of predicates to remote database
and remote Impala.
The decimal precision and scale of the columns in external JDBC table
must be no less than the decimal precision and scale of the
corresponding columns in the table of remote database. Otherwise,
Impala fails with an error since it may cause truncation of decimal
data.

Testing:
 - Added Planner test for pushing down decimal type of predicates.
 - Added end-to-end unit-tests for tables with decimal type of columns
   for Postgres, MySQL, and Impala-to-Impala.
 - Passed core-tests.

Change-Id: I8c9d2e0667c42c0e52436b158e3dfe3ec14b9e3b
---
M fe/src/main/java/org/apache/impala/extdatasource/jdbc/JdbcDataSource.java
M 
fe/src/main/java/org/apache/impala/extdatasource/jdbc/dao/JdbcRecordIterator.java
M 
fe/src/main/java/org/apache/impala/extdatasource/jdbc/util/QueryConditionUtil.java
M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
M testdata/bin/clean-mysql-env.sh
M testdata/bin/create-ext-data-source-table.sql
M testdata/bin/load-ext-data-sources.sh
M testdata/bin/setup-mysql-env.sh
M 
testdata/workloads/functional-planner/queries/PlannerTest/data-source-tables.test
M 
testdata/workloads/functional-query/queries/QueryTest/impala-ext-jdbc-tables-predicates.test
M 
testdata/workloads/functional-query/queries/QueryTest/impala-ext-jdbc-tables.test
M testdata/workloads/functional-query/queries/QueryTest/jdbc-data-source.test
M 
testdata/workloads/functional-query/queries/QueryTest/mysql-ext-jdbc-tables.test
13 files changed, 568 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/21218/9
--
To view, visit http://gerrit.cloudera.org:8080/21218
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8c9d2e0667c42c0e52436b158e3dfe3ec14b9e3b
Gerrit-Change-Number: 21218
Gerrit-PatchSet: 9
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: gaurav singh 


[Impala-ASF-CR] IMPALA-12925: Fix decimal data type for external JDBC table

2024-04-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21218 )

Change subject: IMPALA-12925: Fix decimal data type for external JDBC table
..


Patch Set 9:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15781/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21218
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c9d2e0667c42c0e52436b158e3dfe3ec14b9e3b
Gerrit-Change-Number: 21218
Gerrit-PatchSet: 9
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: gaurav singh 
Gerrit-Comment-Date: Thu, 04 Apr 2024 16:38:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] [TEST][WIP]IMPALA-11499: Refactor UrlEncode function to handle special characters

2024-04-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21131 )

Change subject: [TEST][WIP]IMPALA-11499: Refactor UrlEncode function to handle 
special characters
..


Patch Set 4: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/21131
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I88c4aba5d811dfcec809583d0c16fcbc0ca730fb
Gerrit-Change-Number: 21131
Gerrit-PatchSet: 4
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Zihao Ye 
Gerrit-Comment-Date: Thu, 04 Apr 2024 14:55:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12960: Fix Incorrect RowsPassedThrough Metric in Streaming Aggregation

2024-04-04 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21235 )

Change subject: IMPALA-12960: Fix Incorrect RowsPassedThrough Metric in 
Streaming Aggregation
..


Patch Set 1:

Thanks Riza and Wenzhe for the review.


--
To view, visit http://gerrit.cloudera.org:8080/21235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59205a4b06824ee1607a25e906db1f96dc4eda9f
Gerrit-Change-Number: 21235
Gerrit-PatchSet: 1
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Thu, 04 Apr 2024 15:27:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12612: SELECT * queries expand complex type columns from Iceberg metadata tables

2024-04-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21236 )

Change subject: IMPALA-12612: SELECT * queries expand complex type columns from 
Iceberg metadata tables
..


Patch Set 5: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/21236
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia298705ba54411cc439e99d5cb27184093541f02
Gerrit-Change-Number: 21236
Gerrit-PatchSet: 5
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 04 Apr 2024 14:04:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5323: Support BINARY columns in Kudu tables

2024-04-04 Thread Daniel Becker (Code Review)
Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18868 )

Change subject: IMPALA-5323: Support BINARY columns in Kudu tables
..


Patch Set 7:

(3 comments)

Looks good, thanks.

http://gerrit.cloudera.org:8080/#/c/18868/7/be/src/exec/kudu/kudu-util-ir.cc
File be/src/exec/kudu/kudu-util-ir.cc:

http://gerrit.cloudera.org:8080/#/c/18868/7/be/src/exec/kudu/kudu-util-ir.cc@56
PS7, Line 56:   const char* VALUE_ERROR_MSG = "Could not set Kudu row value.";
Could be constexpr:
constexpr const char* VALUE_ERROR_MSG = "...";


http://gerrit.cloudera.org:8080/#/c/18868/5/fe/src/main/java/org/apache/impala/analysis/CastExpr.java
File fe/src/main/java/org/apache/impala/analysis/CastExpr.java:

http://gerrit.cloudera.org:8080/#/c/18868/5/fe/src/main/java/org/apache/impala/analysis/CastExpr.java@198
PS5, Line 198:L
> I agree with Peter that it should be a separate patch.
Could you create a Jira ticket for it and reference the number here?


http://gerrit.cloudera.org:8080/#/c/18868/7/testdata/workloads/functional-query/queries/QueryTest/binary-type.test
File testdata/workloads/functional-query/queries/QueryTest/binary-type.test:

http://gerrit.cloudera.org:8080/#/c/18868/7/testdata/workloads/functional-query/queries/QueryTest/binary-type.test@170
PS7, Line 170: s
Nit: constant folding. See also on L182.



--
To view, visit http://gerrit.cloudera.org:8080/18868
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iff701a4b3a09ce7b6982c5d238e65f3d4f3d1151
Gerrit-Change-Number: 18868
Gerrit-PatchSet: 7
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Comment-Date: Thu, 04 Apr 2024 13:50:16 +
Gerrit-HasComments: Yes