[Impala-ASF-CR] IMPALA-12431: Support reading compressed JSON file

2024-01-15 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20482 )

Change subject: IMPALA-12431: Support reading compressed JSON file
..


Patch Set 11: Code-Review+1

This looks good to me


--
To view, visit http://gerrit.cloudera.org:8080/20482
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2471855d97d4cdd51363b321055e6b06aa6d81e8
Gerrit-Change-Number: 20482
Gerrit-PatchSet: 11
Gerrit-Owner: Zihao Ye 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Zihao Ye 
Gerrit-Comment-Date: Tue, 16 Jan 2024 07:28:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12631: Improve count star performance for parquet scans

2024-01-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20804 )

Change subject: IMPALA-12631: Improve count star performance for parquet scans
..


Patch Set 11: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/10154/


--
To view, visit http://gerrit.cloudera.org:8080/20804
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9cd2448fe51a420d4559d0cc861c4d30822f4fd
Gerrit-Change-Number: 20804
Gerrit-PatchSet: 11
Gerrit-Owner: Yifan Zhang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yifan Zhang 
Gerrit-Reviewer: Zihao Ye 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 16 Jan 2024 07:31:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12054: Lazily check Kudu flags in tests

2024-01-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20904 )

Change subject: IMPALA-12054: Lazily check Kudu flags in tests
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10155/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/20904
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic7a8282b59d72322085c21c70a5019c51b586a52
Gerrit-Change-Number: 20904
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Yifan Zhang 
Gerrit-Comment-Date: Tue, 16 Jan 2024 07:20:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12054: Lazily check Kudu flags in tests

2024-01-15 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20904 )

Change subject: IMPALA-12054: Lazily check Kudu flags in tests
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/20904
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic7a8282b59d72322085c21c70a5019c51b586a52
Gerrit-Change-Number: 20904
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Yifan Zhang 
Gerrit-Comment-Date: Tue, 16 Jan 2024 05:43:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12054: Lazily check Kudu flags in tests

2024-01-15 Thread Yifan Zhang (Code Review)
Yifan Zhang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20904 )

Change subject: IMPALA-12054: Lazily check Kudu flags in tests
..


Patch Set 1: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/20904
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic7a8282b59d72322085c21c70a5019c51b586a52
Gerrit-Change-Number: 20904
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yifan Zhang 
Gerrit-Comment-Date: Tue, 16 Jan 2024 03:17:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12631: Improve count star performance for parquet scans

2024-01-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20804 )

Change subject: IMPALA-12631: Improve count star performance for parquet scans
..


Patch Set 11:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10154/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/20804
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9cd2448fe51a420d4559d0cc861c4d30822f4fd
Gerrit-Change-Number: 20804
Gerrit-PatchSet: 11
Gerrit-Owner: Yifan Zhang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yifan Zhang 
Gerrit-Reviewer: Zihao Ye 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 16 Jan 2024 03:10:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12054: Lazily check Kudu flags in tests

2024-01-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20904 )

Change subject: IMPALA-12054: Lazily check Kudu flags in tests
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/14958/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20904
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic7a8282b59d72322085c21c70a5019c51b586a52
Gerrit-Change-Number: 20904
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 16 Jan 2024 02:21:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12362: Improve Linux packaging support.

2024-01-15 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20263 )

Change subject: IMPALA-12362: Improve Linux packaging support.
..


Patch Set 6:

(8 comments)

Thanks for improving the packaging support! There are several independent 
topics in this patch, e.g. CMake files refactoring, scripts refactoring, 
default configuration changes, adding more binaries, etc. To be easier for 
review, It'd be nice to split this into several smaller patches.

http://gerrit.cloudera.org:8080/#/c/20263/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20263/6//COMMIT_MSG@18
PS6, Line 18:  - add support for admissiond service.
It'd be nice to also add impala-profile-tool so users can parse the thrift 
profiles. The location is be/build/release/util/impala-profile-tool


http://gerrit.cloudera.org:8080/#/c/20263/6/package/CMakeLists.txt
File package/CMakeLists.txt:

http://gerrit.cloudera.org:8080/#/c/20263/6/package/CMakeLists.txt@28
PS6, Line 28: install(FILES ${STATESTORED_SYMLINK} DESTINATION 
${IMPALA_INSTALLDIR}/sbin)
: install(FILES ${CATALOGD_SYMLINK} DESTINATION 
${IMPALA_INSTALLDIR}/sbin)
: install(FILES ${ADMISSIOND_SYMLINK} DESTINATION 
${IMPALA_INSTALLDIR}/sbin)
: install(TARGETS impalad DESTINATION ${IMPALA_INSTALLDIR}/sbin)
We already have these in be/src/service/CMakeLists.txt. Do we need to duplicate 
them here?


http://gerrit.cloudera.org:8080/#/c/20263/6/package/bin/impala.sh
File package/bin/impala.sh:

http://gerrit.cloudera.org:8080/#/c/20263/6/package/bin/impala.sh@21
PS6, Line 21: custom
nit: customize


http://gerrit.cloudera.org:8080/#/c/20263/6/package/bin/impala.sh@22
PS6, Line 22: custom
nit: customize


http://gerrit.cloudera.org:8080/#/c/20263/6/package/bin/impala.sh@49
PS6, Line 49:   else
nit: don't need "else"


http://gerrit.cloudera.org:8080/#/c/20263/6/package/bin/impala.sh@109
PS6, Line 109:   sleep 1
Any reason for removing the logic of wait_for_ready? I think it's helpful when 
launching Impala on a large cluster. Admins can know when the launch really 
finishes.


http://gerrit.cloudera.org:8080/#/c/20263/6/package/conf/catalogd_flags
File package/conf/catalogd_flags:

http://gerrit.cloudera.org:8080/#/c/20263/6/package/conf/catalogd_flags@a9
PS6, Line 9:
Why do we remove this? Without the correct doc root, the webUI might not be 
able to be rendered.


http://gerrit.cloudera.org:8080/#/c/20263/6/package/conf/catalogd_flags@20
PS6, Line 20: # -v=1
We should set -v=1 explicitly. Otherwise, no INFO logs will be shown. The 
default of glog is -v=0.



--
To view, visit http://gerrit.cloudera.org:8080/20263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If3914dcda69f81a735cdf70d76c59fa09454777b
Gerrit-Change-Number: 20263
Gerrit-PatchSet: 6
Gerrit-Owner: Xiang Yang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Tue, 16 Jan 2024 02:16:31 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12054: Lazily check Kudu flags in tests

2024-01-15 Thread Quanlong Huang (Code Review)
Quanlong Huang has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/20904


Change subject: IMPALA-12054: Lazily check Kudu flags in tests
..

IMPALA-12054: Lazily check Kudu flags in tests

I usually shutdown Kudu in my dev env to save some resources. However,
tests that import skip.py will fail if Kudu cluster is not running
locally, even if the tests are unrelated to Kudu. The cause is that Kudu
web pages are accessed when the module is imported, and it fails if Kudu
cluster is not running.

This patch exposes the decorators of SkipIfKudu as methods just like
what we did in SkipIfCatalogV2, so Kudu web pages can be checked lazily
when needed.

Tests:
 - Ran Kudu tests.
 - Ran some Kudu unrelated tests without lauching the Kudu cluster.

Change-Id: Ic7a8282b59d72322085c21c70a5019c51b586a52
---
M tests/common/skip.py
M tests/query_test/test_kudu.py
2 files changed, 65 insertions(+), 54 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/20904/1
--
To view, visit http://gerrit.cloudera.org:8080/20904
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ic7a8282b59d72322085c21c70a5019c51b586a52
Gerrit-Change-Number: 20904
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 


[Impala-ASF-CR] IMPALA-12714: Fix test reduced cardinality by filter for non-HDFS

2024-01-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/20902 )

Change subject: IMPALA-12714: Fix test_reduced_cardinality_by_filter for 
non-HDFS
..

IMPALA-12714: Fix test_reduced_cardinality_by_filter for non-HDFS

test_reduced_cardinality_by_filter failed in non-HDFS environment
because it assert for existence of '00:SCAN HDFS' in ExecSummary. This
patch change that assertion to ignore the type of scan node from test
query. Also marked the test with SkipIfNotHdfsMinicluster.plans
decorator.

Testing:
- Pass test_reduced_cardinality_by_filter

Change-Id: Icbf72687cc3c5a99aa0a0a74e229ed8c88ed06ef
Reviewed-on: http://gerrit.cloudera.org:8080/20902
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M tests/query_test/test_observability.py
1 file changed, 2 insertions(+), 1 deletion(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/20902
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Icbf72687cc3c5a99aa0a0a74e229ed8c88ed06ef
Gerrit-Change-Number: 20902
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Tamas Mate 


[Impala-ASF-CR] IMPALA-12714: Fix test reduced cardinality by filter for non-HDFS

2024-01-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20902 )

Change subject: IMPALA-12714: Fix test_reduced_cardinality_by_filter for 
non-HDFS
..


Patch Set 2: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/20902
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icbf72687cc3c5a99aa0a0a74e229ed8c88ed06ef
Gerrit-Change-Number: 20902
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Comment-Date: Mon, 15 Jan 2024 22:53:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12642: Support query options for Impala external JDBC table

2024-01-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20837 )

Change subject: IMPALA-12642: Support query options for Impala external JDBC 
table
..


Patch Set 8:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/14957/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20837
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I47687b7a93e90cea8ebd5f3fc280c9135bd97992
Gerrit-Change-Number: 20837
Gerrit-PatchSet: 8
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Yifan Zhang 
Gerrit-Comment-Date: Mon, 15 Jan 2024 19:12:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12708: An UPDATE creates 2 new snapshots in Iceberg tables

2024-01-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20903 )

Change subject: IMPALA-12708: An UPDATE creates 2 new snapshots in Iceberg 
tables
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/14956/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20903
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2ceb80b939c644388707b21061bf55451234dcd3
Gerrit-Change-Number: 20903
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Comment-Date: Mon, 15 Jan 2024 18:56:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12642: Support query options for Impala external JDBC table

2024-01-15 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has uploaded a new patch set (#8). ( 
http://gerrit.cloudera.org:8080/20837 )

Change subject: IMPALA-12642: Support query options for Impala external JDBC 
table
..

IMPALA-12642: Support query options for Impala external JDBC table

This patch uses JDBC connection string to apply query options to the
Impala server by setting the properties in "jdbc.properties" when
creating JDBC external DataSource table.
jdbc.properties are specified as comma-delimited key=value string, like
"MEM_LIMIT=10, ENABLED_RUNTIME_FILTER_TYPES=\"BLOOM,MIN_MAX\"".
jdbc.properties can be used for other databases like Postgres and MySQL
to set additional properties.

Testing:
 - Added end-to-end tests for setting query options on Impala JDBC
   tables.
 - Passed core tests.

Change-Id: I47687b7a93e90cea8ebd5f3fc280c9135bd97992
---
M 
java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/conf/JdbcStorageConfig.java
M 
java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/dao/GenericJdbcDatabaseAccessor.java
M 
java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/dao/ImpalaDatabaseAccessor.java
M 
java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/dao/MySqlDatabaseAccessor.java
M 
java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/dao/PostgresDatabaseAccessor.java
M 
testdata/workloads/functional-query/queries/QueryTest/impala-ext-jdbc-tables.test
M testdata/workloads/functional-query/queries/QueryTest/jdbc-data-source.test
M 
testdata/workloads/functional-query/queries/QueryTest/mysql-ext-jdbc-tables.test
M tests/custom_cluster/test_ext_data_sources.py
9 files changed, 115 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/20837/8
--
To view, visit http://gerrit.cloudera.org:8080/20837
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I47687b7a93e90cea8ebd5f3fc280c9135bd97992
Gerrit-Change-Number: 20837
Gerrit-PatchSet: 8
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Yifan Zhang 


[Impala-ASF-CR] IMPALA-12714: Fix test reduced cardinality by filter for non-HDFS

2024-01-15 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20902 )

Change subject: IMPALA-12714: Fix test_reduced_cardinality_by_filter for 
non-HDFS
..


Patch Set 1:

> Patch Set 1:
>
> I'm double checking if the following stats assertion remains true in Ozone. 
> Will run GVO after I confirm that.

This test passed in Ozone.


--
To view, visit http://gerrit.cloudera.org:8080/20902
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icbf72687cc3c5a99aa0a0a74e229ed8c88ed06ef
Gerrit-Change-Number: 20902
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Comment-Date: Mon, 15 Jan 2024 18:29:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12708: An UPDATE creates 2 new snapshots in Iceberg tables

2024-01-15 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/20903


Change subject: IMPALA-12708: An UPDATE creates 2 new snapshots in Iceberg 
tables
..

IMPALA-12708: An UPDATE creates 2 new snapshots in Iceberg tables

The current implementation of UPDATE creates the delete file(s) and the
new data file(s) for the updated row(s). These files are committed in
one Iceberg transaction, but the transaction adds two snapshots to the
table. The first contains the delete file(s), the second adds the new
data file(s) of the updated row(s). Only the final snapshot (which
holds the consistent table state) is observable by concurrent readers,
but still, the commit history can look strange with these "phantom
snapshots".

So instead of doing a RowDelta and AppendFiles operation in a single
transaction, with this change we are doing a single RowDelta operation
only.

Another issue was that we also committed empty operations (e.g. UPDATEs
with zero records). These created redundant snapshots in the table
history. This patch also fixes that.

Testing:
 * added e2e test that checks the table history

Change-Id: I2ceb80b939c644388707b21061bf55451234dcd3
---
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java
M tests/query_test/test_iceberg.py
4 files changed, 128 insertions(+), 60 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/20903/1
--
To view, visit http://gerrit.cloudera.org:8080/20903
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I2ceb80b939c644388707b21061bf55451234dcd3
Gerrit-Change-Number: 20903
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-12714: Fix test reduced cardinality by filter for non-HDFS

2024-01-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20902 )

Change subject: IMPALA-12714: Fix test_reduced_cardinality_by_filter for 
non-HDFS
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/20902
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icbf72687cc3c5a99aa0a0a74e229ed8c88ed06ef
Gerrit-Change-Number: 20902
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Comment-Date: Mon, 15 Jan 2024 18:29:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12714: Fix test reduced cardinality by filter for non-HDFS

2024-01-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20902 )

Change subject: IMPALA-12714: Fix test_reduced_cardinality_by_filter for 
non-HDFS
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10153/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/20902
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icbf72687cc3c5a99aa0a0a74e229ed8c88ed06ef
Gerrit-Change-Number: 20902
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Comment-Date: Mon, 15 Jan 2024 18:29:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12704: Fix NPE when quering empty iceberg table's metadata

2024-01-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20890 )

Change subject: IMPALA-12704: Fix NPE when quering empty iceberg table's 
metadata
..


Patch Set 5: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/20890
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6b4d4fb81a45214045b8809a4bdd910a1f1f3843
Gerrit-Change-Number: 20890
Gerrit-PatchSet: 5
Gerrit-Owner: Zihao Ye 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Zihao Ye 
Gerrit-Comment-Date: Mon, 15 Jan 2024 17:46:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12714: Fix test reduced cardinality by filter for non-HDFS

2024-01-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20902 )

Change subject: IMPALA-12714: Fix test_reduced_cardinality_by_filter for 
non-HDFS
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/14955/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20902
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icbf72687cc3c5a99aa0a0a74e229ed8c88ed06ef
Gerrit-Change-Number: 20902
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Comment-Date: Mon, 15 Jan 2024 17:35:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12714: Fix test reduced cardinality by filter for non-HDFS

2024-01-15 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20902 )

Change subject: IMPALA-12714: Fix test_reduced_cardinality_by_filter for 
non-HDFS
..


Patch Set 1:

I'm double checking if the following stats assertion remains true in Ozone. 
Will run GVO after I confirm that.


--
To view, visit http://gerrit.cloudera.org:8080/20902
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icbf72687cc3c5a99aa0a0a74e229ed8c88ed06ef
Gerrit-Change-Number: 20902
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Comment-Date: Mon, 15 Jan 2024 17:21:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12714: Fix test reduced cardinality by filter for non-HDFS

2024-01-15 Thread Tamas Mate (Code Review)
Tamas Mate has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20902 )

Change subject: IMPALA-12714: Fix test_reduced_cardinality_by_filter for 
non-HDFS
..


Patch Set 1: Code-Review+2

Thanks for the quick fix Riza! LGTM!


--
To view, visit http://gerrit.cloudera.org:8080/20902
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icbf72687cc3c5a99aa0a0a74e229ed8c88ed06ef
Gerrit-Change-Number: 20902
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Comment-Date: Mon, 15 Jan 2024 17:13:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12714: Fix test reduced cardinality by filter for non-HDFS

2024-01-15 Thread Riza Suminto (Code Review)
Riza Suminto has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/20902


Change subject: IMPALA-12714: Fix test_reduced_cardinality_by_filter for 
non-HDFS
..

IMPALA-12714: Fix test_reduced_cardinality_by_filter for non-HDFS

test_reduced_cardinality_by_filter failed in non-HDFS environment
because it assert for existence of '00:SCAN HDFS' in ExecSummary. This
patch change that assertion to ignore the type of scan node from test
query. Also marked the test with SkipIfNotHdfsMinicluster.plans
decorator.

Testing:
- Pass test_reduced_cardinality_by_filter

Change-Id: Icbf72687cc3c5a99aa0a0a74e229ed8c88ed06ef
---
M tests/query_test/test_observability.py
1 file changed, 2 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/20902/1
--
To view, visit http://gerrit.cloudera.org:8080/20902
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Icbf72687cc3c5a99aa0a0a74e229ed8c88ed06ef
Gerrit-Change-Number: 20902
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto 


[Impala-ASF-CR] IMPALA-12698: Restrict check deleted file fd() for fixing flaky tests

2024-01-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20898 )

Change subject: IMPALA-12698: Restrict check_deleted_file_fd() for fixing flaky 
tests
..


Patch Set 3: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/20898
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I55f5aa1cdbc0c74f6c7ebd25575e71d2b238bf98
Gerrit-Change-Number: 20898
Gerrit-PatchSet: 3
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Mon, 15 Jan 2024 16:45:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12698: Restrict check deleted file fd() for fixing flaky tests

2024-01-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/20898 )

Change subject: IMPALA-12698: Restrict check_deleted_file_fd() for fixing flaky 
tests
..

IMPALA-12698: Restrict check_deleted_file_fd() for fixing flaky tests

The introduction of check_deleted_file_fd() in IMPALA-12681 aimed
to detect a bug related to remote spilling where local temporary file
handles were not being released after deletion. However, the tests
associated with this function seem flaky in exhaustive builds with
occasionally some files of hdfs may not be promptly released after
deletion, though locally, I observed that these files are eventually
removed from /proc/xx/fd in a few minutes, the reason is unclear
yet.

To fix the flaky build failure, this patch confines the scope of
check_deleted_file_fd() to detect files containing the keyword
"scratch" only. Given that hdfs files eventually get removed, and
it seems not an urgent issue, a separate Jira will be filed to track
and investigate this behavior further.

Testing:
Reran the tests a couple times and passed.

Change-Id: I55f5aa1cdbc0c74f6c7ebd25575e71d2b238bf98
Reviewed-on: http://gerrit.cloudera.org:8080/20898
Reviewed-by: Csaba Ringhofer 
Tested-by: Impala Public Jenkins 
---
M tests/custom_cluster/test_scratch_disk.py
1 file changed, 6 insertions(+), 1 deletion(-)

Approvals:
  Csaba Ringhofer: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/20898
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I55f5aa1cdbc0c74f6c7ebd25575e71d2b238bf98
Gerrit-Change-Number: 20898
Gerrit-PatchSet: 4
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Yida Wu 


[Impala-ASF-CR] IMPALA-12704: Fix NPE when quering empty iceberg table's metadata

2024-01-15 Thread Tamas Mate (Code Review)
Tamas Mate has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20890 )

Change subject: IMPALA-12704: Fix NPE when quering empty iceberg table's 
metadata
..


Patch Set 4: Code-Review+2

Thanks Zihao Ye! LGTM!


--
To view, visit http://gerrit.cloudera.org:8080/20890
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6b4d4fb81a45214045b8809a4bdd910a1f1f3843
Gerrit-Change-Number: 20890
Gerrit-PatchSet: 4
Gerrit-Owner: Zihao Ye 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Zihao Ye 
Gerrit-Comment-Date: Mon, 15 Jan 2024 13:06:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12704: Fix NPE when quering empty iceberg table's metadata

2024-01-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20890 )

Change subject: IMPALA-12704: Fix NPE when quering empty iceberg table's 
metadata
..


Patch Set 5: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/20890
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6b4d4fb81a45214045b8809a4bdd910a1f1f3843
Gerrit-Change-Number: 20890
Gerrit-PatchSet: 5
Gerrit-Owner: Zihao Ye 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Zihao Ye 
Gerrit-Comment-Date: Mon, 15 Jan 2024 13:06:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12704: Fix NPE when quering empty iceberg table's metadata

2024-01-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20890 )

Change subject: IMPALA-12704: Fix NPE when quering empty iceberg table's 
metadata
..


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10152/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/20890
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6b4d4fb81a45214045b8809a4bdd910a1f1f3843
Gerrit-Change-Number: 20890
Gerrit-PatchSet: 5
Gerrit-Owner: Zihao Ye 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Zihao Ye 
Gerrit-Comment-Date: Mon, 15 Jan 2024 13:06:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12582: Fix crash when enabling MIN MAX RuntimeFilter in Nested Loop Join

2024-01-15 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20891 )

Change subject: IMPALA-12582: Fix crash when enabling MIN_MAX RuntimeFilter in 
Nested Loop Join
..


Patch Set 2: Code-Review+1

(2 comments)

http://gerrit.cloudera.org:8080/#/c/20891/2/tests/query_test/test_join_queries.py
File tests/query_test/test_join_queries.py:

http://gerrit.cloudera.org:8080/#/c/20891/2/tests/query_test/test_join_queries.py@116
PS2, Line 116:   def test_nested_loop_joins_with_min_max_runtime_filter(self, 
vector, unique_database):
Checked the original patch that added the bug and it did add a test with nested 
loop join + min max filter:
https://gerrit.cloudera.org/#/c/17706/34/testdata/workloads/functional-query/queries/QueryTest/overlap_min_max_filters_on_sorted_columns.test

Do you know the difference, so why does this query trigger the issue, but not 
the old query?


http://gerrit.cloudera.org:8080/#/c/20891/2/tests/query_test/test_join_queries.py@120
PS2, Line 120: "CREATE TABLE {0} (id int) PARTITIONED BY(dt string) "
optional: I think that this would be more readable if it was moved to an 
existing .test file or a now one.=



--
To view, visit http://gerrit.cloudera.org:8080/20891
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iba951796d52f109c419587c444840adbb2d44f5d
Gerrit-Change-Number: 20891
Gerrit-PatchSet: 2
Gerrit-Owner: Zihao Ye 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 15 Jan 2024 12:58:01 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12631: Improve count star performance for parquet scans

2024-01-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20804 )

Change subject: IMPALA-12631: Improve count star performance for parquet scans
..


Patch Set 11:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/14954/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20804
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9cd2448fe51a420d4559d0cc861c4d30822f4fd
Gerrit-Change-Number: 20804
Gerrit-PatchSet: 11
Gerrit-Owner: Yifan Zhang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yifan Zhang 
Gerrit-Reviewer: Zihao Ye 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 15 Jan 2024 12:56:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12642: Support query options for Impala external JDBC table

2024-01-15 Thread Yifan Zhang (Code Review)
Yifan Zhang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20837 )

Change subject: IMPALA-12642: Support query options for Impala external JDBC 
table
..


Patch Set 7: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/20837
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I47687b7a93e90cea8ebd5f3fc280c9135bd97992
Gerrit-Change-Number: 20837
Gerrit-PatchSet: 7
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Yifan Zhang 
Gerrit-Comment-Date: Mon, 15 Jan 2024 12:36:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12631: Improve count star performance for parquet scans

2024-01-15 Thread Yifan Zhang (Code Review)
Hello Riza Suminto, Zoltan Borok-Nagy, Zihao Ye, Csaba Ringhofer, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20804

to look at the new patch set (#11).

Change subject: IMPALA-12631: Improve count star performance for parquet scans
..

IMPALA-12631: Improve count star performance for parquet scans

Backend function HdfsParquetScanner::GetNextInternal() uses the data
stored in the Parquet RowGroup.num_rows field to compute count star,
it still needs to find row groups and sum all RowGroup.num_rows.
This patch uses the 'num_rows' field in Parquet file metadata, it
avoids NextRowGroup() function calls, generates and processes only one
footer range per file.

A new query option parquet_count_star_use_file_metadata is added for
forward compatibility. Its default value is true, if any inconsistency
between FileMetaData.num_rows and RowGroup.num_rows is found, we can
set it to false to get same results as old versions.

The following table shows a performance comparison before and after
the patch. primitive_count_star_multiblock query is a modified
primitive_count_star query that targets a multi-block
tpch10_parquet.lineitem table. The files of the table is generated
by the command `hdfs dfs -Ddfs.block.size=1048576 -cp -f -d`.

+---+-+---++-++++---++-++
| Workload  | Query   | File Format   | 
Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%)  | Base StdDev(%) | Iters | 
Median Diff(%) | MW Zval | Tval   |
+---+-+---++-++++---++-++
| TPCDS(10) | TPCDS-Q_COUNT_OPTIMIZED | parquet / none / none | 
0.17   | 0.16|   +2.58%   | * 29.53% * | * 27.16% * | 30|   
+1.20%   | 0.58| 0.35   |
| TPCDS(10) | TPCDS-Q_COUNT_UNOPTIMIZED   | parquet / none / none | 
0.27   | 0.26|   +2.96%   |   8.97%|   9.94%| 30|   
+0.16%   | 0.44| 1.19   |
| TPCDS(10) | TPCDS-Q_COUNT_ZERO_SLOT | parquet / none / none | 
0.18   | 0.18|   -0.69%   |   1.65%|   1.99%| 30|   
-0.34%   | -1.55   | -1.47  |
| TARGETED-PERF(10) | primitive_count_star_multiblock | parquet / none / none | 
0.06   | 0.12| I -49.88%  |   4.11%|   3.53%| 30| I 
-99.97%  | -6.54   | -66.81 |
+---+-+---++-++++---++-++

Testing:
- Ran PlannerTest#testParquetStatsAgg
- Added new test cases to query_test/test_aggregation.py

Change-Id: Ib9cd2448fe51a420d4559d0cc861c4d30822f4fd
---
M be/src/exec/parquet/hdfs-parquet-scanner.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaService.thrift
M common/thrift/Query.thrift
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-plain-count-star-optimization.test
A 
testdata/workloads/functional-query/queries/QueryTest/parquet-stats-agg-default.test
M testdata/workloads/functional-query/queries/QueryTest/parquet-stats-agg.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q_count_optimized.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q_count_unoptimized.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q_count_zero_slot.test
M tests/query_test/test_aggregation.py
13 files changed, 331 insertions(+), 36 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/20804/11
--
To view, visit http://gerrit.cloudera.org:8080/20804
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib9cd2448fe51a420d4559d0cc861c4d30822f4fd
Gerrit-Change-Number: 20804
Gerrit-PatchSet: 11
Gerrit-Owner: Yifan Zhang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yifan Zhang 
Gerrit-Reviewer: Zihao Ye 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-12636: Reload filemetadata for AlterTable event of type truncate

2024-01-15 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20887 )

Change subject: IMPALA-12636: Reload filemetadata for AlterTable event of type 
truncate
..


Patch Set 3:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/20887/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/20887/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@2481
PS3, Line 2481:   // force reload truncated partition events
nit: catalogd reloads the partitions, not the partition events


http://gerrit.cloudera.org:8080/#/c/20887/3/tests/custom_cluster/test_events_custom_configs.py
File tests/custom_cluster/test_events_custom_configs.py:

http://gerrit.cloudera.org:8080/#/c/20887/3/tests/custom_cluster/test_events_custom_configs.py@1094
PS3, Line 1094: self.execute_query("create database if not exists 
{0}".format(unique_database))
This shouldn't be necessary with unique_database


http://gerrit.cloudera.org:8080/#/c/20887/3/tests/custom_cluster/test_events_custom_configs.py@1099
PS3, Line 1099: " ".join
Using join here looks strange to me - shouldn't we simply use the format 
string? e.g.
partitioned_str = " partitioned by (year int) " if is_partitioned else ''
create_query = "create table `{}`.`{}` (i int) {} {}".format(unique_database, 
tbl_name, partitioned_str , values)


http://gerrit.cloudera.org:8080/#/c/20887/3/tests/custom_cluster/test_events_custom_configs.py@1101
PS3, Line 1101: 
self.__get_transactional_tblproperties(is_transactional)])
nit: +2 indentation on line breaks


http://gerrit.cloudera.org:8080/#/c/20887/3/tests/custom_cluster/test_events_custom_configs.py@1105
PS3, Line 1105:   
self.run_stmt_in_hive(insert_query.format(unique_database, tbl_name))
as the test's goal is to check handling of truncate event, shouldn't we wait 
here for the event processor to process the insert event?

there are 2 cases we should avoid to ensure that the test provides proper 
coverage:

1. the table is reloaded, but not because of the truncate event, but the insert 
event - as the reload happens after the truncate already happened, catalogd 
would see an empty table during reload

2. both events are ignored because the table is an IncompleteTable in the 
catalogd, and the table load is initiated  because of the select query, not the 
event processor

The problem with 1 and 2 is that the test would pass even without the truncate 
event.

So actually I think that the best way is to call a REFRESH in Impala before the 
truncate to ensure that the state with 3 rows is cached and the truncate event 
handling is really needed to trigger the reload



--
To view, visit http://gerrit.cloudera.org:8080/20887
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I53bb80c294623eec7c79d9f30f410771386c6b75
Gerrit-Change-Number: 20887
Gerrit-PatchSet: 3
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Mon, 15 Jan 2024 12:31:05 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12698: Restrict check deleted file fd() for fixing flaky tests

2024-01-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20898 )

Change subject: IMPALA-12698: Restrict check_deleted_file_fd() for fixing flaky 
tests
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10151/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/20898
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I55f5aa1cdbc0c74f6c7ebd25575e71d2b238bf98
Gerrit-Change-Number: 20898
Gerrit-PatchSet: 3
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Mon, 15 Jan 2024 12:14:26 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12698: Restrict check deleted file fd() for fixing flaky tests

2024-01-15 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20898 )

Change subject: IMPALA-12698: Restrict check_deleted_file_fd() for fixing flaky 
tests
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/20898
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I55f5aa1cdbc0c74f6c7ebd25575e71d2b238bf98
Gerrit-Change-Number: 20898
Gerrit-PatchSet: 3
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Mon, 15 Jan 2024 11:51:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12704: Fix NPE when quering empty iceberg table's metadata

2024-01-15 Thread Zihao Ye (Code Review)
Zihao Ye has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20890 )

Change subject: IMPALA-12704: Fix NPE when quering empty iceberg table's 
metadata
..


Patch Set 3:

(1 comment)

Hi Tamas, thank you for your review!

http://gerrit.cloudera.org:8080/#/c/20890/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20890/3//COMMIT_MSG@12
PS3, Line 12:
> On the Impala project we usually add a short: "Tests" or "Testing" section
Done



--
To view, visit http://gerrit.cloudera.org:8080/20890
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6b4d4fb81a45214045b8809a4bdd910a1f1f3843
Gerrit-Change-Number: 20890
Gerrit-PatchSet: 3
Gerrit-Owner: Zihao Ye 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Zihao Ye 
Gerrit-Comment-Date: Mon, 15 Jan 2024 11:31:36 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12704: Fix NPE when quering empty iceberg table's metadata

2024-01-15 Thread Zihao Ye (Code Review)
Hello Tamas Mate, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20890

to look at the new patch set (#4).

Change subject: IMPALA-12704: Fix NPE when quering empty iceberg table's 
metadata
..

IMPALA-12704: Fix NPE when quering empty iceberg table's metadata

Currently, When querying some metadata tables of an empty iceberg table,
a null pointer exception occurs. This patch fixes the issue and adds
corresponding test cases in test_metadata_tables.

Testing:
 - Added E2E test to cover this case

Change-Id: I6b4d4fb81a45214045b8809a4bdd910a1f1f3843
---
M fe/src/main/java/org/apache/impala/util/IcebergMetadataScanner.java
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-metadata-tables.test
2 files changed, 25 insertions(+), 14 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/20890/4
--
To view, visit http://gerrit.cloudera.org:8080/20890
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6b4d4fb81a45214045b8809a4bdd910a1f1f3843
Gerrit-Change-Number: 20890
Gerrit-PatchSet: 4
Gerrit-Owner: Zihao Ye 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 


[Impala-ASF-CR] IMPALA-12698: Restrict check deleted file fd() for fixing flaky tests

2024-01-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20898 )

Change subject: IMPALA-12698: Restrict check_deleted_file_fd() for fixing flaky 
tests
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/14953/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20898
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I55f5aa1cdbc0c74f6c7ebd25575e71d2b238bf98
Gerrit-Change-Number: 20898
Gerrit-PatchSet: 3
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Mon, 15 Jan 2024 11:26:48 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12698: Restrict check deleted file fd() for fixing flaky tests

2024-01-15 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20898 )

Change subject: IMPALA-12698: Restrict check_deleted_file_fd() for fixing flaky 
tests
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/20898/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20898/2//COMMIT_MSG@10
PS2, Line 10: spillin
> nit: typo in 'spilling'
Thanks. Done.



--
To view, visit http://gerrit.cloudera.org:8080/20898
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I55f5aa1cdbc0c74f6c7ebd25575e71d2b238bf98
Gerrit-Change-Number: 20898
Gerrit-PatchSet: 3
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Mon, 15 Jan 2024 10:59:50 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12698: Restrict check deleted file fd() for fixing flaky tests

2024-01-15 Thread Yida Wu (Code Review)
Yida Wu has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/20898 )

Change subject: IMPALA-12698: Restrict check_deleted_file_fd() for fixing flaky 
tests
..

IMPALA-12698: Restrict check_deleted_file_fd() for fixing flaky tests

The introduction of check_deleted_file_fd() in IMPALA-12681 aimed
to detect a bug related to remote spilling where local temporary file
handles were not being released after deletion. However, the tests
associated with this function seem flaky in exhaustive builds with
occasionally some files of hdfs may not be promptly released after
deletion, though locally, I observed that these files are eventually
removed from /proc/xx/fd in a few minutes, the reason is unclear
yet.

To fix the flaky build failure, this patch confines the scope of
check_deleted_file_fd() to detect files containing the keyword
"scratch" only. Given that hdfs files eventually get removed, and
it seems not an urgent issue, a separate Jira will be filed to track
and investigate this behavior further.

Testing:
Reran the tests a couple times and passed.

Change-Id: I55f5aa1cdbc0c74f6c7ebd25575e71d2b238bf98
---
M tests/custom_cluster/test_scratch_disk.py
1 file changed, 6 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/20898/3
--
To view, visit http://gerrit.cloudera.org:8080/20898
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I55f5aa1cdbc0c74f6c7ebd25575e71d2b238bf98
Gerrit-Change-Number: 20898
Gerrit-PatchSet: 3
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Yida Wu 


[Impala-ASF-CR] IMPALA-12698: Restrict check deleted file fd() for fixing flaky tests

2024-01-15 Thread Laszlo Gaal (Code Review)
Laszlo Gaal has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20898 )

Change subject: IMPALA-12698: Restrict check_deleted_file_fd() for fixing flaky 
tests
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/20898/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20898/2//COMMIT_MSG@10
PS2, Line 10: splling
nit: typo in 'spilling'



--
To view, visit http://gerrit.cloudera.org:8080/20898
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I55f5aa1cdbc0c74f6c7ebd25575e71d2b238bf98
Gerrit-Change-Number: 20898
Gerrit-PatchSet: 2
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Mon, 15 Jan 2024 10:29:42 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12704: Fix NPE when quering empty iceberg table's metadata

2024-01-15 Thread Tamas Mate (Code Review)
Tamas Mate has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20890 )

Change subject: IMPALA-12704: Fix NPE when quering empty iceberg table's 
metadata
..


Patch Set 3:

(1 comment)

Hi Zihao Ye, thank you for catching and fixing this issue.
The change looks good to me and can +2 after a small commit message change.

http://gerrit.cloudera.org:8080/#/c/20890/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20890/3//COMMIT_MSG@12
PS3, Line 12:
On the Impala project we usually add a short: "Tests" or "Testing" section to 
the commit message to highlight how the testing was done. In this case a simple:

```
Testing:
 - Added E2E test to cover this case
```

would suffice in my opinion.



--
To view, visit http://gerrit.cloudera.org:8080/20890
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6b4d4fb81a45214045b8809a4bdd910a1f1f3843
Gerrit-Change-Number: 20890
Gerrit-PatchSet: 3
Gerrit-Owner: Zihao Ye 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Comment-Date: Mon, 15 Jan 2024 10:20:34 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12698: Restrict check deleted file fd() for fixing flaky tests

2024-01-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20898 )

Change subject: IMPALA-12698: Restrict check_deleted_file_fd() for fixing flaky 
tests
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/14952/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20898
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I55f5aa1cdbc0c74f6c7ebd25575e71d2b238bf98
Gerrit-Change-Number: 20898
Gerrit-PatchSet: 2
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Mon, 15 Jan 2024 10:15:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12698: Restrict check deleted file fd() for fixing flaky tests

2024-01-15 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20898 )

Change subject: IMPALA-12698: Restrict check_deleted_file_fd() for fixing flaky 
tests
..


Patch Set 2:

(1 comment)

Thanks Csaba for the review.

http://gerrit.cloudera.org:8080/#/c/20898/1/tests/custom_cluster/test_scratch_disk.py
File tests/custom_cluster/test_scratch_disk.py:

http://gerrit.cloudera.org:8080/#/c/20898/1/tests/custom_cluster/test_scratch_disk.py@286
PS1, Line 286: # Look for the files with keywords 'scratch' and '(deleted)'.
> Can you mention IMPALA-12698 and that there are temporary files like this c
Done



--
To view, visit http://gerrit.cloudera.org:8080/20898
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I55f5aa1cdbc0c74f6c7ebd25575e71d2b238bf98
Gerrit-Change-Number: 20898
Gerrit-PatchSet: 2
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Mon, 15 Jan 2024 09:49:34 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12698: Restrict check deleted file fd() for fixing flaky tests

2024-01-15 Thread Yida Wu (Code Review)
Yida Wu has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/20898 )

Change subject: IMPALA-12698: Restrict check_deleted_file_fd() for fixing flaky 
tests
..

IMPALA-12698: Restrict check_deleted_file_fd() for fixing flaky tests

The introduction of check_deleted_file_fd() in IMPALA-12681 aimed
to detect a bug related to remote splling where local temporary file
handles were not being released after deletion. However, the tests
associated with this function seem flaky in exhaustive builds with
occasionally some files of hdfs may not be promptly released after
deletion, though locally, I observed that these files are eventually
removed from /proc/xx/fd in a few minutes, the reason is unclear
yet.

To fix the flaky build failure, this patch confines the scope of
check_deleted_file_fd() to detect files containing the keyword
"scratch" only. Given that hdfs files eventually get removed, and
it seems not an urgent issue, a separate Jira will be filed to track
and investigate this behavior further.

Testing:
Reran the tests a couple times and passed.

Change-Id: I55f5aa1cdbc0c74f6c7ebd25575e71d2b238bf98
---
M tests/custom_cluster/test_scratch_disk.py
1 file changed, 6 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/20898/2
--
To view, visit http://gerrit.cloudera.org:8080/20898
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I55f5aa1cdbc0c74f6c7ebd25575e71d2b238bf98
Gerrit-Change-Number: 20898
Gerrit-PatchSet: 2
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] IMPALA-12698: Restrict check deleted file fd() for fixing flaky tests

2024-01-15 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20898 )

Change subject: IMPALA-12698: Restrict check_deleted_file_fd() for fixing flaky 
tests
..


Patch Set 1: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/20898/1/tests/custom_cluster/test_scratch_disk.py
File tests/custom_cluster/test_scratch_disk.py:

http://gerrit.cloudera.org:8080/#/c/20898/1/tests/custom_cluster/test_scratch_disk.py@286
PS1, Line 286: # Look for the files with keywords 'scratch' and '(deleted)'.
Can you mention IMPALA-12698 and that there are temporary files like this 
caused by HDFS?



--
To view, visit http://gerrit.cloudera.org:8080/20898
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I55f5aa1cdbc0c74f6c7ebd25575e71d2b238bf98
Gerrit-Change-Number: 20898
Gerrit-PatchSet: 1
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Mon, 15 Jan 2024 09:09:46 +
Gerrit-HasComments: Yes