[Impala-ASF-CR] IMPALA-12933: Avoid fetching unneccessary events of unwanted types

2024-04-11 Thread Anonymous Coward (Code Review)
k.venureddy2...@gmail.com has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21186 )

Change subject: IMPALA-12933: Avoid fetching unneccessary events of unwanted 
types
..


Patch Set 8: Code-Review+1

(1 comment)

Just a nit. Looks good to me.

http://gerrit.cloudera.org:8080/#/c/21186/8/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
File 
fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java:

http://gerrit.cloudera.org:8080/#/c/21186/8/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java@1336
PS8, Line 1336: currentEventId, 
MetastoreEvents.AlterTableEvent.ALTER_TABLE_EVENT_TYPE,
nit: Can we use getNextMetastoreEventsInBatchesForTable() here and also in 
createTableCore() ?



--
To view, visit http://gerrit.cloudera.org:8080/21186
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ieabe714328aa2cc605cb62b85ae8aa4bd537dbe9
Gerrit-Change-Number: 21186
Gerrit-PatchSet: 8
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Fri, 12 Apr 2024 03:57:06 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12933: Avoid fetching unneccessary events of unwanted types

2024-04-11 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21186 )

Change subject: IMPALA-12933: Avoid fetching unneccessary events of unwanted 
types
..


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21186/8/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
File 
fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java:

http://gerrit.cloudera.org:8080/#/c/21186/8/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@3293
PS8, Line 3293:   .getNextMetastoreEventsInBatches(catalog_, 
beforeDropEventId,
> Nit: this can also point to the newly refactored method getNextMetastoreEve
This is fetching events for db which don't specify the table name. If we pass 
in a null table name, we need changes in 
getNextMetastoreEventsInBatchesForTable to check for that. I think it's ok to 
leave it as-is.



--
To view, visit http://gerrit.cloudera.org:8080/21186
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ieabe714328aa2cc605cb62b85ae8aa4bd537dbe9
Gerrit-Change-Number: 21186
Gerrit-PatchSet: 8
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Fri, 12 Apr 2024 00:27:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12998: Add SHOW METADATA TABLES to ignored DDL

2024-04-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21290 )

Change subject: IMPALA-12998: Add SHOW_METADATA_TABLES to ignored DDL
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15877/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21290
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I69f7de9756aa730d70cd9187c9f869d5bcf67fce
Gerrit-Change-Number: 21290
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 11 Apr 2024 22:40:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12998: Add SHOW METADATA TABLES to ignored DDL

2024-04-11 Thread Michael Smith (Code Review)
Michael Smith has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21290


Change subject: IMPALA-12998: Add SHOW_METADATA_TABLES to ignored DDL
..

IMPALA-12998: Add SHOW_METADATA_TABLES to ignored DDL

Adds SHOW_METADATA_TABLES to the list of ignored DDL in workload
management. Fixes DCHECK failure when running Impala's full test suite
with 'enable_workload_mgmt'.

Change-Id: I69f7de9756aa730d70cd9187c9f869d5bcf67fce
---
M be/src/service/workload-management.cc
1 file changed, 1 insertion(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/21290/1
--
To view, visit http://gerrit.cloudera.org:8080/21290
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I69f7de9756aa730d70cd9187c9f869d5bcf67fce
Gerrit-Change-Number: 21290
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Smith 


[Impala-ASF-CR] IMPALA-12810: Simplify IcebergDeleteNode and IcebergDeleteBuilder

2024-04-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/21258 )

Change subject: IMPALA-12810: Simplify IcebergDeleteNode and 
IcebergDeleteBuilder
..

IMPALA-12810: Simplify IcebergDeleteNode and IcebergDeleteBuilder

Now that we have the DIRECTED distribution mode, some parts of
IcebergDeleteNode and IcebergDeleteBuilder became dead code. It is
time to simplify the above classes.

IcebergDeleteBuilder and KrpcDataStreamSender now also tolerate
NULL file paths which are also not an error in the hash join mode.

Change-Id: I3ba02b33433990950b49628f11e732e01ed8a34d
Reviewed-on: http://gerrit.cloudera.org:8080/21258
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M be/src/exec/iceberg-delete-builder.cc
M be/src/exec/iceberg-delete-builder.h
M be/src/exec/iceberg-delete-node.cc
M be/src/exec/iceberg-delete-node.h
M be/src/runtime/krpc-data-stream-sender.cc
M testdata/data/README
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_null_delete_record/data/6348b186d3705f6b-370ecfbb_152551971_data.0.parq
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_null_delete_record/data/delete_null_first.parq
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_null_delete_record/data/delete_null_first_and_last.parq
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_null_delete_record/data/delete_null_last.parq
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_null_delete_record/data/delete_null_single.parq
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_null_delete_record/data/delete_three_nulls.parq
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_null_delete_record/data/same_data.0.parq
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_null_delete_record/metadata/3a813d5e-fc0b-485f-bbba-010972a9f20a-m0.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_null_delete_record/metadata/e90d28aa-cd17-4655-ad04-aa3711792576-m0.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_null_delete_record/metadata/snap-5852039568708655222-1-3a813d5e-fc0b-485f-bbba-010972a9f20a.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_null_delete_record/metadata/v1.metadata.json
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_null_delete_record/metadata/version-hint.text
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
M tests/query_test/test_iceberg.py
21 files changed, 219 insertions(+), 104 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/21258
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I3ba02b33433990950b49628f11e732e01ed8a34d
Gerrit-Change-Number: 21258
Gerrit-PatchSet: 12
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-12810: Simplify IcebergDeleteNode and IcebergDeleteBuilder

2024-04-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21258 )

Change subject: IMPALA-12810: Simplify IcebergDeleteNode and 
IcebergDeleteBuilder
..


Patch Set 11: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/21258
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ba02b33433990950b49628f11e732e01ed8a34d
Gerrit-Change-Number: 21258
Gerrit-PatchSet: 11
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 11 Apr 2024 21:27:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12980: Translate CpuAsk into admission control slots

2024-04-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21257 )

Change subject: IMPALA-12980: Translate CpuAsk into admission control slots
..


Patch Set 11: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/21257
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I338ca96555bfe8d07afce0320b3688a0861663f2
Gerrit-Change-Number: 21257
Gerrit-PatchSet: 11
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Thu, 11 Apr 2024 18:36:53 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12963: Return parent PID when children spawned

2024-04-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21278 )

Change subject: IMPALA-12963: Return parent PID when children spawned
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15876/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21278
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I214e79507c717340863d27f68f6ea54c169e4090
Gerrit-Change-Number: 21278
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Thu, 11 Apr 2024 18:34:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12986: Base64Encode fails if the 'out len' output parameter is passed with certain values

2024-04-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21271 )

Change subject: IMPALA-12986: Base64Encode fails if the 'out_len' output 
parameter is passed with certain values
..


Patch Set 4: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/21271
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I35ae59fc9b3280f89ea4f7d95d27d2f21751001f
Gerrit-Change-Number: 21271
Gerrit-PatchSet: 4
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 11 Apr 2024 18:24:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12986: Base64Encode fails if the 'out len' output parameter is passed with certain values

2024-04-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/21271 )

Change subject: IMPALA-12986: Base64Encode fails if the 'out_len' output 
parameter is passed with certain values
..

IMPALA-12986: Base64Encode fails if the 'out_len' output parameter is passed 
with certain values

The Base64Encode function in coding-util.h with signature

  bool Base64Encode(const char* in, int64_t in_len, int64_t out_max,
  char* out, int64_t* out_len);

fails if '*out_len', when passed to the function, contains a value that
does not fit in a 32 bit integer.

Internally we use the

  int sasl_encode64(const char *in, unsigned inlen, char *out, unsigned
  outmax, unsigned *outlen);

function and explicitly cast 'out_len' to 'unsigned*'.

The error is that the called sasl_encode64() function only sets the four
lower bytes of '*out_len' (assuming that 'unsigned' is a 32 bit
integer), and if the upper bytes are not all zero, the resulting value
of '*out_len' will be incorrect.

This change changes the type of 'out_len' from 'int64_t*' to 'unsigned*'
to match the type that sasl_encode64() expects.

Base64Decode() is also updated to use 'unsigned*'. Before this change it
used an intermediate 32 bit local variable to avoid this issue.

Testing:
 - added a regression test in coding-util-test.cc

Change-Id: I35ae59fc9b3280f89ea4f7d95d27d2f21751001f
Reviewed-on: http://gerrit.cloudera.org:8080/21271
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M be/src/exec/text-converter.inline.h
M be/src/exprs/string-functions-ir.cc
M be/src/util/coding-util-test.cc
M be/src/util/coding-util.cc
M be/src/util/coding-util.h
M be/src/util/runtime-profile.cc
6 files changed, 43 insertions(+), 16 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/21271
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I35ae59fc9b3280f89ea4f7d95d27d2f21751001f
Gerrit-Change-Number: 21271
Gerrit-PatchSet: 5
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-12963: Return parent PID when children spawned

2024-04-11 Thread Michael Smith (Code Review)
Hello Riza Suminto, Yida Wu, Jason Fehr, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21278

to look at the new patch set (#3).

Change subject: IMPALA-12963: Return parent PID when children spawned
..

IMPALA-12963: Return parent PID when children spawned

Returns the original PID for a command rather than any children that may
be active. This happens during graceful shutdown in UBSAN tests. Also
updates 'kill' to use the version of 'get_pid' that logs details to help
with debugging.

Moves try block in test_query_log.py to after client2 has been
initialized. Removes 'drop table' on unique_database, since test suite
already handles cleanup.

Change-Id: I214e79507c717340863d27f68f6ea54c169e4090
---
M tests/common/impala_cluster.py
M tests/custom_cluster/test_query_log.py
2 files changed, 72 insertions(+), 61 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/21278/3
--
To view, visit http://gerrit.cloudera.org:8080/21278
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I214e79507c717340863d27f68f6ea54c169e4090
Gerrit-Change-Number: 21278
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 


[Impala-ASF-CR] IMPALA-12991: Eliminate unnecessary SORT for Iceberg DELETEs

2024-04-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21285 )

Change subject: IMPALA-12991: Eliminate unnecessary SORT for Iceberg DELETEs
..


Patch Set 3: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/21285
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I94a691e7990228a1ec2de03e6ad90ebb97931581
Gerrit-Change-Number: 21285
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 11 Apr 2024 17:35:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12990: Fix impala-shell handling of unset rows deleted

2024-04-11 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21284 )

Change subject: IMPALA-12990: Fix impala-shell handling of unset rows_deleted
..


Patch Set 5: Code-Review+1

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21284/2/shell/impala_client.py
File shell/impala_client.py:

http://gerrit.cloudera.org:8080/#/c/21284/2/shell/impala_client.py@980
PS2, Line 980:   self._check_hs2_rpc_status(resp.status)
> About TestHS2FaultInjection::test_close_dml:
Ah, I assumed the comment referred to num_deleted_rows but looks like it was 
never updated to include that.

It'd be nice to have StrictHS2Client::close_dml match the return layout even if 
not used. But not required.


http://gerrit.cloudera.org:8080/#/c/21284/2/tests/shell/test_shell_commandline.py
File tests/shell/test_shell_commandline.py:

http://gerrit.cloudera.org:8080/#/c/21284/2/tests/shell/test_shell_commandline.py@932
PS2, Line 932:   def _test_dml_reporting(self, vector, create_sql, db, is_kudu):
> Yes ... but as I learned the Python3 + beeswax combination is only run in e
Ack



--
To view, visit http://gerrit.cloudera.org:8080/21284
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5812b8006b9cacf34a7a0dbbc89a486d8b454438
Gerrit-Change-Number: 21284
Gerrit-PatchSet: 5
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Comment-Date: Thu, 11 Apr 2024 17:12:28 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12810: Simplify IcebergDeleteNode and IcebergDeleteBuilder

2024-04-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21258 )

Change subject: IMPALA-12810: Simplify IcebergDeleteNode and 
IcebergDeleteBuilder
..


Patch Set 11: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21258
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ba02b33433990950b49628f11e732e01ed8a34d
Gerrit-Change-Number: 21258
Gerrit-PatchSet: 11
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 11 Apr 2024 16:17:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12810: Simplify IcebergDeleteNode and IcebergDeleteBuilder

2024-04-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21258 )

Change subject: IMPALA-12810: Simplify IcebergDeleteNode and 
IcebergDeleteBuilder
..


Patch Set 11:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10535/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/21258
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ba02b33433990950b49628f11e732e01ed8a34d
Gerrit-Change-Number: 21258
Gerrit-PatchSet: 11
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 11 Apr 2024 16:17:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12810: Simplify IcebergDeleteNode and IcebergDeleteBuilder

2024-04-11 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21258 )

Change subject: IMPALA-12810: Simplify IcebergDeleteNode and 
IcebergDeleteBuilder
..


Patch Set 10: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21258
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ba02b33433990950b49628f11e732e01ed8a34d
Gerrit-Change-Number: 21258
Gerrit-PatchSet: 10
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 11 Apr 2024 16:10:37 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11495: Add glibc version and effective locale to the Web UI

2024-04-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21252 )

Change subject: IMPALA-11495: Add glibc version and effective locale to the Web 
UI
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15875/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21252
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia69c4d63df4beae29f5261691a8dcdd04b931de7
Gerrit-Change-Number: 21252
Gerrit-PatchSet: 2
Gerrit-Owner: Saurabh Katiyal 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Saurabh Katiyal 
Gerrit-Comment-Date: Thu, 11 Apr 2024 16:11:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12810: Simplify IcebergDeleteNode and IcebergDeleteBuilder

2024-04-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21258 )

Change subject: IMPALA-12810: Simplify IcebergDeleteNode and 
IcebergDeleteBuilder
..


Patch Set 10:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15874/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21258
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ba02b33433990950b49628f11e732e01ed8a34d
Gerrit-Change-Number: 21258
Gerrit-PatchSet: 10
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 11 Apr 2024 15:52:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12973,IMPALA-11491,IMPALA-12651: Support BINARY nested in complex types in select list

2024-04-11 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21269 )

Change subject: IMPALA-12973,IMPALA-11491,IMPALA-12651: Support BINARY nested 
in complex types in select list
..


Patch Set 3: Code-Review+1

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21269/1/be/src/exec/iceberg-metadata/iceberg-row-reader.cc
File be/src/exec/iceberg-metadata/iceberg-row-reader.cc:

http://gerrit.cloudera.org:8080/#/c/21269/1/be/src/exec/iceberg-metadata/iceberg-row-reader.cc@135
PS1, Line 135:   // Skip the unsupported type and set it to NULL
> I was wrong here, DATE is present in for example the 'all_files' metadata t
good catch!


http://gerrit.cloudera.org:8080/#/c/21269/3/testdata/workloads/functional-query/queries/QueryTest/iceberg-metadata-tables.test
File 
testdata/workloads/functional-query/queries/QueryTest/iceberg-metadata-tables.test:

http://gerrit.cloudera.org:8080/#/c/21269/3/testdata/workloads/functional-query/queries/QueryTest/iceberg-metadata-tables.test@801
PS3, Line 801: 
row_regex:'{2147483546:"aGRmczovL2xvY2FsaG9zdDoyMDUwMC90ZXN0LXdhcmVob3VzZS9pY2ViZXJnX3Rlc3QvaGFkb29wX2NhdGFsb2cvaWNlL2ljZWJlcmdfcXVlcnlfbWV0YWRhdGEvZGF0YS8.*",2147483545:"AAA="}','{2147483546:"aGRmczovL2xvY2FsaG9zdDoyMDUwMC90ZXN0LXdhcmVob3VzZS9pY2ViZXJnX3Rlc3QvaGFkb29wX2NhdGFsb2cvaWNlL2ljZWJlcmdfcXVlcnlfbWV0YWRhdGEvZGF0YS8.*",2147483545:"AAA="}'
> We have file paths here and the file names will be different for each data
Can't we provide fixed file names by using a table with copied files/manifests?
I am more worried about the prefix of the path, e.g. S3:// vs HDFS://



--
To view, visit http://gerrit.cloudera.org:8080/21269
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7b1d7fa332a901f05a46e0199e13fb841d2687c2
Gerrit-Change-Number: 21269
Gerrit-PatchSet: 3
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Comment-Date: Thu, 11 Apr 2024 15:47:40 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11495: Add glibc version and effective locale to the Web UI

2024-04-11 Thread Saurabh Katiyal (Code Review)
Saurabh Katiyal has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/21252 )

Change subject: IMPALA-11495: Add glibc version and effective locale to the Web 
UI
..

IMPALA-11495: Add glibc version and effective locale to the Web UI

Added a new section "Other Info" in root page for WebUI of catalog,
coordinator and statestore displaying effective locale and glibc version.

Change-Id: Ia69c4d63df4beae29f5261691a8dcdd04b931de7
---
M be/src/util/default-path-handlers.cc
M tests/webserver/test_web_pages.py
M www/root.tmpl
3 files changed, 17 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/21252/2
--
To view, visit http://gerrit.cloudera.org:8080/21252
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia69c4d63df4beae29f5261691a8dcdd04b931de7
Gerrit-Change-Number: 21252
Gerrit-PatchSet: 2
Gerrit-Owner: Saurabh Katiyal 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Saurabh Katiyal 


[Impala-ASF-CR] IMPALA-12973,IMPALA-11491,IMPALA-12651: Support BINARY nested in complex types in select list

2024-04-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21269 )

Change subject: IMPALA-12973,IMPALA-11491,IMPALA-12651: Support BINARY nested 
in complex types in select list
..


Patch Set 3:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/15873/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/21269
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7b1d7fa332a901f05a46e0199e13fb841d2687c2
Gerrit-Change-Number: 21269
Gerrit-PatchSet: 3
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Comment-Date: Thu, 11 Apr 2024 15:33:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12810: Simplify IcebergDeleteNode and IcebergDeleteBuilder

2024-04-11 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21258 )

Change subject: IMPALA-12810: Simplify IcebergDeleteNode and 
IcebergDeleteBuilder
..


Patch Set 7:

(1 comment)

Thanks for the comment!

http://gerrit.cloudera.org:8080/#/c/21258/7/tests/query_test/test_iceberg.py
File tests/query_test/test_iceberg.py:

http://gerrit.cloudera.org:8080/#/c/21258/7/tests/query_test/test_iceberg.py@1455
PS7, Line 1455: if 
vector.get_value('exec_option')['disable_optimized_iceberg_v2_read'] == 0:
> Here the code says that "if we don't disable the V2 read optimisations then
Ah, those negations... you're right.

Anyway, it revealed a bug in the test code that I fixed in the new PS.



--
To view, visit http://gerrit.cloudera.org:8080/21258
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ba02b33433990950b49628f11e732e01ed8a34d
Gerrit-Change-Number: 21258
Gerrit-PatchSet: 7
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 11 Apr 2024 15:28:13 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12810: Simplify IcebergDeleteNode and IcebergDeleteBuilder

2024-04-11 Thread Zoltan Borok-Nagy (Code Review)
Hello Daniel Becker, Gabor Kaszab, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21258

to look at the new patch set (#10).

Change subject: IMPALA-12810: Simplify IcebergDeleteNode and 
IcebergDeleteBuilder
..

IMPALA-12810: Simplify IcebergDeleteNode and IcebergDeleteBuilder

Now that we have the DIRECTED distribution mode, some parts of
IcebergDeleteNode and IcebergDeleteBuilder became dead code. It is
time to simplify the above classes.

IcebergDeleteBuilder and KrpcDataStreamSender now also tolerate
NULL file paths which are also not an error in the hash join mode.

Change-Id: I3ba02b33433990950b49628f11e732e01ed8a34d
---
M be/src/exec/iceberg-delete-builder.cc
M be/src/exec/iceberg-delete-builder.h
M be/src/exec/iceberg-delete-node.cc
M be/src/exec/iceberg-delete-node.h
M be/src/runtime/krpc-data-stream-sender.cc
M testdata/data/README
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_null_delete_record/data/6348b186d3705f6b-370ecfbb_152551971_data.0.parq
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_null_delete_record/data/delete_null_first.parq
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_null_delete_record/data/delete_null_first_and_last.parq
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_null_delete_record/data/delete_null_last.parq
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_null_delete_record/data/delete_null_single.parq
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_null_delete_record/data/delete_three_nulls.parq
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_null_delete_record/data/same_data.0.parq
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_null_delete_record/metadata/3a813d5e-fc0b-485f-bbba-010972a9f20a-m0.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_null_delete_record/metadata/e90d28aa-cd17-4655-ad04-aa3711792576-m0.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_null_delete_record/metadata/snap-5852039568708655222-1-3a813d5e-fc0b-485f-bbba-010972a9f20a.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_null_delete_record/metadata/v1.metadata.json
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_null_delete_record/metadata/version-hint.text
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
M tests/query_test/test_iceberg.py
21 files changed, 219 insertions(+), 104 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/21258/10
--
To view, visit http://gerrit.cloudera.org:8080/21258
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3ba02b33433990950b49628f11e732e01ed8a34d
Gerrit-Change-Number: 21258
Gerrit-PatchSet: 10
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-12973,IMPALA-11491,IMPALA-12651: Support BINARY nested in complex types in select list

2024-04-11 Thread Daniel Becker (Code Review)
Daniel Becker has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/21269 )

Change subject: IMPALA-12973,IMPALA-11491,IMPALA-12651: Support BINARY nested 
in complex types in select list
..

IMPALA-12973,IMPALA-11491,IMPALA-12651: Support BINARY nested in complex types 
in select list

Binary fields in complex types are currently not supported at all for
regular tables (an error is returned). For Iceberg metadata tables,
IMPALA-12899 added a temporary workaround to allow queries that contain
these fields to succeed by NULLing them out. This change adds support
for displaying them with base64 encoding for both regular and Iceberg
metadata tables.

Complex types are displayed in JSON format, so simply inserting the
bytes of the binary fields is not acceptable as it would produce invalid
JSON. Base64 is a widely used encoding that allows representing
arbitrary binary information using only a limited set of ASCII
characters.

This change also adds support for top level binary columns in Iceberg
metadata tables. However, these are not base64 encoded but are returned
in raw byte format - this is consistent with how top level binary
columns from regular (non-metadata) tables are handled.

Testing:
 - added test queries in iceberg-metadata-tables.test referencing both
   nested and top level binary fields; also updated existing queries
 - moved relevant tests (queries extracting binary fields from within
   complex types) from nested-types-scanner-basic.test to a new
   binary-in-complex-type.test file and also added a query that selects
   the containing complex types; this new test file is run from
   test_scanners.py::TestBinaryInComplexType::\
 test_binary_in_complex_type
 - moved negative tests in AnalyzerTest.TestUnsupportedTypes() to
   AnalyzeStmtsTest.TestComplexTypesInSelectList() and converted them to
   positive tests (expecting success); a negative test already in
   AnalyzeStmtsTest.TestComplexTypesInSelectList() was also converted

Change-Id: I7b1d7fa332a901f05a46e0199e13fb841d2687c2
---
M be/src/exec/iceberg-metadata/iceberg-metadata-scanner.cc
M be/src/exec/iceberg-metadata/iceberg-metadata-scanner.h
M be/src/exec/iceberg-metadata/iceberg-row-reader.cc
M be/src/exec/iceberg-metadata/iceberg-row-reader.h
M be/src/rpc/jni-thrift-util.h
M be/src/runtime/complex-value-writer.inline.h
M be/src/util/jni-util.cc
M be/src/util/jni-util.h
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M fe/src/main/java/org/apache/impala/util/IcebergMetadataScanner.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java
M testdata/data/README
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_with_key_metadata/data/0-0-data-danielbecker_20240408174043_c3737eaf-db30-4b88-aafb-f23c0f3c1dd3-job_17125053806420_0002-1-1.parquet
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_with_key_metadata/metadata/64da0e56-efa3-4025-bef1-1047fdd9a2b0-m0.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_with_key_metadata/metadata/snap-3079551887386250470-1-64da0e56-efa3-4025-bef1-1047fdd9a2b0.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_with_key_metadata/metadata/v1.metadata.json
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_with_key_metadata/metadata/v2.metadata.json
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_with_key_metadata/metadata/version-hint.txt
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
A 
testdata/workloads/functional-query/queries/QueryTest/binary-in-complex-type.test
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-metadata-tables.test
M 
testdata/workloads/functional-query/queries/QueryTest/nested-types-scanner-basic.test
M tests/query_test/test_scanners.py
26 files changed, 440 insertions(+), 153 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/69/21269/3
--
To view, visit http://gerrit.cloudera.org:8080/21269
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7b1d7fa332a901f05a46e0199e13fb841d2687c2
Gerrit-Change-Number: 21269
Gerrit-PatchSet: 3
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 


[Impala-ASF-CR] IMPALA-12973,IMPALA-11491,IMPALA-12651: Support BINARY nested in complex types in select list

2024-04-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21269 )

Change subject: IMPALA-12973,IMPALA-11491,IMPALA-12651: Support BINARY nested 
in complex types in select list
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21269/3/be/src/util/jni-util.h
File be/src/util/jni-util.h:

http://gerrit.cloudera.org:8080/#/c/21269/3/be/src/util/jni-util.h@115
PS3, Line 115: /// is more restricted, see 
https://docs.oracle.com/javase/8/docs/technotes/guides/jni/spec/functions.html#GetPrimitiveArrayCritical_ReleasePrimitiveArrayCritical
line too long (162 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/21269
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7b1d7fa332a901f05a46e0199e13fb841d2687c2
Gerrit-Change-Number: 21269
Gerrit-PatchSet: 3
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Comment-Date: Thu, 11 Apr 2024 15:11:06 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12990: Fix impala-shell handling of unset rows deleted

2024-04-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21284 )

Change subject: IMPALA-12990: Fix impala-shell handling of unset rows_deleted
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15872/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21284
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5812b8006b9cacf34a7a0dbbc89a486d8b454438
Gerrit-Change-Number: 21284
Gerrit-PatchSet: 5
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Comment-Date: Thu, 11 Apr 2024 14:46:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12990: Fix impala-shell handling of unset rows deleted

2024-04-11 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21284 )

Change subject: IMPALA-12990: Fix impala-shell handling of unset rows_deleted
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21284/4/tests/shell/test_shell_commandline.py
File tests/shell/test_shell_commandline.py:

http://gerrit.cloudera.org:8080/#/c/21284/4/tests/shell/test_shell_commandline.py@961
PS4, Line 961: Kude
> typo: Kudu
Done



--
To view, visit http://gerrit.cloudera.org:8080/21284
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5812b8006b9cacf34a7a0dbbc89a486d8b454438
Gerrit-Change-Number: 21284
Gerrit-PatchSet: 4
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Comment-Date: Thu, 11 Apr 2024 14:23:46 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12990: Fix impala-shell handling of unset rows deleted

2024-04-11 Thread Csaba Ringhofer (Code Review)
Hello Peter Rozsa, Michael Smith, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21284

to look at the new patch set (#5).

Change subject: IMPALA-12990: Fix impala-shell handling of unset rows_deleted
..

IMPALA-12990: Fix impala-shell handling of unset rows_deleted

The issue occurred in Python 3 when 0 rows were deleted from Iceberg.
It could also happen in other DMLs with older Impala servers where
TDmlResult.rows_deleted was not set. See the Jira for details of
the error.

Testing:
Extended shell tests for Kudu DML reporting to also cover Iceberg.

Change-Id: I5812b8006b9cacf34a7a0dbbc89a486d8b454438
---
M shell/impala_client.py
M tests/custom_cluster/test_hs2_fault_injection.py
M tests/shell/test_shell_commandline.py
3 files changed, 65 insertions(+), 51 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/84/21284/5
--
To view, visit http://gerrit.cloudera.org:8080/21284
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5812b8006b9cacf34a7a0dbbc89a486d8b454438
Gerrit-Change-Number: 21284
Gerrit-PatchSet: 5
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Peter Rozsa 


[Impala-ASF-CR] IMPALA-12980: Translate CpuAsk into admission control slots

2024-04-11 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21257 )

Change subject: IMPALA-12980: Translate CpuAsk into admission control slots
..


Patch Set 11:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/21257/11//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21257/11//COMMIT_MSG@19
PS11, Line 19:  rather
 : than sum of it (48)
Can this get higher than the amount of slots per executor? Based on IMPALA-8998 
the query is refused in this case.


http://gerrit.cloudera.org:8080/#/c/21257/11//COMMIT_MSG@28
PS11, Line 28: which will be a closer resemblance of maximum
 : parallel execution of fragment instances.
Does PLANNER_CPU_ASK always calculate a greater or equal number of slots than 
LARGEST_FRAGMENT?


http://gerrit.cloudera.org:8080/#/c/21257/11/testdata/workloads/functional-query/queries/QueryTest/processing_cost_admission_slots.test
File 
testdata/workloads/functional-query/queries/QueryTest/processing_cost_admission_slots.test:

http://gerrit.cloudera.org:8080/#/c/21257/11/testdata/workloads/functional-query/queries/QueryTest/processing_cost_admission_slots.test@2
PS11, Line 2:  QUERY
It would be very nice to add an explain for the same query with the same query 
options to show plan. At the moment it is hard for me to think through how 
exactly we got to the results below - seeing a plan + having a few pointers in 
comments could help a lot.

Besides documentation, this would also ensure that if the plan changes in the 
future, the test case with the explain will break, making it much easier to 
investigate the cause of the failure.


http://gerrit.cloudera.org:8080/#/c/21257/11/testdata/workloads/functional-query/queries/QueryTest/processing_cost_admission_slots.test@31
PS11, Line 31:  RESULTS
Are the results actually important here?
If the goal is to verify that slot_count_strategy doesn't change the results of 
the queries, then it would be better to add it to the dimensions of some large 
query suite, e.g. test_queries.py


http://gerrit.cloudera.org:8080/#/c/21257/11/tests/custom_cluster/test_executor_groups.py
File tests/custom_cluster/test_executor_groups.py:

http://gerrit.cloudera.org:8080/#/c/21257/11/tests/custom_cluster/test_executor_groups.py@1245
PS11, Line 1245: #   CoreCount={total=16 
trace=F15:3+F01:1+F14:3+F03:1+F13:3+F05:1+F12:3+F07:1},
This is very useful to help in understanding the patch, but it would be better 
to point to specific query plan, as the plan (and fragment numbers) may change 
in future.


http://gerrit.cloudera.org:8080/#/c/21257/11/tests/query_test/test_processing_cost.py
File tests/query_test/test_processing_cost.py:

http://gerrit.cloudera.org:8080/#/c/21257/11/tests/query_test/test_processing_cost.py@42
PS11, Line 42: add_mandatory_exec_option(cls, 'slot_count_strategy', 
'planner_cpu_ask')
IMO moving these to SET statements in the .test file would be clearer. It would 
also allow running the same query with both slot_count_strategy and 
planner_cpu_ask to show that it leads to different estimates.



--
To view, visit http://gerrit.cloudera.org:8080/21257
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I338ca96555bfe8d07afce0320b3688a0861663f2
Gerrit-Change-Number: 21257
Gerrit-PatchSet: 11
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Thu, 11 Apr 2024 14:21:19 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12657: Improve ProcessingCost of ScanNode and NonGroupingAggregator

2024-04-11 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21279 )

Change subject: IMPALA-12657: Improve ProcessingCost of ScanNode and 
NonGroupingAggregator
..


Patch Set 7:

(5 comments)

I took the liberty of editing few nits along with the rebases.
I leave my comments here for some that is not quite straightforward and need 
discussion.

http://gerrit.cloudera.org:8080/#/c/21279/7/fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java
File fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java:

http://gerrit.cloudera.org:8080/#/c/21279/7/fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java@72
PS7, Line 72:   private static final double AGG_EXPR_COST_COEFFICIENT = 0.07;
:   // Cost per input row with single grouping expr with single agg 
expr
:   private static final double 
AGG_INPUT_SINGLE_GROUP_COST_COEFFICIENT = 0.2925;
:   // Cost per output row produced with single grouping expr and 
single agg expr
:   private static final double 
AGG_OUTPUT_SINGLE_GROUP_COST_COEFFICIENT = 2.6072;
:   // Cost per input row with multiple grouping exprs and single 
agg expr
:   private static final double 
AGG_INPUT_MULTI_GROUP_COST_COEFFICIENT = 1.3741;
:   // Cost per output row produced with multiple grouping exprs 
and single agg expr
:   private static final double 
AGG_OUTPUT_MULTI_GROUP_COST_COEFFICIENT = 4.5285;
nit: Personally, I prefer related constant names to have common prefix, so 
these names becomes:

COST_COEFFICIENT_AGG_EXPR
COST_COEFFICIENT_AGG_INPUT
COST_COEFFICIENT_AGG_OUTPUT
COST_COEFFICIENT_AGG_INPUT_MULTI_GROUP
COST_COEFFICIENT_AGG_OUTPUT_MULTI_GROUP

I think it is easier to distinguish them if they have common prefix. Same 
comment for other coefficient constants.


http://gerrit.cloudera.org:8080/#/c/21279/7/fe/src/main/java/org/apache/impala/planner/AggregationNode.java
File fe/src/main/java/org/apache/impala/planner/AggregationNode.java:

http://gerrit.cloudera.org:8080/#/c/21279/7/fe/src/main/java/org/apache/impala/planner/AggregationNode.java@582
PS7, Line 582: fragment_.getNumInstances()
I'm not sure about using getNumInstances() here. traverseEffectiveParallelism() 
will change parallelism/num_instances later. If traverseEffectiveParallelism() 
increase/decrease num instances, intermediateOutputCardinality will be 
underestimated/overestimated. I understand this is a chicken-and-egg problem.

I wonder if simply using Analyzer.getMinParallelismPerNode() * getNumNodes() is 
enough. That would be the approximate lower bound for 
intermediateOutputCardinality. What is your observation during benchmarking on 
this issue?


http://gerrit.cloudera.org:8080/#/c/21279/7/fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java
File fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java:

http://gerrit.cloudera.org:8080/#/c/21279/7/fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java@116
PS7, Line 116: fragment_.getNumInstances()
Another chicken-and-egg problem of using getNumInstances() before 
traverseEffectiveParallelism().


http://gerrit.cloudera.org:8080/#/c/21279/7/fe/src/main/java/org/apache/impala/planner/PlanFragment.java
File fe/src/main/java/org/apache/impala/planner/PlanFragment.java:

http://gerrit.cloudera.org:8080/#/c/21279/7/fe/src/main/java/org/apache/impala/planner/PlanFragment.java@633
PS7, Line 633: getNumInstances
Another chicken-and-egg problem of using getNumInstances() before 
traverseEffectiveParallelism()?


http://gerrit.cloudera.org:8080/#/c/21279/7/testdata/workloads/functional-planner/queries/PlannerTest/tpcds-processing-cost.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds-processing-cost.test:

http://gerrit.cloudera.org:8080/#/c/21279/7/testdata/workloads/functional-planner/queries/PlannerTest/tpcds-processing-cost.test@132
PS7, Line 132:HDFS partitions=1824/1824 files=1824 size=199.18MB
 :stored statistics:
 :  table: rows=2.88M size=199.18MB
nit: unnecessary change. Small bytes variability that is not asserted in 
Planner test.



--
To view, visit http://gerrit.cloudera.org:8080/21279
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icf1edd48d4ae255b7b3b7f5b228800d7bac7d2ca
Gerrit-Change-Number: 21279
Gerrit-PatchSet: 7
Gerrit-Owner: David Rorke 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Thu, 11 Apr 2024 14:14:34 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12980: Translate CpuAsk into admission control slots

2024-04-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21257 )

Change subject: IMPALA-12980: Translate CpuAsk into admission control slots
..


Patch Set 11:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15871/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21257
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I338ca96555bfe8d07afce0320b3688a0861663f2
Gerrit-Change-Number: 21257
Gerrit-PatchSet: 11
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Thu, 11 Apr 2024 13:49:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12980: Translate CpuAsk into admission control slots

2024-04-11 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21257 )

Change subject: IMPALA-12980: Translate CpuAsk into admission control slots
..


Patch Set 11:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21257/10/testdata/workloads/functional-query/queries/QueryTest/processing_cost_admission_slots.test
File 
testdata/workloads/functional-query/queries/QueryTest/processing_cost_admission_slots.test:

http://gerrit.cloudera.org:8080/#/c/21257/10/testdata/workloads/functional-query/queries/QueryTest/processing_cost_admission_slots.test@31
PS10, Line 31:  RESULTS
 : 'AAABBAAA'
> This assertion failed in dockerised environment.
Done



--
To view, visit http://gerrit.cloudera.org:8080/21257
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I338ca96555bfe8d07afce0320b3688a0861663f2
Gerrit-Change-Number: 21257
Gerrit-PatchSet: 11
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Thu, 11 Apr 2024 13:26:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12980: Translate CpuAsk into admission control slots

2024-04-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21257 )

Change subject: IMPALA-12980: Translate CpuAsk into admission control slots
..


Patch Set 11:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10534/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/21257
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I338ca96555bfe8d07afce0320b3688a0861663f2
Gerrit-Change-Number: 21257
Gerrit-PatchSet: 11
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Thu, 11 Apr 2024 13:28:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12980: Translate CpuAsk into admission control slots

2024-04-11 Thread Riza Suminto (Code Review)
Hello Kurt Deschler, Abhishek Rawat, Csaba Ringhofer, Wenzhe Zhou, Impala 
Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21257

to look at the new patch set (#11).

Change subject: IMPALA-12980: Translate CpuAsk into admission control slots
..

IMPALA-12980: Translate CpuAsk into admission control slots

Impala has a concept of "admission control slots" - the amount of
parallelism that should be allowed on an Impala daemon. This defaults to
the number of processors per executor and can be overridden with
-–admission_control_slots flag.

Admission control slot accounting is described in IMPALA-8998. It
compute 'slots_to_use' for each backend based on the max number of
instances of any fragment on that backend. This can lead to
underestimation, because multiple non-blocking fragments with the same
number of instance counts, say 4 non-blocking fragments each with 12
instances, only request the max instance (12) admission slots rather
than sum of it (48), making each of 12 cores oversubscribed by 4x.

When COMPUTE_PROCESSING_COST is enabled, Planner will generate a CpuAsk
number that represents the cpu requirement of that query over a
particular executor group set. This number is an estimation of the
largest number of query fragment instances that can run in parallel
without waiting, given by the blocking operator analysis. Therefore, the
fragment trace that sums into that CpuAsk number can be translated into
'slots_to_use' as well, which will be a closer resemblance of maximum
parallel execution of fragment instances.

This patch adds a new query option called SLOT_COUNT_STRATEGY to control
which admission control slot accounting to use. There are two possible
values:
- LARGEST_FRAGMENT, which is the original algorithm from IMPALA-8998.
  This is still the default value for the SLOT_COUNT_STRATEGY option.
- PLANNER_CPU_ASK, which will follow fragment trace that contributes
  towards CpuAsk number.

To do the PLANNER_CPU_ASK strategy, the Planner will mark fragments that
contribute to CpuAsk as dominant fragments. It also passes
max_slot_per_executor information that it knows about the executor group
set to the scheduler.

AvgAdmissionSlotsPerExecutor counter is added to describe what Planner
thinks the average 'slots_to_use' per backend will be, which follows
this formula:

  AvgAdmissionSlotsPerExecutor = ceil(CpuAsk / num_executors)

Actual 'slots_to_use' in each backend may differ than
AvgAdmissionSlotsPerExecutor, depending on what is scheduled on that
backend. 'slots_to_use' will be shown as 'AdmissionSlots' counter under
each executor profile node.

Testing:
- Update test_executors.py with AvgAdmissionSlotsPerExecutor assertion.
- Pass test_tpcds_queries.py::TestTpcdsQueryWithProcessingCost.

Change-Id: I338ca96555bfe8d07afce0320b3688a0861663f2
---
M be/src/scheduling/admission-controller-test.cc
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/scheduler.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaService.thrift
M common/thrift/Planner.thrift
M common/thrift/Query.thrift
M fe/src/main/java/org/apache/impala/common/Id.java
M fe/src/main/java/org/apache/impala/planner/CoreCount.java
M fe/src/main/java/org/apache/impala/planner/CostingSegment.java
M fe/src/main/java/org/apache/impala/planner/PlanFragment.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
A 
testdata/workloads/functional-query/queries/QueryTest/processing_cost_admission_slots.test
M tests/custom_cluster/test_executor_groups.py
A tests/query_test/test_processing_cost.py
M tests/query_test/test_tpcds_queries.py
19 files changed, 534 insertions(+), 96 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/57/21257/11
--
To view, visit http://gerrit.cloudera.org:8080/21257
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I338ca96555bfe8d07afce0320b3688a0861663f2
Gerrit-Change-Number: 21257
Gerrit-PatchSet: 11
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] IMPALA-12980: Translate CpuAsk into admission control slots

2024-04-11 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21257 )

Change subject: IMPALA-12980: Translate CpuAsk into admission control slots
..


Patch Set 10:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21257/10/testdata/workloads/functional-query/queries/QueryTest/processing_cost_admission_slots.test
File 
testdata/workloads/functional-query/queries/QueryTest/processing_cost_admission_slots.test:

http://gerrit.cloudera.org:8080/#/c/21257/10/testdata/workloads/functional-query/queries/QueryTest/processing_cost_admission_slots.test@31
PS10, Line 31: row_regex: .* AdmissionSlots: 8 .*
 : row_regex: .* AdmissionSlots: 4 .*
This assertion failed in dockerised environment.
In regular 3 node minicluster, fragment assignment is 8, 4, 4.
In dockerised env, fragment assignment is 7, 4, 5.
Will remove this. The assertion in L26 should be sufficient.



--
To view, visit http://gerrit.cloudera.org:8080/21257
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I338ca96555bfe8d07afce0320b3688a0861663f2
Gerrit-Change-Number: 21257
Gerrit-PatchSet: 10
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Thu, 11 Apr 2024 13:21:22 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12986: Base64Encode fails if the 'out len' output parameter is passed with certain values

2024-04-11 Thread Peter Rozsa (Code Review)
Peter Rozsa has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21271 )

Change subject: IMPALA-12986: Base64Encode fails if the 'out_len' output 
parameter is passed with certain values
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21271
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I35ae59fc9b3280f89ea4f7d95d27d2f21751001f
Gerrit-Change-Number: 21271
Gerrit-PatchSet: 3
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 11 Apr 2024 13:19:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12986: Base64Encode fails if the 'out len' output parameter is passed with certain values

2024-04-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21271 )

Change subject: IMPALA-12986: Base64Encode fails if the 'out_len' output 
parameter is passed with certain values
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10533/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/21271
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I35ae59fc9b3280f89ea4f7d95d27d2f21751001f
Gerrit-Change-Number: 21271
Gerrit-PatchSet: 4
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 11 Apr 2024 13:20:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12986: Base64Encode fails if the 'out len' output parameter is passed with certain values

2024-04-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21271 )

Change subject: IMPALA-12986: Base64Encode fails if the 'out_len' output 
parameter is passed with certain values
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21271
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I35ae59fc9b3280f89ea4f7d95d27d2f21751001f
Gerrit-Change-Number: 21271
Gerrit-PatchSet: 4
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 11 Apr 2024 13:20:11 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP: Bump Impyla version to 0.20a1

2024-04-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21286 )

Change subject: WIP: Bump Impyla version to 0.20a1
..


Patch Set 2: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/21286
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I297a2a34f3e688555ce8572b6c7fffbd34423f2d
Gerrit-Change-Number: 21286
Gerrit-PatchSet: 2
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 11 Apr 2024 12:57:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12991: Eliminate unnecessary SORT for Iceberg DELETEs

2024-04-11 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21285 )

Change subject: IMPALA-12991: Eliminate unnecessary SORT for Iceberg DELETEs
..


Patch Set 2: Code-Review+2

(1 comment)

Thanks for the comment! Carry +2

http://gerrit.cloudera.org:8080/#/c/21285/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21285/1//COMMIT_MSG@9
PS1, Line 9: using
> nit: using
Done



--
To view, visit http://gerrit.cloudera.org:8080/21285
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I94a691e7990228a1ec2de03e6ad90ebb97931581
Gerrit-Change-Number: 21285
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 11 Apr 2024 12:33:32 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12991: Eliminate unnecessary SORT for Iceberg DELETEs

2024-04-11 Thread Zoltan Borok-Nagy (Code Review)
Hello Gabor Kaszab, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21285

to look at the new patch set (#2).

Change subject: IMPALA-12991: Eliminate unnecessary SORT for Iceberg DELETEs
..

IMPALA-12991: Eliminate unnecessary SORT for Iceberg DELETEs

Since we are using IcebergBufferedDeleteSink, which sorts the data
before flushing, there is no need to add a SORT node before the sink.

Testing:
 * updated planner tests

Change-Id: I94a691e7990228a1ec2de03e6ad90ebb97931581
---
M fe/src/main/java/org/apache/impala/planner/Planner.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-delete.test
2 files changed, 7 insertions(+), 41 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/85/21285/2
--
To view, visit http://gerrit.cloudera.org:8080/21285
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I94a691e7990228a1ec2de03e6ad90ebb97931581
Gerrit-Change-Number: 21285
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-12980: Translate CpuAsk into admission control slots

2024-04-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21257 )

Change subject: IMPALA-12980: Translate CpuAsk into admission control slots
..


Patch Set 10: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/10530/


--
To view, visit http://gerrit.cloudera.org:8080/21257
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I338ca96555bfe8d07afce0320b3688a0861663f2
Gerrit-Change-Number: 21257
Gerrit-PatchSet: 10
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Thu, 11 Apr 2024 12:36:33 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12991: Eliminate unnecessary SORT for Iceberg DELETEs

2024-04-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21285 )

Change subject: IMPALA-12991: Eliminate unnecessary SORT for Iceberg DELETEs
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10532/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/21285
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I94a691e7990228a1ec2de03e6ad90ebb97931581
Gerrit-Change-Number: 21285
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 11 Apr 2024 12:33:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12990: Fix impala-shell handling of unset rows deleted

2024-04-11 Thread Peter Rozsa (Code Review)
Peter Rozsa has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21284 )

Change subject: IMPALA-12990: Fix impala-shell handling of unset rows_deleted
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21284/2/tests/shell/test_shell_commandline.py
File tests/shell/test_shell_commandline.py:

http://gerrit.cloudera.org:8080/#/c/21284/2/tests/shell/test_shell_commandline.py@925
PS2, Line 925:   def test_iceberg_dml_reporting(self, vector, unique_database):
> Changed the function to make the Kudu / Iceberg differences more explicit a
Thank you, it's clearer now.


http://gerrit.cloudera.org:8080/#/c/21284/4/tests/shell/test_shell_commandline.py
File tests/shell/test_shell_commandline.py:

http://gerrit.cloudera.org:8080/#/c/21284/4/tests/shell/test_shell_commandline.py@961
PS4, Line 961: Kude
typo: Kudu



--
To view, visit http://gerrit.cloudera.org:8080/21284
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5812b8006b9cacf34a7a0dbbc89a486d8b454438
Gerrit-Change-Number: 21284
Gerrit-PatchSet: 4
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Comment-Date: Thu, 11 Apr 2024 12:25:25 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12990: Fix impala-shell handling of unset rows deleted

2024-04-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21284 )

Change subject: IMPALA-12990: Fix impala-shell handling of unset rows_deleted
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15870/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21284
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5812b8006b9cacf34a7a0dbbc89a486d8b454438
Gerrit-Change-Number: 21284
Gerrit-PatchSet: 4
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Comment-Date: Thu, 11 Apr 2024 11:30:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12990: Fix impala-shell handling of unset rows deleted

2024-04-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21284 )

Change subject: IMPALA-12990: Fix impala-shell handling of unset rows_deleted
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15869/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21284
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5812b8006b9cacf34a7a0dbbc89a486d8b454438
Gerrit-Change-Number: 21284
Gerrit-PatchSet: 3
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Comment-Date: Thu, 11 Apr 2024 11:20:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12990: Fix impala-shell handling of unset rows deleted

2024-04-11 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21284 )

Change subject: IMPALA-12990: Fix impala-shell handling of unset rows_deleted
..


Patch Set 4:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/21284/2/shell/impala_client.py
File shell/impala_client.py:

http://gerrit.cloudera.org:8080/#/c/21284/2/shell/impala_client.py@980
PS2, Line 980:   self._check_hs2_rpc_status(resp.status)
> This seems right given how it's used in impala_shell.py.
About TestHS2FaultInjection::test_close_dml:
The "return" was never reached there as the expectation is that an exception is 
thrown during the RPC. Updated the code as it was quite misleading.

about ImpalaBeeswaxClient::close_dml:
thanks for catching it,  updated the beeswax case

StrictHS2Client::close_dml:
TDmlResult is only returned in Impala, so the strict client doesn't use it. 
Hive has a different way to report number of modified rows, which is not used 
in Impala. If interested, IMPALA-12647 has details about the Impala and Hive 
solution

>Also the comment describing ImpalaClient::close_dml
The comment I found was about row errors, not deleted rows, and for row errors 
this is still true. Updated the comment as it was stale and didn't mention 
deleted rows.


http://gerrit.cloudera.org:8080/#/c/21284/2/tests/shell/test_shell_commandline.py
File tests/shell/test_shell_commandline.py:

http://gerrit.cloudera.org:8080/#/c/21284/2/tests/shell/test_shell_commandline.py@925
PS2, Line 925:   def test_iceberg_dml_reporting(self, vector, unique_database):
> Reusing the Kudu DML sequence makes this new test compact but a bit complex
Changed the function to make the Kudu / Iceberg differences more explicit and 
added some comments. If you think that it is still too complex, I can split the 
function to two.


http://gerrit.cloudera.org:8080/#/c/21284/2/tests/shell/test_shell_commandline.py@932
PS2, Line 932:   def _test_dml_reporting(self, vector, create_sql, db, is_kudu):
> Do these get run with both HS2 and Beeswax?
Yes ... but as I learned the Python3 + beeswax combination is only run in 
exhaustive tests so the tests I ran didn't catch it.

This is a bit unlucky situation, I think that we should run all protocols with 
Python2 and Python3 in core tests.



--
To view, visit http://gerrit.cloudera.org:8080/21284
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5812b8006b9cacf34a7a0dbbc89a486d8b454438
Gerrit-Change-Number: 21284
Gerrit-PatchSet: 4
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Comment-Date: Thu, 11 Apr 2024 11:16:23 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12990: Fix impala-shell handling of unset rows deleted

2024-04-11 Thread Csaba Ringhofer (Code Review)
Hello Peter Rozsa, Michael Smith, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21284

to look at the new patch set (#4).

Change subject: IMPALA-12990: Fix impala-shell handling of unset rows_deleted
..

IMPALA-12990: Fix impala-shell handling of unset rows_deleted

The issue occurred in Python 3 when 0 rows were deleted from Iceberg.
It could also happen in other DMLs with older Impala servers where
TDmlResult.rows_deleted was not set. See the Jira for details of
the error.

Testing:
Extended shell tests for Kudu DML reporting to also cover Iceberg.

Change-Id: I5812b8006b9cacf34a7a0dbbc89a486d8b454438
---
M shell/impala_client.py
M tests/custom_cluster/test_hs2_fault_injection.py
M tests/shell/test_shell_commandline.py
3 files changed, 65 insertions(+), 51 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/84/21284/4
--
To view, visit http://gerrit.cloudera.org:8080/21284
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5812b8006b9cacf34a7a0dbbc89a486d8b454438
Gerrit-Change-Number: 21284
Gerrit-PatchSet: 4
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Peter Rozsa 


[Impala-ASF-CR] IMPALA-12990: Fix impala-shell handling of unset rows deleted

2024-04-11 Thread Csaba Ringhofer (Code Review)
Hello Peter Rozsa, Michael Smith, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21284

to look at the new patch set (#3).

Change subject: IMPALA-12990: Fix impala-shell handling of unset rows_deleted
..

IMPALA-12990: Fix impala-shell handling of unset rows_deleted

The issue occurred in Python 3 when 0 rows were deleted from Iceberg.
It could also happen in other DMLs with older Impala servers where
TDmlResult.rows_deleted was not set. See the Jira for details of
the error.

Testing:
Extended shell tests for Kudu DML reporting to also cover Iceberg.

Change-Id: I5812b8006b9cacf34a7a0dbbc89a486d8b454438
---
M shell/impala_client.py
M tests/custom_cluster/test_hs2_fault_injection.py
M tests/shell/test_shell_commandline.py
3 files changed, 61 insertions(+), 48 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/84/21284/3
--
To view, visit http://gerrit.cloudera.org:8080/21284
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5812b8006b9cacf34a7a0dbbc89a486d8b454438
Gerrit-Change-Number: 21284
Gerrit-PatchSet: 3
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Peter Rozsa 


[Impala-ASF-CR] [TEST][WIP]IMPALA-11499: Refactor UrlEncode function to handle special characters

2024-04-11 Thread Daniel Becker (Code Review)
Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21131 )

Change subject: [TEST][WIP]IMPALA-11499: Refactor UrlEncode function to handle 
special characters
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21131/3/be/src/util/coding-util.cc
File be/src/util/coding-util.cc:

http://gerrit.cloudera.org:8080/#/c/21131/3/be/src/util/coding-util.cc@83
PS3, Line 83:   if (!hive_compat) {
> That was previously possible. The following works for me before this change
The exception is thrown here in the catalogd:
https://github.com/apache/impala/blob/70c35425d3f8ac68b23fb8e0d08e12ee763965d7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java#L1770
It is a Preconditions check. The partition name is not found in the map. 
'partName' is "p=12?!@%23$%25^&%2A()|%3F%2F.~`", while 'nameToPartitionMap_' 
contains "p=12?!@%23$%25%5E&%2A()|%3F%2F.~`". The difference is the caret ("^").
What I think happens is that the HMS encodes it to "%5E", and that is what is 
present in the 'nameToPartitionMap_' map. But we don't encode it, hence 
'partName' contains "^".

Indeed, if I add the mapping {'^', "%5E"} in 'specialCharacterMap' in this 
file, there is no error.



--
To view, visit http://gerrit.cloudera.org:8080/21131
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I88c4aba5d811dfcec809583d0c16fcbc0ca730fb
Gerrit-Change-Number: 21131
Gerrit-PatchSet: 3
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Zihao Ye 
Gerrit-Comment-Date: Thu, 11 Apr 2024 10:38:30 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12990: Fix impala-shell handling of unset rows deleted

2024-04-11 Thread Peter Rozsa (Code Review)
Peter Rozsa has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21284 )

Change subject: IMPALA-12990: Fix impala-shell handling of unset rows_deleted
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21284/2/tests/shell/test_shell_commandline.py
File tests/shell/test_shell_commandline.py:

http://gerrit.cloudera.org:8080/#/c/21284/2/tests/shell/test_shell_commandline.py@925
PS2, Line 925:   def test_iceberg_dml_reporting(self, vector, unique_database):
Reusing the Kudu DML sequence makes this new test compact but a bit complex. 
Maybe creating a separate test sequence for Iceberg should simplify it, at the 
cost of adding duplicate rows, but these duplicates, in my opinion, can be 
reasonable as they have different semantics.



--
To view, visit http://gerrit.cloudera.org:8080/21284
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5812b8006b9cacf34a7a0dbbc89a486d8b454438
Gerrit-Change-Number: 21284
Gerrit-PatchSet: 2
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Comment-Date: Thu, 11 Apr 2024 09:31:44 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12810: Simplify IcebergDeleteNode and IcebergDeleteBuilder

2024-04-11 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21258 )

Change subject: IMPALA-12810: Simplify IcebergDeleteNode and 
IcebergDeleteBuilder
..


Patch Set 9:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21258/7/tests/query_test/test_iceberg.py
File tests/query_test/test_iceberg.py:

http://gerrit.cloudera.org:8080/#/c/21258/7/tests/query_test/test_iceberg.py@1455
PS7, Line 1455: if 
vector.get_value('exec_option')['disable_optimized_iceberg_v2_read'] == 0:
> We only test DIRECTED mode + V2 operator (which go in hand with each other)
Here the code says that "if we don't disable the V2 read optimisations then 
skip the test" or in other words "if we have V2 read opts then skip".
I might misread this code or we don't test the DIRECTED mode here only the old 
ANTI JOIN case.



--
To view, visit http://gerrit.cloudera.org:8080/21258
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ba02b33433990950b49628f11e732e01ed8a34d
Gerrit-Change-Number: 21258
Gerrit-PatchSet: 9
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 11 Apr 2024 09:02:43 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] WIP: Bump Impyla version to 0.20a1

2024-04-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21286 )

Change subject: WIP: Bump Impyla version to 0.20a1
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15868/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21286
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I297a2a34f3e688555ce8572b6c7fffbd34423f2d
Gerrit-Change-Number: 21286
Gerrit-PatchSet: 2
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 11 Apr 2024 08:16:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP: Bump Impyla version to 0.20a1

2024-04-11 Thread Csaba Ringhofer (Code Review)
Hello Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21286

to look at the new patch set (#2).

Change subject: WIP: Bump Impyla version to 0.20a1
..

WIP: Bump Impyla version to 0.20a1

Change-Id: I297a2a34f3e688555ce8572b6c7fffbd34423f2d
---
M infra/python/deps/requirements.txt
M tests/common/impala_connection.py
2 files changed, 3 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/21286/2
--
To view, visit http://gerrit.cloudera.org:8080/21286
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I297a2a34f3e688555ce8572b6c7fffbd34423f2d
Gerrit-Change-Number: 21286
Gerrit-PatchSet: 2
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] WIP: Bump Impyla version to 0.20a1

2024-04-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21286 )

Change subject: WIP: Bump Impyla version to 0.20a1
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10531/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/21286
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I297a2a34f3e688555ce8572b6c7fffbd34423f2d
Gerrit-Change-Number: 21286
Gerrit-PatchSet: 2
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 11 Apr 2024 07:52:48 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12970: Fix ConcurrentModificationException for Iceberg table scans

2024-04-11 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21267 )

Change subject: IMPALA-12970: Fix ConcurrentModificationException for Iceberg 
table scans
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21267/1/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java
File fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java:

http://gerrit.cloudera.org:8080/#/c/21267/1/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java@107
PS1, Line 107:   fileDescs_ = new ArrayList<>(fileDescs_);
 :   Collections.sort(fileDescs_);
> I'm experimenting with your suggestion and see that it would bring too much
I see, thanks Gabor for investigating this.



--
To view, visit http://gerrit.cloudera.org:8080/21267
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iafe57f05ffa0fa6a0875c141cfafd5ee1607a5c3
Gerrit-Change-Number: 21267
Gerrit-PatchSet: 1
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 11 Apr 2024 07:43:42 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12657: Improve ProcessingCost of ScanNode and NonGroupingAggregator

2024-04-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21279 )

Change subject: IMPALA-12657: Improve ProcessingCost of ScanNode and 
NonGroupingAggregator
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15867/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21279
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icf1edd48d4ae255b7b3b7f5b228800d7bac7d2ca
Gerrit-Change-Number: 21279
Gerrit-PatchSet: 6
Gerrit-Owner: David Rorke 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Thu, 11 Apr 2024 07:37:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12980: Translate CpuAsk into admission control slots

2024-04-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21257 )

Change subject: IMPALA-12980: Translate CpuAsk into admission control slots
..


Patch Set 10:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15865/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21257
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I338ca96555bfe8d07afce0320b3688a0861663f2
Gerrit-Change-Number: 21257
Gerrit-PatchSet: 10
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Thu, 11 Apr 2024 07:36:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12988: Calculate an unbounded version of CpuAsk

2024-04-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21277 )

Change subject: IMPALA-12988: Calculate an unbounded version of CpuAsk
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15866/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21277
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5441e31088f90761062af35862be4ce09d116923
Gerrit-Change-Number: 21277
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Thu, 11 Apr 2024 07:36:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12920: Support ai generate text built-in function for OpenAI's chat completion API

2024-04-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/21168 )

Change subject: IMPALA-12920: Support ai_generate_text built-in function for 
OpenAI's chat completion API
..

IMPALA-12920: Support ai_generate_text built-in function for OpenAI's chat 
completion API

Added support for following built-in functions:
- ai_generate_text_default(prompt)
- ai_generate_text(ai_endpoint, prompt, ai_model,
  ai_api_key_jceks_secret, additional_params)

'ai_endpoint', 'ai_model' and 'ai_api_key_jceks_secret' are flagfile
options. 'ai_generate_text_default(prompt)' syntax expects all these
to be set to proper values. The other syntax, will try to use the
provided input parameter values, but fallback to instance level values
if the inputs are NULL or empty.

Only public OpenAI (api.openai.com) and Azure OpenAI (openai.azure.com)
API endpoints are currently supported.

Exposed these functions in FunctionContext so that they can also be
called from UDFs:
- ai_generate_text_default(context, model)
- ai_generate_text(context, ai_endpoint, prompt, ai_model,
  ai_api_key_jceks_secret, additional_params)

Testing:
- Added unit tests for AiGenerateTextInternal function
- Added fe test for JniFrontend::getSecretFromKeyStore
- Ran manual tests to make sure Impala can talk with OpenAI LLMs using
'ai_generate_text' built-in function. Example sql:
select ai_generate_text("https://api.openai.com/v1/chat/completions;,
"hello", "gpt-3.5-turbo", "open-ai-key",
'{"temperature": 0.9, "model": "gpt-4"}')
- Tested using standalone UDF SDK and made sure that the UDFs can invoke
  BuiltInFunctions (ai_generate_text and ai_generate_text_default)

Change-Id: Id4446957f6030bab1f985fdd69185c3da07d7c4b
Reviewed-on: http://gerrit.cloudera.org:8080/21168
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M be/src/exprs/CMakeLists.txt
A be/src/exprs/ai-functions-ir.cc
A be/src/exprs/ai-functions.h
A be/src/exprs/ai-functions.inline.h
M be/src/exprs/expr-test.cc
M be/src/exprs/scalar-expr-evaluator.cc
M be/src/runtime/exec-env.cc
M be/src/service/frontend.cc
M be/src/service/frontend.h
M be/src/udf/udf-internal.h
M be/src/udf/udf.cc
M be/src/udf/udf.h
M be/src/udf_samples/udf-sample.cc
M be/src/udf_samples/udf-sample.h
M be/src/util/jni-util.h
M bin/load-data.py
M bin/rat_exclude_files.txt
M common/function-registry/impala_functions.py
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
M fe/src/test/java/org/apache/impala/service/JniFrontendTest.java
M testdata/cluster/node_templates/common/etc/hadoop/conf/core-site.xml.py
A testdata/jceks/.gitkeep
22 files changed, 784 insertions(+), 6 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/21168
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Id4446957f6030bab1f985fdd69185c3da07d7c4b
Gerrit-Change-Number: 21168
Gerrit-PatchSet: 19
Gerrit-Owner: Abhishek Rawat 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Yida Wu 


[Impala-ASF-CR] IMPALA-12920: Support ai generate text built-in function for OpenAI's chat completion API

2024-04-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21168 )

Change subject: IMPALA-12920: Support ai_generate_text built-in function for 
OpenAI's chat completion API
..


Patch Set 18: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/21168
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id4446957f6030bab1f985fdd69185c3da07d7c4b
Gerrit-Change-Number: 21168
Gerrit-PatchSet: 18
Gerrit-Owner: Abhishek Rawat 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Thu, 11 Apr 2024 07:25:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12657: Improve ProcessingCost of ScanNode and NonGroupingAggregator

2024-04-11 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21279 )

Change subject: IMPALA-12657: Improve ProcessingCost of ScanNode and 
NonGroupingAggregator
..


Patch Set 6:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21279/4/fe/src/main/java/org/apache/impala/planner/BaseProcessingCost.java
File fe/src/main/java/org/apache/impala/planner/BaseProcessingCost.java:

http://gerrit.cloudera.org:8080/#/c/21279/4/fe/src/main/java/org/apache/impala/planner/BaseProcessingCost.java@78
PS4, Line 78: total-cost="
> Replace with "total-cost="
Done


http://gerrit.cloudera.org:8080/#/c/21279/4/tests/custom_cluster/test_executor_groups.py
File tests/custom_cluster/test_executor_groups.py:

http://gerrit.cloudera.org:8080/#/c/21279/4/tests/custom_cluster/test_executor_groups.py@1105
PS4, Line 1105:
> flake8: E261 at least two spaces before inline comment
Done



--
To view, visit http://gerrit.cloudera.org:8080/21279
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icf1edd48d4ae255b7b3b7f5b228800d7bac7d2ca
Gerrit-Change-Number: 21279
Gerrit-PatchSet: 6
Gerrit-Owner: David Rorke 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Thu, 11 Apr 2024 07:21:23 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12980: Translate CpuAsk into admission control slots

2024-04-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21257 )

Change subject: IMPALA-12980: Translate CpuAsk into admission control slots
..


Patch Set 10:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10530/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/21257
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I338ca96555bfe8d07afce0320b3688a0861663f2
Gerrit-Change-Number: 21257
Gerrit-PatchSet: 10
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Thu, 11 Apr 2024 07:23:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12991: Eliminate unnecessary SORT for Iceberg DELETEs

2024-04-11 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21285 )

Change subject: IMPALA-12991: Eliminate unnecessary SORT for Iceberg DELETEs
..


Patch Set 1: Code-Review+2

(1 comment)

Thanks! +2 with a nit

http://gerrit.cloudera.org:8080/#/c/21285/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21285/1//COMMIT_MSG@9
PS1, Line 9: useing
nit: using



--
To view, visit http://gerrit.cloudera.org:8080/21285
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I94a691e7990228a1ec2de03e6ad90ebb97931581
Gerrit-Change-Number: 21285
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 11 Apr 2024 07:19:13 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12988: Calculate an unbounded version of CpuAsk

2024-04-11 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21277 )

Change subject: IMPALA-12988: Calculate an unbounded version of CpuAsk
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21277/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21277/1//COMMIT_MSG@16
PS1, Line 16: MAX_FRAGMENT_INSTANCES_PER_NODE options accordingly.
> The EG selection should only need the unbounded number for all EG. The last
Done


http://gerrit.cloudera.org:8080/#/c/21277/2/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

http://gerrit.cloudera.org:8080/#/c/21277/2/fe/src/main/java/org/apache/impala/service/Frontend.java@2300
PS2, Line 2300:
> Might want to use max(cores_requirement, cores_requirement_unbounded) here,
I go ahead with Math.max(cores_requirement, cores_requirement_unbounded).



--
To view, visit http://gerrit.cloudera.org:8080/21277
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5441e31088f90761062af35862be4ce09d116923
Gerrit-Change-Number: 21277
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Thu, 11 Apr 2024 07:18:10 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12980: Translate CpuAsk into admission control slots

2024-04-11 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21257 )

Change subject: IMPALA-12980: Translate CpuAsk into admission control slots
..


Patch Set 10:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21257/9/be/src/scheduling/scheduler.cc
File be/src/scheduling/scheduler.cc:

http://gerrit.cloudera.org:8080/#/c/21257/9/be/src/scheduling/scheduler.cc@1212
PS9, Line 1212:   be_max_instances =
> CPC cpu costing also apply to coordinator fragments, but excluded during EG
Add tests/query_test/test_processing_cost.py in ps10


http://gerrit.cloudera.org:8080/#/c/21257/9/fe/src/main/java/org/apache/impala/planner/CoreCount.java
File fe/src/main/java/org/apache/impala/planner/CoreCount.java:

http://gerrit.cloudera.org:8080/#/c/21257/9/fe/src/main/java/org/apache/impala/planner/CoreCount.java@148
PS9, Line 148:   }
> I think totalWithoutCoordinator() method will be useful. I'll test this.
Done



--
To view, visit http://gerrit.cloudera.org:8080/21257
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I338ca96555bfe8d07afce0320b3688a0861663f2
Gerrit-Change-Number: 21257
Gerrit-PatchSet: 10
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Thu, 11 Apr 2024 07:14:51 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12980: Translate CpuAsk into admission control slots

2024-04-11 Thread Riza Suminto (Code Review)
Hello Kurt Deschler, Abhishek Rawat, Csaba Ringhofer, Wenzhe Zhou, Impala 
Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21257

to look at the new patch set (#10).

Change subject: IMPALA-12980: Translate CpuAsk into admission control slots
..

IMPALA-12980: Translate CpuAsk into admission control slots

Impala has a concept of "admission control slots" - the amount of
parallelism that should be allowed on an Impala daemon. This defaults to
the number of processors per executor and can be overridden with
-–admission_control_slots flag.

Admission control slot accounting is described in IMPALA-8998. It
compute 'slots_to_use' for each backend based on the max number of
instances of any fragment on that backend. This can lead to
underestimation, because multiple non-blocking fragments with the same
number of instance counts, say 4 non-blocking fragments each with 12
instances, only request the max instance (12) admission slots rather
than sum of it (48), making each of 12 cores oversubscribed by 4x.

When COMPUTE_PROCESSING_COST is enabled, Planner will generate a CpuAsk
number that represents the cpu requirement of that query over a
particular executor group set. This number is an estimation of the
largest number of query fragment instances that can run in parallel
without waiting, given by the blocking operator analysis. Therefore, the
fragment trace that sums into that CpuAsk number can be translated into
'slots_to_use' as well, which will be a closer resemblance of maximum
parallel execution of fragment instances.

This patch adds a new query option called SLOT_COUNT_STRATEGY to control
which admission control slot accounting to use. There are two possible
values:
- LARGEST_FRAGMENT, which is the original algorithm from IMPALA-8998.
  This is still the default value for the SLOT_COUNT_STRATEGY option.
- PLANNER_CPU_ASK, which will follow fragment trace that contributes
  towards CpuAsk number.

To do the PLANNER_CPU_ASK strategy, the Planner will mark fragments that
contribute to CpuAsk as dominant fragments. It also passes
max_slot_per_executor information that it knows about the executor group
set to the scheduler.

AvgAdmissionSlotsPerExecutor counter is added to describe what Planner
thinks the average 'slots_to_use' per backend will be, which follows
this formula:

  AvgAdmissionSlotsPerExecutor = ceil(CpuAsk / num_executors)

Actual 'slots_to_use' in each backend may differ than
AvgAdmissionSlotsPerExecutor, depending on what is scheduled on that
backend. 'slots_to_use' will be shown as 'AdmissionSlots' counter under
each executor profile node.

Testing:
- Update test_executors.py with AvgAdmissionSlotsPerExecutor assertion.
- Pass test_tpcds_queries.py::TestTpcdsQueryWithProcessingCost.

Change-Id: I338ca96555bfe8d07afce0320b3688a0861663f2
---
M be/src/scheduling/admission-controller-test.cc
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/scheduler.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaService.thrift
M common/thrift/Planner.thrift
M common/thrift/Query.thrift
M fe/src/main/java/org/apache/impala/common/Id.java
M fe/src/main/java/org/apache/impala/planner/CoreCount.java
M fe/src/main/java/org/apache/impala/planner/CostingSegment.java
M fe/src/main/java/org/apache/impala/planner/PlanFragment.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
A 
testdata/workloads/functional-query/queries/QueryTest/processing_cost_admission_slots.test
M tests/custom_cluster/test_executor_groups.py
A tests/query_test/test_processing_cost.py
M tests/query_test/test_tpcds_queries.py
19 files changed, 536 insertions(+), 96 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/57/21257/10
--
To view, visit http://gerrit.cloudera.org:8080/21257
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I338ca96555bfe8d07afce0320b3688a0861663f2
Gerrit-Change-Number: 21257
Gerrit-PatchSet: 10
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] IMPALA-12657: Improve ProcessingCost of ScanNode and NonGroupingAggregator

2024-04-11 Thread Riza Suminto (Code Review)
Riza Suminto has uploaded a new patch set (#6) to the change originally created 
by David Rorke. ( http://gerrit.cloudera.org:8080/21279 )

Change subject: IMPALA-12657: Improve ProcessingCost of ScanNode and 
NonGroupingAggregator
..

IMPALA-12657: Improve ProcessingCost of ScanNode and NonGroupingAggregator

This patch improves the accuracy of the CPU ProcessingCost estimates for
several of the CPU intensive operators by basing the costs on benchmark
data. The general approach for a given operator was to run a set of queries
that exercised the operator under various conditions (e.g. large vs small
row sizes and row counts, varying NDV, different file formats, etc) and
capture the CPU time spent per unit of work (the unit of work might be
measured as some number of rows, some number of bytes, some number of
predicates evaluated, or some combination of these). The data was then
analyzed in an attempt to fit a simple model that would allow us to
predict CPU consumption of a given operator based on information available
at planning time.

For example, the CPU ProcessingCost for a Parquet scan is estimated as:
TotalCost = (0.0144 * BytesMaterialized) + (0.0281 * Rows * Predicate Count)

The coefficients  (0.0144 and 0.0281) are derived from benchmarking
scans under a variety of conditions. Similar cost functions and coefficients
were derived for all of the benchmarked operators. The coefficients for all
the operators are normalized such that a single unit of cost equates to
roughly 100 nanoseconds of CPU time on a r5d.4xlarge instance. So we would
predict an operator with a cost of 10,000,000 would complete in roughly one
second on a single core.

Limitations:
* Costing only addresses CPU time spent and doesn't account for any IO
  or other wait time.
* Benchmarking scenarios didn't provide comprehensive coverage of the
  full range of data types, distributions, etc. More thorough
  benchmarking could improve the costing estimates further.
* This initial patch only covers a subset of the operators, focusing
  on those that are most common and most CPU intensive. Specifically
  the following operators are covered by this patch. All others
  continue to use the previous ProcessingCost code:
  AggregationNode
  DataStreamSink (exchange sender)
  ExchangeNode
  HashJoinNode
  HdfsScanNode
  HdfsTableSink
  NestedLoopJoinNode
  SortNode
  UnionNode

Benchmark-based costing of the remaining operators will be covered by
a future patch.

Future patches will automate the collection and analysis of the benchmark
data and the computation of the cost coefficients to simplify maintenance
of the costing as performance changes over time.

Change-Id: Icf1edd48d4ae255b7b3b7f5b228800d7bac7d2ca
---
M fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java
M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java
M fe/src/main/java/org/apache/impala/planner/AggregationNode.java
M fe/src/main/java/org/apache/impala/planner/BaseProcessingCost.java
M fe/src/main/java/org/apache/impala/planner/CostingSegment.java
M fe/src/main/java/org/apache/impala/planner/DataStreamSink.java
M fe/src/main/java/org/apache/impala/planner/EmptySetNode.java
M fe/src/main/java/org/apache/impala/planner/ExchangeNode.java
M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsTableSink.java
M fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java
M fe/src/main/java/org/apache/impala/planner/PlanFragment.java
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/main/java/org/apache/impala/planner/ProcessingCost.java
M fe/src/main/java/org/apache/impala/planner/ScanNode.java
M fe/src/main/java/org/apache/impala/planner/SortNode.java
M fe/src/main/java/org/apache/impala/planner/UnionNode.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds-processing-cost.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q01.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q02.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q03.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q04.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q05.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q06.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q07.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q08.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q09.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q10a.test

[Impala-ASF-CR] IMPALA-12988: Calculate an unbounded version of CpuAsk

2024-04-11 Thread Riza Suminto (Code Review)
Hello Kurt Deschler, Abhishek Rawat, David Rorke, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21277

to look at the new patch set (#4).

Change subject: IMPALA-12988: Calculate an unbounded version of CpuAsk
..

IMPALA-12988: Calculate an unbounded version of CpuAsk

Planner calculates CpuAsk through a recursive call beginning at
Planner.computeBlockingAwareCores(), which is called after
Planner.computeEffectiveParallelism(). It does blocking operator
analysis over the selected degree of parallelism that was decided during
computeEffectiveParallelism() traversal. That selected degree of
parallelism, however, is already bounded by min and max parallelism
config, derived from PROCESSING_COST_MIN_THREADS and
MAX_FRAGMENT_INSTANCES_PER_NODE options accordingly.

This patch calculates an unbounded version of CpuAsk that is not bounded
by min and max parallelism config. It is purely based on the fragment's
ProcessingCost and query plan relationship constraint (for example, the
number of JOIN BUILDER fragments should equal the number of destination
JOIN fragments for partitioned join). During executor group set
selection, Frontend should use the maximum between bounded CpuAsk and
unbounded CpuAsk numbers to avoid assigning a query to a small executor
group set too soon. The last executor group set stays as the "catch-all"
executor group set.

After this patch, the "max-parallelism" field in the query plan will all
be set with maximum parallelism based on ProcessingCost. The CpuAsk
counter is changed to show the maximum between bounded CpuAsk and
unbounded CpuAsk after scaling against the QUERY_CPU_COUNT_DIVISOR
option. A new counter CpuAskBounded shows the bounded CpuAsk after
scaling. The EffectiveParallelism counter remains unchanged, showing
bounded CpuAsk before scaling.

Testing:
- Update and pass FE test TpcdsCpuCostPlannerTest and
  PlannerTest#testProcessingCost.
- Pass EE test tests/query_test/test_tpcds_queries.py
- Pass custom cluster test tests/custom_cluster/test_executor_groups.py

Change-Id: I5441e31088f90761062af35862be4ce09d116923
---
M be/src/scheduling/scheduler.cc
M common/thrift/Query.thrift
M fe/src/main/java/org/apache/impala/planner/CostingSegment.java
M fe/src/main/java/org/apache/impala/planner/PlanFragment.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds-processing-cost.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q04.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q06.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q08.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q09.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q11.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q12.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q13.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q14a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q14b.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q15.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q17.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q20.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q22.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q23a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q23b.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q24a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q24b.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q28.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q29.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q31.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q32.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q36.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q37.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q38.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q40.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q43.test
M 

[Impala-ASF-CR] IMPALA-11871: Skip permissions check on HDFS files if Ranger is enabled

2024-04-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20221 )

Change subject: IMPALA-11871: Skip permissions check on HDFS files if Ranger is 
enabled
..


Patch Set 10: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/20221
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id33c400fbe0c918b6b65d713b09009512835a4c9
Gerrit-Change-Number: 20221
Gerrit-PatchSet: 10
Gerrit-Owner: Halim Kim 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Halim Kim 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 11 Apr 2024 06:27:29 +
Gerrit-HasComments: No