[Impala-ASF-CR] IMPALA-12782: Show info of the event processing in /events webUI

2024-03-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20986 )

Change subject: IMPALA-12782: Show info of the event processing in /events webUI
..


Patch Set 9:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15507/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20986
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2e7d4952c7fd04ae89b6751204499bf9dd99f57c
Gerrit-Change-Number: 20986
Gerrit-PatchSet: 9
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Thu, 14 Mar 2024 04:17:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12782: Show info of the event processing in /events webUI

2024-03-13 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20986 )

Change subject: IMPALA-12782: Show info of the event processing in /events webUI
..


Patch Set 8:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/20986/8//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20986/8//COMMIT_MSG@12
PS8, Line 12: processing
> nit: processed
Done


http://gerrit.cloudera.org:8080/#/c/20986/8/be/src/catalog/catalog-server.cc
File be/src/catalog/catalog-server.cc:

http://gerrit.cloudera.org:8080/#/c/20986/8/be/src/catalog/catalog-server.cc@980
PS8, Line 980: last_synced_event_id
> idea to reduce the chattiness of the code:
Good point! Added such a wrapper in be/src/util/json-util.h


http://gerrit.cloudera.org:8080/#/c/20986/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
File 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/20986/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@1192
PS3, Line 1192:* @throws MetastoreNotificationException
> One member when this seems problematic is currentEventIndex_ - if there is
Starts from PS6, resetProgressInfo() is called in start() so I think the 
metrics are good when the event processor was restarted.

I also thought about adding a new field to track the last failed event and 
always show it in the WebUI. But it's a bit confusing when the event processor 
is actually active (after restart). The current section of "Error Message" is 
moved to the top of the /events page  to indicate the status of the event 
processor. So I thought it would be better to add a new section about the event 
processing history. In that section we can show the top-10 expensive events, 
top-10 expensive tables, and the recent 10 failures, etc. Do you think it's ok 
to do it in a new JIRA?


http://gerrit.cloudera.org:8080/#/c/20986/8/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
File 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/20986/8/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@1146
PS8, Line 1146:  in case they are replaced
  : // when being used below
> "in case they are replaced concurrently"?
Done


http://gerrit.cloudera.org:8080/#/c/20986/8/tests/custom_cluster/test_web_pages.py
File tests/custom_cluster/test_web_pages.py:

http://gerrit.cloudera.org:8080/#/c/20986/8/tests/custom_cluster/test_web_pages.py@444
PS8, Line 444: catalogd_event_processing_delay
> Probably out of the scope of this patch, but a potential alternative would
Good point! Filed IMPALA-12901 for this.



--
To view, visit http://gerrit.cloudera.org:8080/20986
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2e7d4952c7fd04ae89b6751204499bf9dd99f57c
Gerrit-Change-Number: 20986
Gerrit-PatchSet: 8
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Thu, 14 Mar 2024 03:54:40 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12782: Show info of the event processing in /events webUI

2024-03-13 Thread Quanlong Huang (Code Review)
Hello k.venureddy2...@gmail.com, Sai Hemanth Gantasala, Csaba Ringhofer, Impala 
Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20986

to look at the new patch set (#9).

Change subject: IMPALA-12782: Show info of the event processing in /events webUI
..

IMPALA-12782: Show info of the event processing in /events webUI

The /events page of catalogd shows the metrics and status of the
event-processor. This patch adds more info in this page, including
 - lag info
 - current event batch that's being processed
See the screenshot attached in the JIRA for how it looks like.

Also moves the error message to the top to highlight the error status.

Adds a debug action, catalogd_event_processing_delay, to inject a sleep
while processing an event. So the web page can be captured more easily.

Also adds a missing test for showing the error message of
event-processing in the /events page.

Tests:
 - Add e2e test to verify the content of the page.

Change-Id: I2e7d4952c7fd04ae89b6751204499bf9dd99f57c
---
M be/src/catalog/catalog-server.cc
M be/src/util/json-util.h
M common/thrift/JniCatalog.thrift
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/util/DebugUtils.java
M tests/custom_cluster/test_web_pages.py
M www/events.tmpl
8 files changed, 286 insertions(+), 18 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/20986/9
--
To view, visit http://gerrit.cloudera.org:8080/20986
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2e7d4952c7fd04ae89b6751204499bf9dd99f57c
Gerrit-Change-Number: 20986
Gerrit-PatchSet: 9
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12782: Show info of the event processing in /events webUI

2024-03-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20986 )

Change subject: IMPALA-12782: Show info of the event processing in /events webUI
..


Patch Set 9:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/20986/9/be/src/catalog/catalog-server.cc
File be/src/catalog/catalog-server.cc:

http://gerrit.cloudera.org:8080/#/c/20986/9/be/src/catalog/catalog-server.cc@982
PS9, Line 982:   progress_info_obj.AddMember("last_synced_event_time_s", 
progress_info.last_synced_event_time_s);
line too long (98 > 90)


http://gerrit.cloudera.org:8080/#/c/20986/9/be/src/catalog/catalog-server.cc@1008
PS9, Line 1008: progress_info_obj.AddMember("min_event_time", 
ToStringFromUnix(progress_info.min_event_time_s));
line too long (100 > 90)


http://gerrit.cloudera.org:8080/#/c/20986/9/be/src/catalog/catalog-server.cc@1009
PS9, Line 1009: progress_info_obj.AddMember("max_event_time", 
ToStringFromUnix(progress_info.max_event_time_s));
line too long (100 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/20986
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2e7d4952c7fd04ae89b6751204499bf9dd99f57c
Gerrit-Change-Number: 20986
Gerrit-PatchSet: 9
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Thu, 14 Mar 2024 03:55:36 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12856: Event processor should ignore processing partition with empty partition values

2024-03-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21143 )

Change subject: IMPALA-12856: Event processor should ignore processing 
partition with empty partition values
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15506/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09
Gerrit-Change-Number: 21143
Gerrit-PatchSet: 2
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Thu, 14 Mar 2024 01:19:37 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12856: Event processor should ignore processing partition with empty partition values

2024-03-13 Thread Sai Hemanth Gantasala (Code Review)
Hello Quanlong Huang, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21143

to look at the new patch set (#2).

Change subject: IMPALA-12856: Event processor should ignore processing 
partition with empty partition values
..

IMPALA-12856: Event processor should ignore processing partition
with empty partition values

While processing partition related events, Event Processor (EP) is
facing IllegalStateException if the partition fetched from HMS has
empty partition values. Even though this is a bug in HMS which returns
partitions with empty values, EP should ignore such partitions instead
of throwing IllegalStateException.

Note: Added a debug option 'mock_empty_partition_values' to add
malformed partition objects.

Testing:
- Manually verified the test provided in jira details in local env.
- Added unit test to return empty partition values and verify EP state.

Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09
---
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/DebugUtils.java
M fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
6 files changed, 128 insertions(+), 8 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/21143/2
--
To view, visit http://gerrit.cloudera.org:8080/21143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09
Gerrit-Change-Number: 21143
Gerrit-PatchSet: 2
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12856: Event processor should ignore processing partition with empty partition values

2024-03-13 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21143 )

Change subject: IMPALA-12856: Event processor should ignore processing 
partition with empty partition values
..


Patch Set 2:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/21143/1/be/src/catalog/catalog-server.cc
File be/src/catalog/catalog-server.cc:

http://gerrit.cloudera.org:8080/#/c/21143/1/be/src/catalog/catalog-server.cc@192
PS1, Line 192: DECLARE_string(state_store_host);
> We don't need this for FE tests. We can use the debug_actions flag and add
Ack


http://gerrit.cloudera.org:8080/#/c/21143/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/21143/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2846
PS1, Line 2846:   public int reloadPartitionsFromNames(long eventId, String 
eventType,
> nit: it seems this is only used by the event-processor. We can pass in the
Ack


http://gerrit.cloudera.org:8080/#/c/21143/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2858
PS1, Line 2858: erro
> nit: error() seems more suitable. This can also be simplified as
Ack


http://gerrit.cloudera.org:8080/#/c/21143/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2891
PS1, Line 2891:*/
> nit: we can pass in the eventId and eventType to improve the logging
Only eventId is available, so I'm passing that for now.


http://gerrit.cloudera.org:8080/#/c/21143/1/fe/src/main/java/org/apache/impala/service/BackendConfig.java
File fe/src/main/java/org/apache/impala/service/BackendConfig.java:

http://gerrit.cloudera.org:8080/#/c/21143/1/fe/src/main/java/org/apache/impala/service/BackendConfig.java@461
PS1, Line 461:   public String debugActions() { return 
backendCfg_.debug_actions; }
> We can add setDebugAction() for the new FE test.
Ack


http://gerrit.cloudera.org:8080/#/c/21143/1/fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java
File fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java:

http://gerrit.cloudera.org:8080/#/c/21143/1/fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java@212
PS1, Line 212: if 
(DebugUtils.hasDebugAction(BackendConfig.INSTANCE.debugActions(),
> and use the debug action here like DebugUtils.hasDebugAction(BackendConfig.
Ack



--
To view, visit http://gerrit.cloudera.org:8080/21143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09
Gerrit-Change-Number: 21143
Gerrit-PatchSet: 2
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Thu, 14 Mar 2024 00:55:59 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12881: Use getFkPkJoinCardinality to reduce scan cardinality

2024-03-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21118 )

Change subject: IMPALA-12881: Use getFkPkJoinCardinality to reduce scan 
cardinality
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15505/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6efafffc8f96247a860b88e85d9097b2b4327f32
Gerrit-Change-Number: 21118
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Thu, 14 Mar 2024 00:16:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12881: Use getFkPkJoinCardinality to reduce scan cardinality

2024-03-13 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21118 )

Change subject: IMPALA-12881: Use getFkPkJoinCardinality to reduce scan 
cardinality
..


Patch Set 4:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/21118/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21118/3//COMMIT_MSG@18
PS3, Line 18: d on storage
> Can you make this more explicit? This means partition columns and Kudu scan
Done


http://gerrit.cloudera.org:8080/#/c/21118/3//COMMIT_MSG@18
PS3, Line 18: ilters that are eva
> nit: filters that are evaluated
Done


http://gerrit.cloudera.org:8080/#/c/21118/3//COMMIT_MSG@21
PS3, Line 21: will keep using JoinNode.computeGenericJoinCardinality().
:
> Will check other planner tests.
Done


http://gerrit.cloudera.org:8080/#/c/21118/3/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java
File fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java:

http://gerrit.cloudera.org:8080/#/c/21118/3/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@840
PS3, Line 840: filter
> nit: filters
Done



--
To view, visit http://gerrit.cloudera.org:8080/21118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6efafffc8f96247a860b88e85d9097b2b4327f32
Gerrit-Change-Number: 21118
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Wed, 13 Mar 2024 23:55:34 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12881: Use getFkPkJoinCardinality to reduce scan cardinality

2024-03-13 Thread Riza Suminto (Code Review)
Hello Daniel Becker, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21118

to look at the new patch set (#4).

Change subject: IMPALA-12881: Use getFkPkJoinCardinality to reduce scan 
cardinality
..

IMPALA-12881: Use getFkPkJoinCardinality to reduce scan cardinality

IMPALA-12018 adds reduceCardinalityForScanNode to lower cardinality
estimation when runtime filter is involved. It calls
JoinNode.computeGenericJoinCardinality(). However, if the originating
join node has FK-PK conjunct, it should be possible to obtain lower
cardinality estimate by calling JoinNode.getFkPkJoinCardinality()
instead.

This patch adds that analysis and calls
JoinNode.getFkPkJoinCardinality() when possible. It is, however, only
limited for runtime filters that are evaluated on storage layer, such as
partition filter and pushed-down Kudu filter, to avoid severe
underestimation. Runtime filters that evaluate at scan node at row level
will keep using JoinNode.computeGenericJoinCardinality().

Testing:
- Update TpcdsCpuCostPlannerTest.
- Pass FE tests.

Change-Id: I6efafffc8f96247a860b88e85d9097b2b4327f32
---
M fe/src/main/java/org/apache/impala/planner/JoinNode.java
M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/outer-to-inner-joins.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-cardinality-reduction-on-kudu.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-cardinality-reduction.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q13.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q17.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q19.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q25.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q29.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q33.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q34.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q42.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q46.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q48.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q49.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q52.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q53.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q55.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q56.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q60.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q61.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q63.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q64.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q66.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q75.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q89.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q04.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q07.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q13.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q14a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q15.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q17.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q18.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q19.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q23a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q23b.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q25.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q26.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q27.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q29.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q31.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q33.test
M 
testdata/work

[Impala-ASF-CR] IMPALA-12872: Use Calcite for ...

2024-03-13 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21109 )

Change subject: IMPALA-12872: Use Calcite for ...
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21109/5/fe/pom.xml
File fe/pom.xml:

http://gerrit.cloudera.org:8080/#/c/21109/5/fe/pom.xml@616
PS5, Line 616: 
 :   org.apache.calcite
 :   calcite-core
 :   1.36.0
 : 
 : 
 :   org.apache.calcite.avatica
 :   avatica-core
 :   1.23.0
 : 
I think fe doesn't actually need these dependencies, and they could be declared 
in experimental-planner's pom.xml.

Separately, the versions should be the variables declared in java/pom.xml (e.g. 
${calcite.version} or ${calcite.avatica.version} ). Make sure the values in 
bin/impala-config.sh match what you want (right now bin/impala-config.sh's 
version for Calcite is older).



--
To view, visit http://gerrit.cloudera.org:8080/21109
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I453fd75b7b705f4d7de1ed73c3e24cafad0b8c98
Gerrit-Change-Number: 21109
Gerrit-PatchSet: 5
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Comment-Date: Wed, 13 Mar 2024 23:52:03 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12872: Use Calcite for ...

2024-03-13 Thread Steve Carlin (Code Review)
Steve Carlin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21109 )

Change subject: IMPALA-12872: Use Calcite for ...
..


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21109/5/testdata/workloads/functional-query/queries/QueryTest/calcite.test
File testdata/workloads/functional-query/queries/QueryTest/calcite.test:

http://gerrit.cloudera.org:8080/#/c/21109/5/testdata/workloads/functional-query/queries/QueryTest/calcite.test@3
PS5, Line 3: # Regression test for IMPALA-938
Get rid of this


http://gerrit.cloudera.org:8080/#/c/21109/5/testdata/workloads/functional-query/queries/QueryTest/calcite.test@7
PS5, Line 7: # Regression test for IMPALA-938
Get rid of this



--
To view, visit http://gerrit.cloudera.org:8080/21109
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I453fd75b7b705f4d7de1ed73c3e24cafad0b8c98
Gerrit-Change-Number: 21109
Gerrit-PatchSet: 5
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Comment-Date: Wed, 13 Mar 2024 21:04:58 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12896: Avoid JDBC table to be set as transactional table

2024-03-13 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/21141 )

Change subject: IMPALA-12896: Avoid JDBC table to be set as transactional table
..

IMPALA-12896: Avoid JDBC table to be set as transactional table

In some deployment environment, JDBC tables are set as transactional
tables by default. This causes catalogd failed to load the metadata for
JDBC tables. This patch explicitly add table properties with
"transactional=false" for JDBC table to avoid the JDBC to be set as
transactional table.

The operations on JDBC table are processed only on coordinator. The
processed rows should be estimated as 0 for DataSourceScanNode by
planner so that coordinator-only query plans are generated for simple
queries on JDBC tables and queries could be executed without invoking
executor nodes. Also adds Preconditions.check to make sure numNodes
equals 1 for DataSourceScanNode.

Updates FileSystemUtil.copyFileFromUriToLocal() function to write log
message for all types of exceptions.

Testing:
 - Fixed planer tests for data source tables.
 - Ran end-to-end tests of JDBC tables with query option
   'exec_single_node_rows_threshold' as default value 100.
 - Passed core-tests.

Change-Id: I556faeda923a4a11d4bef8c1250c9616f77e6fa6
Reviewed-on: http://gerrit.cloudera.org:8080/21141
Reviewed-by: Riza Suminto 
Tested-by: Impala Public Jenkins 
---
M fe/src/main/java/org/apache/impala/catalog/DataSourceTable.java
M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
M fe/src/main/java/org/apache/impala/util/MaxRowsProcessedVisitor.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
M testdata/workloads/functional-planner/queries/PlannerTest/small-query-opt.test
M tests/custom_cluster/test_ext_data_sources.py
M tests/query_test/test_ext_data_sources.py
7 files changed, 42 insertions(+), 13 deletions(-)

Approvals:
  Riza Suminto: Looks good to me, approved
  Impala Public Jenkins: Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/21141
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I556faeda923a4a11d4bef8c1250c9616f77e6fa6
Gerrit-Change-Number: 21141
Gerrit-PatchSet: 4
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] IMPALA-12602: Unregister queries on idle timeout

2024-03-13 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21074 )

Change subject: IMPALA-12602: Unregister queries on idle timeout
..


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21074/5/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/21074/5/be/src/service/impala-server.cc@2880
PS5, Line 2880:
Is it possible for the session to be closed at this point? Wouldn't this mean 
that we'll add the query to idle_query_statuses_ but never remove it, as we 
couldn't erase it during the closing of the session?


http://gerrit.cloudera.org:8080/#/c/21074/3/tests/custom_cluster/test_query_expiration.py
File tests/custom_cluster/test_query_expiration.py:

http://gerrit.cloudera.org:8080/#/c/21074/3/tests/custom_cluster/test_query_expiration.py@140
PS3, Line 140: 'Invalid or unknown query handle' in str(e)
> __expect_expired fetches from short_timeout_expire_handle and time_limit_ex
Could this be made stricter by checking whether the handle is in 
[short_timeout_expire_handle, time_limit_expire_handle] and expect an error 
message based on that?



--
To view, visit http://gerrit.cloudera.org:8080/21074
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iacfc285ed3587892c7ec6f7df3b5f71c9e41baf0
Gerrit-Change-Number: 21074
Gerrit-PatchSet: 5
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Wed, 13 Mar 2024 18:59:04 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12894: (part 1) Turn off the count(*) optimisation for V2 Iceberg tables

2024-03-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/21139 )

Change subject: IMPALA-12894: (part 1) Turn off the count(*) optimisation for 
V2 Iceberg tables
..

IMPALA-12894: (part 1) Turn off the count(*) optimisation for V2 Iceberg tables

This is a part 1 change that turns off the count(*) optimisations for
V2 tables as there is a correctness issue with it. The reason is that
Spark compaction may leave some dangling delete files that mess up
the logic in Impala.

Change-Id: Ida9fb04fd076c987b6b5257ad801bf30f5900237
Reviewed-on: http://gerrit.cloudera.org:8080/21139
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M testdata/data/README
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/data/0-8-7d506ac2-9987-4514-8310-505eb02c528a-1.parquet
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/data/2b4453538b945045-7ba1864b_1900113267_data.0.parq
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/data/3549308fee10b145-141d9f69_502574269_data.0.parq
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/data/delete-3549308fee10b145-141d9f69_1919298510_data.0.parq
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/data/delete-ca41ed5edf889878-632c88f10001_1119661503_data.0.parq
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/52100098-3c71-4111-8d7e-1c02e8343a0e-m0.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/a69c2096-fc8b-4365-8b7b-3b561afdd7e2-m0.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/a69c2096-fc8b-4365-8b7b-3b561afdd7e2-m1.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/aa501eb1-924a-4460-a2a0-ad577de8aef5-m0.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/aa501eb1-924a-4460-a2a0-ad577de8aef5-m1.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/aa501eb1-924a-4460-a2a0-ad577de8aef5-m2.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/aa501eb1-924a-4460-a2a0-ad577de8aef5-m3.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/f6475cdb-128e-4438-ab63-2251736670ad-m0.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/snap-1208327814823543579-1-52100098-3c71-4111-8d7e-1c02e8343a0e.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/snap-37664836060851883-1-f6475cdb-128e-4438-ab63-2251736670ad.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/snap-5278394901353853232-1-aa501eb1-924a-4460-a2a0-ad577de8aef5.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/snap-6274599306850878811-1-a69c2096-fc8b-4365-8b7b-3b561afdd7e2.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/v1.metadata.json
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/v2.metadata.json
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/v3.metadata.json
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/v4.metadata.json
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/v5.metadata.json
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/version-hint.text
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
M 
testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-tables-hash-join.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-tables.test
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-v2-read-position-deletes-orc.test
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-v2-read-position-deletes.test
M tests/query_test/test_iceberg.py
32 files changed, 1,009 insertions(+), 244 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/21139
To unsubscribe, visit http:

[Impala-ASF-CR] IMPALA-12894: (part 1) Turn off the count(*) optimisation for V2 Iceberg tables

2024-03-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21139 )

Change subject: IMPALA-12894: (part 1) Turn off the count(*) optimisation for 
V2 Iceberg tables
..


Patch Set 5: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/21139
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ida9fb04fd076c987b6b5257ad801bf30f5900237
Gerrit-Change-Number: 21139
Gerrit-PatchSet: 5
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 13 Mar 2024 18:59:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12426: Query History Table

2024-03-13 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20770 )

Change subject: IMPALA-12426: Query History Table
..


Patch Set 46:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/20770/46/be/src/service/workload-management.h
File be/src/service/workload-management.h:

http://gerrit.cloudera.org:8080/#/c/20770/46/be/src/service/workload-management.h@519
PS46, Line 519: std::condition_variable completed_queries_cv_;
These all rely on workload-management.h only being included by 
workload-management.cc (since they're not marked extern).

This would all be simpler to make safe if it were a class rather than a 
namespace.



--
To view, visit http://gerrit.cloudera.org:8080/20770
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844
Gerrit-Change-Number: 20770
Gerrit-PatchSet: 46
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Wed, 13 Mar 2024 18:54:41 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12426: Query History Table

2024-03-13 Thread Michael Smith (Code Review)
Michael Smith has removed a vote on this change.

Change subject: IMPALA-12426: Query History Table
..


Removed Code-Review+1 by Michael Smith 
--
To view, visit http://gerrit.cloudera.org:8080/20770
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844
Gerrit-Change-Number: 20770
Gerrit-PatchSet: 46
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 


[Impala-ASF-CR] IMPALA-12626: Capture tables in query for log

2024-03-13 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20886 )

Change subject: IMPALA-12626: Capture tables in query for log
..


Patch Set 14:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/20886/14/tests/util/workload_management.py
File tests/util/workload_management.py:

http://gerrit.cloudera.org:8080/#/c/20886/14/tests/util/workload_management.py@603
PS14, Line 603: assert data[index] != ""
> Please update this assert to check that the data in the query table is accu
That's not trivial to check, as it's not included in the profile. I guess we 
could also add it to the profile.



--
To view, visit http://gerrit.cloudera.org:8080/20886
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9c9c80b2adf7f3e44225a191fe8eb9df3c4bc5aa
Gerrit-Change-Number: 20886
Gerrit-PatchSet: 14
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Wed, 13 Mar 2024 18:29:54 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12426: Query History Table

2024-03-13 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20770 )

Change subject: IMPALA-12426: Query History Table
..


Patch Set 46:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/20770/46/tests/util/workload_management.py
File tests/util/workload_management.py:

http://gerrit.cloudera.org:8080/#/c/20770/46/tests/util/workload_management.py@684
PS46, Line 684:   assert sql_results.column_labels[index] == "PLAN"
Since you defined all the variables at the top, it'd make sense to use them for 
the comparison too.



--
To view, visit http://gerrit.cloudera.org:8080/20770
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844
Gerrit-Change-Number: 20770
Gerrit-PatchSet: 46
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Wed, 13 Mar 2024 18:28:59 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12426: Query History Table

2024-03-13 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20770 )

Change subject: IMPALA-12426: Query History Table
..


Patch Set 46: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/20770
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844
Gerrit-Change-Number: 20770
Gerrit-PatchSet: 46
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Wed, 13 Mar 2024 18:25:47 +
Gerrit-HasComments: No


[native-toolchain-CR] IMPALA-12900: Build binutils with -O3

2024-03-13 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21145 )

Change subject: IMPALA-12900: Build binutils with -O3
..


Patch Set 3: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/21145
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2e75db0759b4d3d4e6cc2ce929b1741808f1b771
Gerrit-Change-Number: 21145
Gerrit-PatchSet: 3
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Wed, 13 Mar 2024 18:11:48 +
Gerrit-HasComments: No


[native-toolchain-CR] Omnibus: Add new versions of several libraries

2024-03-13 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21144 )

Change subject: Omnibus: Add new versions of several libraries
..


Patch Set 3: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/21144
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2a900bc9b0c939391a3c6f970b1b5d99fbb42fcf
Gerrit-Change-Number: 21144
Gerrit-PatchSet: 3
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Wed, 13 Mar 2024 18:11:27 +
Gerrit-HasComments: No


[native-toolchain-CR] Omnibus: Add new versions of several libraries

2024-03-13 Thread Joe McDonnell (Code Review)
Hello Michael Smith,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21144

to look at the new patch set (#3).

Change subject: Omnibus: Add new versions of several libraries
..

Omnibus: Add new versions of several libraries

To economize on toolchain builds, this adds new versions for
several libraries while keeping the existing version. This
allows us to test and upgrade the libraries without a new
toolchain build for each upgrade.

Here are the included new versions:
1. RE2 2023-03-01 (for IMPALA-12363)
2. ZLib 1.3.1
3. Cloudflare Zlib 7aa510344e
4. Gperftools 2.15
5. LZ4 1.9.4 (for IMPALA-12368)
6. ZSTD 1.5.5 (for IMPALA-12369)
7. Libunwind 1.8.1
8. CCTZ 2.3 and 2.4
9. jwt-cpp 0.6.0 and 0.7.0
10. Arrow 13.0.0
11. Mold 2.4.1
12. Python 3.8.18

Since Arrow and Mold have not been used by Impala yet, this removes
the code to build the old version. All others are additional builds
without removing the old version. These are small libraries, so
this has minimal impact on build time.

Testing:
 - Ran x86_64 / ARM toolchain builds

Change-Id: I2a900bc9b0c939391a3c6f970b1b5d99fbb42fcf
---
M buildall.sh
D 
source/arrow/arrow-9.0.0-patches/0001-ARROW-17847-C-Support-unquoted-decimal-in-JSON-parse.patch
D 
source/arrow/arrow-9.0.0-patches/0002-ARROW-17995-C-Fix-json-decimals-not-being-rescaled-b.patch
M source/arrow/build.sh
M source/re2/build.sh
5 files changed, 23 insertions(+), 469 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/native-toolchain 
refs/changes/44/21144/3
--
To view, visit http://gerrit.cloudera.org:8080/21144
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2a900bc9b0c939391a3c6f970b1b5d99fbb42fcf
Gerrit-Change-Number: 21144
Gerrit-PatchSet: 3
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 


[Impala-ASF-CR] IMPALA-12626: Capture tables in query for log

2024-03-13 Thread Jason Fehr (Code Review)
Jason Fehr has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20886 )

Change subject: IMPALA-12626: Capture tables in query for log
..


Patch Set 14:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/20886/7/be/src/service/query-state.cc
File be/src/service/query-state.cc:

http://gerrit.cloudera.org:8080/#/c/20886/7/be/src/service/query-state.cc@182
PS7, Line 182:
> Oops, I was thinking of a different problem. But the list of tables is alre
On second thought, there is no sense in sorting the list of tables since the 
end users will sort (or not sort) according to their needs.


http://gerrit.cloudera.org:8080/#/c/20886/14/tests/util/workload_management.py
File tests/util/workload_management.py:

http://gerrit.cloudera.org:8080/#/c/20886/14/tests/util/workload_management.py@603
PS14, Line 603: assert data[index] != ""
Please update this assert to check that the data in the query table is accurate.



--
To view, visit http://gerrit.cloudera.org:8080/20886
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9c9c80b2adf7f3e44225a191fe8eb9df3c4bc5aa
Gerrit-Change-Number: 20886
Gerrit-PatchSet: 14
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Wed, 13 Mar 2024 17:58:10 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12881: Use getFkPkJoinCardinality to reduce scan cardinality

2024-03-13 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21118 )

Change subject: IMPALA-12881: Use getFkPkJoinCardinality to reduce scan 
cardinality
..


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/21118/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21118/3//COMMIT_MSG@19
PS3, Line 19: Runtime filters that evaluate at scan node at
: row level will keep using 
JoinNode.computeGenericJoinCardinality().
> I don't get the reasoning behind this - if the runtime filter is estimated
In IMPALA-12018, I don't make distinction on whether a runtime filter still 
need to materialize row in scan node (row-level filter) or not (partition 
filter).

I initially tried to replace computeGenericJoinCardinality() entirely with 
getFkPkJoinCardinality(), but then I see underestimation in few cases. It looks 
like row-level runtime filter is either late or disabled by scan node due to 
ineffectiveness. This is the reason why this patch only use 
getFkPkJoinCardinality() (that result in lower cardinality estimation) for 
partition filter or pushed-down Kudu filter only.


http://gerrit.cloudera.org:8080/#/c/21118/3//COMMIT_MSG@21
PS3, Line 21:
: Testi
> All the test changes I checked are about HDFS tables. As the change should
Will check other planner tests.


http://gerrit.cloudera.org:8080/#/c/21118/3/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q04.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q04.test:

http://gerrit.cloudera.org:8080/#/c/21118/3/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q04.test@a178
PS3, Line 178:
> What change leads to skipping the "filtered from" part in the plan?
I think it was off-by-1 due to rounding.



--
To view, visit http://gerrit.cloudera.org:8080/21118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6efafffc8f96247a860b88e85d9097b2b4327f32
Gerrit-Change-Number: 21118
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Wed, 13 Mar 2024 17:30:09 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the coordinator

2024-03-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21138 )

Change subject: IMPALA-12809: Iceberg metadata table scanner should always be 
scheduled to the coordinator
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15504/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21138
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
Gerrit-Change-Number: 21138
Gerrit-PatchSet: 2
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 13 Mar 2024 17:26:53 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12881: Use getFkPkJoinCardinality to reduce scan cardinality

2024-03-13 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21118 )

Change subject: IMPALA-12881: Use getFkPkJoinCardinality to reduce scan 
cardinality
..


Patch Set 3:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/21118/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21118/3//COMMIT_MSG@18
PS3, Line 18: ilter that evaluate
nit: filters that are evaluated


http://gerrit.cloudera.org:8080/#/c/21118/3//COMMIT_MSG@18
PS3, Line 18: storage layer
Can you make this more explicit? This means partition columns and Kudu scanners 
based on the code.


http://gerrit.cloudera.org:8080/#/c/21118/3//COMMIT_MSG@19
PS3, Line 19: Runtime filters that evaluate at scan node at
: row level will keep using 
JoinNode.computeGenericJoinCardinality().
I don't get the reasoning behind this - if the runtime filter is estimated on 
the storage level, then the effect on the plan should be even larger.


http://gerrit.cloudera.org:8080/#/c/21118/3//COMMIT_MSG@21
PS3, Line 21:
: Testi
All the test changes I checked are about HDFS tables. As the change should also 
affect Kudu table it would be nice to if there were some targeted planner test 
that shows how this change affects Kudu tables.


http://gerrit.cloudera.org:8080/#/c/21118/3/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java
File fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java:

http://gerrit.cloudera.org:8080/#/c/21118/3/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@840
PS3, Line 840: filter
nit: filters


http://gerrit.cloudera.org:8080/#/c/21118/3/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q04.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q04.test:

http://gerrit.cloudera.org:8080/#/c/21118/3/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q04.test@a178
PS3, Line 178:
What change leads to skipping the "filtered from" part in the plan?



--
To view, visit http://gerrit.cloudera.org:8080/21118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6efafffc8f96247a860b88e85d9097b2b4327f32
Gerrit-Change-Number: 21118
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Wed, 13 Mar 2024 17:04:38 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the coordinator

2024-03-13 Thread Daniel Becker (Code Review)
Daniel Becker has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/21138 )

Change subject: IMPALA-12809: Iceberg metadata table scanner should always be 
scheduled to the coordinator
..

IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the 
coordinator

On clusters with dedicated coordinators and executors the Iceberg
metadata scanner fragment(s) can be scheduled to executors, for example
during a join. The fragment in this case will fail a precondition check,
because either the 'frontend_' object or the table will not be present.

This change forces Iceberg metadata scanner fragments to be scheduled on
the coordinator. It is not enough to set the DataPartition type to
UNPARTITIONED, because unpartitioned fragments can still be scheduled on
executors. This change introduces a new flag in the TPlanFragment thrift
struct - if it is true, the fragment is always scheduled on the
coordinator.

Testing:
 - Added a regression test in test_coordinators.py.

Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
---
M be/src/scheduling/schedule-state.cc
M be/src/scheduling/schedule-state.h
M be/src/scheduling/scheduler.cc
M common/thrift/Planner.thrift
M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java
M fe/src/main/java/org/apache/impala/planner/PlanFragment.java
M tests/custom_cluster/test_coordinators.py
7 files changed, 63 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/21138/2
--
To view, visit http://gerrit.cloudera.org:8080/21138
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
Gerrit-Change-Number: 21138
Gerrit-PatchSet: 2
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[native-toolchain-CR] Omnibus: Add new versions of several libraries

2024-03-13 Thread Joe McDonnell (Code Review)
Joe McDonnell has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/21144 )

Change subject: Omnibus: Add new versions of several libraries
..

Omnibus: Add new versions of several libraries

To economize on toolchain builds, this adds new versions for
several libraries while keeping the existing version. This
allows us to test and upgrade the libraries without a new
toolchain build for each upgrade.

Here are the included new versions:
1. RE2 2023-03-01 (for IMPALA-12363)
2. ZLib 1.3.1
3. Cloudflare Zlib 7aa510344e
4. Gperftools 2.15
5. LZ4 1.9.4 (for IMPALA-12368)
6. ZSTD 1.5.5 (for IMPALA-12369)
7. Libunwind 1.8.1
8. CCTZ 2.3 and 2.4
9. jwt-cpp 0.6.0 and 0.7.0
10. Arrow 13.0.0
11. Mold 2.4.1
12. Python 3.8.18

Since Arrow is currently not used by Impala, this removes the
code to build the old version. All others are additional builds
without removing the old version. These are small libraries, so
this has minimal impact on build time.

Testing:
 - Ran x86_64 / ARM toolchain builds

Change-Id: I2a900bc9b0c939391a3c6f970b1b5d99fbb42fcf
---
M buildall.sh
D 
source/arrow/arrow-9.0.0-patches/0001-ARROW-17847-C-Support-unquoted-decimal-in-JSON-parse.patch
D 
source/arrow/arrow-9.0.0-patches/0002-ARROW-17995-C-Fix-json-decimals-not-being-rescaled-b.patch
M source/arrow/build.sh
M source/re2/build.sh
5 files changed, 23 insertions(+), 468 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/native-toolchain 
refs/changes/44/21144/2
--
To view, visit http://gerrit.cloudera.org:8080/21144
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2a900bc9b0c939391a3c6f970b1b5d99fbb42fcf
Gerrit-Change-Number: 21144
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 


[native-toolchain-CR] Omnibus: Add new versions of several libraries

2024-03-13 Thread Joe McDonnell (Code Review)
Joe McDonnell has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21144


Change subject: Omnibus: Add new versions of several libraries
..

Omnibus: Add new versions of several libraries

To economize on toolchain builds, this adds new versions for
several libraries while keeping the existing version. This
allows us to test and upgrade the libraries without a new
toolchain build for each upgrade.

Here are the included new versions:
1. RE2 2023-03-01 (for IMPALA-12363)
2. ZLib 1.3.1
3. Cloudflare Zlib 7aa510344e
4. Gperftools 2.15
5. LZ4 1.9.4 (for IMPALA-12368)
6. ZSTD 1.5.5 (for IMPALA-12369)
7. Libunwind 1.8.1
8. CCTZ 2.3 and 2.4
9. jwt-cpp 0.6.0 and 0.7.0
10. Arrow 13.0.0
11. Mold 2.4.1
12. Python 3.8.18

Since Arrow is currently not used by Impala, this removes the
code to build the old version. All others are additional builds
without removing the old version. These are small libraries, so
this has minimal impact on build time.

Testing:
 - Ran x86_64 / ARM toolchain builds

Change-Id: I2a900bc9b0c939391a3c6f970b1b5d99fbb42fcf
---
M buildall.sh
D 
source/arrow/arrow-9.0.0-patches/0001-ARROW-17847-C-Support-unquoted-decimal-in-JSON-parse.patch
D 
source/arrow/arrow-9.0.0-patches/0002-ARROW-17995-C-Fix-json-decimals-not-being-rescaled-b.patch
M source/arrow/build.sh
A 
source/libunwind/libunwind-1.8.1-patches/0001-libunwind-trace-cache-destructor.patch
M source/re2/build.sh
6 files changed, 55 insertions(+), 468 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/native-toolchain 
refs/changes/44/21144/1
--
To view, visit http://gerrit.cloudera.org:8080/21144
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I2a900bc9b0c939391a3c6f970b1b5d99fbb42fcf
Gerrit-Change-Number: 21144
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 


[native-toolchain-CR] IMPALA-12900: Build binutils with -O3

2024-03-13 Thread Joe McDonnell (Code Review)
Joe McDonnell has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21145


Change subject: IMPALA-12900: Build binutils with -O3
..

IMPALA-12900: Build binutils with -O3

The binutils build happens before we have switched over to
using the toolchain compiler. This means that it also does
not set CFLAGS/CXXFLAGS. The default optimization level
for binutils is -O2. It is possible that we could get a bit
extra speed by using -O3, so this sets CFLAGS/CXXFLAGS to use
-O3 for binutils.

Testing:
 - Toolchain builds on x86_64 and ARM

Change-Id: I2e75db0759b4d3d4e6cc2ce929b1741808f1b771
---
M source/binutils/build.sh
1 file changed, 6 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/native-toolchain 
refs/changes/45/21145/1
--
To view, visit http://gerrit.cloudera.org:8080/21145
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I2e75db0759b4d3d4e6cc2ce929b1741808f1b771
Gerrit-Change-Number: 21145
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 


[Impala-ASF-CR] IMPALA-12426: Query History Table

2024-03-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20770 )

Change subject: IMPALA-12426: Query History Table
..


Patch Set 46:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15503/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20770
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844
Gerrit-Change-Number: 20770
Gerrit-PatchSet: 46
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Wed, 13 Mar 2024 16:25:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12426: Query History Table

2024-03-13 Thread Jason Fehr (Code Review)
Hello Andrew Sherman, Riza Suminto, Michael Smith, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20770

to look at the new patch set (#46).

Change subject: IMPALA-12426: Query History Table
..

IMPALA-12426: Query History Table

Adds the ability for users to specify that Impala will create and
maintain an internal Iceberg table that contains data about all
completed queries. This table is automatically created at startup by
each coordinator if it does not exist. Then, most completed queries are
queued in memory and flushed to the query history table at a set
interval (either minutes or number of records). Set, use, and show
queries are not written to this table. This commit leverages the
InternalServer class to maintain the query history table.

Ctest unit tests have been added to assert the various pieces of code.
New custom cluster tests have been added to assert the query history
table is properly populated with completed queries.

Negative testing consists of attempting sql injection attacks and
syntactically incorrect queries.

Impala built-in string functions benchmarks have been updated to include
the new built-in functions.

Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844
---
M be/src/runtime/query-driver.cc
M be/src/runtime/query-driver.h
M be/src/service/CMakeLists.txt
M be/src/service/impala-http-handler.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/internal-server-test.cc
M be/src/service/internal-server.cc
M be/src/service/internal-server.h
M be/src/service/query-state-record.cc
A be/src/service/workload-management.cc
A be/src/service/workload-management.h
M be/src/util/CMakeLists.txt
M be/src/util/backend-gflag-util.cc
M be/src/util/impalad-metrics.cc
M be/src/util/impalad-metrics.h
A be/src/util/sql-util-test.cc
A be/src/util/sql-util.cc
A be/src/util/sql-util.h
M common/thrift/BackendGflags.thrift
M common/thrift/metrics.json
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/util/CatalogBlacklistUtils.java
M tests/authorization/test_provider.py
M tests/beeswax/impala_beeswax.py
M tests/common/custom_cluster_test_suite.py
M tests/common/impala_connection.py
A tests/custom_cluster/test_query_log.py
A tests/util/assert_time.py
A tests/util/memory.py
A tests/util/retry.py
A tests/util/workload_management.py
33 files changed, 3,589 insertions(+), 72 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/20770/46
--
To view, visit http://gerrit.cloudera.org:8080/20770
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844
Gerrit-Change-Number: 20770
Gerrit-PatchSet: 46
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 


[Impala-ASF-CR] IMPALA-12782: Show info of the event processing in /events webUI

2024-03-13 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20986 )

Change subject: IMPALA-12782: Show info of the event processing in /events webUI
..


Patch Set 8:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/20986/8//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20986/8//COMMIT_MSG@12
PS8, Line 12: processing
nit: processed


http://gerrit.cloudera.org:8080/#/c/20986/8/be/src/catalog/catalog-server.cc
File be/src/catalog/catalog-server.cc:

http://gerrit.cloudera.org:8080/#/c/20986/8/be/src/catalog/catalog-server.cc@980
PS8, Line 980: last_synced_event_id
idea to reduce the chattiness of the code:

a struct like ObjWrapper could be created with a template function 
AddMember(const char* name, const T& val), specialized for std::string.

This would allow skipping  creating local Value variables and passing the 
allocator in each AddMember().

So this could look like:
ObjWrapper progress_info_obj(allocator);
progress_info_obj.AddMember("last_synced_event_id", 
progress_info.last_synced_event_id)
...
document->AddMember("progress-info", progress_info_obj.value, allocator);

Note that RapidJson uses move semanctics, so after Value::AddMember("s", val, 
allocator), 'val' is set to null, so there is no need to keep it on the stack.


http://gerrit.cloudera.org:8080/#/c/20986/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
File 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/20986/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@1192
PS3, Line 1192:* @throws MetastoreNotificationException
> I remove calling it in finally since it will reset currentEvent_ for except
One member when this seems problematic is currentEventIndex_ - if there is an 
exception then it won't be reset, so after restarting the event processor 
currentEventIndex_ won't start from zero in the first batch.

What do you think about calling resetProgress() in finally, but in case of an 
exception, saving the the current event to a member like 
lastEventDuringException_? This could be returned instead of 
currentFilteredEvent_ in case currentFilteredEvent_ and the status is error. It 
could be also useful do display the last problematic event in webui, even if 
the event processor was restarted.


http://gerrit.cloudera.org:8080/#/c/20986/8/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
File 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/20986/8/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@1146
PS8, Line 1146:  in case they are replaced
  : // when being used below
"in case they are replaced concurrently"?

It could be also mentioned that the consistency of the members in progressInfo 
is not guaranteed in case of parallel modifications.


http://gerrit.cloudera.org:8080/#/c/20986/8/tests/custom_cluster/test_web_pages.py
File tests/custom_cluster/test_web_pages.py:

http://gerrit.cloudera.org:8080/#/c/20986/8/tests/custom_cluster/test_web_pages.py@444
PS8, Line 444: catalogd_event_processing_delay
Probably out of the scope of this patch, but a potential alternative would be 
to use table properties to inject debug actions to the event processor, so that 
the debug action would be triggered only for events from the specific table. 
This would allow these tests to run without using a custom cluster.



--
To view, visit http://gerrit.cloudera.org:8080/20986
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2e7d4952c7fd04ae89b6751204499bf9dd99f57c
Gerrit-Change-Number: 20986
Gerrit-PatchSet: 8
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Wed, 13 Mar 2024 16:02:54 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12426: Query History Table

2024-03-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20770 )

Change subject: IMPALA-12426: Query History Table
..


Patch Set 45:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/15502/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/20770
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844
Gerrit-Change-Number: 20770
Gerrit-PatchSet: 45
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Wed, 13 Mar 2024 15:45:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12426: Query History Table

2024-03-13 Thread Jason Fehr (Code Review)
Jason Fehr has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20770 )

Change subject: IMPALA-12426: Query History Table
..


Patch Set 45:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/20770/43/be/src/service/impala-server.h
File be/src/service/impala-server.h:

http://gerrit.cloudera.org:8080/#/c/20770/43/be/src/service/impala-server.h@1285
PS43, Line 1285:   /// Pass a session secret for validation.
> nit: can this be a struct instead of pair?
Done


http://gerrit.cloudera.org:8080/#/c/20770/43/be/src/service/workload-management.cc
File be/src/service/workload-management.cc:

http://gerrit.cloudera.org:8080/#/c/20770/43/be/src/service/workload-management.cc@212
PS43, Line 212: /// Generates the first portion of the DML that inserts records 
into the completed queries
> nit: we're already in workload-management.cc, might make sense to reference
Copy-paste error.  This variable declaration came from the header file.  Fixed.


http://gerrit.cloudera.org:8080/#/c/20770/43/be/src/service/workload-management.cc@470
PS43, Line 470: impala::worklo
> nit: Add comment mentioning that this is an attempt increment.
I just saw that too while reviewing my code.  Fixed.


http://gerrit.cloudera.org:8080/#/c/20770/41/tests/custom_cluster/test_query_log.py
File tests/custom_cluster/test_query_log.py:

http://gerrit.cloudera.org:8080/#/c/20770/41/tests/custom_cluster/test_query_log.py@168
PS41, Line 168:
"--shutdown_grace_period_s=10 "
  :  
"--shutdown_deadline_s=60",
  : 
catalogd_args="--enable_workload_mgmt",
  :
> In that case, please give distinctive comment for each of them.
Done



--
To view, visit http://gerrit.cloudera.org:8080/20770
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844
Gerrit-Change-Number: 20770
Gerrit-PatchSet: 45
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Wed, 13 Mar 2024 15:22:47 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12426: Query History Table

2024-03-13 Thread Jason Fehr (Code Review)
Hello Andrew Sherman, Riza Suminto, Michael Smith, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20770

to look at the new patch set (#45).

Change subject: IMPALA-12426: Query History Table
..

IMPALA-12426: Query History Table

Adds the ability for users to specify that Impala will create and
maintain an internal Iceberg table that contains data about all
completed queries. This table is automatically created at startup by
each coordinator if it does not exist. Then, most completed queries are
queued in memory and flushed to the query history table at a set
interval (either minutes or number of records). Set, use, and show
queries are not written to this table. This commit leverages the
InternalServer class to maintain the query history table.

Ctest unit tests have been added to assert the various pieces of code.
New custom cluster tests have been added to assert the query history
table is properly populated with completed queries.

Negative testing consists of attempting sql injection attacks and
syntactically incorrect queries.

Impala built-in string functions benchmarks have been updated to include
the new built-in functions.

Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844
---
M be/src/runtime/query-driver.cc
M be/src/runtime/query-driver.h
M be/src/service/CMakeLists.txt
M be/src/service/impala-http-handler.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/internal-server-test.cc
M be/src/service/internal-server.cc
M be/src/service/internal-server.h
M be/src/service/query-state-record.cc
A be/src/service/workload-management.cc
A be/src/service/workload-management.h
M be/src/util/CMakeLists.txt
M be/src/util/backend-gflag-util.cc
M be/src/util/impalad-metrics.cc
M be/src/util/impalad-metrics.h
A be/src/util/sql-util-test.cc
A be/src/util/sql-util.cc
A be/src/util/sql-util.h
M common/thrift/BackendGflags.thrift
M common/thrift/metrics.json
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/util/CatalogBlacklistUtils.java
M tests/authorization/test_provider.py
M tests/beeswax/impala_beeswax.py
M tests/common/custom_cluster_test_suite.py
M tests/common/impala_connection.py
A tests/custom_cluster/test_query_log.py
A tests/util/assert_time.py
A tests/util/memory.py
A tests/util/retry.py
A tests/util/workload_management.py
33 files changed, 3,589 insertions(+), 72 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/20770/45
--
To view, visit http://gerrit.cloudera.org:8080/20770
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844
Gerrit-Change-Number: 20770
Gerrit-PatchSet: 45
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 


[Impala-ASF-CR] IMPALA-12426: Query History Table

2024-03-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20770 )

Change subject: IMPALA-12426: Query History Table
..


Patch Set 44:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15501/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20770
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844
Gerrit-Change-Number: 20770
Gerrit-PatchSet: 44
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Wed, 13 Mar 2024 14:54:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12426: Query History Table

2024-03-13 Thread Jason Fehr (Code Review)
Hello Andrew Sherman, Riza Suminto, Michael Smith, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20770

to look at the new patch set (#44).

Change subject: IMPALA-12426: Query History Table
..

IMPALA-12426: Query History Table

Adds the ability for users to specify that Impala will create and
maintain an internal Iceberg table that contains data about all
completed queries. This table is automatically created at startup by
each coordinator if it does not exist. Then, most completed queries are
queued in memory and flushed to the query history table at a set
interval (either minutes or number of records). Set, use, and show
queries are not written to this table. This commit leverages the
InternalServer class to maintain the query history table.

Ctest unit tests have been added to assert the various pieces of code.
New custom cluster tests have been added to assert the query history
table is properly populated with completed queries.

Negative testing consists of attempting sql injection attacks and
syntactically incorrect queries.

Impala built-in string functions benchmarks have been updated to include
the new built-in functions.

Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844
---
M be/src/runtime/query-driver.cc
M be/src/runtime/query-driver.h
M be/src/service/CMakeLists.txt
M be/src/service/impala-http-handler.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/internal-server-test.cc
M be/src/service/internal-server.cc
M be/src/service/internal-server.h
M be/src/service/query-state-record.cc
A be/src/service/workload-management.cc
A be/src/service/workload-management.h
M be/src/util/CMakeLists.txt
M be/src/util/backend-gflag-util.cc
M be/src/util/impalad-metrics.cc
M be/src/util/impalad-metrics.h
A be/src/util/sql-util-test.cc
A be/src/util/sql-util.cc
A be/src/util/sql-util.h
M common/thrift/BackendGflags.thrift
M common/thrift/metrics.json
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/util/CatalogBlacklistUtils.java
M tests/authorization/test_provider.py
M tests/beeswax/impala_beeswax.py
M tests/common/custom_cluster_test_suite.py
M tests/common/impala_connection.py
A tests/custom_cluster/test_query_log.py
A tests/util/assert_time.py
A tests/util/memory.py
A tests/util/retry.py
A tests/util/workload_management.py
33 files changed, 3,578 insertions(+), 72 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/20770/44
--
To view, visit http://gerrit.cloudera.org:8080/20770
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844
Gerrit-Change-Number: 20770
Gerrit-PatchSet: 44
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 


[Impala-ASF-CR] IMPALA-12894: (part 1) Turn off the count(*) optimisation for V2 Iceberg tables

2024-03-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21139 )

Change subject: IMPALA-12894: (part 1) Turn off the count(*) optimisation for 
V2 Iceberg tables
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15500/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21139
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ida9fb04fd076c987b6b5257ad801bf30f5900237
Gerrit-Change-Number: 21139
Gerrit-PatchSet: 3
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 13 Mar 2024 14:23:03 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12894: (part 1) Turn off the count(*) optimisation for V2 Iceberg tables

2024-03-13 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21139 )

Change subject: IMPALA-12894: (part 1) Turn off the count(*) optimisation for 
V2 Iceberg tables
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21139
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ida9fb04fd076c987b6b5257ad801bf30f5900237
Gerrit-Change-Number: 21139
Gerrit-PatchSet: 4
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 13 Mar 2024 14:13:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12894: (part 1) Turn off the count(*) optimisation for V2 Iceberg tables

2024-03-13 Thread Gabor Kaszab (Code Review)
Hello Daniel Becker, Zoltan Borok-Nagy, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21139

to look at the new patch set (#4).

Change subject: IMPALA-12894: (part 1) Turn off the count(*) optimisation for 
V2 Iceberg tables
..

IMPALA-12894: (part 1) Turn off the count(*) optimisation for V2 Iceberg tables

This is a part 1 change that turns off the count(*) optimisations for
V2 tables as there is a correctness issue with it. The reason is that
Spark compaction may leave some dangling delete files that mess up
the logic in Impala.

Change-Id: Ida9fb04fd076c987b6b5257ad801bf30f5900237
---
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M testdata/data/README
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/data/0-8-7d506ac2-9987-4514-8310-505eb02c528a-1.parquet
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/data/2b4453538b945045-7ba1864b_1900113267_data.0.parq
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/data/3549308fee10b145-141d9f69_502574269_data.0.parq
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/data/delete-3549308fee10b145-141d9f69_1919298510_data.0.parq
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/data/delete-ca41ed5edf889878-632c88f10001_1119661503_data.0.parq
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/52100098-3c71-4111-8d7e-1c02e8343a0e-m0.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/a69c2096-fc8b-4365-8b7b-3b561afdd7e2-m0.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/a69c2096-fc8b-4365-8b7b-3b561afdd7e2-m1.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/aa501eb1-924a-4460-a2a0-ad577de8aef5-m0.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/aa501eb1-924a-4460-a2a0-ad577de8aef5-m1.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/aa501eb1-924a-4460-a2a0-ad577de8aef5-m2.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/aa501eb1-924a-4460-a2a0-ad577de8aef5-m3.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/f6475cdb-128e-4438-ab63-2251736670ad-m0.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/snap-1208327814823543579-1-52100098-3c71-4111-8d7e-1c02e8343a0e.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/snap-37664836060851883-1-f6475cdb-128e-4438-ab63-2251736670ad.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/snap-5278394901353853232-1-aa501eb1-924a-4460-a2a0-ad577de8aef5.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/snap-6274599306850878811-1-a69c2096-fc8b-4365-8b7b-3b561afdd7e2.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/v1.metadata.json
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/v2.metadata.json
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/v3.metadata.json
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/v4.metadata.json
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/v5.metadata.json
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/version-hint.text
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
M 
testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-tables-hash-join.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-tables.test
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-v2-read-position-deletes-orc.test
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-v2-read-position-deletes.test
M tests/query_test/test_iceberg.py
32 files changed, 1,009 insertions(+), 244 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/39/21139/4
--
To view, visit http://gerrit.cloudera.org:8080/21139
To unsubscribe, visit http://gerrit.cloudera.org:8080/setting

[Impala-ASF-CR] IMPALA-12894: Turn off the count(*) optimisation for V2 Iceberg tables

2024-03-13 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21139 )

Change subject: IMPALA-12894: Turn off the count(*) optimisation for V2 Iceberg 
tables
..


Patch Set 3: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21139/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21139/3//COMMIT_MSG@7
PS3, Line 7:
nit: maybe you could include "part 1" in the title



--
To view, visit http://gerrit.cloudera.org:8080/21139
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ida9fb04fd076c987b6b5257ad801bf30f5900237
Gerrit-Change-Number: 21139
Gerrit-PatchSet: 3
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 13 Mar 2024 14:11:10 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12894: (part 1) Turn off the count(*) optimisation for V2 Iceberg tables

2024-03-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21139 )

Change subject: IMPALA-12894: (part 1) Turn off the count(*) optimisation for 
V2 Iceberg tables
..


Patch Set 5: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21139
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ida9fb04fd076c987b6b5257ad801bf30f5900237
Gerrit-Change-Number: 21139
Gerrit-PatchSet: 5
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 13 Mar 2024 14:14:14 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12894: (part 1) Turn off the count(*) optimisation for V2 Iceberg tables

2024-03-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21139 )

Change subject: IMPALA-12894: (part 1) Turn off the count(*) optimisation for 
V2 Iceberg tables
..


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10375/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/21139
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ida9fb04fd076c987b6b5257ad801bf30f5900237
Gerrit-Change-Number: 21139
Gerrit-PatchSet: 5
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 13 Mar 2024 14:14:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12894: Turn off the count(*) optimisation for V2 Iceberg tables

2024-03-13 Thread Daniel Becker (Code Review)
Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21139 )

Change subject: IMPALA-12894: Turn off the count(*) optimisation for V2 Iceberg 
tables
..


Patch Set 3: Code-Review+1

Thanks.


--
To view, visit http://gerrit.cloudera.org:8080/21139
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ida9fb04fd076c987b6b5257ad801bf30f5900237
Gerrit-Change-Number: 21139
Gerrit-PatchSet: 3
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 13 Mar 2024 14:07:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12894: Turn off the count(*) optimisation for V2 Iceberg tables

2024-03-13 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21139 )

Change subject: IMPALA-12894: Turn off the count(*) optimisation for V2 Iceberg 
tables
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21139/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21139/2//COMMIT_MSG@11
PS2, Line 11:  u
> Nit: mess (plural).
Done


http://gerrit.cloudera.org:8080/#/c/21139/2/testdata/data/README
File testdata/data/README:

http://gerrit.cloudera.org:8080/#/c/21139/2/testdata/data/README@1093
PS2, Line 1093: iceberg_spark_compaction_with_dangling_delete:
> Could you also provide the SQL commands? It would then be possible to exact
Done



--
To view, visit http://gerrit.cloudera.org:8080/21139
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ida9fb04fd076c987b6b5257ad801bf30f5900237
Gerrit-Change-Number: 21139
Gerrit-PatchSet: 3
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 13 Mar 2024 14:05:18 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12264: Add limit on number of HS2 sessions per user.

2024-03-13 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21128 )

Change subject: IMPALA-12264: Add limit on number of HS2 sessions per user.
..


Patch Set 1:

(1 comment)

> Patch Set 1:
>
> CDPD-58529 / IMPALA-12264 requests that we should “prevent a rogue 
> application from repeatedly connecting to and monopolizing Impala”, and 
> suggests “we can add configurations to limit concurrent connections”. The 
> structure of thrift is such that it is hard to limit actual connections. For 
> now the plan is to limit the number of HS2 sessions per user. This is a 
> fairly straightforward change that I believe implements the main idea which 
> is to prevent rogue applications as the jira says. As with some other 
> features (e.g. -disconnected_session_timeout) this is just implemented for 
> HiveServer2. As Beeswax is now deprecated it is suggested that customers 
> disable beeswax access  for clients by setting the  --beeswax_port (Impala 
> Daemon Beeswax Port) flag to 0.

HS2 session can be used across different connections. It seems a rogue 
application can still open lots of connections and saturate the fe service 
threads. Haven't looked into the thrift structure yet. Do we know the user info 
when creating the connection? Maybe we can reject connections there?

http://gerrit.cloudera.org:8080/#/c/21128/1/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/21128/1/be/src/service/impala-server.cc@282
PS1, Line 282: DEFINE_int32(max_hs2_sessions_per_user, 0, "The maximum allowed 
number of HiveServer2 "
nit: maybe -1 is better since it's used more to mean "unlimited", e.g. 
query_log_size, catalog_operation_log_size



--
To view, visit http://gerrit.cloudera.org:8080/21128
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idd28edc352102d89774f6ece5376e7c79ae41aa8
Gerrit-Change-Number: 21128
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Wed, 13 Mar 2024 14:05:44 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12894: Turn off the count(*) optimisation for V2 Iceberg tables

2024-03-13 Thread Gabor Kaszab (Code Review)
Hello Daniel Becker, Zoltan Borok-Nagy, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21139

to look at the new patch set (#3).

Change subject: IMPALA-12894: Turn off the count(*) optimisation for V2 Iceberg 
tables
..

IMPALA-12894: Turn off the count(*) optimisation for V2 Iceberg tables

This is a part 1 change that turns off the count(*) optimisations for
V2 tables as there is a correctness issue with it. The reason is that
Spark compaction may leave some dangling delete files that mess up
the logic in Impala.

Change-Id: Ida9fb04fd076c987b6b5257ad801bf30f5900237
---
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M testdata/data/README
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/data/0-8-7d506ac2-9987-4514-8310-505eb02c528a-1.parquet
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/data/2b4453538b945045-7ba1864b_1900113267_data.0.parq
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/data/3549308fee10b145-141d9f69_502574269_data.0.parq
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/data/delete-3549308fee10b145-141d9f69_1919298510_data.0.parq
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/data/delete-ca41ed5edf889878-632c88f10001_1119661503_data.0.parq
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/52100098-3c71-4111-8d7e-1c02e8343a0e-m0.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/a69c2096-fc8b-4365-8b7b-3b561afdd7e2-m0.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/a69c2096-fc8b-4365-8b7b-3b561afdd7e2-m1.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/aa501eb1-924a-4460-a2a0-ad577de8aef5-m0.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/aa501eb1-924a-4460-a2a0-ad577de8aef5-m1.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/aa501eb1-924a-4460-a2a0-ad577de8aef5-m2.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/aa501eb1-924a-4460-a2a0-ad577de8aef5-m3.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/f6475cdb-128e-4438-ab63-2251736670ad-m0.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/snap-1208327814823543579-1-52100098-3c71-4111-8d7e-1c02e8343a0e.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/snap-37664836060851883-1-f6475cdb-128e-4438-ab63-2251736670ad.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/snap-5278394901353853232-1-aa501eb1-924a-4460-a2a0-ad577de8aef5.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/snap-6274599306850878811-1-a69c2096-fc8b-4365-8b7b-3b561afdd7e2.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/v1.metadata.json
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/v2.metadata.json
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/v3.metadata.json
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/v4.metadata.json
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/v5.metadata.json
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/version-hint.text
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
M 
testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-tables-hash-join.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-tables.test
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-v2-read-position-deletes-orc.test
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-v2-read-position-deletes.test
M tests/query_test/test_iceberg.py
32 files changed, 1,009 insertions(+), 244 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/39/21139/3
--
To view, visit http://gerrit.cloudera.org:8080/21139
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project:

[Impala-ASF-CR] IMPALA-12894: Turn off the count(*) optimisation for V2 Iceberg tables

2024-03-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21139 )

Change subject: IMPALA-12894: Turn off the count(*) optimisation for V2 Iceberg 
tables
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15499/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21139
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ida9fb04fd076c987b6b5257ad801bf30f5900237
Gerrit-Change-Number: 21139
Gerrit-PatchSet: 2
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 13 Mar 2024 14:03:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12894: Turn off the count(*) optimisation for V2 Iceberg tables

2024-03-13 Thread Daniel Becker (Code Review)
Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21139 )

Change subject: IMPALA-12894: Turn off the count(*) optimisation for V2 Iceberg 
tables
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21139/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21139/2//COMMIT_MSG@11
PS2, Line 11: es
Nit: mess (plural).


http://gerrit.cloudera.org:8080/#/c/21139/2/testdata/data/README
File testdata/data/README:

http://gerrit.cloudera.org:8080/#/c/21139/2/testdata/data/README@1093
PS2, Line 1093: iceberg_spark_compaction_with_dangling_delete:
Could you also provide the SQL commands? It would then be possible to exactly 
reproduce the table.



--
To view, visit http://gerrit.cloudera.org:8080/21139
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ida9fb04fd076c987b6b5257ad801bf30f5900237
Gerrit-Change-Number: 21139
Gerrit-PatchSet: 2
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 13 Mar 2024 13:50:39 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12782: Show info of the event processing in /events webUI

2024-03-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20986 )

Change subject: IMPALA-12782: Show info of the event processing in /events webUI
..


Patch Set 8:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15498/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20986
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2e7d4952c7fd04ae89b6751204499bf9dd99f57c
Gerrit-Change-Number: 20986
Gerrit-PatchSet: 8
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Wed, 13 Mar 2024 13:51:11 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12809: Iceberg metadata table scanner can be scheduled to executors

2024-03-13 Thread Noemi Pap-Takacs (Code Review)
Noemi Pap-Takacs has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21138 )

Change subject: IMPALA-12809: Iceberg metadata table scanner can be scheduled 
to executors
..


Patch Set 1:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/21138/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21138/1//COMMIT_MSG@7
PS1, Line 7: scanner can be scheduled to executors
I find the title a bit confusing. Could you make it clearer whether it 
describes the incorrect or the desired behavior?
For example "...scan crashes when scheduled to executors" or "...scanner should 
be scheduled to coordinator"


http://gerrit.cloudera.org:8080/#/c/21138/1/be/src/scheduling/scheduler.cc
File be/src/scheduling/scheduler.cc:

http://gerrit.cloudera.org:8080/#/c/21138/1/be/src/scheduling/scheduler.cc@474
PS1, Line 474: fragment.must_run_on_coordinator
The comments describe similar use cases for the first two flags. I wonder if 
this flag could be merged into 'is_coord_fragment' somehow.


http://gerrit.cloudera.org:8080/#/c/21138/1/fe/src/main/java/org/apache/impala/planner/PlanFragment.java
File fe/src/main/java/org/apache/impala/planner/PlanFragment.java:

http://gerrit.cloudera.org:8080/#/c/21138/1/fe/src/main/java/org/apache/impala/planner/PlanFragment.java@709
PS1, Line 709: [%s]
> If 'mustRunOnCoord_' is true, do you think we should mention it here? It ma
I think we should not introduce an inconsistent label, if we can not display a 
valid value for every case.


http://gerrit.cloudera.org:8080/#/c/21138/1/tests/custom_cluster/test_coordinators.py
File tests/custom_cluster/test_coordinators.py:

http://gerrit.cloudera.org:8080/#/c/21138/1/tests/custom_cluster/test_coordinators.py@296
PS1, Line 296:   def test_iceberg_metadata_scan_on_coord(self):
Could you add a test where you join the metadata table with a normal table - if 
this use case makes sense?



--
To view, visit http://gerrit.cloudera.org:8080/21138
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
Gerrit-Change-Number: 21138
Gerrit-PatchSet: 1
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 13 Mar 2024 13:41:09 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12894: Turn off the count(*) optimisation for V2 Iceberg tables

2024-03-13 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21139 )

Change subject: IMPALA-12894: Turn off the count(*) optimisation for V2 Iceberg 
tables
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21139/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21139/1//COMMIT_MSG@12
PS1, Line 12: logic
> logic?
Done


http://gerrit.cloudera.org:8080/#/c/21139/1/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
File fe/src/main/java/org/apache/impala/analysis/SelectStmt.java:

http://gerrit.cloudera.org:8080/#/c/21139/1/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java@1473
PS1, Line 1473:
> nit: missing space
Done



--
To view, visit http://gerrit.cloudera.org:8080/21139
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ida9fb04fd076c987b6b5257ad801bf30f5900237
Gerrit-Change-Number: 21139
Gerrit-PatchSet: 2
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 13 Mar 2024 13:39:23 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12894: Turn off the count(*) optimisation for V2 Iceberg tables

2024-03-13 Thread Gabor Kaszab (Code Review)
Hello Daniel Becker, Zoltan Borok-Nagy, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21139

to look at the new patch set (#2).

Change subject: IMPALA-12894: Turn off the count(*) optimisation for V2 Iceberg 
tables
..

IMPALA-12894: Turn off the count(*) optimisation for V2 Iceberg tables

This is a part 1 change that turns off the count(*) optimisations for
V2 tables as there is a correctness issue with it. The reason is that
Spark compaction may leave some dangling delete files that messes up
the logic in Impala.

Change-Id: Ida9fb04fd076c987b6b5257ad801bf30f5900237
---
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M testdata/data/README
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/data/0-8-7d506ac2-9987-4514-8310-505eb02c528a-1.parquet
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/data/2b4453538b945045-7ba1864b_1900113267_data.0.parq
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/data/3549308fee10b145-141d9f69_502574269_data.0.parq
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/data/delete-3549308fee10b145-141d9f69_1919298510_data.0.parq
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/data/delete-ca41ed5edf889878-632c88f10001_1119661503_data.0.parq
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/52100098-3c71-4111-8d7e-1c02e8343a0e-m0.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/a69c2096-fc8b-4365-8b7b-3b561afdd7e2-m0.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/a69c2096-fc8b-4365-8b7b-3b561afdd7e2-m1.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/aa501eb1-924a-4460-a2a0-ad577de8aef5-m0.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/aa501eb1-924a-4460-a2a0-ad577de8aef5-m1.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/aa501eb1-924a-4460-a2a0-ad577de8aef5-m2.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/aa501eb1-924a-4460-a2a0-ad577de8aef5-m3.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/f6475cdb-128e-4438-ab63-2251736670ad-m0.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/snap-1208327814823543579-1-52100098-3c71-4111-8d7e-1c02e8343a0e.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/snap-37664836060851883-1-f6475cdb-128e-4438-ab63-2251736670ad.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/snap-5278394901353853232-1-aa501eb1-924a-4460-a2a0-ad577de8aef5.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/snap-6274599306850878811-1-a69c2096-fc8b-4365-8b7b-3b561afdd7e2.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/v1.metadata.json
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/v2.metadata.json
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/v3.metadata.json
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/v4.metadata.json
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/v5.metadata.json
A 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_spark_compaction_with_dangling_delete/metadata/version-hint.text
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
M 
testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-tables-hash-join.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-tables.test
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-v2-read-position-deletes-orc.test
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-v2-read-position-deletes.test
M tests/query_test/test_iceberg.py
32 files changed, 998 insertions(+), 244 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/39/21139/2
--
To view, visit http://gerrit.cloudera.org:8080/21139
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project:

[Impala-ASF-CR] IMPALA-12782: Show info of the event processing in /events webUI

2024-03-13 Thread Quanlong Huang (Code Review)
Hello k.venureddy2...@gmail.com, Sai Hemanth Gantasala, Csaba Ringhofer, Impala 
Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20986

to look at the new patch set (#8).

Change subject: IMPALA-12782: Show info of the event processing in /events webUI
..

IMPALA-12782: Show info of the event processing in /events webUI

The /events page of catalogd shows the metrics and status of the
event-processor. This patch adds more info in this page, including
 - lag info
 - current event batch that's being processing
See the screenshot attached in the JIRA for how it looks like.

Also moves the error message to the top to highlight the error status.

Adds a debug action, catalogd_event_processing_delay, to inject a sleep
while processing an event. So the web page can be captured more easily.

Also adds a missing test for showing the error message of
event-processing in the /events page.

Tests:
 - Add e2e test to verify the content of the page.

Change-Id: I2e7d4952c7fd04ae89b6751204499bf9dd99f57c
---
M be/src/catalog/catalog-server.cc
M common/thrift/JniCatalog.thrift
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/util/DebugUtils.java
M tests/custom_cluster/test_web_pages.py
M www/events.tmpl
7 files changed, 296 insertions(+), 18 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/20986/8
--
To view, visit http://gerrit.cloudera.org:8080/20986
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2e7d4952c7fd04ae89b6751204499bf9dd99f57c
Gerrit-Change-Number: 20986
Gerrit-PatchSet: 8
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12782: Show info of the event processing in /events webUI

2024-03-13 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20986 )

Change subject: IMPALA-12782: Show info of the event processing in /events webUI
..


Patch Set 8:

PS8 adds a new test to verify the error message for exceptions.


--
To view, visit http://gerrit.cloudera.org:8080/20986
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2e7d4952c7fd04ae89b6751204499bf9dd99f57c
Gerrit-Change-Number: 20986
Gerrit-PatchSet: 8
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Wed, 13 Mar 2024 13:29:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] [WIP] IMPALA-12856: Event processor should ignore processing partition with empty partition values

2024-03-13 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21143 )

Change subject: [WIP] IMPALA-12856: Event processor should ignore processing 
partition with empty partition values
..


Patch Set 1:

(6 comments)

LGTM, just have some minor comments to make this a FE-only change.

http://gerrit.cloudera.org:8080/#/c/21143/1/be/src/catalog/catalog-server.cc
File be/src/catalog/catalog-server.cc:

http://gerrit.cloudera.org:8080/#/c/21143/1/be/src/catalog/catalog-server.cc@192
PS1, Line 192: DEFINE_bool_hidden(is_return_empty_partition_values, false, 
"This configuration is used "
We don't need this for FE tests. We can use the debug_actions flag and add an 
action appropriately.


http://gerrit.cloudera.org:8080/#/c/21143/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/21143/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2846
PS1, Line 2846:   public int reloadPartitionsFromNames(IMetaStoreClient client,
nit: it seems this is only used by the event-processor. We can pass in the 
eventId and eventType to improve the logging if that's not too hard.


http://gerrit.cloudera.org:8080/#/c/21143/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2858
PS1, Line 2858: warn
nit: error() seems more suitable. This can also be simplified as

  LOG.error("Received partition with empty values: {}. \nIgnoring" +
  " reloading the partition.", partition));


http://gerrit.cloudera.org:8080/#/c/21143/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2891
PS1, Line 2891:   public int reloadPartitionsFromEvent(IMetaStoreClient client,
nit: we can pass in the eventId and eventType to improve the logging


http://gerrit.cloudera.org:8080/#/c/21143/1/fe/src/main/java/org/apache/impala/service/BackendConfig.java
File fe/src/main/java/org/apache/impala/service/BackendConfig.java:

http://gerrit.cloudera.org:8080/#/c/21143/1/fe/src/main/java/org/apache/impala/service/BackendConfig.java@461
PS1, Line 461:   public String debugActions() { return 
backendCfg_.debug_actions; }
We can add setDebugAction() for the new FE test.


http://gerrit.cloudera.org:8080/#/c/21143/1/fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java
File fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java:

http://gerrit.cloudera.org:8080/#/c/21143/1/fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java@212
PS1, Line 212: if 
(BackendConfig.INSTANCE.getIsReturnEmptyPartitionValues()) {
and use the debug action here like 
DebugUtils.hasDebugAction(BackendConfig.INSTANCE.debugActions(), 
DebugUtils.MOCK_EMPTY_PARTITION_VALUES)



--
To view, visit http://gerrit.cloudera.org:8080/21143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09
Gerrit-Change-Number: 21143
Gerrit-PatchSet: 1
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Wed, 13 Mar 2024 12:44:09 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12782: Show info of the event processing in /events webUI

2024-03-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20986 )

Change subject: IMPALA-12782: Show info of the event processing in /events webUI
..


Patch Set 7:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15497/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20986
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2e7d4952c7fd04ae89b6751204499bf9dd99f57c
Gerrit-Change-Number: 20986
Gerrit-PatchSet: 7
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Wed, 13 Mar 2024 12:02:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12782: Show info of the event processing in /events webUI

2024-03-13 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20986 )

Change subject: IMPALA-12782: Show info of the event processing in /events webUI
..


Patch Set 7:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/20986/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
File 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/20986/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@1192
PS3, Line 1192:* @throws MetastoreNotificationException
> Sorry i didn't realise that resetProgress() was called in finally. It is no
I remove calling it in finally since it will reset currentEvent_ for exceptions 
which causes no event details shown in the error message. So we still need this 
here.


http://gerrit.cloudera.org:8080/#/c/20986/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
File 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/20986/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@1234
PS6, Line 1234:   String desc = String.format("Processing %s on %s, 
eventId=%d",
> Nit: we could probably remove L#1195 and L#1226 and add that here.
It's intended to not resetting the progress info when hitting exceptions. So 
currentEvent_ won't be null and can be used in the error message at L1001.


http://gerrit.cloudera.org:8080/#/c/20986/6/www/events.tmpl
File www/events.tmpl:

http://gerrit.cloudera.org:8080/#/c/20986/6/www/events.tmpl@24
PS6, Line 24: {{event_processor_error_msg}}
> nit: will we have event-id and decompressed event message in the event mess
Yes but with compressed event message. There is an example in IMPALA-12053. 
This is an existing feature added by IMPALA-12053. Just move it to the top to 
highlight the error.


http://gerrit.cloudera.org:8080/#/c/20986/6/www/events.tmpl@43
PS6, Line 43: Latest Event in Metastore
> nit: should we say "Latest Event in Metastore"?
Done



--
To view, visit http://gerrit.cloudera.org:8080/20986
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2e7d4952c7fd04ae89b6751204499bf9dd99f57c
Gerrit-Change-Number: 20986
Gerrit-PatchSet: 7
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Wed, 13 Mar 2024 11:39:32 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12782: Show info of the event processing in /events webUI

2024-03-13 Thread Quanlong Huang (Code Review)
Hello k.venureddy2...@gmail.com, Sai Hemanth Gantasala, Csaba Ringhofer, Impala 
Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20986

to look at the new patch set (#7).

Change subject: IMPALA-12782: Show info of the event processing in /events webUI
..

IMPALA-12782: Show info of the event processing in /events webUI

The /events page of catalogd shows the metrics and status of the
event-processor. This patch adds more info in this page, including
 - lag info
 - current event batch that's being processing
See the screenshot attached in the JIRA for how it looks like.

Also moves the error message to the top to highlight the error status.

Adds a debug action, catalogd_event_processing_delay, to inject a sleep
while processing an event. So the web page can be captured more easily.

Tests:
 - Add e2e test to verify the content of the page.

Change-Id: I2e7d4952c7fd04ae89b6751204499bf9dd99f57c
---
M be/src/catalog/catalog-server.cc
M common/thrift/JniCatalog.thrift
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/util/DebugUtils.java
M tests/custom_cluster/test_web_pages.py
M www/events.tmpl
7 files changed, 279 insertions(+), 18 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/20986/7
--
To view, visit http://gerrit.cloudera.org:8080/20986
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2e7d4952c7fd04ae89b6751204499bf9dd99f57c
Gerrit-Change-Number: 20986
Gerrit-PatchSet: 7
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor

2024-03-13 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21019 )

Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE 
events with trivial changes in StorageDescriptor
..


Patch Set 6:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/21019/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21019/6//COMMIT_MSG@14
PS6, Line 14: =true
nit: remove "=true" ?


http://gerrit.cloudera.org:8080/#/c/21019/6//COMMIT_MSG@15
PS6, Line 15: Also introduced a small
: optimization to skip reloading of table schema
nit: this is stale now


http://gerrit.cloudera.org:8080/#/c/21019/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/21019/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1936
PS6, Line 1936:   "whitelisted config: {}, value before: {}, value 
after: {}", dbName_,
nit: let's also add "So file metadata should be reloaded" here.


http://gerrit.cloudera.org:8080/#/c/21019/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1971
PS6, Line 1971:   return false;
So we return false for other unknown cases. In the future if a new field is 
added to the definition of SD (e.g. in a higher Hive version) and the change of 
it is non-trivial (requires file metadata reloading), it will be skipped here 
(considered as trivial change). This is the case mentioned by Csaba that we 
need to address.

As this method is only used as !isNonTrivialSdPropsChanged() at line 1886, we 
can convert it to isTrivialSdPropsChanged() and only return true for known 
cases (i.e. changes on location, input/output format, serde, 
storedAsSubDirectories).

The difference is that canSkipFileMetadataReload() will only return true for 
known cases.


http://gerrit.cloudera.org:8080/#/c/21019/6/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/21019/6/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3050
PS6, Line 3050: assertNotEquals(fileMetadataLoadAfter + 1, 
fileMetadataLoadBefore);
Why do we change this? fileMetadataLoadAfter >= fileMetadataLoadBefore is 
always true, so fileMetadataLoadAfter + 1 != fileMetadataLoadBefore is always 
true. The assertion becomes meaningless.

Do you want to use assertEquals(fileMetadataLoadAfter, fileMetadataLoadBefore + 
1) instead?



--
To view, visit http://gerrit.cloudera.org:8080/21019
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac
Gerrit-Change-Number: 21019
Gerrit-PatchSet: 6
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Wed, 13 Mar 2024 11:13:15 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12693: [DOCS] Typo in link for ltrim in string functions docs

2024-03-13 Thread Daniel Becker (Code Review)
Daniel Becker has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/21123 )

Change subject: IMPALA-12693: [DOCS] Typo in link for ltrim in string functions 
docs
..

IMPALA-12693: [DOCS] Typo in link for ltrim in string functions docs

Fixed documentation typo for LTRIM string function, from LTRI to LTRIM.

Change-Id: If4345fc6d19f04d0c0c6feef3e0c8598271224fe
Reviewed-on: http://gerrit.cloudera.org:8080/21123
Tested-by: Impala Public Jenkins 
Reviewed-by: Daniel Becker 
---
M docs/topics/impala_string_functions.xml
1 file changed, 1 insertion(+), 1 deletion(-)

Approvals:
  Impala Public Jenkins: Verified
  Daniel Becker: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/21123
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: If4345fc6d19f04d0c0c6feef3e0c8598271224fe
Gerrit-Change-Number: 21123
Gerrit-PatchSet: 5
Gerrit-Owner: Saurabh Katiyal 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-12693: [DOCS] Typo in link for ltrim in string functions docs

2024-03-13 Thread Daniel Becker (Code Review)
Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21123 )

Change subject: IMPALA-12693: [DOCS] Typo in link for ltrim in string functions 
docs
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21123
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If4345fc6d19f04d0c0c6feef3e0c8598271224fe
Gerrit-Change-Number: 21123
Gerrit-PatchSet: 4
Gerrit-Owner: Saurabh Katiyal 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 13 Mar 2024 09:20:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12693: [DOCS] Typo in link for ltrim in string functions docs

2024-03-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21123 )

Change subject: IMPALA-12693: [DOCS] Typo in link for ltrim in string functions 
docs
..


Patch Set 4: Verified+1

Build Successful

https://jenkins.impala.io/job/gerrit-docs-auto-test/757/ : Doc tests passed.


--
To view, visit http://gerrit.cloudera.org:8080/21123
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If4345fc6d19f04d0c0c6feef3e0c8598271224fe
Gerrit-Change-Number: 21123
Gerrit-PatchSet: 4
Gerrit-Owner: Saurabh Katiyal 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 13 Mar 2024 08:36:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12693: [DOCS] Typo in link for ltrim in string functions docs

2024-03-13 Thread Saurabh Katiyal (Code Review)
Saurabh Katiyal has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/21123 )

Change subject: IMPALA-12693: [DOCS] Typo in link for ltrim in string functions 
docs
..

IMPALA-12693: [DOCS] Typo in link for ltrim in string functions docs

Fixed documentation typo for LTRIM string function, from LTRI to LTRIM.

Change-Id: If4345fc6d19f04d0c0c6feef3e0c8598271224fe
---
M docs/topics/impala_string_functions.xml
1 file changed, 1 insertion(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/21123/4
--
To view, visit http://gerrit.cloudera.org:8080/21123
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If4345fc6d19f04d0c0c6feef3e0c8598271224fe
Gerrit-Change-Number: 21123
Gerrit-PatchSet: 4
Gerrit-Owner: Saurabh Katiyal 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-12693: [DOCS] Typo in link for ltrim in string functions docs

2024-03-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21123 )

Change subject: IMPALA-12693: [DOCS] Typo in link for ltrim in string functions 
docs
..


Patch Set 4:

Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/757/

Testing docs change - this change appears to modify docs/ and no code. This is 
experimental - please report any issues to tarmstr...@cloudera.com or on this 
JIRA: IMPALA-7317


--
To view, visit http://gerrit.cloudera.org:8080/21123
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If4345fc6d19f04d0c0c6feef3e0c8598271224fe
Gerrit-Change-Number: 21123
Gerrit-PatchSet: 4
Gerrit-Owner: Saurabh Katiyal 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 13 Mar 2024 08:29:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12693: [DOCS] Typo in link for ltrim in string functions docs

2024-03-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21123 )

Change subject: IMPALA-12693: [DOCS] Typo in link for ltrim in string functions 
docs
..


Patch Set 2: Verified+1

Build Successful

https://jenkins.impala.io/job/gerrit-docs-auto-test/756/ : Doc tests passed.


--
To view, visit http://gerrit.cloudera.org:8080/21123
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If4345fc6d19f04d0c0c6feef3e0c8598271224fe
Gerrit-Change-Number: 21123
Gerrit-PatchSet: 2
Gerrit-Owner: Saurabh Katiyal 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 13 Mar 2024 08:26:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12693: [DOCS] Typo in link for ltrim in string functions docs

2024-03-13 Thread Saurabh Katiyal (Code Review)
Saurabh Katiyal has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/21123 )

Change subject: IMPALA-12693: [DOCS] Typo in link for ltrim in string functions 
docs
..

IMPALA-12693: [DOCS] Typo in link for ltrim in string functions docs

fixed documentation typo for LTRIM string function, from LTRI to LTRIM

Change-Id: If4345fc6d19f04d0c0c6feef3e0c8598271224fe
---
M docs/topics/impala_string_functions.xml
1 file changed, 1 insertion(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/21123/2
--
To view, visit http://gerrit.cloudera.org:8080/21123
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If4345fc6d19f04d0c0c6feef3e0c8598271224fe
Gerrit-Change-Number: 21123
Gerrit-PatchSet: 2
Gerrit-Owner: Saurabh Katiyal 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-12693: [DOCS] Typo in link for ltrim in string functions docs

2024-03-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21123 )

Change subject: IMPALA-12693: [DOCS] Typo in link for ltrim in string functions 
docs
..


Patch Set 2: -Verified

Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/756/

Testing docs change - this change appears to modify docs/ and no code. This is 
experimental - please report any issues to tarmstr...@cloudera.com or on this 
JIRA: IMPALA-7317


--
To view, visit http://gerrit.cloudera.org:8080/21123
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If4345fc6d19f04d0c0c6feef3e0c8598271224fe
Gerrit-Change-Number: 21123
Gerrit-PatchSet: 2
Gerrit-Owner: Saurabh Katiyal 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 13 Mar 2024 08:19:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12896: Avoid JDBC table to be set as transactional table

2024-03-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21141 )

Change subject: IMPALA-12896: Avoid JDBC table to be set as transactional table
..


Patch Set 3: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/21141
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I556faeda923a4a11d4bef8c1250c9616f77e6fa6
Gerrit-Change-Number: 21141
Gerrit-PatchSet: 3
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 13 Mar 2024 08:01:50 +
Gerrit-HasComments: No