[Impala-ASF-CR] IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the coordinator

2024-03-28 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21138 )

Change subject: IMPALA-12809: Iceberg metadata table scanner should always be 
scheduled to the coordinator
..


Patch Set 13: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/21138
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
Gerrit-Change-Number: 21138
Gerrit-PatchSet: 13
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 29 Mar 2024 04:40:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the coordinator

2024-03-28 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/21138 )

Change subject: IMPALA-12809: Iceberg metadata table scanner should always be 
scheduled to the coordinator
..

IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the 
coordinator

On clusters with dedicated coordinators and executors the Iceberg
metadata scanner fragment(s) can be scheduled to executors, for example
during a join. The fragment in this case will fail a precondition check,
because either the 'frontend_' object or the table will not be present.

This change forces Iceberg metadata scanner fragments to be scheduled on
the coordinator. It is not enough to set the DataPartition type to
UNPARTITIONED, because unpartitioned fragments can still be scheduled on
executors. This change introduces a new flag in the TPlanFragment thrift
struct - if it is true, the fragment is always scheduled on the
coordinator.

Testing:
 - Added a regression test in test_coordinators.py.
 - Added a new planner test with two metadata tables and a regular table
   joined together.

Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
Reviewed-on: http://gerrit.cloudera.org:8080/21138
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M be/src/scheduling/schedule-state.cc
M be/src/scheduling/schedule-state.h
M be/src/scheduling/scheduler.cc
M common/thrift/Planner.thrift
M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java
M fe/src/main/java/org/apache/impala/planner/PlanFragment.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A 
testdata/workloads/functional-planner/queries/PlannerTest/iceberg-metadata-table-joined-with-regular-table.test
M tests/custom_cluster/test_coordinators.py
9 files changed, 175 insertions(+), 15 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/21138
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
Gerrit-Change-Number: 21138
Gerrit-PatchSet: 14
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the coordinator

2024-03-28 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21138 )

Change subject: IMPALA-12809: Iceberg metadata table scanner should always be 
scheduled to the coordinator
..


Patch Set 12:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15730/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21138
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
Gerrit-Change-Number: 21138
Gerrit-PatchSet: 12
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 29 Mar 2024 00:02:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the coordinator

2024-03-28 Thread Daniel Becker (Code Review)
Daniel Becker has uploaded a new patch set (#12). ( 
http://gerrit.cloudera.org:8080/21138 )

Change subject: IMPALA-12809: Iceberg metadata table scanner should always be 
scheduled to the coordinator
..

IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the 
coordinator

On clusters with dedicated coordinators and executors the Iceberg
metadata scanner fragment(s) can be scheduled to executors, for example
during a join. The fragment in this case will fail a precondition check,
because either the 'frontend_' object or the table will not be present.

This change forces Iceberg metadata scanner fragments to be scheduled on
the coordinator. It is not enough to set the DataPartition type to
UNPARTITIONED, because unpartitioned fragments can still be scheduled on
executors. This change introduces a new flag in the TPlanFragment thrift
struct - if it is true, the fragment is always scheduled on the
coordinator.

Testing:
 - Added a regression test in test_coordinators.py.
 - Added a new planner test with two metadata tables and a regular table
   joined together.

Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
---
M be/src/scheduling/schedule-state.cc
M be/src/scheduling/schedule-state.h
M be/src/scheduling/scheduler.cc
M common/thrift/Planner.thrift
M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java
M fe/src/main/java/org/apache/impala/planner/PlanFragment.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A 
testdata/workloads/functional-planner/queries/PlannerTest/iceberg-metadata-table-joined-with-regular-table.test
M tests/custom_cluster/test_coordinators.py
9 files changed, 175 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/21138/12
--
To view, visit http://gerrit.cloudera.org:8080/21138
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
Gerrit-Change-Number: 21138
Gerrit-PatchSet: 12
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the coordinator

2024-03-28 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21138 )

Change subject: IMPALA-12809: Iceberg metadata table scanner should always be 
scheduled to the coordinator
..


Patch Set 13:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10461/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/21138
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
Gerrit-Change-Number: 21138
Gerrit-PatchSet: 13
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 28 Mar 2024 23:39:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the coordinator

2024-03-28 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21138 )

Change subject: IMPALA-12809: Iceberg metadata table scanner should always be 
scheduled to the coordinator
..


Patch Set 13: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21138
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
Gerrit-Change-Number: 21138
Gerrit-PatchSet: 13
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 28 Mar 2024 23:39:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the coordinator

2024-03-28 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21138 )

Change subject: IMPALA-12809: Iceberg metadata table scanner should always be 
scheduled to the coordinator
..


Patch Set 11: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/10451/


--
To view, visit http://gerrit.cloudera.org:8080/21138
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
Gerrit-Change-Number: 21138
Gerrit-PatchSet: 11
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 28 Mar 2024 18:12:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the coordinator

2024-03-28 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21138 )

Change subject: IMPALA-12809: Iceberg metadata table scanner should always be 
scheduled to the coordinator
..


Patch Set 10:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15708/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21138
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
Gerrit-Change-Number: 21138
Gerrit-PatchSet: 10
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 28 Mar 2024 13:41:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the coordinator

2024-03-28 Thread Daniel Becker (Code Review)
Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21138 )

Change subject: IMPALA-12809: Iceberg metadata table scanner should always be 
scheduled to the coordinator
..


Patch Set 10: Code-Review+2

(1 comment)

Carrying +2.

http://gerrit.cloudera.org:8080/#/c/21138/10/fe/src/test/java/org/apache/impala/planner/PlannerTest.java
File fe/src/test/java/org/apache/impala/planner/PlannerTest.java:

http://gerrit.cloudera.org:8080/#/c/21138/10/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@1358
PS10, Line 1358: "functional_parquet", options);
Removed the option PlannerTestOption.VALIDATE_CARDINALITY as the test kept 
failing because of small variations in cardinality. We are mostly interested in 
the number of hosts on which the metadata scanning fragments run, and those are 
still checked.



--
To view, visit http://gerrit.cloudera.org:8080/21138
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
Gerrit-Change-Number: 21138
Gerrit-PatchSet: 10
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 28 Mar 2024 11:33:26 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the coordinator

2024-03-28 Thread Daniel Becker (Code Review)
Daniel Becker has uploaded a new patch set (#10). ( 
http://gerrit.cloudera.org:8080/21138 )

Change subject: IMPALA-12809: Iceberg metadata table scanner should always be 
scheduled to the coordinator
..

IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the 
coordinator

On clusters with dedicated coordinators and executors the Iceberg
metadata scanner fragment(s) can be scheduled to executors, for example
during a join. The fragment in this case will fail a precondition check,
because either the 'frontend_' object or the table will not be present.

This change forces Iceberg metadata scanner fragments to be scheduled on
the coordinator. It is not enough to set the DataPartition type to
UNPARTITIONED, because unpartitioned fragments can still be scheduled on
executors. This change introduces a new flag in the TPlanFragment thrift
struct - if it is true, the fragment is always scheduled on the
coordinator.

Testing:
 - Added a regression test in test_coordinators.py.
 - Added a new planner test with two metadata tables and a regular table
   joined together.

Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
---
M be/src/scheduling/schedule-state.cc
M be/src/scheduling/schedule-state.h
M be/src/scheduling/scheduler.cc
M common/thrift/Planner.thrift
M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java
M fe/src/main/java/org/apache/impala/planner/PlanFragment.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A 
testdata/workloads/functional-planner/queries/PlannerTest/iceberg-metadata-table-joined-with-regular-table.test
M tests/custom_cluster/test_coordinators.py
9 files changed, 174 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/21138/10
--
To view, visit http://gerrit.cloudera.org:8080/21138
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
Gerrit-Change-Number: 21138
Gerrit-PatchSet: 10
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the coordinator

2024-03-28 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21138 )

Change subject: IMPALA-12809: Iceberg metadata table scanner should always be 
scheduled to the coordinator
..


Patch Set 11:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10451/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/21138
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
Gerrit-Change-Number: 21138
Gerrit-PatchSet: 11
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 28 Mar 2024 11:33:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the coordinator

2024-03-28 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21138 )

Change subject: IMPALA-12809: Iceberg metadata table scanner should always be 
scheduled to the coordinator
..


Patch Set 11: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21138
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
Gerrit-Change-Number: 21138
Gerrit-PatchSet: 11
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 28 Mar 2024 11:33:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the coordinator

2024-03-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21138 )

Change subject: IMPALA-12809: Iceberg metadata table scanner should always be 
scheduled to the coordinator
..


Patch Set 9: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/10439/


--
To view, visit http://gerrit.cloudera.org:8080/21138
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
Gerrit-Change-Number: 21138
Gerrit-PatchSet: 9
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 27 Mar 2024 19:51:01 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the coordinator

2024-03-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21138 )

Change subject: IMPALA-12809: Iceberg metadata table scanner should always be 
scheduled to the coordinator
..


Patch Set 9:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15691/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21138
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
Gerrit-Change-Number: 21138
Gerrit-PatchSet: 9
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 27 Mar 2024 15:10:01 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the coordinator

2024-03-27 Thread Daniel Becker (Code Review)
Daniel Becker has uploaded a new patch set (#9). ( 
http://gerrit.cloudera.org:8080/21138 )

Change subject: IMPALA-12809: Iceberg metadata table scanner should always be 
scheduled to the coordinator
..

IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the 
coordinator

On clusters with dedicated coordinators and executors the Iceberg
metadata scanner fragment(s) can be scheduled to executors, for example
during a join. The fragment in this case will fail a precondition check,
because either the 'frontend_' object or the table will not be present.

This change forces Iceberg metadata scanner fragments to be scheduled on
the coordinator. It is not enough to set the DataPartition type to
UNPARTITIONED, because unpartitioned fragments can still be scheduled on
executors. This change introduces a new flag in the TPlanFragment thrift
struct - if it is true, the fragment is always scheduled on the
coordinator.

Testing:
 - Added a regression test in test_coordinators.py.
 - Added a new planner test with two metadata tables and a regular table
   joined together.

Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
---
M be/src/scheduling/schedule-state.cc
M be/src/scheduling/schedule-state.h
M be/src/scheduling/scheduler.cc
M common/thrift/Planner.thrift
M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java
M fe/src/main/java/org/apache/impala/planner/PlanFragment.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A 
testdata/workloads/functional-planner/queries/PlannerTest/iceberg-metadata-table-joined-with-regular-table.test
M tests/custom_cluster/test_coordinators.py
9 files changed, 175 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/21138/9
--
To view, visit http://gerrit.cloudera.org:8080/21138
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
Gerrit-Change-Number: 21138
Gerrit-PatchSet: 9
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the coordinator

2024-03-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21138 )

Change subject: IMPALA-12809: Iceberg metadata table scanner should always be 
scheduled to the coordinator
..


Patch Set 9:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10439/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/21138
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
Gerrit-Change-Number: 21138
Gerrit-PatchSet: 9
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 27 Mar 2024 14:48:26 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the coordinator

2024-03-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21138 )

Change subject: IMPALA-12809: Iceberg metadata table scanner should always be 
scheduled to the coordinator
..


Patch Set 7: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/10420/


--
To view, visit http://gerrit.cloudera.org:8080/21138
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
Gerrit-Change-Number: 21138
Gerrit-PatchSet: 7
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 25 Mar 2024 14:43:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the coordinator

2024-03-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21138 )

Change subject: IMPALA-12809: Iceberg metadata table scanner should always be 
scheduled to the coordinator
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15647/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21138
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
Gerrit-Change-Number: 21138
Gerrit-PatchSet: 6
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 25 Mar 2024 09:58:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the coordinator

2024-03-25 Thread Daniel Becker (Code Review)
Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21138 )

Change subject: IMPALA-12809: Iceberg metadata table scanner should always be 
scheduled to the coordinator
..


Patch Set 6: Code-Review+2

Carrying +2.


--
To view, visit http://gerrit.cloudera.org:8080/21138
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
Gerrit-Change-Number: 21138
Gerrit-PatchSet: 6
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 25 Mar 2024 09:35:53 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the coordinator

2024-03-25 Thread Daniel Becker (Code Review)
Daniel Becker has uploaded a new patch set (#6). ( 
http://gerrit.cloudera.org:8080/21138 )

Change subject: IMPALA-12809: Iceberg metadata table scanner should always be 
scheduled to the coordinator
..

IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the 
coordinator

On clusters with dedicated coordinators and executors the Iceberg
metadata scanner fragment(s) can be scheduled to executors, for example
during a join. The fragment in this case will fail a precondition check,
because either the 'frontend_' object or the table will not be present.

This change forces Iceberg metadata scanner fragments to be scheduled on
the coordinator. It is not enough to set the DataPartition type to
UNPARTITIONED, because unpartitioned fragments can still be scheduled on
executors. This change introduces a new flag in the TPlanFragment thrift
struct - if it is true, the fragment is always scheduled on the
coordinator.

Testing:
 - Added a regression test in test_coordinators.py.
 - Added a new planner test with two metadata tables and a regular table
   joined together.

Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
---
M be/src/scheduling/schedule-state.cc
M be/src/scheduling/schedule-state.h
M be/src/scheduling/scheduler.cc
M common/thrift/Planner.thrift
M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java
M fe/src/main/java/org/apache/impala/planner/PlanFragment.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A 
testdata/workloads/functional-planner/queries/PlannerTest/iceberg-metadata-table-joined-with-regular-table.test
M tests/custom_cluster/test_coordinators.py
9 files changed, 175 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/21138/6
--
To view, visit http://gerrit.cloudera.org:8080/21138
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
Gerrit-Change-Number: 21138
Gerrit-PatchSet: 6
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the coordinator

2024-03-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21138 )

Change subject: IMPALA-12809: Iceberg metadata table scanner should always be 
scheduled to the coordinator
..


Patch Set 7:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10420/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/21138
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
Gerrit-Change-Number: 21138
Gerrit-PatchSet: 7
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 25 Mar 2024 09:36:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the coordinator

2024-03-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21138 )

Change subject: IMPALA-12809: Iceberg metadata table scanner should always be 
scheduled to the coordinator
..


Patch Set 7: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21138
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
Gerrit-Change-Number: 21138
Gerrit-PatchSet: 7
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 25 Mar 2024 09:36:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the coordinator

2024-03-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21138 )

Change subject: IMPALA-12809: Iceberg metadata table scanner should always be 
scheduled to the coordinator
..


Patch Set 5: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/10412/


--
To view, visit http://gerrit.cloudera.org:8080/21138
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
Gerrit-Change-Number: 21138
Gerrit-PatchSet: 5
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 22 Mar 2024 20:47:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the coordinator

2024-03-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21138 )

Change subject: IMPALA-12809: Iceberg metadata table scanner should always be 
scheduled to the coordinator
..


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10412/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/21138
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
Gerrit-Change-Number: 21138
Gerrit-PatchSet: 5
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 22 Mar 2024 15:39:03 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the coordinator

2024-03-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21138 )

Change subject: IMPALA-12809: Iceberg metadata table scanner should always be 
scheduled to the coordinator
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15627/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21138
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
Gerrit-Change-Number: 21138
Gerrit-PatchSet: 4
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 22 Mar 2024 15:39:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the coordinator

2024-03-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21138 )

Change subject: IMPALA-12809: Iceberg metadata table scanner should always be 
scheduled to the coordinator
..


Patch Set 5: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21138
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
Gerrit-Change-Number: 21138
Gerrit-PatchSet: 5
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 22 Mar 2024 15:39:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the coordinator

2024-03-22 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21138 )

Change subject: IMPALA-12809: Iceberg metadata table scanner should always be 
scheduled to the coordinator
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21138
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
Gerrit-Change-Number: 21138
Gerrit-PatchSet: 4
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 22 Mar 2024 15:21:31 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the coordinator

2024-03-22 Thread Daniel Becker (Code Review)
Daniel Becker has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/21138 )

Change subject: IMPALA-12809: Iceberg metadata table scanner should always be 
scheduled to the coordinator
..

IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the 
coordinator

On clusters with dedicated coordinators and executors the Iceberg
metadata scanner fragment(s) can be scheduled to executors, for example
during a join. The fragment in this case will fail a precondition check,
because either the 'frontend_' object or the table will not be present.

This change forces Iceberg metadata scanner fragments to be scheduled on
the coordinator. It is not enough to set the DataPartition type to
UNPARTITIONED, because unpartitioned fragments can still be scheduled on
executors. This change introduces a new flag in the TPlanFragment thrift
struct - if it is true, the fragment is always scheduled on the
coordinator.

Testing:
 - Added a regression test in test_coordinators.py.
 - Added a new planner test with two metadata tables and a regular table
   joined together.

Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
---
M be/src/scheduling/schedule-state.cc
M be/src/scheduling/schedule-state.h
M be/src/scheduling/scheduler.cc
M common/thrift/Planner.thrift
M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java
M fe/src/main/java/org/apache/impala/planner/PlanFragment.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A 
testdata/workloads/functional-planner/queries/PlannerTest/iceberg-metadata-table-joined-with-regular-table.test
M tests/custom_cluster/test_coordinators.py
9 files changed, 175 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/21138/4
--
To view, visit http://gerrit.cloudera.org:8080/21138
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
Gerrit-Change-Number: 21138
Gerrit-PatchSet: 4
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the coordinator

2024-03-22 Thread Daniel Becker (Code Review)
Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21138 )

Change subject: IMPALA-12809: Iceberg metadata table scanner should always be 
scheduled to the coordinator
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21138/3/fe/src/main/java/org/apache/impala/planner/PlanFragment.java
File fe/src/main/java/org/apache/impala/planner/PlanFragment.java:

http://gerrit.cloudera.org:8080/#/c/21138/3/fe/src/main/java/org/apache/impala/planner/PlanFragment.java@192
PS3, Line 192: // Coordinator-only fragments must be unpartitined as there 
is only one instance of
 : // them.
> Could you please add a comment for this?
Done



--
To view, visit http://gerrit.cloudera.org:8080/21138
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
Gerrit-Change-Number: 21138
Gerrit-PatchSet: 4
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 22 Mar 2024 15:15:34 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the coordinator

2024-03-22 Thread Daniel Becker (Code Review)
Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21138 )

Change subject: IMPALA-12809: Iceberg metadata table scanner should always be 
scheduled to the coordinator
..


Patch Set 4:

Added a planner test also.


--
To view, visit http://gerrit.cloudera.org:8080/21138
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
Gerrit-Change-Number: 21138
Gerrit-PatchSet: 4
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 22 Mar 2024 15:15:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the coordinator

2024-03-22 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21138 )

Change subject: IMPALA-12809: Iceberg metadata table scanner should always be 
scheduled to the coordinator
..


Patch Set 3:

(1 comment)

Just quickly went over the code. Looks good overall, but could you please add 
planner tests?

http://gerrit.cloudera.org:8080/#/c/21138/3/fe/src/main/java/org/apache/impala/planner/PlanFragment.java
File fe/src/main/java/org/apache/impala/planner/PlanFragment.java:

http://gerrit.cloudera.org:8080/#/c/21138/3/fe/src/main/java/org/apache/impala/planner/PlanFragment.java@192
PS3, Line 192: Preconditions.checkState(!coordinatorOnly ||
 : dataPartition_.equals(DataPartition.UNPARTITIONED));
Could you please add a comment for this?



--
To view, visit http://gerrit.cloudera.org:8080/21138
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
Gerrit-Change-Number: 21138
Gerrit-PatchSet: 3
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 22 Mar 2024 10:54:47 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the coordinator

2024-03-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21138 )

Change subject: IMPALA-12809: Iceberg metadata table scanner should always be 
scheduled to the coordinator
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15624/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21138
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
Gerrit-Change-Number: 21138
Gerrit-PatchSet: 3
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 22 Mar 2024 10:26:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the coordinator

2024-03-22 Thread Daniel Becker (Code Review)
Daniel Becker has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/21138 )

Change subject: IMPALA-12809: Iceberg metadata table scanner should always be 
scheduled to the coordinator
..

IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the 
coordinator

On clusters with dedicated coordinators and executors the Iceberg
metadata scanner fragment(s) can be scheduled to executors, for example
during a join. The fragment in this case will fail a precondition check,
because either the 'frontend_' object or the table will not be present.

This change forces Iceberg metadata scanner fragments to be scheduled on
the coordinator. It is not enough to set the DataPartition type to
UNPARTITIONED, because unpartitioned fragments can still be scheduled on
executors. This change introduces a new flag in the TPlanFragment thrift
struct - if it is true, the fragment is always scheduled on the
coordinator.

Testing:
 - Added a regression test in test_coordinators.py.

Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
---
M be/src/scheduling/schedule-state.cc
M be/src/scheduling/schedule-state.h
M be/src/scheduling/scheduler.cc
M common/thrift/Planner.thrift
M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java
M fe/src/main/java/org/apache/impala/planner/PlanFragment.java
M tests/custom_cluster/test_coordinators.py
7 files changed, 64 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/21138/3
--
To view, visit http://gerrit.cloudera.org:8080/21138
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
Gerrit-Change-Number: 21138
Gerrit-PatchSet: 3
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the coordinator

2024-03-22 Thread Daniel Becker (Code Review)
Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21138 )

Change subject: IMPALA-12809: Iceberg metadata table scanner should always be 
scheduled to the coordinator
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21138/2/fe/src/main/java/org/apache/impala/planner/PlanFragment.java
File fe/src/main/java/org/apache/impala/planner/PlanFragment.java:

http://gerrit.cloudera.org:8080/#/c/21138/2/fe/src/main/java/org/apache/impala/planner/PlanFragment.java@93
PS2, Line 93: mustRunOnCoord_
> nit: 'coordinatorOnly_' ?
Done



--
To view, visit http://gerrit.cloudera.org:8080/21138
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
Gerrit-Change-Number: 21138
Gerrit-PatchSet: 2
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 22 Mar 2024 10:02:39 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the coordinator

2024-03-21 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21138 )

Change subject: IMPALA-12809: Iceberg metadata table scanner should always be 
scheduled to the coordinator
..


Patch Set 2:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/21138/2/be/src/scheduling/schedule-state.h
File be/src/scheduling/schedule-state.h:

http://gerrit.cloudera.org:8080/#/c/21138/2/be/src/scheduling/schedule-state.h@107
PS2, Line 107:   bool is_root_coord_fragment;
now that we have 2 similar things 'is_root_coord_fragment' and 
'fragment.must_run_on_coordinator' could you add a comment here to make more 
clear what this is for.


http://gerrit.cloudera.org:8080/#/c/21138/2/common/thrift/Planner.thrift
File common/thrift/Planner.thrift:

http://gerrit.cloudera.org:8080/#/c/21138/2/common/thrift/Planner.thrift@51
PS2, Line 51:   15: required bool must_run_on_coordinator
I know the order of the IDs is confusing now, but I feel this would be better 
at the end of this struct.


http://gerrit.cloudera.org:8080/#/c/21138/2/fe/src/main/java/org/apache/impala/planner/PlanFragment.java
File fe/src/main/java/org/apache/impala/planner/PlanFragment.java:

http://gerrit.cloudera.org:8080/#/c/21138/2/fe/src/main/java/org/apache/impala/planner/PlanFragment.java@93
PS2, Line 93: mustRunOnCoord_
nit: 'coordinatorOnly_' ?


http://gerrit.cloudera.org:8080/#/c/21138/2/fe/src/main/java/org/apache/impala/planner/PlanFragment.java@193
PS2, Line 193: outputPartition_.equals(DataPartition.UNPARTITIONED));
'outputPartition_' is unconditionally set to UNPARTITIONED in this function so 
this check doesn't make much sense. Also, I;m not sure we want to make this a 
constructor param. Can we default it to false and set it true in a 
setMustRunOnCoord() or such? We can then assert on the 'outputPartition_' there.
Or did you mean 'dataPartition_'?



--
To view, visit http://gerrit.cloudera.org:8080/21138
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
Gerrit-Change-Number: 21138
Gerrit-PatchSet: 2
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 21 Mar 2024 15:04:04 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the coordinator

2024-03-19 Thread Noemi Pap-Takacs (Code Review)
Noemi Pap-Takacs has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21138 )

Change subject: IMPALA-12809: Iceberg metadata table scanner should always be 
scheduled to the coordinator
..


Patch Set 2: Code-Review+1

Thanks, Daniel!
I like the new 'is_root_coord_fragment' naming, it makes the use case more 
clear.


--
To view, visit http://gerrit.cloudera.org:8080/21138
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
Gerrit-Change-Number: 21138
Gerrit-PatchSet: 2
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 19 Mar 2024 09:45:11 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the coordinator

2024-03-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21138 )

Change subject: IMPALA-12809: Iceberg metadata table scanner should always be 
scheduled to the coordinator
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15504/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21138
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
Gerrit-Change-Number: 21138
Gerrit-PatchSet: 2
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 13 Mar 2024 17:26:53 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the coordinator

2024-03-13 Thread Daniel Becker (Code Review)
Daniel Becker has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/21138 )

Change subject: IMPALA-12809: Iceberg metadata table scanner should always be 
scheduled to the coordinator
..

IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the 
coordinator

On clusters with dedicated coordinators and executors the Iceberg
metadata scanner fragment(s) can be scheduled to executors, for example
during a join. The fragment in this case will fail a precondition check,
because either the 'frontend_' object or the table will not be present.

This change forces Iceberg metadata scanner fragments to be scheduled on
the coordinator. It is not enough to set the DataPartition type to
UNPARTITIONED, because unpartitioned fragments can still be scheduled on
executors. This change introduces a new flag in the TPlanFragment thrift
struct - if it is true, the fragment is always scheduled on the
coordinator.

Testing:
 - Added a regression test in test_coordinators.py.

Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
---
M be/src/scheduling/schedule-state.cc
M be/src/scheduling/schedule-state.h
M be/src/scheduling/scheduler.cc
M common/thrift/Planner.thrift
M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java
M fe/src/main/java/org/apache/impala/planner/PlanFragment.java
M tests/custom_cluster/test_coordinators.py
7 files changed, 63 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/21138/2
--
To view, visit http://gerrit.cloudera.org:8080/21138
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
Gerrit-Change-Number: 21138
Gerrit-PatchSet: 2
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy