[
https://issues.apache.org/jira/browse/IMPALA-14638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18048257#comment-18048257
]
ASF subversion and git services commented on IMPALA-14638:
----------------------------------------------------------
Commit 3a5a6f612a332fc509cfdc73c4566356a00ac730 in impala's branch
refs/heads/master from Michael Smith
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=3a5a6f612 ]
IMPALA-14638: Schedule union of iceberg metadata scanner to coordinator
On clusters with dedicated coordinators and executors the Iceberg
metadata scanner fragment(s) must be scheduled to coordinators.
IMPALA-12809 ensured this for most plans, but if the Iceberg metadata
scanner is part of a union of unpartitioned fragments a new fragment is
created for the union that subsumes existing fragments and loses the
coordinatorOnly flag.
Fixes cases where a multi-fragment plan includes a union of iceberg
metadata scans by setting coordinatorOnly on the new union fragment.
Adds new planner and runtime tests for this case.
Change-Id: If2f19945037b4a7a6433cd9c6e7e2b352fae7356
Reviewed-on: http://gerrit.cloudera.org:8080/23803
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> Iceberg metadata scan can still be scheduled to dedicated executor
> ------------------------------------------------------------------
>
> Key: IMPALA-14638
> URL: https://issues.apache.org/jira/browse/IMPALA-14638
> Project: IMPALA
> Issue Type: Bug
> Components: Backend
> Affects Versions: Impala 4.4.0
> Reporter: Michael Smith
> Assignee: Michael Smith
> Priority: Major
> Attachments: profile_794f7e240f4d2a7b_895a9d4200000000.txt
>
>
> IMPALA-12809 attempted to ensure fragments containing SCAN ICEBERG METADATA
> are always scheduled to a coordinator. However that fix appears to be
> incomplete.
> With {{start-impala-cluster.py --num_coordinators=1
> --use_exclusive_coordinators}}, the following query still causes a metadata
> scan to be scheduled to an executor and crashes it
> {code}
> CREATE TABLE default.iceberg_meta_join (foo BIGINT) STORED BY ICEBERG;
> INSERT INTO default.iceberg_meta_join (foo) VALUES (1);
> WITH meta_union AS (
> SELECT 1 as snapshot_id FROM default.iceberg_meta_join.snapshots
> UNION
> SELECT 1 as snapshot_id FROM default.iceberg_meta_join.snapshots)
> SELECT * FROM default.iceberg_meta_join AS t
> INNER JOIN meta_union ON t.foo = meta_union.snapshot_id;
> {code}
> I've attached the profile, showing that some fragments are scheduled to 27001
> and 27002. In coordinator logs (log_level=3) I see
> {code}
> I20251218 13:58:28.711380 595794 scheduler.cc:585]
> d4485d8fa32b7e79:989f109800000000] Computing exec params for unpartitioned
> fragment F03
> I20251218 13:58:28.711390 595794 scheduler.cc:625]
> d4485d8fa32b7e79:989f109800000000] Scheduled unpartitioned fragment on
> 127.0.0.1:27002
> I20251218 13:58:28.711396 595794 scheduler.cc:645]
> d4485d8fa32b7e79:989f109800000000] Computing exec params for scan and/or
> union fragment.
> I20251218 13:58:28.711410 595794 scheduler.cc:585]
> d4485d8fa32b7e79:989f109800000000] Computing exec params for root coordinator
> fragment F04
> I20251218 13:58:28.711412 595794 scheduler.cc:625]
> d4485d8fa32b7e79:989f109800000000] Scheduled unpartitioned fragment on
> 127.0.0.1:27000
> I20251218 13:58:28.711416 595794 scheduler.cc:337]
> d4485d8fa32b7e79:989f109800000000] Computing exec params for fragment F04
> I20251218 13:58:28.711417 595794 scheduler.cc:337]
> d4485d8fa32b7e79:989f109800000000] Computing exec params for fragment F00
> I20251218 13:58:28.711421 595794 scheduler.cc:337]
> d4485d8fa32b7e79:989f109800000000] Computing exec params for fragment F03
> {code}
> so I think in this case the coordinatorOnly flag isn't getting set on F03,
> probably because the Union creates a new fragment.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]