[jira] [Resolved] (IMPALA-14638) Iceberg metadata scan can still be scheduled to dedicated executor

Michael Smith (Jira) Mon, 29 Dec 2025 09:46:26 -0800


     [ 
https://issues.apache.org/jira/browse/IMPALA-14638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Michael Smith resolved IMPALA-14638.
------------------------------------
    Fix Version/s: Impala 5.0.0
       Resolution: Fixed

> Iceberg metadata scan can still be scheduled to dedicated executor
> ------------------------------------------------------------------
>
>                 Key: IMPALA-14638
>                 URL: https://issues.apache.org/jira/browse/IMPALA-14638
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>    Affects Versions: Impala 4.4.0
>            Reporter: Michael Smith
>            Assignee: Michael Smith
>            Priority: Major
>             Fix For: Impala 5.0.0
>
>         Attachments: profile_794f7e240f4d2a7b_895a9d4200000000.txt
>
>
> IMPALA-12809 attempted to ensure fragments containing SCAN ICEBERG METADATA 
> are always scheduled to a coordinator. However that fix appears to be 
> incomplete.
> With {{start-impala-cluster.py --num_coordinators=1 
> --use_exclusive_coordinators}}, the following query still causes a metadata 
> scan to be scheduled to an executor and crashes it
> {code}
> CREATE TABLE default.iceberg_meta_join (foo BIGINT) STORED BY ICEBERG;
> INSERT INTO default.iceberg_meta_join (foo) VALUES (1);
> WITH meta_union AS (
>   SELECT 1 as snapshot_id FROM default.iceberg_meta_join.snapshots
>   UNION
>   SELECT 1 as snapshot_id FROM default.iceberg_meta_join.snapshots)
> SELECT * FROM default.iceberg_meta_join AS t
>   INNER JOIN meta_union ON t.foo = meta_union.snapshot_id;
> {code}
> I've attached the profile, showing that some fragments are scheduled to 27001 
> and 27002. In coordinator logs (log_level=3) I see
> {code}
> I20251218 13:58:28.711380 595794 scheduler.cc:585] 
> d4485d8fa32b7e79:989f109800000000] Computing exec params for unpartitioned 
> fragment F03
> I20251218 13:58:28.711390 595794 scheduler.cc:625] 
> d4485d8fa32b7e79:989f109800000000] Scheduled unpartitioned fragment on 
> 127.0.0.1:27002
> I20251218 13:58:28.711396 595794 scheduler.cc:645] 
> d4485d8fa32b7e79:989f109800000000] Computing exec params for scan and/or 
> union fragment.
> I20251218 13:58:28.711410 595794 scheduler.cc:585] 
> d4485d8fa32b7e79:989f109800000000] Computing exec params for root coordinator 
> fragment F04
> I20251218 13:58:28.711412 595794 scheduler.cc:625] 
> d4485d8fa32b7e79:989f109800000000] Scheduled unpartitioned fragment on 
> 127.0.0.1:27000
> I20251218 13:58:28.711416 595794 scheduler.cc:337] 
> d4485d8fa32b7e79:989f109800000000] Computing exec params for fragment F04
> I20251218 13:58:28.711417 595794 scheduler.cc:337] 
> d4485d8fa32b7e79:989f109800000000] Computing exec params for fragment F00
> I20251218 13:58:28.711421 595794 scheduler.cc:337] 
> d4485d8fa32b7e79:989f109800000000] Computing exec params for fragment F03
> {code}
> so I think in this case the coordinatorOnly flag isn't getting set on F03, 
> probably because the Union creates a new fragment.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (IMPALA-14638) Iceberg metadata scan can still be scheduled to dedicated executor

Reply via email to