[
https://issues.apache.org/jira/browse/IMPALA-10973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Quanlong Huang updated IMPALA-10973:
------------------------------------
Fix Version/s: Impala 4.1.0
> Empty scan nodes are scheduled to the (exclusive) coordinator
> -------------------------------------------------------------
>
> Key: IMPALA-10973
> URL: https://issues.apache.org/jira/browse/IMPALA-10973
> Project: IMPALA
> Issue Type: Bug
> Components: Backend
> Reporter: Csaba Ringhofer
> Assignee: Csaba Ringhofer
> Priority: Critical
> Labels: scalability, scheduler
> Fix For: Impala 4.1.0
>
>
> Currently fragments with scan nodes that have no scan ranges are scheduled to
> the coordinator, even if it is an exclusive coordinator:
> https://github.com/apache/impala/blob/master/be/src/scheduling/scheduler.cc#L805
> As "parent" fragments are often scheduled to be collocated with their
> children, the condition of "being scheduled to the coordinator" can spread
> through the plan tree.
> This can be disastrous to scalability in clusters with lot of executors but
> few coordinators and is also very counter-intuitive, as scanning an empty
> table shouldn't have a major effect on the query.
>
> To reproduce locally:
> bin/start-impala-cluster.py --use_exclusive_coordinators -c 1
> in Impala shell:
> select id from functional.alltypes;
> profile; -- scan nodes will be scheduled to 2 hosts
> select f2 from functional.emptytable union all select id from
> functional.alltypes;
> profile; -- scan nodes will be scheduled to 3 hosts
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]