[jira] [Created] (IMPALA-14006) Scheduler::CreateInputCollocatedInstances may overparallelize

Riza Suminto (Jira) Mon, 28 Apr 2025 10:01:43 -0700

Riza Suminto created IMPALA-14006:
-------------------------------------

             Summary: Scheduler::CreateInputCollocatedInstances may 
overparallelize
                 Key: IMPALA-14006
                 URL: https://issues.apache.org/jira/browse/IMPALA-14006
             Project: IMPALA
          Issue Type: Bug
          Components: Distributed Exec
    Affects Versions: Impala 5.0.0
            Reporter: Riza Suminto
         Attachments: profile_614699efbc5e2ed9_cb9d6aca00000000.txt


IMPALA-11604 (part 2) change how many instances to create in 
Scheduler::CreateInputCollocatedInstances.

[https://github.com/apache/impala/blame/3c24706c72818a1668159a428d4f2afcadea9f27/be/src/scheduling/scheduler.cc#L915-L920]

This work when left child fragment of a parent fragment is distributed across 
nodes. However, there is a possible corner case where the left child fragment 
instance is limited to only 1 node, like SELECT fragment in 
[^profile_614699efbc5e2ed9_cb9d6aca00000000.txt]
{noformat}
F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
Per-Instance Resources: mem-estimate=4.27MB mem-reservation=4.00MB 
thread-reservation=1
max-parallelism=1 segment-costs=[17820]
04:SELECT
|  predicates: dense_rank() < CAST(10 AS BIGINT)
|  mem-estimate=0B mem-reservation=0B thread-reservation=0
|  tuple-ids=7,6 row-size=12B cardinality=730 cost=7300
|  in pipelines: 02(GETNEXT)
|
03:ANALYTIC
|  functions: dense_rank()
|  order by: id ASC
|  window: RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
|  mem-estimate=4.00MB mem-reservation=4.00MB spill-buffer=2.00MB 
thread-reservation=0
|  tuple-ids=7,6 row-size=12B cardinality=7.30K cost=7300
|  in pipelines: 02(GETNEXT)
|
06:MERGING-EXCHANGE [UNPARTITIONED]
|  order by: id ASC
|  mem-estimate=33.50KB mem-reservation=0B thread-reservation=0
|  tuple-ids=7 row-size=4B cardinality=7.30K cost=1904
|  in pipelines: 02(GETNEXT){noformat}
Due to existence of MERGING-EXCHANGE, this fragment is limited to just 1 
instance in 1 node. If its parent fragment has high cost, the parent fragment 
might schedule hundreds of instances, but  
Scheduler::CreateInputCollocatedInstances will force them to colocate in single 
executor node where the SELECT fragment scheduled.

This, in turn, will hit sanity check in Scheduler::CheckEffectiveInstanceCount, 
which enforce that no fragment should be parallelized over 128 instances per 
executor node.

[https://github.com/apache/impala/blob/3c24706c72818a1668159a428d4f2afcadea9f27/be/src/scheduling/scheduler.cc#L456-L463]
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (IMPALA-14006) Scheduler::CreateInputCollocatedInstances may overparallelize

Reply via email to