[ 
https://issues.apache.org/jira/browse/IMPALA-14006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Rawat updated IMPALA-14006:
------------------------------------
    Priority: Critical  (was: Major)

> Scheduler::CreateInputCollocatedInstances may overparallelize
> -------------------------------------------------------------
>
>                 Key: IMPALA-14006
>                 URL: https://issues.apache.org/jira/browse/IMPALA-14006
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Distributed Exec
>    Affects Versions: Impala 5.0.0
>            Reporter: Riza Suminto
>            Priority: Critical
>         Attachments: profile_614699efbc5e2ed9_cb9d6aca00000000.txt
>
>
> IMPALA-11604 (part 2) change how many instances to create in 
> Scheduler::CreateInputCollocatedInstances.
> [https://github.com/apache/impala/blame/3c24706c72818a1668159a428d4f2afcadea9f27/be/src/scheduling/scheduler.cc#L915-L920]
> This work when left child fragment of a parent fragment is distributed across 
> nodes. However, there is a possible corner case where the left child fragment 
> instance is limited to only 1 node, like SELECT fragment in 
> [^profile_614699efbc5e2ed9_cb9d6aca00000000.txt]
> {noformat}
> F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
> Per-Instance Resources: mem-estimate=4.27MB mem-reservation=4.00MB 
> thread-reservation=1
> max-parallelism=1 segment-costs=[17820]
> 04:SELECT
> |  predicates: dense_rank() < CAST(10 AS BIGINT)
> |  mem-estimate=0B mem-reservation=0B thread-reservation=0
> |  tuple-ids=7,6 row-size=12B cardinality=730 cost=7300
> |  in pipelines: 02(GETNEXT)
> |
> 03:ANALYTIC
> |  functions: dense_rank()
> |  order by: id ASC
> |  window: RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
> |  mem-estimate=4.00MB mem-reservation=4.00MB spill-buffer=2.00MB 
> thread-reservation=0
> |  tuple-ids=7,6 row-size=12B cardinality=7.30K cost=7300
> |  in pipelines: 02(GETNEXT)
> |
> 06:MERGING-EXCHANGE [UNPARTITIONED]
> |  order by: id ASC
> |  mem-estimate=33.50KB mem-reservation=0B thread-reservation=0
> |  tuple-ids=7 row-size=4B cardinality=7.30K cost=1904
> |  in pipelines: 02(GETNEXT){noformat}
> Due to existence of MERGING-EXCHANGE, this fragment is limited to just 1 
> instance in 1 node. If its parent fragment has high cost, the parent fragment 
> might schedule hundreds of instances, but  
> Scheduler::CreateInputCollocatedInstances will force them to colocate in 
> single executor node where the SELECT fragment scheduled.
> This, in turn, will hit sanity check in 
> Scheduler::CheckEffectiveInstanceCount, which enforce that no fragment should 
> be parallelized over 128 instances per executor node.
> [https://github.com/apache/impala/blob/3c24706c72818a1668159a428d4f2afcadea9f27/be/src/scheduling/scheduler.cc#L456-L463]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to