[jira] [Updated] (IMPALA-9637) Scan range load-balancing within backend

Tim Armstrong (Jira) Thu, 09 Apr 2020 13:58:26 -0700


     [ 
https://issues.apache.org/jira/browse/IMPALA-9637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Tim Armstrong updated IMPALA-9637:
----------------------------------
    Description: 
Currently the scheduler statically divides scan ranges between fragment 
instances, Since IMPALA-9015 it statically load-balances scan ranges based on 
file size using the LPT algorithm in the schedule.

This has various pitfalls:
 * It interacts badly with dynamic partition pruning, which can filter out a 
bunch of scan ranges and unbalance the laod
 * Different files that have the same byte size may involve different amounts 
of work to process for any number of reasons.

Those can cause both inter-node load balance problems and intra-node load 
balance problems. This Jira is about fixing the intra-node load balance 
problem, so that the situation is no worse than before mt_dop.

The proposed solution is to have a queue of scan ranges per backend, sorted 
from largest to smallest, and have each instance pull scan ranges off that 
queue. The DiskIOMgr ReaderContext probably is already sufficient to solve this 
problem, and we'll need to add a different mechanism for Kudu, Hbase, etc.

  was:
Currently the scheduler statically divides scan ranges between fragment 
instances, Since IMPALA-9015 it statically load-balances scan ranges based on 
file size using the LPT algorithm in the schedule.

This has various pitfalls:
 * It interacts badly with dynamic partition pruning, which can filter
 * Different files that have the same byte size may involve different amounts 
of work to process for any number of reasons.

Those can cause both inter-node load balance problems and intra-node load 
balance problems. This Jira is about fixing the intra-node load balance 
problem, so that the situation is no worse than before mt_dop.

The proposed solution is to have a queue of scan ranges per backend, sorted 
from largest to smallest, and have each instance pull scan ranges off that 
queue. The DiskIOMgr ReaderContext probably is already sufficient to solve this 
problem, and we'll need to add a different mechanism for Kudu, Hbase, etc.


> Scan range load-balancing within backend
> ----------------------------------------
>
>                 Key: IMPALA-9637
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9637
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Distributed Exec
>    Affects Versions: Impala 4.0
>            Reporter: Tim Armstrong
>            Priority: Major
>              Labels: multithreading, performance
>
> Currently the scheduler statically divides scan ranges between fragment 
> instances, Since IMPALA-9015 it statically load-balances scan ranges based on 
> file size using the LPT algorithm in the schedule.
> This has various pitfalls:
>  * It interacts badly with dynamic partition pruning, which can filter out a 
> bunch of scan ranges and unbalance the laod
>  * Different files that have the same byte size may involve different amounts 
> of work to process for any number of reasons.
> Those can cause both inter-node load balance problems and intra-node load 
> balance problems. This Jira is about fixing the intra-node load balance 
> problem, so that the situation is no worse than before mt_dop.
> The proposed solution is to have a queue of scan ranges per backend, sorted 
> from largest to smallest, and have each instance pull scan ranges off that 
> queue. The DiskIOMgr ReaderContext probably is already sufficient to solve 
> this problem, and we'll need to add a different mechanism for Kudu, Hbase, 
> etc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (IMPALA-9637) Scan range load-balancing within backend

Reply via email to