[jira] [Commented] (IMPALA-2424) Rack-aware scheduling

Peter Ebert (JIRA) Wed, 05 Dec 2018 08:14:36 -0800


    [ 
https://issues.apache.org/jira/browse/IMPALA-2424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16710273#comment-16710273
 ]


Peter Ebert commented on IMPALA-2424:
-------------------------------------

This is becoming increasingly important for scaling and separation of storage 
and compute.  If impala is installed on a subset of nodes, or distinct compute 
only nodes, remote reads would be essentially random and cross rack traffic may 
become saturated, especially at large scale where network over-subscription is 
common this could be a problem.  With rack aware scheduling and proper 
distribution of impala and storage nodes per rack, rack aware scheduling could 
keep traffic within the TOR switches and improve performance.

> Rack-aware scheduling
> ---------------------
>
>                 Key: IMPALA-2424
>                 URL: https://issues.apache.org/jira/browse/IMPALA-2424
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Distributed Exec
>    Affects Versions: Impala 2.2.4
>            Reporter: Marcel Kornacker
>            Priority: Minor
>              Labels: scalability, scheduling
>
> Currently, Impala makes an effort to schedule plan fragments local to the 
> data that is being scanned; when no collocated impalad is available, the plan 
> fragment is placed randomly.
> In order to support configurations where Impala is run on a subset of the 
> nodes in a cluster, we should schedule fragments within the same rack that 
> holds the assigned scan ranges (if a collocated impalad isn't available).
> See https://issues.apache.org/jira/browse/HADOOP-692 for details of how rack 
> locality is recorded in hdfs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (IMPALA-2424) Rack-aware scheduling

Reply via email to