[ 
https://issues.apache.org/jira/browse/IMPALA-6404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16463372#comment-16463372
 ] 

Lars Volker commented on IMPALA-6404:
-------------------------------------

I created a design document for this Jira here: 
https://docs.google.com/document/d/1ymiZaZHKDUwDdrxaEB9BpWQYi2GIHqececs0ohKfZmA/edit

> Evenly distribute local and remote scan ranges across Impalad(s) when 100% 
> locality is not achievable
> -----------------------------------------------------------------------------------------------------
>
>                 Key: IMPALA-6404
>                 URL: https://issues.apache.org/jira/browse/IMPALA-6404
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Distributed Exec
>            Reporter: Mostafa Mokhtar
>            Assignee: Lars Volker
>            Priority: Major
>              Labels: scheduler
>
> Current scheduler tries to assign as many local reads as possible, this works 
> well if 100% locality is achievable, but in cases where some nodes have 
> locality while others don't an even scan ranges are assigned to the backends 
> which results in execution skew.
> Ideally the scheduler should create an even distribution of local and remote 
> scan ranges to avoid skew.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to