[
https://issues.apache.org/jira/browse/IMPALA-6404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16463372#comment-16463372
]
Lars Volker commented on IMPALA-6404:
-------------------------------------
I created a design document for this Jira here:
https://docs.google.com/document/d/1ymiZaZHKDUwDdrxaEB9BpWQYi2GIHqececs0ohKfZmA/edit
> Evenly distribute local and remote scan ranges across Impalad(s) when 100%
> locality is not achievable
> -----------------------------------------------------------------------------------------------------
>
> Key: IMPALA-6404
> URL: https://issues.apache.org/jira/browse/IMPALA-6404
> Project: IMPALA
> Issue Type: Improvement
> Components: Distributed Exec
> Reporter: Mostafa Mokhtar
> Assignee: Lars Volker
> Priority: Major
> Labels: scheduler
>
> Current scheduler tries to assign as many local reads as possible, this works
> well if 100% locality is achievable, but in cases where some nodes have
> locality while others don't an even scan ranges are assigned to the backends
> which results in execution skew.
> Ideally the scheduler should create an even distribution of local and remote
> scan ranges to avoid skew.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]