[
https://issues.apache.org/jira/browse/IMPALA-6834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lars Volker updated IMPALA-6834:
--------------------------------
Labels: scheduler (was: )
> Enforce consistent, pseudo-random replica order during local, non-random
> scheduling
> -----------------------------------------------------------------------------------
>
> Key: IMPALA-6834
> URL: https://issues.apache.org/jira/browse/IMPALA-6834
> Project: IMPALA
> Issue Type: Improvement
> Components: Backend, Frontend
> Affects Versions: Impala 2.12.0
> Reporter: Lars Volker
> Priority: Major
> Labels: scheduler
>
> When scheduling local, non-cached reads, the scheduler breaks ties between
> otherwise equivalent replicas by picking the first one in the list of
> candidates as reported by the frontend
> ([scheduler.h:375|https://github.com/apache/impala/blob/master/be/src/scheduling/scheduler.h#L375]).
> We do this to optimize the use of OS buffer caches. We also provide the
> random_replica query option to improve cases where the default behavior leads
> to CPU hotspots.
> The frontend merely passes the replicas of a block to the backend in the
> order they are returned from HDFS. This can result in poor load distribution
> for scans with many small scan ranges, depending on the order in which HDFS
> returns block locations.
> To alleviate this we should add another level of pseudo-random shuffling. The
> main idea is to change the order in which replicas are picked, but to keep
> the order consistent across backends.
> For example, we can compute a hash for each replica over (replica address,
> THdfsFileSplit::file_name, THdfsFileSplit::offset) and use this as a key to
> find the min_element in the list of candidates
> [scheduler.cc:L772|https://github.com/apache/impala/blob/master/be/src/scheduling/scheduler.cc#L772].
> Then we use this element instead of the first one.
> We need to make sure that we run sufficiently comprehensive performance tests
> on this change.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]