[
https://issues.apache.org/jira/browse/IMPALA-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16869735#comment-16869735
]
Todd Lipcon commented on IMPALA-8685:
-------------------------------------
If we set it to 1 and it does cause scheduling skew, is there an easy metric
that shows up in the profile that would allow us to detect the skew? I guess we
have the per-scan-node range counts and bytes read-- should be sufficient,
right? If so, we can document somewhere in our perf-tuning docs, etc, that if
you see scanner skew on a remote data store, setting
NUM_REMOTE_EXECUTOR_CANDIDATES to a higher value can reduce skew, at the
expense of decreasing effective cache capacity.
I wonder if a rename of NUM_REMOTE_EXECUTOR_CANDIDATES would also be useful, if
it's not too late (has this been released already?)
> Evaluate default configuration of NUM_REMOTE_EXECUTOR_CANDIDATES
> ----------------------------------------------------------------
>
> Key: IMPALA-8685
> URL: https://issues.apache.org/jira/browse/IMPALA-8685
> Project: IMPALA
> Issue Type: Improvement
> Components: Backend
> Reporter: Michael Ho
> Assignee: Joe McDonnell
> Priority: Critical
>
> The query option {{NUM_REMOTE_EXECUTOR_CANDIDATES}} is set to 3 by default.
> This means that there are potentially 3 different executors which can process
> a remote scan range. Over time, the data of a given remote scan range will be
> spread across these 3 executors. My understanding of why this is not set to 1
> is to avoid hot spots in pathological cases. On the other hand, this may mean
> that we may not maximize the utilization of the file handle cache and data
> cache. Also, for small clusters (e.g. a 3 node cluster), the default value
> may render deterministic remote scan range scheduling ineffective. We may
> want to re-evaluate the default value of {{NUM_REMOTE_EXECUTOR_CANDIDATES}}.
> One possible idea is to set it to min(3, half of cluster size) so it works
> okay with small cluster, which may be rather common for demo purposes.
> However, it doesn't address the problem of cache effectiveness in larger
> clusters as the footprint of the cache is still amplified by
> {{NUM_REMOTE_EXECUTOR_CANDIDATES}}. There may also be other criteria for
> evaluating the default value.
> cc'ing [~joemcdonnell], [~tlipcon] and [~drorke]
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]