[ https://issues.apache.org/jira/browse/IMPALA-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16869757#comment-16869757 ]
Michael Ho edited comment on IMPALA-8685 at 6/21/19 7:26 PM: ------------------------------------------------------------- Yes, we can definitely see that in the profile if one were to look at the bytes scanned in the scan node instances and we can treat the query option as a workaround instead of penalizing the cache effectiveness by default. Also, the metrics {{impala-server.io-mgr.bytes-read}} can definitely show IO skew among executors. I guess what I am trying to understand is in a workload in which all scan ranges are remote, will the hash ring provide enough randomness / fairness that the skew is not an issue ? Would there be unexpected side effect if we mix remote / local scan ranges and can they be avoided ? Just thinking out aloud here but I should probably go look at the code more :-). was (Author: kwho): Yes, we can definitely see that in the profile if one were to look at the bytes scanned in the scan node instances and we can treat the query option as a workaround instead of penalizing the cache effectiveness by default. I guess what I am trying to understand is in a workload in which all scan ranges are remote, will the hash ring provide enough randomness / fairness that the skew is not an issue ? Would there be unexpected side effect if we mix remote / local scan ranges and can they be avoided ? Just thinking out aloud here but I should probably go look at the code more :-). > Evaluate default configuration of NUM_REMOTE_EXECUTOR_CANDIDATES > ---------------------------------------------------------------- > > Key: IMPALA-8685 > URL: https://issues.apache.org/jira/browse/IMPALA-8685 > Project: IMPALA > Issue Type: Improvement > Components: Backend > Reporter: Michael Ho > Assignee: Joe McDonnell > Priority: Critical > > The query option {{NUM_REMOTE_EXECUTOR_CANDIDATES}} is set to 3 by default. > This means that there are potentially 3 different executors which can process > a remote scan range. Over time, the data of a given remote scan range will be > spread across these 3 executors. My understanding of why this is not set to 1 > is to avoid hot spots in pathological cases. On the other hand, this may mean > that we may not maximize the utilization of the file handle cache and data > cache. Also, for small clusters (e.g. a 3 node cluster), the default value > may render deterministic remote scan range scheduling ineffective. We may > want to re-evaluate the default value of {{NUM_REMOTE_EXECUTOR_CANDIDATES}}. > One possible idea is to set it to min(3, half of cluster size) so it works > okay with small cluster, which may be rather common for demo purposes. > However, it doesn't address the problem of cache effectiveness in larger > clusters as the footprint of the cache is still amplified by > {{NUM_REMOTE_EXECUTOR_CANDIDATES}}. There may also be other criteria for > evaluating the default value. > cc'ing [~joemcdonnell], [~tlipcon] and [~drorke] -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org