[ https://issues.apache.org/jira/browse/IMPALA-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16869688#comment-16869688 ]
Todd Lipcon commented on IMPALA-8685: ------------------------------------- I think we need to be careful not to do anything that is surprising to end user/admin expectations. In particular I think people might have the following assumptions: *If my working set is N GB of data, if I allocate N GB of cache across the cluster I should be able to get 100% cache hit ratio. Some small fixed percentage of overhead would be acceptable (eg if we say we recommend 120G of cache for 100G fo data) but I think asking for 2x or 3x of working set is unacceptable.* This is my big fear about any default configuration other than 1 for NUM_REMOTE_EXECUTOR_CANDIDATES. *Given a cluster with N nodes (with some per-node cache configuration), a cluster with N+1 nodes should have same or better cache rate* This assumption would break if we go with something like {{min(3, half of cluster size)}}, right? eg assuming we round down for "half of cluster size", a 3-node cluster would use 1 cache replica, but a 4-node cluster would use 2 cache replicas. Thus the cacheable working set for 3 nodes (100GB cache each) would be 300GB whereas the cacheable working set for 4 nodes (100GB each) would be 200GB, resulting in a reduced cache hit rate from an increased resource allocation. > Evaluate default configuration of NUM_REMOTE_EXECUTOR_CANDIDATES > ---------------------------------------------------------------- > > Key: IMPALA-8685 > URL: https://issues.apache.org/jira/browse/IMPALA-8685 > Project: IMPALA > Issue Type: Improvement > Components: Backend > Reporter: Michael Ho > Priority: Critical > > The query option {{NUM_REMOTE_EXECUTOR_CANDIDATES}} is set to 3 by default. > This means that there are potentially 3 different executors which can process > a remote scan range. Over time, the data of a given remote scan range will be > spread across these 3 executors. My understanding of why this is not set to 1 > is to avoid hot spots in pathological cases. On the other hand, this may mean > that we may not maximize the utilization of the file handle cache and data > cache. Also, for small clusters (e.g. a 3 node cluster), the default value > may render deterministic remote scan range scheduling ineffective. We may > want to re-evaluate the default value of {{NUM_REMOTE_EXECUTOR_CANDIDATES}}. > One idea is to set it to min(3, half of cluster size) so it works okay with > small cluster, which may be rather common for demo purposes. There may also > be other criteria for evaluating the default value. > cc'ing [~joemcdonnell], [~tlipcon] and [~drorke] -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org