[ 
https://issues.apache.org/jira/browse/IMPALA-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16869711#comment-16869711
 ] 

Michael Ho commented on IMPALA-8685:
------------------------------------

Yes, I agree that setting NUM_REMOTE_EXECUTOR_CANDIDATES to anything other than 
1 will definitely reduce the effectiveness of the cache in general. I believe I 
need to understand the issue of skew and scheduling in general better and see 
if there are workarounds than setting it to 3.

> Evaluate default configuration of NUM_REMOTE_EXECUTOR_CANDIDATES
> ----------------------------------------------------------------
>
>                 Key: IMPALA-8685
>                 URL: https://issues.apache.org/jira/browse/IMPALA-8685
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend
>            Reporter: Michael Ho
>            Priority: Critical
>
> The query option {{NUM_REMOTE_EXECUTOR_CANDIDATES}} is set to 3 by default. 
> This means that there are potentially 3 different executors which can process 
> a remote scan range. Over time, the data of a given remote scan range will be 
> spread across these 3 executors. My understanding of why this is not set to 1 
> is to avoid hot spots in pathological cases. On the other hand, this may mean 
> that we may not maximize the utilization of the file handle cache and data 
> cache. Also, for small clusters (e.g. a 3 node cluster), the default value 
> may render deterministic remote scan range scheduling ineffective. We may 
> want to re-evaluate the default value of {{NUM_REMOTE_EXECUTOR_CANDIDATES}}. 
> One idea is to set it to min(3, half of cluster size) so it works okay with 
> small cluster, which may be rather common for demo purposes. There may also 
> be other criteria for evaluating the default value.
> cc'ing [~joemcdonnell], [~tlipcon] and [~drorke]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to