[ 
https://issues.apache.org/jira/browse/HIVE-17545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16180135#comment-16180135
 ] 

liyunzhang_intel commented on HIVE-17545:
-----------------------------------------

[~lirui]:  {quote}

if user turns on combining equivalent works and turns off RDD caching, then 
there won't be perf improvement right?
{quote}
if users turns on combining equivalent, duplicated map/reduce work will be 
removed. The performance will not change whether rdd caching is enabled or not. 
 
 In HoS, cache will be enabled only when the parent spark work have more than 
[1 
children|https://github.com/kellyzly/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java#L264].
 
If my understanding is not right, tell me.




> Make HoS RDD Cacheing Optimization Configurable
> -----------------------------------------------
>
>                 Key: HIVE-17545
>                 URL: https://issues.apache.org/jira/browse/HIVE-17545
>             Project: Hive
>          Issue Type: Improvement
>          Components: Physical Optimizer, Spark
>            Reporter: Sahil Takiar
>            Assignee: Sahil Takiar
>         Attachments: HIVE-17545.1.patch, HIVE-17545.2.patch
>
>
> The RDD cacheing optimization add in HIVE-10550 is enabled by default. We 
> should make it configurable in case users want to disable it. We can leave it 
> on by default to preserve backwards compatibility.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to