[
https://issues.apache.org/jira/browse/HIVE-8536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14224299#comment-14224299
]
Rui Li commented on HIVE-8536:
------------------------------
It seems the dependency task is created in {{GenSparkProcContext}}. And we
always add it for move task. I suspect this is unnecessary. Here's how MR
decides whether to create dependency task in {{GenMRProcContext}}:
{code}
public DependencyCollectionTask getDependencyTaskForMultiInsert() {
if (dependencyTaskForMultiInsert == null) {
if
(conf.getBoolVar(ConfVars.HIVE_MULTI_INSERT_MOVE_TASKS_SHARE_DEPENDENCIES)) {
dependencyTaskForMultiInsert =
(DependencyCollectionTask) TaskFactory.get(new
DependencyCollectionWork(), conf);
}
}
return dependencyTaskForMultiInsert;
}
{code}
I'm writing a patch to do it the MR way and run some tests. If all diff is in
query plan, we can open another JIRA to fix it.
[~csun] do you have any comments on this since it seems related to multi-insert?
> Enable SkewJoinResolver for spark [Spark Branch]
> ------------------------------------------------
>
> Key: HIVE-8536
> URL: https://issues.apache.org/jira/browse/HIVE-8536
> Project: Hive
> Issue Type: Improvement
> Components: Spark
> Reporter: Rui Li
> Assignee: Rui Li
> Attachments: HIVE-8536.1-spark.patch, HIVE-8536.2-spark.patch
>
>
> Sub-task of HIVE-8406
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)