Xuefu Zhang created HIVE-8220:
---------------------------------

             Summary: Refactor multi-insert code such that plan splitting and 
task generation are modular and reusable [Spark Branch]
                 Key: HIVE-8220
                 URL: https://issues.apache.org/jira/browse/HIVE-8220
             Project: Hive
          Issue Type: Improvement
          Components: Spark
            Reporter: Xuefu Zhang


This is a followup for HIVE-7053. Currently the code to split the operator tree 
and to generate tasks is mingled and thus hard to understand and maintain. 
Logically the two seems independent. This can be improved by modulizing both. 
The following might be helpful:
{code}
  @Override
  protected void generateTaskTree(List<Task<? extends Serializable>> rootTasks, 
ParseContext pCtx,
      List<Task<MoveWork>> mvTask, Set<ReadEntity> inputs, Set<WriteEntity> 
outputs)
      throws SemanticException {
// 1. Identify if the plan is for multi-insert and split the plan if necessary
List<Set<Operator>> operatorSets = multiInsertSplit(...);
// 2. For each operator set, generate a task.
for (Set<Operator> topOps : operatorSets) {
  SparkTask task = generateTask(topOps);
  ...
}
// 3. wire up the tasks
...
}
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to