Github user jihoonson commented on a diff in the pull request:

    https://github.com/apache/tajo/pull/402#discussion_r26217412
  
    --- Diff: 
tajo-core/src/main/java/org/apache/tajo/engine/planner/global/GlobalPlanner.java
 ---
    @@ -1371,32 +1373,57 @@ public LogicalNode visitUnion(GlobalPlanContext 
context, LogicalPlan plan, Logic
           LogicalPlan.QueryBlock rightQueryBlock = 
plan.getBlock(node.getRightChild());
           LogicalNode rightChild = visit(context, plan, rightQueryBlock, 
rightQueryBlock.getRoot(), stack);
           stack.pop();
    +      
    +      MasterPlan masterPlan = context.getPlan();
     
           List<ExecutionBlock> unionBlocks = Lists.newArrayList();
           List<ExecutionBlock> queryBlockBlocks = Lists.newArrayList();
     
           ExecutionBlock leftBlock = 
context.execBlockMap.remove(leftChild.getPID());
           ExecutionBlock rightBlock = 
context.execBlockMap.remove(rightChild.getPID());
    +      
    +      boolean leftUnion = (leftChild.getType() == NodeType.TABLE_SUBQUERY) 
&&
    +          (((TableSubQueryNode)leftChild).getSubQuery().getType() == 
NodeType.UNION);
    +      boolean rightUnion = (rightChild.getType() == 
NodeType.TABLE_SUBQUERY) &&
    +          (((TableSubQueryNode)rightChild).getSubQuery().getType() == 
NodeType.UNION);
           if (leftChild.getType() == NodeType.UNION) {
             unionBlocks.add(leftBlock);
           } else {
    -        queryBlockBlocks.add(leftBlock);
    +        if (leftUnion) {
    +          node.setLeftChild(
    +              createScanNodeWithTableSubQuery(masterPlan, 
    --- End diff --
    
    Please check whether I understand your patch well.
    When an union has a child of subquery that has another union child, the 
subquery and its child union should be removed. Let me suppose an example union 
```u1```. It has a child subquery ```s1``` which has another union child, 
```u2```. Finally, ```u2``` has a scan child, ```scan1```. In this case, 
```u1``` should be directly connected with ```scan```.
    
    However, according to your patch, the example query forms (```scan1``` - 
```u2``` - ```s1```) - (```scan2``` - ```u1```). Parentheses mean query blocks, 
and ```scan2``` is created at this line. In this case, I think that it is 
difficult to avoid unnecessary materialization of intermediate data between 
query blocks.
    
    If I misunderstand something, please tell me.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to