[ 
https://issues.apache.org/jira/browse/DRILL-4539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15228484#comment-15228484
 ] 

ASF GitHub Bot commented on DRILL-4539:
---------------------------------------

Github user amansinha100 commented on a diff in the pull request:

    https://github.com/apache/drill/pull/462#discussion_r58728331
  
    --- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillRelOptUtil.java
 ---
    @@ -169,4 +176,223 @@ private static boolean containIdentity(List<? extends 
RexNode> exps,
         }
         return true;
       }
    +
    +  /**
    +   * Copied from {@link RelOptUtil#splitJoinCondition(RelNode, RelNode, 
RexNode, List, List)}. Modified to rewrite
    --- End diff --
    
    Agree that we ideally should leverage the Calcite code..especially since 
this method is pretty heavily used and modified periodically so keeping Drill's 
version of this method in sync will be difficult. 


> Add support for Null Equality Joins
> -----------------------------------
>
>                 Key: DRILL-4539
>                 URL: https://issues.apache.org/jira/browse/DRILL-4539
>             Project: Apache Drill
>          Issue Type: Improvement
>            Reporter: Jacques Nadeau
>            Assignee: Venki Korukanti
>
> Tableau frequently generates queries similar to this:
> {code}
> SELECT `t0`.`city` AS `city`,
>   `t2`.`X_measure__B` AS `max_Calculation_DFIDBHHAIIECCJFDAG_ok`,
>   `t0`.`state` AS `state`,
>   `t0`.`sum_stars_ok` AS `sum_stars_ok`
> FROM (
>   SELECT `business`.`city` AS `city`,
>     `business`.`state` AS `state`,
>     SUM(`business`.`stars`) AS `sum_stars_ok`
>   FROM `mongo.academic`.`business` `business`
>   GROUP BY `business`.`city`,
>     `business`.`state`
> ) `t0`
>   INNER JOIN (
>   SELECT MAX(`t1`.`X_measure__A`) AS `X_measure__B`,
>     `t1`.`city` AS `city`,
>     `t1`.`state` AS `state`
>   FROM (
>     SELECT `business`.`city` AS `city`,
>       `business`.`state` AS `state`,
>       `business`.`business_id` AS `business_id`,
>       SUM(`business`.`stars`) AS `X_measure__A`
>     FROM `mongo.academic`.`business` `business`
>     GROUP BY `business`.`city`,
>       `business`.`state`,
>       `business`.`business_id`
>   ) `t1`
>   GROUP BY `t1`.`city`,
>     `t1`.`state`
> ) `t2` ON (((`t0`.`city` = `t2`.`city`) OR ((`t0`.`city` IS NULL) AND 
> (`t2`.`city` IS NULL))) AND ((`t0`.`state` = `t2`.`state`) OR ((`t0`.`state` 
> IS NULL) AND (`t2`.`state` IS NULL))))
> {code}
> If you look at the join condition, you'll note that the join condition is an 
> equality condition which also allows null=null. We should add a planning 
> rewrite rule and execution join option to allow null equality so that we 
> don't treat this as a cartesian join.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to