Github user baibaichen commented on the issue:

    https://github.com/apache/spark/pull/18652
  
    @viirya , @jiangxb1987 @gatorsmile 
    
    In general, Hive doesn't consider non-deterministic  in join condition.
    
    Some terms:
    
    1 equi-joins with key, i.e. a.key = b.key, using **Joink** represented
    2 filter,  i.e. a.key = 2 or a.key > 1, using **JoinF** represented,
    
    Prior to  2.2.0, Hive doesn't support OR, so the join condition looks like 
as following:
    
    > _Joink_ **and** _Joink_ **and** _JoinF_
    
    For **Joink**, keys are extracted for later hash (reduce-side or map-side 
join). For **JoinF**, filters will be pushed down according to 
[OuterJoinBehavior](https://cwiki.apache.org/confluence/display/Hive/OuterJoinBehavior)
    
    All codes are in[ 
`SemanticAnalyzer.parseJoinCondition`](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java#L2854).
 Predicate Pushdown starts with line 
[2902](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java#L2902).
    
    After 2.2.0 (with 
[HIVE-15211](https://issues.apache.org/jira/browse/HIVE-15211),[HIVE-15251](https://issues.apache.org/jira/browse/HIVE-15251)),
 Hive  supports complex expressions in ON clauses, but it still doesn't 
consider non-deterministic.
    
    Hive just pushes down filter if possible!  Given that, I agree suggestion 
of @viirya


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to