Re: Review Request 69663: HIVE-16976

2019-01-11 Thread Deepak Jaiswal


> On Jan. 11, 2019, 6:07 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
> > Lines 35 (patched)
> > 
> >
> > This is not needed?

Yes. That is correct. I will remove it before committing.


- Deepak


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69663/#review211896
---


On Jan. 9, 2019, 5:50 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69663/
> ---
> 
> (Updated Jan. 9, 2019, 5:50 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan, Gopal V, Jesús Camacho Rodríguez, 
> and Jason Dere.
> 
> 
> Bugs: HIVE-16976
> https://issues.apache.org/jira/browse/HIVE-16976
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> DPP: SyntheticJoinPredicate transitivity for < > and BETWEEN
> 
> The patch supports predicates on non-equi joins and provides an interface for 
> storage handler to decide if it can use this optimization.
> Work to integrate this with DPP and semijoin will be done in separate JIRA.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java d7f069eaa7 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStorageHandler.java 
> 2ebb149354 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
>  a1401aac72 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java f8c7e18eb1 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDynamicListDesc.java 
> 676dfc9421 
>   ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java 
> e97e44796f 
>   ql/src/test/results/clientpositive/llap/cross_prod_1.q.out f900a01be4 
>   ql/src/test/results/clientpositive/llap/groupby_groupingset_bug.q.out 
> de74af6dff 
>   ql/src/test/results/clientpositive/llap/semijoin.q.out 00bc6cec55 
>   ql/src/test/results/clientpositive/llap/subquery_in.q.out 07cc4dbabc 
>   ql/src/test/results/clientpositive/llap/subquery_notin.q.out 29d8bbfb48 
>   ql/src/test/results/clientpositive/llap/subquery_scalar.q.out 1cf281afbd 
>   ql/src/test/results/clientpositive/llap/subquery_select.q.out 6255abdd70 
> 
> 
> Diff: https://reviews.apache.org/r/69663/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>



Re: Review Request 69663: HIVE-16976

2019-01-11 Thread Jesús Camacho Rodríguez

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69663/#review211896
---




ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
Lines 35 (patched)


This is not needed?


- Jesús Camacho Rodríguez


On Jan. 9, 2019, 5:50 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69663/
> ---
> 
> (Updated Jan. 9, 2019, 5:50 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan, Gopal V, Jesús Camacho Rodríguez, 
> and Jason Dere.
> 
> 
> Bugs: HIVE-16976
> https://issues.apache.org/jira/browse/HIVE-16976
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> DPP: SyntheticJoinPredicate transitivity for < > and BETWEEN
> 
> The patch supports predicates on non-equi joins and provides an interface for 
> storage handler to decide if it can use this optimization.
> Work to integrate this with DPP and semijoin will be done in separate JIRA.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java d7f069eaa7 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStorageHandler.java 
> 2ebb149354 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
>  a1401aac72 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java f8c7e18eb1 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDynamicListDesc.java 
> 676dfc9421 
>   ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java 
> e97e44796f 
>   ql/src/test/results/clientpositive/llap/cross_prod_1.q.out f900a01be4 
>   ql/src/test/results/clientpositive/llap/groupby_groupingset_bug.q.out 
> de74af6dff 
>   ql/src/test/results/clientpositive/llap/semijoin.q.out 00bc6cec55 
>   ql/src/test/results/clientpositive/llap/subquery_in.q.out 07cc4dbabc 
>   ql/src/test/results/clientpositive/llap/subquery_notin.q.out 29d8bbfb48 
>   ql/src/test/results/clientpositive/llap/subquery_scalar.q.out 1cf281afbd 
>   ql/src/test/results/clientpositive/llap/subquery_select.q.out 6255abdd70 
> 
> 
> Diff: https://reviews.apache.org/r/69663/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>



Re: Review Request 69663: HIVE-16976

2019-01-09 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69663/
---

(Updated Jan. 9, 2019, 5:50 p.m.)


Review request for hive, Ashutosh Chauhan, Gopal V, Jesús Camacho Rodríguez, 
and Jason Dere.


Changes
---

Updated the patch with review comments.


Bugs: HIVE-16976
https://issues.apache.org/jira/browse/HIVE-16976


Repository: hive-git


Description
---

DPP: SyntheticJoinPredicate transitivity for < > and BETWEEN

The patch supports predicates on non-equi joins and provides an interface for 
storage handler to decide if it can use this optimization.
Work to integrate this with DPP and semijoin will be done in separate JIRA.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java d7f069eaa7 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStorageHandler.java 
2ebb149354 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
 a1401aac72 
  ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java f8c7e18eb1 
  ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDynamicListDesc.java 
676dfc9421 
  ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java 
e97e44796f 
  ql/src/test/results/clientpositive/llap/cross_prod_1.q.out f900a01be4 
  ql/src/test/results/clientpositive/llap/groupby_groupingset_bug.q.out 
de74af6dff 
  ql/src/test/results/clientpositive/llap/semijoin.q.out 00bc6cec55 
  ql/src/test/results/clientpositive/llap/subquery_in.q.out 07cc4dbabc 
  ql/src/test/results/clientpositive/llap/subquery_notin.q.out 29d8bbfb48 
  ql/src/test/results/clientpositive/llap/subquery_scalar.q.out 1cf281afbd 
  ql/src/test/results/clientpositive/llap/subquery_select.q.out 6255abdd70 


Diff: https://reviews.apache.org/r/69663/diff/2/

Changes: https://reviews.apache.org/r/69663/diff/1-2/


Testing
---


Thanks,

Deepak Jaiswal



Re: Review Request 69663: HIVE-16976

2019-01-08 Thread Jesús Camacho Rodríguez


> On Jan. 7, 2019, 4:41 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
> > Lines 330 (patched)
> > 
> >
> > Should we still return null if function text is not recognized?
> 
> Deepak Jaiswal wrote:
> Yes, that helps recognize unsupported functions. For eg,
> 
> ExprNodeGenericFuncDesc funcDesc = (ExprNodeGenericFuncDesc) 
> filter;
> // filter should be of type <, >, <= or >=
> if (getFuncText(funcDesc.getFuncText(), 1) == null) {
>   // unsupported
>   continue;
> }
> 
> I am open to better ways, hence the TODO.
> 
> Jesús Camacho Rodríguez wrote:
> Sorry, I did not express myself properly. Within the if (srcPos == 0) {, 
> shouldn't we return null if function text is not recognized (similar to what 
> we do below that you pointed out)?
> 
> Deepak Jaiswal wrote:
> That would require verifying the function text which is done in switch 
> case anyway.
> Inorder for non-equi join's synthetic joins to work properly, if the 
> switch case cant get a valid inversion text then it is not supported.
> That is why I used "1" to make sure it goes through the switch case. This 
> eliminates duplicating similar logic.
> 
> Jesús Camacho Rodríguez wrote:
> OK, I was getting confused by the semantics of the srcPos parameter (an 
> 'invert' boolean would have been clearer).
> Tbh, I think it is better to create two methods: one internal in 
> SyntheticJoinPredicate that would return whether a function is supported or 
> not, and a utility method in FunctionRegistry that would return the inverse 
> of a given function. Overhead is neglibible and there will be clear different 
> semantics.
> 
> Deepak Jaiswal wrote:
> Having two functions could create a maintenance headache. As mentioned 
> above, the function will go to FunctionRegistry. There is already a comment,
> return null; // helps identify unsupported functions
> 
> I can expand the comment to make things clearer.
> Leaving the function as it is keeps things short and sweet and involves 
> much less maintenance.

I disagree this is a maintenance headache. Inverse/reverse method is not 
dependent from the logic in SyntheticJoinPredicate; in fact, we have a similar 
utility method for Calcite expressions:
https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rex/RexUtil.java#L1830
This method can be handy when you want to swap the inputs of a given 
expression. It may support other operations in the future, but you do not want 
to affect whether this is a valid predicate for SyntheticJoinPredicate, e.g., 
'=' with inverse '='.


- Jesús


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69663/#review211725
---


On Jan. 3, 2019, 8:39 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69663/
> ---
> 
> (Updated Jan. 3, 2019, 8:39 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan, Gopal V, Jesús Camacho Rodríguez, 
> and Jason Dere.
> 
> 
> Bugs: HIVE-16976
> https://issues.apache.org/jira/browse/HIVE-16976
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> DPP: SyntheticJoinPredicate transitivity for < > and BETWEEN
> 
> The patch supports predicates on non-equi joins and provides an interface for 
> storage handler to decide if it can use this optimization.
> Work to integrate this with DPP and semijoin will be done in separate JIRA.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStorageHandler.java 
> 2ebb149354 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
>  a1401aac72 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java f8c7e18eb1 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDynamicListDesc.java 
> 676dfc9421 
>   ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java 
> e97e44796f 
>   ql/src/test/results/clientpositive/llap/cross_prod_1.q.out ac1f4eabd8 
>   ql/src/test/results/clientpositive/llap/groupby_groupingset_bug.q.out 
> de74af6dff 
>   ql/src/test/results/clientpositive/llap/semijoin.q.out 63a270e57d 
>   ql/src/test/results/clientpositive/llap/subquery_in.q.out 07cc4dbabc 
>   ql/src/test/results/clientpositive/llap/subquery_notin.q.out 29d8bbfb48 
>   ql/src/test/results/clientpositive/llap/subquery_scalar.q.out e830835445 
>   ql/src/test/results/clientpositive/llap/subquery_select.q.out d3cc980ca1 
> 
> 
> Diff: 

Re: Review Request 69663: HIVE-16976

2019-01-07 Thread Deepak Jaiswal


> On Jan. 7, 2019, 4:41 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
> > Lines 284 (patched)
> > 
> >
> > We should add a call to extended version here as we did above for 
> > equality predicates. The only required change seems to be in 
> > _addParentReduceSink_ called from _createDerivatives_, which would receive 
> > the comparison operator from here. All the rest should already work as 
> > expected.
> > 
> > I believe this could be addressed in this JIRA since it is not a lot of 
> > code. However, if it is not addressed, please create follow-up and leave a 
> > TODO.
> 
> Deepak Jaiswal wrote:
> Will add the extended version. Thanks for bringing this up.

The existing logic for extension works for equality. I am planning to do this 
later. HIVE-21098 tracks it.


- Deepak


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69663/#review211725
---


On Jan. 3, 2019, 8:39 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69663/
> ---
> 
> (Updated Jan. 3, 2019, 8:39 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan, Gopal V, Jesús Camacho Rodríguez, 
> and Jason Dere.
> 
> 
> Bugs: HIVE-16976
> https://issues.apache.org/jira/browse/HIVE-16976
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> DPP: SyntheticJoinPredicate transitivity for < > and BETWEEN
> 
> The patch supports predicates on non-equi joins and provides an interface for 
> storage handler to decide if it can use this optimization.
> Work to integrate this with DPP and semijoin will be done in separate JIRA.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStorageHandler.java 
> 2ebb149354 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
>  a1401aac72 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java f8c7e18eb1 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDynamicListDesc.java 
> 676dfc9421 
>   ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java 
> e97e44796f 
>   ql/src/test/results/clientpositive/llap/cross_prod_1.q.out ac1f4eabd8 
>   ql/src/test/results/clientpositive/llap/groupby_groupingset_bug.q.out 
> de74af6dff 
>   ql/src/test/results/clientpositive/llap/semijoin.q.out 63a270e57d 
>   ql/src/test/results/clientpositive/llap/subquery_in.q.out 07cc4dbabc 
>   ql/src/test/results/clientpositive/llap/subquery_notin.q.out 29d8bbfb48 
>   ql/src/test/results/clientpositive/llap/subquery_scalar.q.out e830835445 
>   ql/src/test/results/clientpositive/llap/subquery_select.q.out d3cc980ca1 
> 
> 
> Diff: https://reviews.apache.org/r/69663/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>



Re: Review Request 69663: HIVE-16976

2019-01-07 Thread Deepak Jaiswal


> On Jan. 7, 2019, 4:41 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
> > Lines 330 (patched)
> > 
> >
> > Should we still return null if function text is not recognized?
> 
> Deepak Jaiswal wrote:
> Yes, that helps recognize unsupported functions. For eg,
> 
> ExprNodeGenericFuncDesc funcDesc = (ExprNodeGenericFuncDesc) 
> filter;
> // filter should be of type <, >, <= or >=
> if (getFuncText(funcDesc.getFuncText(), 1) == null) {
>   // unsupported
>   continue;
> }
> 
> I am open to better ways, hence the TODO.
> 
> Jesús Camacho Rodríguez wrote:
> Sorry, I did not express myself properly. Within the if (srcPos == 0) {, 
> shouldn't we return null if function text is not recognized (similar to what 
> we do below that you pointed out)?
> 
> Deepak Jaiswal wrote:
> That would require verifying the function text which is done in switch 
> case anyway.
> Inorder for non-equi join's synthetic joins to work properly, if the 
> switch case cant get a valid inversion text then it is not supported.
> That is why I used "1" to make sure it goes through the switch case. This 
> eliminates duplicating similar logic.
> 
> Jesús Camacho Rodríguez wrote:
> OK, I was getting confused by the semantics of the srcPos parameter (an 
> 'invert' boolean would have been clearer).
> Tbh, I think it is better to create two methods: one internal in 
> SyntheticJoinPredicate that would return whether a function is supported or 
> not, and a utility method in FunctionRegistry that would return the inverse 
> of a given function. Overhead is neglibible and there will be clear different 
> semantics.

Having two functions could create a maintenance headache. As mentioned above, 
the function will go to FunctionRegistry. There is already a comment,
return null; // helps identify unsupported functions

I can expand the comment to make things clearer.
Leaving the function as it is keeps things short and sweet and involves much 
less maintenance.


- Deepak


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69663/#review211725
---


On Jan. 3, 2019, 8:39 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69663/
> ---
> 
> (Updated Jan. 3, 2019, 8:39 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan, Gopal V, Jesús Camacho Rodríguez, 
> and Jason Dere.
> 
> 
> Bugs: HIVE-16976
> https://issues.apache.org/jira/browse/HIVE-16976
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> DPP: SyntheticJoinPredicate transitivity for < > and BETWEEN
> 
> The patch supports predicates on non-equi joins and provides an interface for 
> storage handler to decide if it can use this optimization.
> Work to integrate this with DPP and semijoin will be done in separate JIRA.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStorageHandler.java 
> 2ebb149354 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
>  a1401aac72 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java f8c7e18eb1 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDynamicListDesc.java 
> 676dfc9421 
>   ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java 
> e97e44796f 
>   ql/src/test/results/clientpositive/llap/cross_prod_1.q.out ac1f4eabd8 
>   ql/src/test/results/clientpositive/llap/groupby_groupingset_bug.q.out 
> de74af6dff 
>   ql/src/test/results/clientpositive/llap/semijoin.q.out 63a270e57d 
>   ql/src/test/results/clientpositive/llap/subquery_in.q.out 07cc4dbabc 
>   ql/src/test/results/clientpositive/llap/subquery_notin.q.out 29d8bbfb48 
>   ql/src/test/results/clientpositive/llap/subquery_scalar.q.out e830835445 
>   ql/src/test/results/clientpositive/llap/subquery_select.q.out d3cc980ca1 
> 
> 
> Diff: https://reviews.apache.org/r/69663/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>



Re: Review Request 69663: HIVE-16976

2019-01-07 Thread Deepak Jaiswal


> On Jan. 7, 2019, 10:40 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
> > Lines 254 (patched)
> > 
> >
> > srcPos and targetPos do not seem to refer to function, but rather the 
> > inputs being joined. In addition, they do not change between loop 
> > iterations. Below, they are used to retrieve the child expression from the 
> > function, which does not seem correct.
> 
> Jesús Camacho Rodríguez wrote:
> OK, seeing your comment above, I understood better the code here. You may 
> be inverting the source and target, that is why you access the function 
> expression using them. Could you leave a comment explaining it?
> My comment above about the value change for srcPos and targetPos between 
> iterations still seems valid, the check could be done before to skip the the 
> loop in line 242.

Thanks for the tip. Yes, it makes sense to have the check before the loop 
begins as srcPos and targetPos do not change. We can skip the whole logic with 
this condition even before the if condition.


- Deepak


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69663/#review211745
---


On Jan. 3, 2019, 8:39 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69663/
> ---
> 
> (Updated Jan. 3, 2019, 8:39 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan, Gopal V, Jesús Camacho Rodríguez, 
> and Jason Dere.
> 
> 
> Bugs: HIVE-16976
> https://issues.apache.org/jira/browse/HIVE-16976
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> DPP: SyntheticJoinPredicate transitivity for < > and BETWEEN
> 
> The patch supports predicates on non-equi joins and provides an interface for 
> storage handler to decide if it can use this optimization.
> Work to integrate this with DPP and semijoin will be done in separate JIRA.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStorageHandler.java 
> 2ebb149354 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
>  a1401aac72 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java f8c7e18eb1 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDynamicListDesc.java 
> 676dfc9421 
>   ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java 
> e97e44796f 
>   ql/src/test/results/clientpositive/llap/cross_prod_1.q.out ac1f4eabd8 
>   ql/src/test/results/clientpositive/llap/groupby_groupingset_bug.q.out 
> de74af6dff 
>   ql/src/test/results/clientpositive/llap/semijoin.q.out 63a270e57d 
>   ql/src/test/results/clientpositive/llap/subquery_in.q.out 07cc4dbabc 
>   ql/src/test/results/clientpositive/llap/subquery_notin.q.out 29d8bbfb48 
>   ql/src/test/results/clientpositive/llap/subquery_scalar.q.out e830835445 
>   ql/src/test/results/clientpositive/llap/subquery_select.q.out d3cc980ca1 
> 
> 
> Diff: https://reviews.apache.org/r/69663/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>



Re: Review Request 69663: HIVE-16976

2019-01-07 Thread Jesús Camacho Rodríguez


> On Jan. 7, 2019, 10:40 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
> > Lines 254 (patched)
> > 
> >
> > srcPos and targetPos do not seem to refer to function, but rather the 
> > inputs being joined. In addition, they do not change between loop 
> > iterations. Below, they are used to retrieve the child expression from the 
> > function, which does not seem correct.

OK, seeing your comment above, I understood better the code here. You may be 
inverting the source and target, that is why you access the function expression 
using them. Could you leave a comment explaining it?
My comment above about the value change for srcPos and targetPos between 
iterations still seems valid, the check could be done before to skip the the 
loop in line 242.


- Jesús


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69663/#review211745
---


On Jan. 3, 2019, 8:39 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69663/
> ---
> 
> (Updated Jan. 3, 2019, 8:39 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan, Gopal V, Jesús Camacho Rodríguez, 
> and Jason Dere.
> 
> 
> Bugs: HIVE-16976
> https://issues.apache.org/jira/browse/HIVE-16976
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> DPP: SyntheticJoinPredicate transitivity for < > and BETWEEN
> 
> The patch supports predicates on non-equi joins and provides an interface for 
> storage handler to decide if it can use this optimization.
> Work to integrate this with DPP and semijoin will be done in separate JIRA.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStorageHandler.java 
> 2ebb149354 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
>  a1401aac72 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java f8c7e18eb1 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDynamicListDesc.java 
> 676dfc9421 
>   ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java 
> e97e44796f 
>   ql/src/test/results/clientpositive/llap/cross_prod_1.q.out ac1f4eabd8 
>   ql/src/test/results/clientpositive/llap/groupby_groupingset_bug.q.out 
> de74af6dff 
>   ql/src/test/results/clientpositive/llap/semijoin.q.out 63a270e57d 
>   ql/src/test/results/clientpositive/llap/subquery_in.q.out 07cc4dbabc 
>   ql/src/test/results/clientpositive/llap/subquery_notin.q.out 29d8bbfb48 
>   ql/src/test/results/clientpositive/llap/subquery_scalar.q.out e830835445 
>   ql/src/test/results/clientpositive/llap/subquery_select.q.out d3cc980ca1 
> 
> 
> Diff: https://reviews.apache.org/r/69663/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>



Re: Review Request 69663: HIVE-16976

2019-01-07 Thread Jesús Camacho Rodríguez


> On Jan. 7, 2019, 4:41 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
> > Lines 330 (patched)
> > 
> >
> > Should we still return null if function text is not recognized?
> 
> Deepak Jaiswal wrote:
> Yes, that helps recognize unsupported functions. For eg,
> 
> ExprNodeGenericFuncDesc funcDesc = (ExprNodeGenericFuncDesc) 
> filter;
> // filter should be of type <, >, <= or >=
> if (getFuncText(funcDesc.getFuncText(), 1) == null) {
>   // unsupported
>   continue;
> }
> 
> I am open to better ways, hence the TODO.
> 
> Jesús Camacho Rodríguez wrote:
> Sorry, I did not express myself properly. Within the if (srcPos == 0) {, 
> shouldn't we return null if function text is not recognized (similar to what 
> we do below that you pointed out)?
> 
> Deepak Jaiswal wrote:
> That would require verifying the function text which is done in switch 
> case anyway.
> Inorder for non-equi join's synthetic joins to work properly, if the 
> switch case cant get a valid inversion text then it is not supported.
> That is why I used "1" to make sure it goes through the switch case. This 
> eliminates duplicating similar logic.

OK, I was getting confused by the semantics of the srcPos parameter (an 
'invert' boolean would have been clearer).
Tbh, I think it is better to create two methods: one internal in 
SyntheticJoinPredicate that would return whether a function is supported or 
not, and a utility method in FunctionRegistry that would return the inverse of 
a given function. Overhead is neglibible and there will be clear different 
semantics.


- Jesús


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69663/#review211725
---


On Jan. 3, 2019, 8:39 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69663/
> ---
> 
> (Updated Jan. 3, 2019, 8:39 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan, Gopal V, Jesús Camacho Rodríguez, 
> and Jason Dere.
> 
> 
> Bugs: HIVE-16976
> https://issues.apache.org/jira/browse/HIVE-16976
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> DPP: SyntheticJoinPredicate transitivity for < > and BETWEEN
> 
> The patch supports predicates on non-equi joins and provides an interface for 
> storage handler to decide if it can use this optimization.
> Work to integrate this with DPP and semijoin will be done in separate JIRA.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStorageHandler.java 
> 2ebb149354 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
>  a1401aac72 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java f8c7e18eb1 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDynamicListDesc.java 
> 676dfc9421 
>   ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java 
> e97e44796f 
>   ql/src/test/results/clientpositive/llap/cross_prod_1.q.out ac1f4eabd8 
>   ql/src/test/results/clientpositive/llap/groupby_groupingset_bug.q.out 
> de74af6dff 
>   ql/src/test/results/clientpositive/llap/semijoin.q.out 63a270e57d 
>   ql/src/test/results/clientpositive/llap/subquery_in.q.out 07cc4dbabc 
>   ql/src/test/results/clientpositive/llap/subquery_notin.q.out 29d8bbfb48 
>   ql/src/test/results/clientpositive/llap/subquery_scalar.q.out e830835445 
>   ql/src/test/results/clientpositive/llap/subquery_select.q.out d3cc980ca1 
> 
> 
> Diff: https://reviews.apache.org/r/69663/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>



Re: Review Request 69663: HIVE-16976

2019-01-07 Thread Deepak Jaiswal


> On Jan. 7, 2019, 4:41 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
> > Line 182 (original)
> > 
> >
> > Can you bring this back and delete 'if (sourceKeys.size() > 0) {' 
> > below? This is just a style change and indenting so many lines will just 
> > make more difficult following code provenance.
> 
> Deepak Jaiswal wrote:
> The continue is removed so that it reaches the residualFilter logic, 
> otherwise it would skip everything and move on to next target.
> 
> Jesús Camacho Rodríguez wrote:
> You are right, I did not see the extra }. Could the comment '//if 
> (sourceKeys.size() < 1) continue;' below be removed then? No need to leave it 
> there.

Sure. I forgot to remove it.


> On Jan. 7, 2019, 4:41 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
> > Lines 330 (patched)
> > 
> >
> > Should we still return null if function text is not recognized?
> 
> Deepak Jaiswal wrote:
> Yes, that helps recognize unsupported functions. For eg,
> 
> ExprNodeGenericFuncDesc funcDesc = (ExprNodeGenericFuncDesc) 
> filter;
> // filter should be of type <, >, <= or >=
> if (getFuncText(funcDesc.getFuncText(), 1) == null) {
>   // unsupported
>   continue;
> }
> 
> I am open to better ways, hence the TODO.
> 
> Jesús Camacho Rodríguez wrote:
> Sorry, I did not express myself properly. Within the if (srcPos == 0) {, 
> shouldn't we return null if function text is not recognized (similar to what 
> we do below that you pointed out)?

That would require verifying the function text which is done in switch case 
anyway.
Inorder for non-equi join's synthetic joins to work properly, if the switch 
case cant get a valid inversion text then it is not supported.
That is why I used "1" to make sure it goes through the switch case. This 
eliminates duplicating similar logic.


- Deepak


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69663/#review211725
---


On Jan. 3, 2019, 8:39 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69663/
> ---
> 
> (Updated Jan. 3, 2019, 8:39 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan, Gopal V, Jesús Camacho Rodríguez, 
> and Jason Dere.
> 
> 
> Bugs: HIVE-16976
> https://issues.apache.org/jira/browse/HIVE-16976
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> DPP: SyntheticJoinPredicate transitivity for < > and BETWEEN
> 
> The patch supports predicates on non-equi joins and provides an interface for 
> storage handler to decide if it can use this optimization.
> Work to integrate this with DPP and semijoin will be done in separate JIRA.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStorageHandler.java 
> 2ebb149354 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
>  a1401aac72 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java f8c7e18eb1 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDynamicListDesc.java 
> 676dfc9421 
>   ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java 
> e97e44796f 
>   ql/src/test/results/clientpositive/llap/cross_prod_1.q.out ac1f4eabd8 
>   ql/src/test/results/clientpositive/llap/groupby_groupingset_bug.q.out 
> de74af6dff 
>   ql/src/test/results/clientpositive/llap/semijoin.q.out 63a270e57d 
>   ql/src/test/results/clientpositive/llap/subquery_in.q.out 07cc4dbabc 
>   ql/src/test/results/clientpositive/llap/subquery_notin.q.out 29d8bbfb48 
>   ql/src/test/results/clientpositive/llap/subquery_scalar.q.out e830835445 
>   ql/src/test/results/clientpositive/llap/subquery_select.q.out d3cc980ca1 
> 
> 
> Diff: https://reviews.apache.org/r/69663/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>



Re: Review Request 69663: HIVE-16976

2019-01-07 Thread Jesús Camacho Rodríguez


> On Jan. 7, 2019, 4:41 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
> > Line 182 (original)
> > 
> >
> > Can you bring this back and delete 'if (sourceKeys.size() > 0) {' 
> > below? This is just a style change and indenting so many lines will just 
> > make more difficult following code provenance.
> 
> Deepak Jaiswal wrote:
> The continue is removed so that it reaches the residualFilter logic, 
> otherwise it would skip everything and move on to next target.

You are right, I did not see the extra }. Could the comment '//if 
(sourceKeys.size() < 1) continue;' below be removed then? No need to leave it 
there.


> On Jan. 7, 2019, 4:41 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
> > Lines 328 (patched)
> > 
> >
> > Can we move this method as _invertFunction_ utility method to 
> > _FunctionRegistry.java_?
> > 
> > In addition, instead of relying on the function text, I believe it 
> > would be more robust to have the UDF as the input. In particular, we can 
> > use _funcDesc.getGenericUDF();_ when calling this method, then rely in e.g. 
> > _udf instanceof GenericUDFOPEqualOrGreaterThan_ for the checks.
> 
> Deepak Jaiswal wrote:
> Yes, I can move this.
> The reason I used function text is because I can use switch case and also 
> much faster.
> Otherwise, once extended in future, this could become a giant mess of 
> if...else statements.
> We can discuss this further.

I was thinking about the function UDFs because it is less shaky than text 
matching, but on the other hand, it is not probable that Hive ever changes the 
string representation for these functions and as you said it is simpler, hence 
it should be fine.


> On Jan. 7, 2019, 4:41 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
> > Lines 330 (patched)
> > 
> >
> > Should we still return null if function text is not recognized?
> 
> Deepak Jaiswal wrote:
> Yes, that helps recognize unsupported functions. For eg,
> 
> ExprNodeGenericFuncDesc funcDesc = (ExprNodeGenericFuncDesc) 
> filter;
> // filter should be of type <, >, <= or >=
> if (getFuncText(funcDesc.getFuncText(), 1) == null) {
>   // unsupported
>   continue;
> }
> 
> I am open to better ways, hence the TODO.

Sorry, I did not express myself properly. Within the if (srcPos == 0) {, 
shouldn't we return null if function text is not recognized (similar to what we 
do below that you pointed out)?


- Jesús


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69663/#review211725
---


On Jan. 3, 2019, 8:39 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69663/
> ---
> 
> (Updated Jan. 3, 2019, 8:39 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan, Gopal V, Jesús Camacho Rodríguez, 
> and Jason Dere.
> 
> 
> Bugs: HIVE-16976
> https://issues.apache.org/jira/browse/HIVE-16976
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> DPP: SyntheticJoinPredicate transitivity for < > and BETWEEN
> 
> The patch supports predicates on non-equi joins and provides an interface for 
> storage handler to decide if it can use this optimization.
> Work to integrate this with DPP and semijoin will be done in separate JIRA.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStorageHandler.java 
> 2ebb149354 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
>  a1401aac72 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java f8c7e18eb1 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDynamicListDesc.java 
> 676dfc9421 
>   ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java 
> e97e44796f 
>   ql/src/test/results/clientpositive/llap/cross_prod_1.q.out ac1f4eabd8 
>   ql/src/test/results/clientpositive/llap/groupby_groupingset_bug.q.out 
> de74af6dff 
>   ql/src/test/results/clientpositive/llap/semijoin.q.out 63a270e57d 
>   ql/src/test/results/clientpositive/llap/subquery_in.q.out 07cc4dbabc 
>   ql/src/test/results/clientpositive/llap/subquery_notin.q.out 29d8bbfb48 
>   ql/src/test/results/clientpositive/llap/subquery_scalar.q.out e830835445 
>   

Re: Review Request 69663: HIVE-16976

2019-01-07 Thread Jesús Camacho Rodríguez

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69663/#review211745
---




ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
Lines 254 (patched)


srcPos and targetPos do not seem to refer to function, but rather the 
inputs being joined. In addition, they do not change between loop iterations. 
Below, they are used to retrieve the child expression from the function, which 
does not seem correct.


- Jesús Camacho Rodríguez


On Jan. 3, 2019, 8:39 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69663/
> ---
> 
> (Updated Jan. 3, 2019, 8:39 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan, Gopal V, Jesús Camacho Rodríguez, 
> and Jason Dere.
> 
> 
> Bugs: HIVE-16976
> https://issues.apache.org/jira/browse/HIVE-16976
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> DPP: SyntheticJoinPredicate transitivity for < > and BETWEEN
> 
> The patch supports predicates on non-equi joins and provides an interface for 
> storage handler to decide if it can use this optimization.
> Work to integrate this with DPP and semijoin will be done in separate JIRA.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStorageHandler.java 
> 2ebb149354 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
>  a1401aac72 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java f8c7e18eb1 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDynamicListDesc.java 
> 676dfc9421 
>   ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java 
> e97e44796f 
>   ql/src/test/results/clientpositive/llap/cross_prod_1.q.out ac1f4eabd8 
>   ql/src/test/results/clientpositive/llap/groupby_groupingset_bug.q.out 
> de74af6dff 
>   ql/src/test/results/clientpositive/llap/semijoin.q.out 63a270e57d 
>   ql/src/test/results/clientpositive/llap/subquery_in.q.out 07cc4dbabc 
>   ql/src/test/results/clientpositive/llap/subquery_notin.q.out 29d8bbfb48 
>   ql/src/test/results/clientpositive/llap/subquery_scalar.q.out e830835445 
>   ql/src/test/results/clientpositive/llap/subquery_select.q.out d3cc980ca1 
> 
> 
> Diff: https://reviews.apache.org/r/69663/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>



Re: Review Request 69663: HIVE-16976

2019-01-07 Thread Deepak Jaiswal


> On Jan. 7, 2019, 4:41 p.m., Jesús Camacho Rodríguez wrote:
> > Can we add some tests for the new feature?

The reason there is no test yet is because it does nothing end to end. Both DPP 
route and semijoin reduction route dont process the predicate yet.


> On Jan. 7, 2019, 4:41 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
> > Line 182 (original)
> > 
> >
> > Can you bring this back and delete 'if (sourceKeys.size() > 0) {' 
> > below? This is just a style change and indenting so many lines will just 
> > make more difficult following code provenance.

The continue is removed so that it reaches the residualFilter logic, otherwise 
it would skip everything and move on to next target.


> On Jan. 7, 2019, 4:41 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
> > Lines 284 (patched)
> > 
> >
> > We should add a call to extended version here as we did above for 
> > equality predicates. The only required change seems to be in 
> > _addParentReduceSink_ called from _createDerivatives_, which would receive 
> > the comparison operator from here. All the rest should already work as 
> > expected.
> > 
> > I believe this could be addressed in this JIRA since it is not a lot of 
> > code. However, if it is not addressed, please create follow-up and leave a 
> > TODO.

Will add the extended version. Thanks for bringing this up.


> On Jan. 7, 2019, 4:41 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
> > Lines 318 (patched)
> > 
> >
> > return colExprMap.get(rsColName)

:|


> On Jan. 7, 2019, 4:41 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
> > Lines 328 (patched)
> > 
> >
> > Can we move this method as _invertFunction_ utility method to 
> > _FunctionRegistry.java_?
> > 
> > In addition, instead of relying on the function text, I believe it 
> > would be more robust to have the UDF as the input. In particular, we can 
> > use _funcDesc.getGenericUDF();_ when calling this method, then rely in e.g. 
> > _udf instanceof GenericUDFOPEqualOrGreaterThan_ for the checks.

Yes, I can move this.
The reason I used function text is because I can use switch case and also much 
faster.
Otherwise, once extended in future, this could become a giant mess of if...else 
statements.
We can discuss this further.


> On Jan. 7, 2019, 4:41 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
> > Lines 330 (patched)
> > 
> >
> > Should we still return null if function text is not recognized?

Yes, that helps recognize unsupported functions. For eg,

ExprNodeGenericFuncDesc funcDesc = (ExprNodeGenericFuncDesc) filter;
// filter should be of type <, >, <= or >=
if (getFuncText(funcDesc.getFuncText(), 1) == null) {
  // unsupported
  continue;
}

I am open to better ways, hence the TODO.


- Deepak


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69663/#review211725
---


On Jan. 3, 2019, 8:39 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69663/
> ---
> 
> (Updated Jan. 3, 2019, 8:39 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan, Gopal V, Jesús Camacho Rodríguez, 
> and Jason Dere.
> 
> 
> Bugs: HIVE-16976
> https://issues.apache.org/jira/browse/HIVE-16976
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> DPP: SyntheticJoinPredicate transitivity for < > and BETWEEN
> 
> The patch supports predicates on non-equi joins and provides an interface for 
> storage handler to decide if it can use this optimization.
> Work to integrate this with DPP and semijoin will be done in separate JIRA.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStorageHandler.java 
> 2ebb149354 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
>  a1401aac72 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java f8c7e18eb1 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDynamicListDesc.java 
> 676dfc9421 

Re: Review Request 69663: HIVE-16976

2019-01-07 Thread Jesús Camacho Rodríguez

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69663/#review211725
---



Can we add some tests for the new feature?


ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
Line 182 (original)


Can you bring this back and delete 'if (sourceKeys.size() > 0) {' below? 
This is just a style change and indenting so many lines will just make more 
difficult following code provenance.



ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
Lines 284 (patched)


We should add a call to extended version here as we did above for equality 
predicates. The only required change seems to be in _addParentReduceSink_ 
called from _createDerivatives_, which would receive the comparison operator 
from here. All the rest should already work as expected.

I believe this could be addressed in this JIRA since it is not a lot of 
code. However, if it is not addressed, please create follow-up and leave a TODO.



ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
Lines 318 (patched)


return colExprMap.get(rsColName)



ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
Lines 328 (patched)


Can we move this method as _invertFunction_ utility method to 
_FunctionRegistry.java_?

In addition, instead of relying on the function text, I believe it would be 
more robust to have the UDF as the input. In particular, we can use 
_funcDesc.getGenericUDF();_ when calling this method, then rely in e.g. _udf 
instanceof GenericUDFOPEqualOrGreaterThan_ for the checks.



ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
Lines 330 (patched)


Should we still return null if function text is not recognized?


- Jesús Camacho Rodríguez


On Jan. 3, 2019, 8:39 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69663/
> ---
> 
> (Updated Jan. 3, 2019, 8:39 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan, Gopal V, Jesús Camacho Rodríguez, 
> and Jason Dere.
> 
> 
> Bugs: HIVE-16976
> https://issues.apache.org/jira/browse/HIVE-16976
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> DPP: SyntheticJoinPredicate transitivity for < > and BETWEEN
> 
> The patch supports predicates on non-equi joins and provides an interface for 
> storage handler to decide if it can use this optimization.
> Work to integrate this with DPP and semijoin will be done in separate JIRA.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStorageHandler.java 
> 2ebb149354 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
>  a1401aac72 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java f8c7e18eb1 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDynamicListDesc.java 
> 676dfc9421 
>   ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java 
> e97e44796f 
>   ql/src/test/results/clientpositive/llap/cross_prod_1.q.out ac1f4eabd8 
>   ql/src/test/results/clientpositive/llap/groupby_groupingset_bug.q.out 
> de74af6dff 
>   ql/src/test/results/clientpositive/llap/semijoin.q.out 63a270e57d 
>   ql/src/test/results/clientpositive/llap/subquery_in.q.out 07cc4dbabc 
>   ql/src/test/results/clientpositive/llap/subquery_notin.q.out 29d8bbfb48 
>   ql/src/test/results/clientpositive/llap/subquery_scalar.q.out e830835445 
>   ql/src/test/results/clientpositive/llap/subquery_select.q.out d3cc980ca1 
> 
> 
> Diff: https://reviews.apache.org/r/69663/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>



Review Request 69663: HIVE-16976

2019-01-03 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69663/
---

Review request for hive, Ashutosh Chauhan, Gopal V, Jesús Camacho Rodríguez, 
and Jason Dere.


Bugs: HIVE-16976
https://issues.apache.org/jira/browse/HIVE-16976


Repository: hive-git


Description
---

DPP: SyntheticJoinPredicate transitivity for < > and BETWEEN

The patch supports predicates on non-equi joins and provides an interface for 
storage handler to decide if it can use this optimization.
Work to integrate this with DPP and semijoin will be done in separate JIRA.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStorageHandler.java 
2ebb149354 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
 a1401aac72 
  ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java f8c7e18eb1 
  ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDynamicListDesc.java 
676dfc9421 
  ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java 
e97e44796f 
  ql/src/test/results/clientpositive/llap/cross_prod_1.q.out ac1f4eabd8 
  ql/src/test/results/clientpositive/llap/groupby_groupingset_bug.q.out 
de74af6dff 
  ql/src/test/results/clientpositive/llap/semijoin.q.out 63a270e57d 
  ql/src/test/results/clientpositive/llap/subquery_in.q.out 07cc4dbabc 
  ql/src/test/results/clientpositive/llap/subquery_notin.q.out 29d8bbfb48 
  ql/src/test/results/clientpositive/llap/subquery_scalar.q.out e830835445 
  ql/src/test/results/clientpositive/llap/subquery_select.q.out d3cc980ca1 


Diff: https://reviews.apache.org/r/69663/diff/1/


Testing
---


Thanks,

Deepak Jaiswal