Re: Review Request 69663: HIVE-16976

2019-01-07 Thread Deepak Jaiswal


> On Jan. 7, 2019, 4:41 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
> > Lines 284 (patched)
> > 
> >
> > We should add a call to extended version here as we did above for 
> > equality predicates. The only required change seems to be in 
> > _addParentReduceSink_ called from _createDerivatives_, which would receive 
> > the comparison operator from here. All the rest should already work as 
> > expected.
> > 
> > I believe this could be addressed in this JIRA since it is not a lot of 
> > code. However, if it is not addressed, please create follow-up and leave a 
> > TODO.
> 
> Deepak Jaiswal wrote:
> Will add the extended version. Thanks for bringing this up.

The existing logic for extension works for equality. I am planning to do this 
later. HIVE-21098 tracks it.


- Deepak


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69663/#review211725
---


On Jan. 3, 2019, 8:39 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69663/
> ---
> 
> (Updated Jan. 3, 2019, 8:39 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan, Gopal V, Jesús Camacho Rodríguez, 
> and Jason Dere.
> 
> 
> Bugs: HIVE-16976
> https://issues.apache.org/jira/browse/HIVE-16976
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> DPP: SyntheticJoinPredicate transitivity for < > and BETWEEN
> 
> The patch supports predicates on non-equi joins and provides an interface for 
> storage handler to decide if it can use this optimization.
> Work to integrate this with DPP and semijoin will be done in separate JIRA.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStorageHandler.java 
> 2ebb149354 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
>  a1401aac72 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java f8c7e18eb1 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDynamicListDesc.java 
> 676dfc9421 
>   ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java 
> e97e44796f 
>   ql/src/test/results/clientpositive/llap/cross_prod_1.q.out ac1f4eabd8 
>   ql/src/test/results/clientpositive/llap/groupby_groupingset_bug.q.out 
> de74af6dff 
>   ql/src/test/results/clientpositive/llap/semijoin.q.out 63a270e57d 
>   ql/src/test/results/clientpositive/llap/subquery_in.q.out 07cc4dbabc 
>   ql/src/test/results/clientpositive/llap/subquery_notin.q.out 29d8bbfb48 
>   ql/src/test/results/clientpositive/llap/subquery_scalar.q.out e830835445 
>   ql/src/test/results/clientpositive/llap/subquery_select.q.out d3cc980ca1 
> 
> 
> Diff: https://reviews.apache.org/r/69663/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>



[jira] [Created] (HIVE-21098) DPP: SyntheticJoinPredicate transitivity for < > and BETWEEN needs extension

2019-01-07 Thread Deepak Jaiswal (JIRA)
Deepak Jaiswal created HIVE-21098:
-

 Summary: DPP: SyntheticJoinPredicate transitivity for < > and 
BETWEEN needs extension
 Key: HIVE-21098
 URL: https://issues.apache.org/jira/browse/HIVE-21098
 Project: Hive
  Issue Type: Bug
  Components: Hive
Reporter: Deepak Jaiswal
Assignee: Deepak Jaiswal


SyntheticJoinPredicates are supported for equality. Both in regular and 
extended format.

Similar extended format is needed for non-equi joins too.

 

See HIVE-16976



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 69663: HIVE-16976

2019-01-07 Thread Deepak Jaiswal


> On Jan. 7, 2019, 4:41 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
> > Lines 330 (patched)
> > 
> >
> > Should we still return null if function text is not recognized?
> 
> Deepak Jaiswal wrote:
> Yes, that helps recognize unsupported functions. For eg,
> 
> ExprNodeGenericFuncDesc funcDesc = (ExprNodeGenericFuncDesc) 
> filter;
> // filter should be of type <, >, <= or >=
> if (getFuncText(funcDesc.getFuncText(), 1) == null) {
>   // unsupported
>   continue;
> }
> 
> I am open to better ways, hence the TODO.
> 
> Jesús Camacho Rodríguez wrote:
> Sorry, I did not express myself properly. Within the if (srcPos == 0) {, 
> shouldn't we return null if function text is not recognized (similar to what 
> we do below that you pointed out)?
> 
> Deepak Jaiswal wrote:
> That would require verifying the function text which is done in switch 
> case anyway.
> Inorder for non-equi join's synthetic joins to work properly, if the 
> switch case cant get a valid inversion text then it is not supported.
> That is why I used "1" to make sure it goes through the switch case. This 
> eliminates duplicating similar logic.
> 
> Jesús Camacho Rodríguez wrote:
> OK, I was getting confused by the semantics of the srcPos parameter (an 
> 'invert' boolean would have been clearer).
> Tbh, I think it is better to create two methods: one internal in 
> SyntheticJoinPredicate that would return whether a function is supported or 
> not, and a utility method in FunctionRegistry that would return the inverse 
> of a given function. Overhead is neglibible and there will be clear different 
> semantics.

Having two functions could create a maintenance headache. As mentioned above, 
the function will go to FunctionRegistry. There is already a comment,
return null; // helps identify unsupported functions

I can expand the comment to make things clearer.
Leaving the function as it is keeps things short and sweet and involves much 
less maintenance.


- Deepak


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69663/#review211725
---


On Jan. 3, 2019, 8:39 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69663/
> ---
> 
> (Updated Jan. 3, 2019, 8:39 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan, Gopal V, Jesús Camacho Rodríguez, 
> and Jason Dere.
> 
> 
> Bugs: HIVE-16976
> https://issues.apache.org/jira/browse/HIVE-16976
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> DPP: SyntheticJoinPredicate transitivity for < > and BETWEEN
> 
> The patch supports predicates on non-equi joins and provides an interface for 
> storage handler to decide if it can use this optimization.
> Work to integrate this with DPP and semijoin will be done in separate JIRA.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStorageHandler.java 
> 2ebb149354 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
>  a1401aac72 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java f8c7e18eb1 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDynamicListDesc.java 
> 676dfc9421 
>   ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java 
> e97e44796f 
>   ql/src/test/results/clientpositive/llap/cross_prod_1.q.out ac1f4eabd8 
>   ql/src/test/results/clientpositive/llap/groupby_groupingset_bug.q.out 
> de74af6dff 
>   ql/src/test/results/clientpositive/llap/semijoin.q.out 63a270e57d 
>   ql/src/test/results/clientpositive/llap/subquery_in.q.out 07cc4dbabc 
>   ql/src/test/results/clientpositive/llap/subquery_notin.q.out 29d8bbfb48 
>   ql/src/test/results/clientpositive/llap/subquery_scalar.q.out e830835445 
>   ql/src/test/results/clientpositive/llap/subquery_select.q.out d3cc980ca1 
> 
> 
> Diff: https://reviews.apache.org/r/69663/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>



Re: Review Request 69663: HIVE-16976

2019-01-07 Thread Deepak Jaiswal


> On Jan. 7, 2019, 10:40 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
> > Lines 254 (patched)
> > 
> >
> > srcPos and targetPos do not seem to refer to function, but rather the 
> > inputs being joined. In addition, they do not change between loop 
> > iterations. Below, they are used to retrieve the child expression from the 
> > function, which does not seem correct.
> 
> Jesús Camacho Rodríguez wrote:
> OK, seeing your comment above, I understood better the code here. You may 
> be inverting the source and target, that is why you access the function 
> expression using them. Could you leave a comment explaining it?
> My comment above about the value change for srcPos and targetPos between 
> iterations still seems valid, the check could be done before to skip the the 
> loop in line 242.

Thanks for the tip. Yes, it makes sense to have the check before the loop 
begins as srcPos and targetPos do not change. We can skip the whole logic with 
this condition even before the if condition.


- Deepak


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69663/#review211745
---


On Jan. 3, 2019, 8:39 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69663/
> ---
> 
> (Updated Jan. 3, 2019, 8:39 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan, Gopal V, Jesús Camacho Rodríguez, 
> and Jason Dere.
> 
> 
> Bugs: HIVE-16976
> https://issues.apache.org/jira/browse/HIVE-16976
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> DPP: SyntheticJoinPredicate transitivity for < > and BETWEEN
> 
> The patch supports predicates on non-equi joins and provides an interface for 
> storage handler to decide if it can use this optimization.
> Work to integrate this with DPP and semijoin will be done in separate JIRA.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStorageHandler.java 
> 2ebb149354 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
>  a1401aac72 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java f8c7e18eb1 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDynamicListDesc.java 
> 676dfc9421 
>   ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java 
> e97e44796f 
>   ql/src/test/results/clientpositive/llap/cross_prod_1.q.out ac1f4eabd8 
>   ql/src/test/results/clientpositive/llap/groupby_groupingset_bug.q.out 
> de74af6dff 
>   ql/src/test/results/clientpositive/llap/semijoin.q.out 63a270e57d 
>   ql/src/test/results/clientpositive/llap/subquery_in.q.out 07cc4dbabc 
>   ql/src/test/results/clientpositive/llap/subquery_notin.q.out 29d8bbfb48 
>   ql/src/test/results/clientpositive/llap/subquery_scalar.q.out e830835445 
>   ql/src/test/results/clientpositive/llap/subquery_select.q.out d3cc980ca1 
> 
> 
> Diff: https://reviews.apache.org/r/69663/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>



Re: Review Request 69663: HIVE-16976

2019-01-07 Thread Jesús Camacho Rodríguez


> On Jan. 7, 2019, 10:40 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
> > Lines 254 (patched)
> > 
> >
> > srcPos and targetPos do not seem to refer to function, but rather the 
> > inputs being joined. In addition, they do not change between loop 
> > iterations. Below, they are used to retrieve the child expression from the 
> > function, which does not seem correct.

OK, seeing your comment above, I understood better the code here. You may be 
inverting the source and target, that is why you access the function expression 
using them. Could you leave a comment explaining it?
My comment above about the value change for srcPos and targetPos between 
iterations still seems valid, the check could be done before to skip the the 
loop in line 242.


- Jesús


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69663/#review211745
---


On Jan. 3, 2019, 8:39 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69663/
> ---
> 
> (Updated Jan. 3, 2019, 8:39 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan, Gopal V, Jesús Camacho Rodríguez, 
> and Jason Dere.
> 
> 
> Bugs: HIVE-16976
> https://issues.apache.org/jira/browse/HIVE-16976
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> DPP: SyntheticJoinPredicate transitivity for < > and BETWEEN
> 
> The patch supports predicates on non-equi joins and provides an interface for 
> storage handler to decide if it can use this optimization.
> Work to integrate this with DPP and semijoin will be done in separate JIRA.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStorageHandler.java 
> 2ebb149354 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
>  a1401aac72 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java f8c7e18eb1 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDynamicListDesc.java 
> 676dfc9421 
>   ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java 
> e97e44796f 
>   ql/src/test/results/clientpositive/llap/cross_prod_1.q.out ac1f4eabd8 
>   ql/src/test/results/clientpositive/llap/groupby_groupingset_bug.q.out 
> de74af6dff 
>   ql/src/test/results/clientpositive/llap/semijoin.q.out 63a270e57d 
>   ql/src/test/results/clientpositive/llap/subquery_in.q.out 07cc4dbabc 
>   ql/src/test/results/clientpositive/llap/subquery_notin.q.out 29d8bbfb48 
>   ql/src/test/results/clientpositive/llap/subquery_scalar.q.out e830835445 
>   ql/src/test/results/clientpositive/llap/subquery_select.q.out d3cc980ca1 
> 
> 
> Diff: https://reviews.apache.org/r/69663/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>



Re: Review Request 69663: HIVE-16976

2019-01-07 Thread Jesús Camacho Rodríguez


> On Jan. 7, 2019, 4:41 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
> > Lines 330 (patched)
> > 
> >
> > Should we still return null if function text is not recognized?
> 
> Deepak Jaiswal wrote:
> Yes, that helps recognize unsupported functions. For eg,
> 
> ExprNodeGenericFuncDesc funcDesc = (ExprNodeGenericFuncDesc) 
> filter;
> // filter should be of type <, >, <= or >=
> if (getFuncText(funcDesc.getFuncText(), 1) == null) {
>   // unsupported
>   continue;
> }
> 
> I am open to better ways, hence the TODO.
> 
> Jesús Camacho Rodríguez wrote:
> Sorry, I did not express myself properly. Within the if (srcPos == 0) {, 
> shouldn't we return null if function text is not recognized (similar to what 
> we do below that you pointed out)?
> 
> Deepak Jaiswal wrote:
> That would require verifying the function text which is done in switch 
> case anyway.
> Inorder for non-equi join's synthetic joins to work properly, if the 
> switch case cant get a valid inversion text then it is not supported.
> That is why I used "1" to make sure it goes through the switch case. This 
> eliminates duplicating similar logic.

OK, I was getting confused by the semantics of the srcPos parameter (an 
'invert' boolean would have been clearer).
Tbh, I think it is better to create two methods: one internal in 
SyntheticJoinPredicate that would return whether a function is supported or 
not, and a utility method in FunctionRegistry that would return the inverse of 
a given function. Overhead is neglibible and there will be clear different 
semantics.


- Jesús


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69663/#review211725
---


On Jan. 3, 2019, 8:39 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69663/
> ---
> 
> (Updated Jan. 3, 2019, 8:39 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan, Gopal V, Jesús Camacho Rodríguez, 
> and Jason Dere.
> 
> 
> Bugs: HIVE-16976
> https://issues.apache.org/jira/browse/HIVE-16976
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> DPP: SyntheticJoinPredicate transitivity for < > and BETWEEN
> 
> The patch supports predicates on non-equi joins and provides an interface for 
> storage handler to decide if it can use this optimization.
> Work to integrate this with DPP and semijoin will be done in separate JIRA.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStorageHandler.java 
> 2ebb149354 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
>  a1401aac72 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java f8c7e18eb1 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDynamicListDesc.java 
> 676dfc9421 
>   ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java 
> e97e44796f 
>   ql/src/test/results/clientpositive/llap/cross_prod_1.q.out ac1f4eabd8 
>   ql/src/test/results/clientpositive/llap/groupby_groupingset_bug.q.out 
> de74af6dff 
>   ql/src/test/results/clientpositive/llap/semijoin.q.out 63a270e57d 
>   ql/src/test/results/clientpositive/llap/subquery_in.q.out 07cc4dbabc 
>   ql/src/test/results/clientpositive/llap/subquery_notin.q.out 29d8bbfb48 
>   ql/src/test/results/clientpositive/llap/subquery_scalar.q.out e830835445 
>   ql/src/test/results/clientpositive/llap/subquery_select.q.out d3cc980ca1 
> 
> 
> Diff: https://reviews.apache.org/r/69663/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>



Re: Review Request 69663: HIVE-16976

2019-01-07 Thread Deepak Jaiswal


> On Jan. 7, 2019, 4:41 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
> > Line 182 (original)
> > 
> >
> > Can you bring this back and delete 'if (sourceKeys.size() > 0) {' 
> > below? This is just a style change and indenting so many lines will just 
> > make more difficult following code provenance.
> 
> Deepak Jaiswal wrote:
> The continue is removed so that it reaches the residualFilter logic, 
> otherwise it would skip everything and move on to next target.
> 
> Jesús Camacho Rodríguez wrote:
> You are right, I did not see the extra }. Could the comment '//if 
> (sourceKeys.size() < 1) continue;' below be removed then? No need to leave it 
> there.

Sure. I forgot to remove it.


> On Jan. 7, 2019, 4:41 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
> > Lines 330 (patched)
> > 
> >
> > Should we still return null if function text is not recognized?
> 
> Deepak Jaiswal wrote:
> Yes, that helps recognize unsupported functions. For eg,
> 
> ExprNodeGenericFuncDesc funcDesc = (ExprNodeGenericFuncDesc) 
> filter;
> // filter should be of type <, >, <= or >=
> if (getFuncText(funcDesc.getFuncText(), 1) == null) {
>   // unsupported
>   continue;
> }
> 
> I am open to better ways, hence the TODO.
> 
> Jesús Camacho Rodríguez wrote:
> Sorry, I did not express myself properly. Within the if (srcPos == 0) {, 
> shouldn't we return null if function text is not recognized (similar to what 
> we do below that you pointed out)?

That would require verifying the function text which is done in switch case 
anyway.
Inorder for non-equi join's synthetic joins to work properly, if the switch 
case cant get a valid inversion text then it is not supported.
That is why I used "1" to make sure it goes through the switch case. This 
eliminates duplicating similar logic.


- Deepak


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69663/#review211725
---


On Jan. 3, 2019, 8:39 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69663/
> ---
> 
> (Updated Jan. 3, 2019, 8:39 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan, Gopal V, Jesús Camacho Rodríguez, 
> and Jason Dere.
> 
> 
> Bugs: HIVE-16976
> https://issues.apache.org/jira/browse/HIVE-16976
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> DPP: SyntheticJoinPredicate transitivity for < > and BETWEEN
> 
> The patch supports predicates on non-equi joins and provides an interface for 
> storage handler to decide if it can use this optimization.
> Work to integrate this with DPP and semijoin will be done in separate JIRA.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStorageHandler.java 
> 2ebb149354 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
>  a1401aac72 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java f8c7e18eb1 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDynamicListDesc.java 
> 676dfc9421 
>   ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java 
> e97e44796f 
>   ql/src/test/results/clientpositive/llap/cross_prod_1.q.out ac1f4eabd8 
>   ql/src/test/results/clientpositive/llap/groupby_groupingset_bug.q.out 
> de74af6dff 
>   ql/src/test/results/clientpositive/llap/semijoin.q.out 63a270e57d 
>   ql/src/test/results/clientpositive/llap/subquery_in.q.out 07cc4dbabc 
>   ql/src/test/results/clientpositive/llap/subquery_notin.q.out 29d8bbfb48 
>   ql/src/test/results/clientpositive/llap/subquery_scalar.q.out e830835445 
>   ql/src/test/results/clientpositive/llap/subquery_select.q.out d3cc980ca1 
> 
> 
> Diff: https://reviews.apache.org/r/69663/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>



Re: Review Request 69663: HIVE-16976

2019-01-07 Thread Jesús Camacho Rodríguez


> On Jan. 7, 2019, 4:41 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
> > Line 182 (original)
> > 
> >
> > Can you bring this back and delete 'if (sourceKeys.size() > 0) {' 
> > below? This is just a style change and indenting so many lines will just 
> > make more difficult following code provenance.
> 
> Deepak Jaiswal wrote:
> The continue is removed so that it reaches the residualFilter logic, 
> otherwise it would skip everything and move on to next target.

You are right, I did not see the extra }. Could the comment '//if 
(sourceKeys.size() < 1) continue;' below be removed then? No need to leave it 
there.


> On Jan. 7, 2019, 4:41 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
> > Lines 328 (patched)
> > 
> >
> > Can we move this method as _invertFunction_ utility method to 
> > _FunctionRegistry.java_?
> > 
> > In addition, instead of relying on the function text, I believe it 
> > would be more robust to have the UDF as the input. In particular, we can 
> > use _funcDesc.getGenericUDF();_ when calling this method, then rely in e.g. 
> > _udf instanceof GenericUDFOPEqualOrGreaterThan_ for the checks.
> 
> Deepak Jaiswal wrote:
> Yes, I can move this.
> The reason I used function text is because I can use switch case and also 
> much faster.
> Otherwise, once extended in future, this could become a giant mess of 
> if...else statements.
> We can discuss this further.

I was thinking about the function UDFs because it is less shaky than text 
matching, but on the other hand, it is not probable that Hive ever changes the 
string representation for these functions and as you said it is simpler, hence 
it should be fine.


> On Jan. 7, 2019, 4:41 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
> > Lines 330 (patched)
> > 
> >
> > Should we still return null if function text is not recognized?
> 
> Deepak Jaiswal wrote:
> Yes, that helps recognize unsupported functions. For eg,
> 
> ExprNodeGenericFuncDesc funcDesc = (ExprNodeGenericFuncDesc) 
> filter;
> // filter should be of type <, >, <= or >=
> if (getFuncText(funcDesc.getFuncText(), 1) == null) {
>   // unsupported
>   continue;
> }
> 
> I am open to better ways, hence the TODO.

Sorry, I did not express myself properly. Within the if (srcPos == 0) {, 
shouldn't we return null if function text is not recognized (similar to what we 
do below that you pointed out)?


- Jesús


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69663/#review211725
---


On Jan. 3, 2019, 8:39 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69663/
> ---
> 
> (Updated Jan. 3, 2019, 8:39 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan, Gopal V, Jesús Camacho Rodríguez, 
> and Jason Dere.
> 
> 
> Bugs: HIVE-16976
> https://issues.apache.org/jira/browse/HIVE-16976
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> DPP: SyntheticJoinPredicate transitivity for < > and BETWEEN
> 
> The patch supports predicates on non-equi joins and provides an interface for 
> storage handler to decide if it can use this optimization.
> Work to integrate this with DPP and semijoin will be done in separate JIRA.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStorageHandler.java 
> 2ebb149354 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
>  a1401aac72 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java f8c7e18eb1 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDynamicListDesc.java 
> 676dfc9421 
>   ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java 
> e97e44796f 
>   ql/src/test/results/clientpositive/llap/cross_prod_1.q.out ac1f4eabd8 
>   ql/src/test/results/clientpositive/llap/groupby_groupingset_bug.q.out 
> de74af6dff 
>   ql/src/test/results/clientpositive/llap/semijoin.q.out 63a270e57d 
>   ql/src/test/results/clientpositive/llap/subquery_in.q.out 07cc4dbabc 
>   ql/src/test/results/clientpositive/llap/subquery_notin.q.out 29d8bbfb48 
>   ql/src/test/results/clientpositive/llap/subquery_scalar.q.out e830835445 
>   

Re: Review Request 69663: HIVE-16976

2019-01-07 Thread Jesús Camacho Rodríguez

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69663/#review211745
---




ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
Lines 254 (patched)


srcPos and targetPos do not seem to refer to function, but rather the 
inputs being joined. In addition, they do not change between loop iterations. 
Below, they are used to retrieve the child expression from the function, which 
does not seem correct.


- Jesús Camacho Rodríguez


On Jan. 3, 2019, 8:39 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69663/
> ---
> 
> (Updated Jan. 3, 2019, 8:39 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan, Gopal V, Jesús Camacho Rodríguez, 
> and Jason Dere.
> 
> 
> Bugs: HIVE-16976
> https://issues.apache.org/jira/browse/HIVE-16976
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> DPP: SyntheticJoinPredicate transitivity for < > and BETWEEN
> 
> The patch supports predicates on non-equi joins and provides an interface for 
> storage handler to decide if it can use this optimization.
> Work to integrate this with DPP and semijoin will be done in separate JIRA.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStorageHandler.java 
> 2ebb149354 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
>  a1401aac72 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java f8c7e18eb1 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDynamicListDesc.java 
> 676dfc9421 
>   ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java 
> e97e44796f 
>   ql/src/test/results/clientpositive/llap/cross_prod_1.q.out ac1f4eabd8 
>   ql/src/test/results/clientpositive/llap/groupby_groupingset_bug.q.out 
> de74af6dff 
>   ql/src/test/results/clientpositive/llap/semijoin.q.out 63a270e57d 
>   ql/src/test/results/clientpositive/llap/subquery_in.q.out 07cc4dbabc 
>   ql/src/test/results/clientpositive/llap/subquery_notin.q.out 29d8bbfb48 
>   ql/src/test/results/clientpositive/llap/subquery_scalar.q.out e830835445 
>   ql/src/test/results/clientpositive/llap/subquery_select.q.out d3cc980ca1 
> 
> 
> Diff: https://reviews.apache.org/r/69663/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>



Re: Review Request 69663: HIVE-16976

2019-01-07 Thread Deepak Jaiswal


> On Jan. 7, 2019, 4:41 p.m., Jesús Camacho Rodríguez wrote:
> > Can we add some tests for the new feature?

The reason there is no test yet is because it does nothing end to end. Both DPP 
route and semijoin reduction route dont process the predicate yet.


> On Jan. 7, 2019, 4:41 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
> > Line 182 (original)
> > 
> >
> > Can you bring this back and delete 'if (sourceKeys.size() > 0) {' 
> > below? This is just a style change and indenting so many lines will just 
> > make more difficult following code provenance.

The continue is removed so that it reaches the residualFilter logic, otherwise 
it would skip everything and move on to next target.


> On Jan. 7, 2019, 4:41 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
> > Lines 284 (patched)
> > 
> >
> > We should add a call to extended version here as we did above for 
> > equality predicates. The only required change seems to be in 
> > _addParentReduceSink_ called from _createDerivatives_, which would receive 
> > the comparison operator from here. All the rest should already work as 
> > expected.
> > 
> > I believe this could be addressed in this JIRA since it is not a lot of 
> > code. However, if it is not addressed, please create follow-up and leave a 
> > TODO.

Will add the extended version. Thanks for bringing this up.


> On Jan. 7, 2019, 4:41 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
> > Lines 318 (patched)
> > 
> >
> > return colExprMap.get(rsColName)

:|


> On Jan. 7, 2019, 4:41 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
> > Lines 328 (patched)
> > 
> >
> > Can we move this method as _invertFunction_ utility method to 
> > _FunctionRegistry.java_?
> > 
> > In addition, instead of relying on the function text, I believe it 
> > would be more robust to have the UDF as the input. In particular, we can 
> > use _funcDesc.getGenericUDF();_ when calling this method, then rely in e.g. 
> > _udf instanceof GenericUDFOPEqualOrGreaterThan_ for the checks.

Yes, I can move this.
The reason I used function text is because I can use switch case and also much 
faster.
Otherwise, once extended in future, this could become a giant mess of if...else 
statements.
We can discuss this further.


> On Jan. 7, 2019, 4:41 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
> > Lines 330 (patched)
> > 
> >
> > Should we still return null if function text is not recognized?

Yes, that helps recognize unsupported functions. For eg,

ExprNodeGenericFuncDesc funcDesc = (ExprNodeGenericFuncDesc) filter;
// filter should be of type <, >, <= or >=
if (getFuncText(funcDesc.getFuncText(), 1) == null) {
  // unsupported
  continue;
}

I am open to better ways, hence the TODO.


- Deepak


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69663/#review211725
---


On Jan. 3, 2019, 8:39 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69663/
> ---
> 
> (Updated Jan. 3, 2019, 8:39 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan, Gopal V, Jesús Camacho Rodríguez, 
> and Jason Dere.
> 
> 
> Bugs: HIVE-16976
> https://issues.apache.org/jira/browse/HIVE-16976
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> DPP: SyntheticJoinPredicate transitivity for < > and BETWEEN
> 
> The patch supports predicates on non-equi joins and provides an interface for 
> storage handler to decide if it can use this optimization.
> Work to integrate this with DPP and semijoin will be done in separate JIRA.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStorageHandler.java 
> 2ebb149354 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
>  a1401aac72 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java f8c7e18eb1 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDynamicListDesc.java 
> 676dfc9421 

Re: Review Request 69664: HIVE-21077 : Database and Catalogs should have creation time

2019-01-07 Thread Adam Holley via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69664/#review211734
---




standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Catalog.java
Lines 137 (patched)


nit: extraneous space.



standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Catalog.java
Lines 601 (patched)


nit: extraneous space.



standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Database.java
Lines 178 (patched)


nit: extraneous space



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
Lines 771 (patched)


Need parenthesis around the (System.currentTimeMillis() / 1000).



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
Lines 786 (patched)


Need parenthesis around (System.currentTimeMillis() / 1000).



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
Lines 1063 (patched)


Need parenthesis around (System.currentTimeMillis() / 1000).



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
Line 1289 (original), 1293 (patched)


Need parenthesis around (System.currentTimeMillis() / 1000).


- Adam Holley


On Jan. 7, 2019, 7:16 p.m., Vihang Karajgaonkar wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69664/
> ---
> 
> (Updated Jan. 7, 2019, 7:16 p.m.)
> 
> 
> Review request for hive, Karthik Manamcheri, Naveen Gangam, and Peter Vary.
> 
> 
> Bugs: HIVE-21077
> https://issues.apache.org/jira/browse/HIVE-21077
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-21077 : Database and Catalogs should have creation time
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Catalog.java
>  3eb4dbd51110dd6e5d04c3bdacde2e5bdba09a7c 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Database.java
>  994797698a379e0b08604d73d2d6728a2fcee4df 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-php/metastore/Types.php
>  2953a8f327eabdee42dbc66e0c65f89d17add59a 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-py/hive_metastore/ttypes.py
>  f8b862862de4dde8dce3d0dc5f70eafb67b02d2c 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-rb/hive_metastore_types.rb
>  dfc5d7b294c1df8d5c6b0e7d676a77ba1181e076 
>   standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift 
> 7d09a5c296a8ae924d61b200b4cb9135440fd9a0 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
>  a9398ae1e79404a15894aa42f451df5d18ed3e4c 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
>  58dc6eefcb840d4dd70af7a47811fab1b5e696d9 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
>  d43c0c1e70cffbebd39b05f89ec396227c58ac77 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/client/builder/DatabaseBuilder.java
>  f3d2182a04ab81417a4ba58d9340721513e8 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/model/MCatalog.java
>  e82cb4322f6e2ac7afeb5efcec7517a68c8b2dee 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/model/MDatabase.java
>  815b39c483b2233660310983d58194fb1ab2d107 
>   standalone-metastore/metastore-server/src/main/resources/package.jdo 
> caaec457194332a99d5cd57bef746e969dd38161 
>   
> standalone-metastore/metastore-server/src/main/sql/derby/hive-schema-4.0.0.derby.sql
>  a3c4196dbff7e53be5317631b314983d16a99020 
>   
> standalone-metastore/metastore-server/src/main/sql/derby/upgrade-3.2.0-to-4.0.0.derby.sql
>  bcaebd18accf86846ae44a6498046514575fc069 
>   
> standalone-metastore/metastore-server/src/main/sql/mssql/hive-schema-4.0.0.mssql.sql
>  5ea1b4450d8258e841bb4af7381ca6fb0ba1a827 
>   
> 

Re: Review Request 69664: HIVE-21077 : Database and Catalogs should have creation time

2019-01-07 Thread Vihang Karajgaonkar via Review Board


> On Jan. 4, 2019, 5:33 p.m., Karthik Manamcheri wrote:
> >
> 
> Karthik Manamcheri wrote:
> 1. While we are at it, can we make sure that the "create time" exists for 
> everything else (if required)? Such as catalog, partitions..
> 2. Out of curiosity.. are there any tests to ensure that the upgrade path 
> works? What if we miss adding this to some sql file? Would we catch it?
> 
> Vihang Karajgaonkar wrote:
> createTime exists for tables and partitions. It doesn't exist for 
> catalog. I will see if it doesn't add too much to this patch to add 
> createTime to catalogs. Otherwise, I will prefer to do it in a separate 
> patch. For upgrade testing the precommit job does not run such tests. It is 
> expected that schema patches are tested either manually or using the ITests 
> (steps are provided in DEV-README). I have tested this patch on mysql 
> manually and I am setting up my machine to give a try to the newly introduced 
> ITests currently. Will update the RB once I complete the testing.
> 
> Karthik Manamcheri wrote:
> Don't you think its better to make the schema changes all together 
> instead of splitting it up. If we think we might need creation time for 
> catalogs, we should add it to this changeset because we can knock off the 
> schema change as well.

It was not too hard to add createTime to catalog. Added that in the latest 
patch.


- Vihang


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69664/#review211677
---


On Jan. 7, 2019, 7:16 p.m., Vihang Karajgaonkar wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69664/
> ---
> 
> (Updated Jan. 7, 2019, 7:16 p.m.)
> 
> 
> Review request for hive, Karthik Manamcheri, Naveen Gangam, and Peter Vary.
> 
> 
> Bugs: HIVE-21077
> https://issues.apache.org/jira/browse/HIVE-21077
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-21077 : Database and Catalogs should have creation time
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Catalog.java
>  3eb4dbd51110dd6e5d04c3bdacde2e5bdba09a7c 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Database.java
>  994797698a379e0b08604d73d2d6728a2fcee4df 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-php/metastore/Types.php
>  2953a8f327eabdee42dbc66e0c65f89d17add59a 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-py/hive_metastore/ttypes.py
>  f8b862862de4dde8dce3d0dc5f70eafb67b02d2c 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-rb/hive_metastore_types.rb
>  dfc5d7b294c1df8d5c6b0e7d676a77ba1181e076 
>   standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift 
> 7d09a5c296a8ae924d61b200b4cb9135440fd9a0 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
>  a9398ae1e79404a15894aa42f451df5d18ed3e4c 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
>  58dc6eefcb840d4dd70af7a47811fab1b5e696d9 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
>  d43c0c1e70cffbebd39b05f89ec396227c58ac77 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/client/builder/DatabaseBuilder.java
>  f3d2182a04ab81417a4ba58d9340721513e8 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/model/MCatalog.java
>  e82cb4322f6e2ac7afeb5efcec7517a68c8b2dee 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/model/MDatabase.java
>  815b39c483b2233660310983d58194fb1ab2d107 
>   standalone-metastore/metastore-server/src/main/resources/package.jdo 
> caaec457194332a99d5cd57bef746e969dd38161 
>   
> standalone-metastore/metastore-server/src/main/sql/derby/hive-schema-4.0.0.derby.sql
>  a3c4196dbff7e53be5317631b314983d16a99020 
>   
> standalone-metastore/metastore-server/src/main/sql/derby/upgrade-3.2.0-to-4.0.0.derby.sql
>  bcaebd18accf86846ae44a6498046514575fc069 
>   
> standalone-metastore/metastore-server/src/main/sql/mssql/hive-schema-4.0.0.mssql.sql
>  5ea1b4450d8258e841bb4af7381ca6fb0ba1a827 
>   
> standalone-metastore/metastore-server/src/main/sql/mssql/upgrade-3.2.0-to-4.0.0.mssql.sql
>  edde08db9ef7ee01800c7cc3a04c813014abdd18 
>   
> standalone-metastore/metastore-server/src/main/sql/mysql/hive-schema-4.0.0.mysql.sql
>  a59c7d7e933d25d8d5af611e5b6aa0c0c19b 
>   
> 

Re: Review Request 69664: HIVE-21077 : Database and Catalogs should have creation time

2019-01-07 Thread Vihang Karajgaonkar via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69664/
---

(Updated Jan. 7, 2019, 7:16 p.m.)


Review request for hive, Karthik Manamcheri, Naveen Gangam, and Peter Vary.


Changes
---

added create time to catalogs as well as suggested.


Bugs: HIVE-21077
https://issues.apache.org/jira/browse/HIVE-21077


Repository: hive-git


Description
---

HIVE-21077 : Database and Catalogs should have creation time


Diffs (updated)
-

  
standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Catalog.java
 3eb4dbd51110dd6e5d04c3bdacde2e5bdba09a7c 
  
standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Database.java
 994797698a379e0b08604d73d2d6728a2fcee4df 
  
standalone-metastore/metastore-common/src/gen/thrift/gen-php/metastore/Types.php
 2953a8f327eabdee42dbc66e0c65f89d17add59a 
  
standalone-metastore/metastore-common/src/gen/thrift/gen-py/hive_metastore/ttypes.py
 f8b862862de4dde8dce3d0dc5f70eafb67b02d2c 
  
standalone-metastore/metastore-common/src/gen/thrift/gen-rb/hive_metastore_types.rb
 dfc5d7b294c1df8d5c6b0e7d676a77ba1181e076 
  standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift 
7d09a5c296a8ae924d61b200b4cb9135440fd9a0 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 a9398ae1e79404a15894aa42f451df5d18ed3e4c 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
 58dc6eefcb840d4dd70af7a47811fab1b5e696d9 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
 d43c0c1e70cffbebd39b05f89ec396227c58ac77 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/client/builder/DatabaseBuilder.java
 f3d2182a04ab81417a4ba58d9340721513e8 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/model/MCatalog.java
 e82cb4322f6e2ac7afeb5efcec7517a68c8b2dee 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/model/MDatabase.java
 815b39c483b2233660310983d58194fb1ab2d107 
  standalone-metastore/metastore-server/src/main/resources/package.jdo 
caaec457194332a99d5cd57bef746e969dd38161 
  
standalone-metastore/metastore-server/src/main/sql/derby/hive-schema-4.0.0.derby.sql
 a3c4196dbff7e53be5317631b314983d16a99020 
  
standalone-metastore/metastore-server/src/main/sql/derby/upgrade-3.2.0-to-4.0.0.derby.sql
 bcaebd18accf86846ae44a6498046514575fc069 
  
standalone-metastore/metastore-server/src/main/sql/mssql/hive-schema-4.0.0.mssql.sql
 5ea1b4450d8258e841bb4af7381ca6fb0ba1a827 
  
standalone-metastore/metastore-server/src/main/sql/mssql/upgrade-3.2.0-to-4.0.0.mssql.sql
 edde08db9ef7ee01800c7cc3a04c813014abdd18 
  
standalone-metastore/metastore-server/src/main/sql/mysql/hive-schema-4.0.0.mysql.sql
 a59c7d7e933d25d8d5af611e5b6aa0c0c19b 
  
standalone-metastore/metastore-server/src/main/sql/mysql/upgrade-3.2.0-to-4.0.0.mysql.sql
 701acb00984c61f7511dcc48053890b154575d1f 
  
standalone-metastore/metastore-server/src/main/sql/oracle/hive-schema-4.0.0.oracle.sql
 b1980c5b83f16614845063516495188ebdd8c2a3 
  
standalone-metastore/metastore-server/src/main/sql/oracle/upgrade-3.2.0-to-4.0.0.oracle.sql
 b9f63313251ab1fa6278b862ed9e07e62b234c04 
  
standalone-metastore/metastore-server/src/main/sql/postgres/hive-schema-4.0.0.postgres.sql
 9040005aa82b7a8cc5c01f257ecd47a7cc97e9b2 
  
standalone-metastore/metastore-server/src/main/sql/postgres/upgrade-3.2.0-to-4.0.0.postgres.sql
 0c36069d071d4b60cc338ba729da5d22e08ca8ca 
  
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestCatalogs.java
 28eb1fadca80dfd3c962e4163120b83f00410c4a 
  
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestDatabases.java
 d323ac6c90ed20f092b4e179fdb1bed8602ecf63 


Diff: https://reviews.apache.org/r/69664/diff/3/

Changes: https://reviews.apache.org/r/69664/diff/2-3/


Testing
---

Ran ITests to check the db install and upgrade scripts are working for mysql, 
postgres, oracle and derby databases. The ITests for mssql is timing out for 
some reason due to container provisioning issues.


Thanks,

Vihang Karajgaonkar



Re: Review Request 69663: HIVE-16976

2019-01-07 Thread Jesús Camacho Rodríguez

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69663/#review211725
---



Can we add some tests for the new feature?


ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
Line 182 (original)


Can you bring this back and delete 'if (sourceKeys.size() > 0) {' below? 
This is just a style change and indenting so many lines will just make more 
difficult following code provenance.



ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
Lines 284 (patched)


We should add a call to extended version here as we did above for equality 
predicates. The only required change seems to be in _addParentReduceSink_ 
called from _createDerivatives_, which would receive the comparison operator 
from here. All the rest should already work as expected.

I believe this could be addressed in this JIRA since it is not a lot of 
code. However, if it is not addressed, please create follow-up and leave a TODO.



ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
Lines 318 (patched)


return colExprMap.get(rsColName)



ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
Lines 328 (patched)


Can we move this method as _invertFunction_ utility method to 
_FunctionRegistry.java_?

In addition, instead of relying on the function text, I believe it would be 
more robust to have the UDF as the input. In particular, we can use 
_funcDesc.getGenericUDF();_ when calling this method, then rely in e.g. _udf 
instanceof GenericUDFOPEqualOrGreaterThan_ for the checks.



ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java
Lines 330 (patched)


Should we still return null if function text is not recognized?


- Jesús Camacho Rodríguez


On Jan. 3, 2019, 8:39 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69663/
> ---
> 
> (Updated Jan. 3, 2019, 8:39 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan, Gopal V, Jesús Camacho Rodríguez, 
> and Jason Dere.
> 
> 
> Bugs: HIVE-16976
> https://issues.apache.org/jira/browse/HIVE-16976
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> DPP: SyntheticJoinPredicate transitivity for < > and BETWEEN
> 
> The patch supports predicates on non-equi joins and provides an interface for 
> storage handler to decide if it can use this optimization.
> Work to integrate this with DPP and semijoin will be done in separate JIRA.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStorageHandler.java 
> 2ebb149354 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
>  a1401aac72 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java f8c7e18eb1 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDynamicListDesc.java 
> 676dfc9421 
>   ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java 
> e97e44796f 
>   ql/src/test/results/clientpositive/llap/cross_prod_1.q.out ac1f4eabd8 
>   ql/src/test/results/clientpositive/llap/groupby_groupingset_bug.q.out 
> de74af6dff 
>   ql/src/test/results/clientpositive/llap/semijoin.q.out 63a270e57d 
>   ql/src/test/results/clientpositive/llap/subquery_in.q.out 07cc4dbabc 
>   ql/src/test/results/clientpositive/llap/subquery_notin.q.out 29d8bbfb48 
>   ql/src/test/results/clientpositive/llap/subquery_scalar.q.out e830835445 
>   ql/src/test/results/clientpositive/llap/subquery_select.q.out d3cc980ca1 
> 
> 
> Diff: https://reviews.apache.org/r/69663/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>



[jira] [Created] (HIVE-21097) Replace SparkConf dependency from HS2 with Hadoop Conf

2019-01-07 Thread Adam Szita (JIRA)
Adam Szita created HIVE-21097:
-

 Summary: Replace SparkConf dependency from HS2 with Hadoop Conf
 Key: HIVE-21097
 URL: https://issues.apache.org/jira/browse/HIVE-21097
 Project: Hive
  Issue Type: Sub-task
Reporter: Adam Szita
Assignee: Adam Szita






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21096) Remove unnecessary Spark dependency from HS2 process

2019-01-07 Thread Adam Szita (JIRA)
Adam Szita created HIVE-21096:
-

 Summary: Remove unnecessary Spark dependency from HS2 process
 Key: HIVE-21096
 URL: https://issues.apache.org/jira/browse/HIVE-21096
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2, Spark
Reporter: Adam Szita
Assignee: Adam Szita


When a HiveOnSpark job is kicked off most of the work is done by the 
RemoteDriver, which is a separate process. There a couple of smaller parts of 
code, where HS2 process depends on Spark jars, these for example include 
receiving stats from the driver or putting together a Spark conf object - used 
mostly during communication with RemoteDriver.

We can limit the data types used for such communication so that we don't use 
(and serialize) types that are in Spark codebase, and hence we can refactor our 
code to only use Spark jars in the Remote Driver process.

I think this way would be cleaner from dependencies point of view, and also 
less erroneous when users have to compile the classpath for their HS2 processes.
(E.g. due to a change between Spark 2.2 and 2.4 we had to also include 
spark-unsafe*.jar - though it's an internal change to Spark..)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21095) 'Show create table' should not display a time zone for timestamp with local time zone

2019-01-07 Thread Karen Coppage (JIRA)
Karen Coppage created HIVE-21095:


 Summary: 'Show create table' should not display a time zone for 
timestamp with local time zone
 Key: HIVE-21095
 URL: https://issues.apache.org/jira/browse/HIVE-21095
 Project: Hive
  Issue Type: Improvement
Reporter: Karen Coppage
Assignee: Karen Coppage


SHOW CREATE TABLE shows the time zone that the table was created in (if it 
contains a TIMESTAMPTZ column). This is also misleading, since it might have 
nothing to do with the actual data.
e.g.

{code:java}
hive> set time zone America/Los_Angeles;
hive> create table text_local (ts timestamp with local time zone) stored as 
textfile;
hive> show create table text_local;
CREATE TABLE `text_local`(
  `ts` timestamp with local time zone('America/Los_Angeles'))
{code}

should be:

{code:java}
hive> show create table text_local;
CREATE TABLE `text_local`(
  `ts` timestamp with local time zone)
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Review Request 69683: [HIVE-21071] Improve getInputSummary

2019-01-07 Thread David Mollitor

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69683/
---

Review request for hive.


Repository: hive-git


Description
---

Improve performance of method getInputSummary by changing data structures and 
allowing multiple threads to do calculations.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java d0f6451 


Diff: https://reviews.apache.org/r/69683/diff/1/


Testing
---

Unit


Thanks,

David Mollitor



[jira] [Created] (HIVE-21094) Store TIMESTAMP WITH LOCAL TIME ZONE in UTC instead of writer's time zone

2019-01-07 Thread Karen Coppage (JIRA)
Karen Coppage created HIVE-21094:


 Summary: Store TIMESTAMP WITH LOCAL TIME ZONE in UTC instead of 
writer's time zone
 Key: HIVE-21094
 URL: https://issues.apache.org/jira/browse/HIVE-21094
 Project: Hive
  Issue Type: Improvement
  Components: Hive
Reporter: Karen Coppage
Assignee: Karen Coppage


TIMESTAMP WITH LOCAL TIME ZONE (aka TIMESTAMPTZ) is stored in writer's local 
time, and the writer's zone is stored with it. When reading, the timestamp in 
reader local time + reader zone is displayed. This is misleading for the user, 
since it looks like all the data was written in the reader's time zone.

TIMESTAMPTZ should be stored in UTC time and be displayed in reader local time 
(as it was before) but without the reader's time zone.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21093) Push grouping sets in Aggregate operator to JDBC sources from Calcite

2019-01-07 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-21093:
--

 Summary: Push grouping sets in Aggregate operator to JDBC sources 
from Calcite
 Key: HIVE-21093
 URL: https://issues.apache.org/jira/browse/HIVE-21093
 Project: Hive
  Issue Type: Improvement
  Components: StorageHandler
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 69633: HIVE-20159 Do Not Print StackTraces to STDERR in ConditionalResolverSkewJoin

2019-01-07 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69633/#review211722
---



Thanks for the fix!
Fix it and ship it!


ql/src/java/org/apache/hadoop/hive/ql/plan/ConditionalResolverSkewJoin.java
Line 121 (original), 125 (patched)


Could you please remove the commented out line?


- Peter Vary


On jan. 5, 2019, 5:50 de, MANI M wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69633/
> ---
> 
> (Updated jan. 5, 2019, 5:50 de)
> 
> 
> Review request for hive, Peter Vary and Vihang Karajgaonkar.
> 
> 
> Bugs: HIVE-20159
> https://issues.apache.org/jira/browse/HIVE-20159
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Do Not Print StackTraces to STDERR in ConditionalResolverSkewJoin HIVE-20159
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/ConditionalResolverSkewJoin.java 
> 5d25007d77 
> 
> 
> Diff: https://reviews.apache.org/r/69633/diff/1/
> 
> 
> Testing
> ---
> 
> Patch Uploaded to JIRA HIVE-20159, PRE-COMMIT testing completed.
> https://issues.apache.org/jira/browse/HIVE-20159
> 
> 
> Thanks,
> 
> MANI M
> 
>