Re: Review Request 19718: Vectorized Between and IN expressions don't work with decimal, date types.

2014-03-28 Thread Jitendra Pandey

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19718/
---

(Updated March 28, 2014, 9:56 p.m.)


Review request for hive and Eric Hanson.


Bugs: HIVE-6752
https://issues.apache.org/jira/browse/HIVE-6752


Repository: hive-git


Description
---

Vectorized Between and IN expressions don't work with decimal, date types.


Diffs (updated)
-

  ant/src/org/apache/hadoop/hive/ant/GenVectorCode.java 44b0c59 
  ql/src/gen/vectorization/ExpressionTemplates/FilterDecimalColumnBetween.txt 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorHashKeyWrapper.java 
2229079 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 
96e74a9 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToString.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/DecimalColumnInList.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FilterDecimalColumnInList.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/IDecimalInExpr.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 
c2240c0 
  
ql/src/test/org/apache/hadoop/hive/ql/exec/vector/TestVectorizationContext.java 
5ebab70 
  ql/src/test/queries/clientpositive/vector_between_in.q PRE-CREATION 
  ql/src/test/results/clientpositive/vector_between_in.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/19718/diff/


Testing
---


Thanks,

Jitendra Pandey



Re: Review Request 19718: Vectorized Between and IN expressions don't work with decimal, date types.

2014-03-28 Thread Eric Hanson

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19718/#review38958
---



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorHashKeyWrapper.java
https://reviews.apache.org/r/19718/#comment71328

please add a comment to explain why we use the sum of all the counts here 
to determine the array size.



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorHashKeyWrapper.java
https://reviews.apache.org/r/19718/#comment71329

Consider for readability/encapsulation having a function to compute offset, 
e.g. 

isNull[decimalOffset(index)] = false;

Please add a comment to explain offset logic.

Does addition of decimal affect any other offsets? I guess not.



ql/src/test/org/apache/hadoop/hive/ql/exec/vector/TestVectorizationContext.java
https://reviews.apache.org/r/19718/#comment71330

Timestamp is supposed to be represented as a long (# of nanos since epoch). 
So whey is this using a FilterStringColumnBetween?



ql/src/test/org/apache/hadoop/hive/ql/exec/vector/TestVectorizationContext.java
https://reviews.apache.org/r/19718/#comment71331

Again, why string and not long not between operator?


- Eric Hanson


On March 28, 2014, 9:56 p.m., Jitendra Pandey wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/19718/
 ---
 
 (Updated March 28, 2014, 9:56 p.m.)
 
 
 Review request for hive and Eric Hanson.
 
 
 Bugs: HIVE-6752
 https://issues.apache.org/jira/browse/HIVE-6752
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Vectorized Between and IN expressions don't work with decimal, date types.
 
 
 Diffs
 -
 
   ant/src/org/apache/hadoop/hive/ant/GenVectorCode.java 44b0c59 
   ql/src/gen/vectorization/ExpressionTemplates/FilterDecimalColumnBetween.txt 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorHashKeyWrapper.java 
 2229079 
   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 
 96e74a9 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToString.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/DecimalColumnInList.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FilterDecimalColumnInList.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/IDecimalInExpr.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 
 c2240c0 
   
 ql/src/test/org/apache/hadoop/hive/ql/exec/vector/TestVectorizationContext.java
  5ebab70 
   ql/src/test/queries/clientpositive/vector_between_in.q PRE-CREATION 
   ql/src/test/results/clientpositive/vector_between_in.q.out PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/19718/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Jitendra Pandey
 




Re: Review Request 19718: Vectorized Between and IN expressions don't work with decimal, date types.

2014-03-28 Thread Jitendra Pandey


 On March 28, 2014, 10:42 p.m., Eric Hanson wrote:
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorHashKeyWrapper.java,
   line 62
  https://reviews.apache.org/r/19718/diff/3/?file=539988#file539988line62
 
  please add a comment to explain why we use the sum of all the counts 
  here to determine the array size.

This patch modifies only one line in this file, the other code has been around 
for quiet some time. I agree that we should add some comments to explain these 
offsets, I would like to address them in a separate jira.


 On March 28, 2014, 10:42 p.m., Eric Hanson wrote:
  ql/src/test/org/apache/hadoop/hive/ql/exec/vector/TestVectorizationContext.java,
   line 968
  https://reviews.apache.org/r/19718/diff/3/?file=539995#file539995line968
 
  Timestamp is supposed to be represented as a long (# of nanos since 
  epoch). So whey is this using a FilterStringColumnBetween?

The test compares timestamp with string constants, which requires these 
arguments to be cast to a common type. I am using the legacy hive 
FunctionRegistry to get the common type which happens to be string in this 
case. This should be optimized to instead cast strings to timestamps, but that 
would mean a different logic to determine common types for comparisons, which 
we better take up in a separate jira. This is not a regression even though test 
was passing earlier, because earlier for between clause, everything was getting 
cast to boolean (because first argument of Between expressions is a boolean for 
Not) which was incorrect anyway. 


 On March 28, 2014, 10:42 p.m., Eric Hanson wrote:
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorHashKeyWrapper.java,
   line 226
  https://reviews.apache.org/r/19718/diff/3/?file=539988#file539988line226
 
  Consider for readability/encapsulation having a function to compute 
  offset, e.g. 
  
  isNull[decimalOffset(index)] = false;
  
  Please add a comment to explain offset logic.
  
  Does addition of decimal affect any other offsets? I guess not.

  All datatypes use this pattern of offset calculation. The readability 
argument is valid, but it would be better to address it for all datatypes in a 
separate jira.
 Addition of decimal was done in an earlier patch and it doesn't affect other 
offsets, this patch only sets the isNull correctly for non-null values.


 On March 28, 2014, 10:42 p.m., Eric Hanson wrote:
  ql/src/test/org/apache/hadoop/hive/ql/exec/vector/TestVectorizationContext.java,
   line 973
  https://reviews.apache.org/r/19718/diff/3/?file=539995#file539995line973
 
  Again, why string and not long not between operator?

See previous one.


- Jitendra


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19718/#review38958
---


On March 28, 2014, 9:56 p.m., Jitendra Pandey wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/19718/
 ---
 
 (Updated March 28, 2014, 9:56 p.m.)
 
 
 Review request for hive and Eric Hanson.
 
 
 Bugs: HIVE-6752
 https://issues.apache.org/jira/browse/HIVE-6752
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Vectorized Between and IN expressions don't work with decimal, date types.
 
 
 Diffs
 -
 
   ant/src/org/apache/hadoop/hive/ant/GenVectorCode.java 44b0c59 
   ql/src/gen/vectorization/ExpressionTemplates/FilterDecimalColumnBetween.txt 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorHashKeyWrapper.java 
 2229079 
   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 
 96e74a9 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToString.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/DecimalColumnInList.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FilterDecimalColumnInList.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/IDecimalInExpr.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 
 c2240c0 
   
 ql/src/test/org/apache/hadoop/hive/ql/exec/vector/TestVectorizationContext.java
  5ebab70 
   ql/src/test/queries/clientpositive/vector_between_in.q PRE-CREATION 
   ql/src/test/results/clientpositive/vector_between_in.q.out PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/19718/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Jitendra Pandey
 




Review Request 19718: Vectorized Between and IN expressions don't work with decimal, date types.

2014-03-27 Thread Jitendra Pandey

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19718/
---

Review request for hive and Eric Hanson.


Bugs: HIVE-6752
https://issues.apache.org/jira/browse/HIVE-6752


Repository: hive-git


Description
---

Vectorized Between and IN expressions don't work with decimal, date types.


Diffs
-

  ant/src/org/apache/hadoop/hive/ant/GenVectorCode.java 44b0c59 
  ql/src/gen/vectorization/ExpressionTemplates/FilterDecimalColumnBetween.txt 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 
96e74a9 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToString.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/DecimalColumnInList.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FilterDecimalColumnInList.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/IDecimalInExpr.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 
c2240c0 
  ql/src/test/queries/clientpositive/vector_between_in.q PRE-CREATION 
  ql/src/test/results/clientpositive/vector_between_in.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/19718/diff/


Testing
---


Thanks,

Jitendra Pandey



Re: Review Request 19718: Vectorized Between and IN expressions don't work with decimal, date types.

2014-03-27 Thread Eric Hanson

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19718/#review38752
---


Looks good overall. Only minor comments.


ql/src/gen/vectorization/ExpressionTemplates/FilterDecimalColumnBetween.txt
https://reviews.apache.org/r/19718/#comment71027

please remove all trailing whitespace in this file



ql/src/gen/vectorization/ExpressionTemplates/FilterDecimalColumnBetween.txt
https://reviews.apache.org/r/19718/#comment71034

add blank after //



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java
https://reviews.apache.org/r/19718/#comment71038

Couldn't determine common type ...

sounds better



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/DecimalColumnInList.java
https://reviews.apache.org/r/19718/#comment71053

Change comment. This is not a filter, it is a Boolean-valued expression.



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/DecimalColumnInList.java
https://reviews.apache.org/r/19718/#comment71052

Remove the comment about This is optimized for lookup of the data type of 
the column. 

because that doesn't apply here since you're using the standard HashSet.

But it is still pretty good :-)



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/DecimalColumnInList.java
https://reviews.apache.org/r/19718/#comment71057

formatting: j=0 == j = 0




ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/DecimalColumnInList.java
https://reviews.apache.org/r/19718/#comment71059

add blanks line before comment and space after //



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FilterDecimalColumnInList.java
https://reviews.apache.org/r/19718/#comment71062

remove This is optimized



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FilterDecimalColumnInList.java
https://reviews.apache.org/r/19718/#comment71061

see formatting comments for DecimalColumnInList


- Eric Hanson


On March 27, 2014, 7:02 a.m., Jitendra Pandey wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/19718/
 ---
 
 (Updated March 27, 2014, 7:02 a.m.)
 
 
 Review request for hive and Eric Hanson.
 
 
 Bugs: HIVE-6752
 https://issues.apache.org/jira/browse/HIVE-6752
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Vectorized Between and IN expressions don't work with decimal, date types.
 
 
 Diffs
 -
 
   ant/src/org/apache/hadoop/hive/ant/GenVectorCode.java 44b0c59 
   ql/src/gen/vectorization/ExpressionTemplates/FilterDecimalColumnBetween.txt 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 
 96e74a9 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToString.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/DecimalColumnInList.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FilterDecimalColumnInList.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/IDecimalInExpr.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 
 c2240c0 
   ql/src/test/queries/clientpositive/vector_between_in.q PRE-CREATION 
   ql/src/test/results/clientpositive/vector_between_in.q.out PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/19718/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Jitendra Pandey
 




Re: Review Request 19718: Vectorized Between and IN expressions don't work with decimal, date types.

2014-03-27 Thread Jitendra Pandey

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19718/
---

(Updated March 27, 2014, 11:22 p.m.)


Review request for hive and Eric Hanson.


Changes
---

Addressed comments. Also updated test with 'Order by' clauses, because hadoop-2 
and hadoop-1 produce different order of results if order by is not specified.


Bugs: HIVE-6752
https://issues.apache.org/jira/browse/HIVE-6752


Repository: hive-git


Description
---

Vectorized Between and IN expressions don't work with decimal, date types.


Diffs (updated)
-

  ant/src/org/apache/hadoop/hive/ant/GenVectorCode.java 44b0c59 
  ql/src/gen/vectorization/ExpressionTemplates/FilterDecimalColumnBetween.txt 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 
96e74a9 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToString.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/DecimalColumnInList.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FilterDecimalColumnInList.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/IDecimalInExpr.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 
c2240c0 
  ql/src/test/queries/clientpositive/vector_between_in.q PRE-CREATION 
  ql/src/test/results/clientpositive/vector_between_in.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/19718/diff/


Testing
---


Thanks,

Jitendra Pandey