Re: Review Request 19718: Vectorized Between and IN expressions don't work with decimal, date types.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19718/ --- (Updated March 28, 2014, 9:56 p.m.) Review request for hive and Eric Hanson. Bugs: HIVE-6752 https://issues.apache.org/jira/browse/HIVE-6752 Repository: hive-git Description --- Vectorized Between and IN expressions don't work with decimal, date types. Diffs (updated) - ant/src/org/apache/hadoop/hive/ant/GenVectorCode.java 44b0c59 ql/src/gen/vectorization/ExpressionTemplates/FilterDecimalColumnBetween.txt PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorHashKeyWrapper.java 2229079 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 96e74a9 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToString.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/DecimalColumnInList.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FilterDecimalColumnInList.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/IDecimalInExpr.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java c2240c0 ql/src/test/org/apache/hadoop/hive/ql/exec/vector/TestVectorizationContext.java 5ebab70 ql/src/test/queries/clientpositive/vector_between_in.q PRE-CREATION ql/src/test/results/clientpositive/vector_between_in.q.out PRE-CREATION Diff: https://reviews.apache.org/r/19718/diff/ Testing --- Thanks, Jitendra Pandey
Re: Review Request 19718: Vectorized Between and IN expressions don't work with decimal, date types.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19718/#review38958 --- ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorHashKeyWrapper.java https://reviews.apache.org/r/19718/#comment71328 please add a comment to explain why we use the sum of all the counts here to determine the array size. ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorHashKeyWrapper.java https://reviews.apache.org/r/19718/#comment71329 Consider for readability/encapsulation having a function to compute offset, e.g. isNull[decimalOffset(index)] = false; Please add a comment to explain offset logic. Does addition of decimal affect any other offsets? I guess not. ql/src/test/org/apache/hadoop/hive/ql/exec/vector/TestVectorizationContext.java https://reviews.apache.org/r/19718/#comment71330 Timestamp is supposed to be represented as a long (# of nanos since epoch). So whey is this using a FilterStringColumnBetween? ql/src/test/org/apache/hadoop/hive/ql/exec/vector/TestVectorizationContext.java https://reviews.apache.org/r/19718/#comment71331 Again, why string and not long not between operator? - Eric Hanson On March 28, 2014, 9:56 p.m., Jitendra Pandey wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19718/ --- (Updated March 28, 2014, 9:56 p.m.) Review request for hive and Eric Hanson. Bugs: HIVE-6752 https://issues.apache.org/jira/browse/HIVE-6752 Repository: hive-git Description --- Vectorized Between and IN expressions don't work with decimal, date types. Diffs - ant/src/org/apache/hadoop/hive/ant/GenVectorCode.java 44b0c59 ql/src/gen/vectorization/ExpressionTemplates/FilterDecimalColumnBetween.txt PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorHashKeyWrapper.java 2229079 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 96e74a9 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToString.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/DecimalColumnInList.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FilterDecimalColumnInList.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/IDecimalInExpr.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java c2240c0 ql/src/test/org/apache/hadoop/hive/ql/exec/vector/TestVectorizationContext.java 5ebab70 ql/src/test/queries/clientpositive/vector_between_in.q PRE-CREATION ql/src/test/results/clientpositive/vector_between_in.q.out PRE-CREATION Diff: https://reviews.apache.org/r/19718/diff/ Testing --- Thanks, Jitendra Pandey
Re: Review Request 19718: Vectorized Between and IN expressions don't work with decimal, date types.
On March 28, 2014, 10:42 p.m., Eric Hanson wrote: ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorHashKeyWrapper.java, line 62 https://reviews.apache.org/r/19718/diff/3/?file=539988#file539988line62 please add a comment to explain why we use the sum of all the counts here to determine the array size. This patch modifies only one line in this file, the other code has been around for quiet some time. I agree that we should add some comments to explain these offsets, I would like to address them in a separate jira. On March 28, 2014, 10:42 p.m., Eric Hanson wrote: ql/src/test/org/apache/hadoop/hive/ql/exec/vector/TestVectorizationContext.java, line 968 https://reviews.apache.org/r/19718/diff/3/?file=539995#file539995line968 Timestamp is supposed to be represented as a long (# of nanos since epoch). So whey is this using a FilterStringColumnBetween? The test compares timestamp with string constants, which requires these arguments to be cast to a common type. I am using the legacy hive FunctionRegistry to get the common type which happens to be string in this case. This should be optimized to instead cast strings to timestamps, but that would mean a different logic to determine common types for comparisons, which we better take up in a separate jira. This is not a regression even though test was passing earlier, because earlier for between clause, everything was getting cast to boolean (because first argument of Between expressions is a boolean for Not) which was incorrect anyway. On March 28, 2014, 10:42 p.m., Eric Hanson wrote: ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorHashKeyWrapper.java, line 226 https://reviews.apache.org/r/19718/diff/3/?file=539988#file539988line226 Consider for readability/encapsulation having a function to compute offset, e.g. isNull[decimalOffset(index)] = false; Please add a comment to explain offset logic. Does addition of decimal affect any other offsets? I guess not. All datatypes use this pattern of offset calculation. The readability argument is valid, but it would be better to address it for all datatypes in a separate jira. Addition of decimal was done in an earlier patch and it doesn't affect other offsets, this patch only sets the isNull correctly for non-null values. On March 28, 2014, 10:42 p.m., Eric Hanson wrote: ql/src/test/org/apache/hadoop/hive/ql/exec/vector/TestVectorizationContext.java, line 973 https://reviews.apache.org/r/19718/diff/3/?file=539995#file539995line973 Again, why string and not long not between operator? See previous one. - Jitendra --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19718/#review38958 --- On March 28, 2014, 9:56 p.m., Jitendra Pandey wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19718/ --- (Updated March 28, 2014, 9:56 p.m.) Review request for hive and Eric Hanson. Bugs: HIVE-6752 https://issues.apache.org/jira/browse/HIVE-6752 Repository: hive-git Description --- Vectorized Between and IN expressions don't work with decimal, date types. Diffs - ant/src/org/apache/hadoop/hive/ant/GenVectorCode.java 44b0c59 ql/src/gen/vectorization/ExpressionTemplates/FilterDecimalColumnBetween.txt PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorHashKeyWrapper.java 2229079 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 96e74a9 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToString.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/DecimalColumnInList.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FilterDecimalColumnInList.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/IDecimalInExpr.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java c2240c0 ql/src/test/org/apache/hadoop/hive/ql/exec/vector/TestVectorizationContext.java 5ebab70 ql/src/test/queries/clientpositive/vector_between_in.q PRE-CREATION ql/src/test/results/clientpositive/vector_between_in.q.out PRE-CREATION Diff: https://reviews.apache.org/r/19718/diff/ Testing --- Thanks, Jitendra Pandey
Review Request 19718: Vectorized Between and IN expressions don't work with decimal, date types.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19718/ --- Review request for hive and Eric Hanson. Bugs: HIVE-6752 https://issues.apache.org/jira/browse/HIVE-6752 Repository: hive-git Description --- Vectorized Between and IN expressions don't work with decimal, date types. Diffs - ant/src/org/apache/hadoop/hive/ant/GenVectorCode.java 44b0c59 ql/src/gen/vectorization/ExpressionTemplates/FilterDecimalColumnBetween.txt PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 96e74a9 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToString.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/DecimalColumnInList.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FilterDecimalColumnInList.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/IDecimalInExpr.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java c2240c0 ql/src/test/queries/clientpositive/vector_between_in.q PRE-CREATION ql/src/test/results/clientpositive/vector_between_in.q.out PRE-CREATION Diff: https://reviews.apache.org/r/19718/diff/ Testing --- Thanks, Jitendra Pandey
Re: Review Request 19718: Vectorized Between and IN expressions don't work with decimal, date types.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19718/#review38752 --- Looks good overall. Only minor comments. ql/src/gen/vectorization/ExpressionTemplates/FilterDecimalColumnBetween.txt https://reviews.apache.org/r/19718/#comment71027 please remove all trailing whitespace in this file ql/src/gen/vectorization/ExpressionTemplates/FilterDecimalColumnBetween.txt https://reviews.apache.org/r/19718/#comment71034 add blank after // ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java https://reviews.apache.org/r/19718/#comment71038 Couldn't determine common type ... sounds better ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/DecimalColumnInList.java https://reviews.apache.org/r/19718/#comment71053 Change comment. This is not a filter, it is a Boolean-valued expression. ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/DecimalColumnInList.java https://reviews.apache.org/r/19718/#comment71052 Remove the comment about This is optimized for lookup of the data type of the column. because that doesn't apply here since you're using the standard HashSet. But it is still pretty good :-) ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/DecimalColumnInList.java https://reviews.apache.org/r/19718/#comment71057 formatting: j=0 == j = 0 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/DecimalColumnInList.java https://reviews.apache.org/r/19718/#comment71059 add blanks line before comment and space after // ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FilterDecimalColumnInList.java https://reviews.apache.org/r/19718/#comment71062 remove This is optimized ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FilterDecimalColumnInList.java https://reviews.apache.org/r/19718/#comment71061 see formatting comments for DecimalColumnInList - Eric Hanson On March 27, 2014, 7:02 a.m., Jitendra Pandey wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19718/ --- (Updated March 27, 2014, 7:02 a.m.) Review request for hive and Eric Hanson. Bugs: HIVE-6752 https://issues.apache.org/jira/browse/HIVE-6752 Repository: hive-git Description --- Vectorized Between and IN expressions don't work with decimal, date types. Diffs - ant/src/org/apache/hadoop/hive/ant/GenVectorCode.java 44b0c59 ql/src/gen/vectorization/ExpressionTemplates/FilterDecimalColumnBetween.txt PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 96e74a9 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToString.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/DecimalColumnInList.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FilterDecimalColumnInList.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/IDecimalInExpr.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java c2240c0 ql/src/test/queries/clientpositive/vector_between_in.q PRE-CREATION ql/src/test/results/clientpositive/vector_between_in.q.out PRE-CREATION Diff: https://reviews.apache.org/r/19718/diff/ Testing --- Thanks, Jitendra Pandey
Re: Review Request 19718: Vectorized Between and IN expressions don't work with decimal, date types.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19718/ --- (Updated March 27, 2014, 11:22 p.m.) Review request for hive and Eric Hanson. Changes --- Addressed comments. Also updated test with 'Order by' clauses, because hadoop-2 and hadoop-1 produce different order of results if order by is not specified. Bugs: HIVE-6752 https://issues.apache.org/jira/browse/HIVE-6752 Repository: hive-git Description --- Vectorized Between and IN expressions don't work with decimal, date types. Diffs (updated) - ant/src/org/apache/hadoop/hive/ant/GenVectorCode.java 44b0c59 ql/src/gen/vectorization/ExpressionTemplates/FilterDecimalColumnBetween.txt PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 96e74a9 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToString.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/DecimalColumnInList.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FilterDecimalColumnInList.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/IDecimalInExpr.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java c2240c0 ql/src/test/queries/clientpositive/vector_between_in.q PRE-CREATION ql/src/test/results/clientpositive/vector_between_in.q.out PRE-CREATION Diff: https://reviews.apache.org/r/19718/diff/ Testing --- Thanks, Jitendra Pandey