[ https://issues.apache.org/jira/browse/HIVE-18421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16321352#comment-16321352 ]
Gopal V edited comment on HIVE-18421 at 1/10/18 11:36 PM: ---------------------------------------------------------- New Column vectors are generally a bad idea - OOB checks are much easier to apply to Long. I'm expecting that we'll add something like a {code} if (checked) { checkBounds(outV, typeRange); } {code} to https://github.com/apache/hive/blob/master/ql/src/gen/vectorization/ExpressionTemplates/ColumnArithmeticColumn.txt This allows the initial operation to be SIMD and the checking to be L1 cache efficient (i.e check & overwrite same memory). This is somewhat inline with the NullUtil.setNullDataEntries. was (Author: gopalv): New Column vectors are generally a bad idea - OOB checks are much easier to apply to Long. I'm expecting that we'll add something like a {code} if (checked) { checkBounds(outV, typeRange); } {code} to https://github.com/apache/hive/blob/master/ql/src/gen/vectorization/ExpressionTemplates/ColumnArithmeticColumn.txt This allows the initial operation to be SIMD and the checking to be L1 cache efficient (i.e check & overwrite same memory). > Vectorized execution does not handle integer overflows > ------------------------------------------------------ > > Key: HIVE-18421 > URL: https://issues.apache.org/jira/browse/HIVE-18421 > Project: Hive > Issue Type: Bug > Components: Vectorization > Affects Versions: 2.1.1, 2.2.0, 3.0.0, 2.3.2 > Reporter: Vihang Karajgaonkar > Assignee: Vihang Karajgaonkar > > In vectorized execution arithmetic operations which cause integer overflows > can give wrong results. Issue is reproducible in both Orc and parquet. > Simple test case to reproduce this issue > {noformat} > set hive.vectorized.execution.enabled=true; > create table parquettable (t1 tinyint, t2 tinyint) stored as parquet; > insert into parquettable values (-104, 25), (-112, 24), (54, 9); > select t1, t2, (t1-t2) as diff from parquettable where (t1-t2) < 50 order by > diff desc; > +-------+-----+-------+ > | t1 | t2 | diff | > +-------+-----+-------+ > | -104 | 25 | 127 | > | -112 | 24 | 120 | > | 54 | 9 | 45 | > +-------+-----+-------+ > {noformat} > When vectorization is turned off the same query produces only one row. -- This message was sent by Atlassian JIRA (v6.4.14#64029)