[ 
https://issues.apache.org/jira/browse/HIVE-18421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16321352#comment-16321352
 ] 

Gopal V edited comment on HIVE-18421 at 1/10/18 11:36 PM:
----------------------------------------------------------

New Column vectors are generally a bad idea - OOB checks are much easier to 
apply to Long.

I'm expecting that we'll add something like a 

{code}
if (checked) {
    checkBounds(outV, typeRange);
}
{code}

to 
https://github.com/apache/hive/blob/master/ql/src/gen/vectorization/ExpressionTemplates/ColumnArithmeticColumn.txt

This allows the initial operation to be SIMD and the checking to be L1 cache 
efficient (i.e check & overwrite same memory).

This is somewhat inline with the NullUtil.setNullDataEntries.


was (Author: gopalv):
New Column vectors are generally a bad idea - OOB checks are much easier to 
apply to Long.

I'm expecting that we'll add something like a 

{code}
if (checked) {
    checkBounds(outV, typeRange);
}
{code}

to 
https://github.com/apache/hive/blob/master/ql/src/gen/vectorization/ExpressionTemplates/ColumnArithmeticColumn.txt

This allows the initial operation to be SIMD and the checking to be L1 cache 
efficient (i.e check & overwrite same memory).

> Vectorized execution does not handle integer overflows
> ------------------------------------------------------
>
>                 Key: HIVE-18421
>                 URL: https://issues.apache.org/jira/browse/HIVE-18421
>             Project: Hive
>          Issue Type: Bug
>          Components: Vectorization
>    Affects Versions: 2.1.1, 2.2.0, 3.0.0, 2.3.2
>            Reporter: Vihang Karajgaonkar
>            Assignee: Vihang Karajgaonkar
>
> In vectorized execution arithmetic operations which cause integer overflows 
> can give wrong results. Issue is reproducible in both Orc and parquet.
> Simple test case to reproduce this issue
> {noformat}
> set hive.vectorized.execution.enabled=true;
> create table parquettable (t1 tinyint, t2 tinyint) stored as parquet;
> insert into parquettable values (-104, 25), (-112, 24), (54, 9);
> select t1, t2, (t1-t2) as diff from parquettable where (t1-t2) < 50 order by 
> diff desc;
> +-------+-----+-------+
> |  t1   | t2  | diff  |
> +-------+-----+-------+
> | -104  | 25  | 127   |
> | -112  | 24  | 120   |
> | 54    | 9   | 45    |
> +-------+-----+-------+
> {noformat}
> When vectorization is turned off the same query produces only one row.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to