[ 
https://issues.apache.org/jira/browse/HIVE-18421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16321249#comment-16321249
 ] 

Vihang Karajgaonkar commented on HIVE-18421:
--------------------------------------------

bq. One option might be to generate 2 sets of vectorization classes: checked 
and unchecked.

Just to clarify what you mean by 2 sets here. Do you mean that vectorcodegen 
will generate one with the overflow checks and another which is unchecked. So 
for example there would be two versions LongColAddLongColumn generated and when 
we instantiate the vectorExpression we check some global config and make sure 
we instantiate the correct variation. How would annotations for 
{{VectorizedExpressions}} look like in that case?

Also, do you think it would be easier to just introduce a ByteColumnVector, 
ShortColumnVector and IntegerColumnVector for tinyint, short and int 
respectively?

> Vectorized execution does not handle integer overflows
> ------------------------------------------------------
>
>                 Key: HIVE-18421
>                 URL: https://issues.apache.org/jira/browse/HIVE-18421
>             Project: Hive
>          Issue Type: Bug
>          Components: Vectorization
>    Affects Versions: 2.1.1, 2.2.0, 3.0.0, 2.3.2
>            Reporter: Vihang Karajgaonkar
>            Assignee: Vihang Karajgaonkar
>
> In vectorized execution arithmetic operations which cause integer overflows 
> can give wrong results. Issue is reproducible in both Orc and parquet.
> Simple test case to reproduce this issue
> {noformat}
> set hive.vectorized.execution.enabled=true;
> create table parquettable (t1 tinyint, t2 tinyint) stored as parquet;
> insert into parquettable values (-104, 25), (-112, 24), (54, 9);
> select t1, t2, (t1-t2) as diff from parquettable where (t1-t2) < 50 order by 
> diff desc;
> +-------+-----+-------+
> |  t1   | t2  | diff  |
> +-------+-----+-------+
> | -104  | 25  | 127   |
> | -112  | 24  | 120   |
> | 54    | 9   | 45    |
> +-------+-----+-------+
> {noformat}
> When vectorization is turned off the same query produces only one row.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to