[
https://issues.apache.org/jira/browse/HIVE-18421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16356213#comment-16356213
]
Vihang Karajgaonkar commented on HIVE-18421:
--------------------------------------------
[~mmccline] would you be able to take a look at this? This patch introduces new
checked vector expressions and uses them when we set the newly introduced
config {{hive.vectorized.use.checked.expressions}}. I introduced checked
expressions for the arithmetic operators and some others where I could see the
issue could cause different results after overflow based on my analysis.
> Vectorized execution handles overflows in a different manner than
> non-vectorized execution
> ------------------------------------------------------------------------------------------
>
> Key: HIVE-18421
> URL: https://issues.apache.org/jira/browse/HIVE-18421
> Project: Hive
> Issue Type: Bug
> Components: Vectorization
> Affects Versions: 2.1.1, 2.2.0, 3.0.0, 2.3.2
> Reporter: Vihang Karajgaonkar
> Assignee: Vihang Karajgaonkar
> Priority: Major
> Attachments: HIVE-18421.01.patch, HIVE-18421.02.patch,
> HIVE-18421.03.patch, HIVE-18421.04.patch, HIVE-18421.05.patch,
> HIVE-18421.06.patch, HIVE-18421.07.patch
>
>
> In vectorized execution arithmetic operations which cause integer overflows
> can give wrong results. Issue is reproducible in both Orc and parquet.
> Simple test case to reproduce this issue
> {noformat}
> set hive.vectorized.execution.enabled=true;
> create table parquettable (t1 tinyint, t2 tinyint) stored as parquet;
> insert into parquettable values (-104, 25), (-112, 24), (54, 9);
> select t1, t2, (t1-t2) as diff from parquettable where (t1-t2) < 50 order by
> diff desc;
> +-------+-----+-------+
> | t1 | t2 | diff |
> +-------+-----+-------+
> | -104 | 25 | 127 |
> | -112 | 24 | 120 |
> | 54 | 9 | 45 |
> +-------+-----+-------+
> {noformat}
> When vectorization is turned off the same query produces only one row.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)