[
https://issues.apache.org/jira/browse/HIVE-18421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16356455#comment-16356455
]
Aihua Xu edited comment on HIVE-18421 at 2/8/18 5:50 PM:
---------------------------------------------------------
[~vihangk1] Sorry for the late reply. I left comment in RB. Basically I don't
follow why we need both CHECKED and UNCHECKED implementations. Seems we should
only have CHECKED one if UNCHECKED one would generate incorrect result. The
user would get incorrect result without notice, right?
Of course, even we want to support UNCHECKED implementation, we should still
error out/fail the query if there is overflow so the user knows to set the flag
to true. BTW: how much performance impact for this and why (don't exactly
follow previous discussion)?
was (Author: aihuaxu):
[~vihangk1] Sorry for the late reply. I left comment in RB. Basically I don't
follow why we need both CHECKED and UNCHECKED implementations. Seems we should
only have CHECKED one if UNCHECKED one would generate incorrect result. The
user would get incorrect result without notice, right?
Of course, even we want to support UNCHECKED implementation, we should error
out/fail the query if there is overflow so the user knows to set the flag to
true. BTW: how much performance impact for this and why (don't exactly follow
previous discussion)?
> Vectorized execution handles overflows in a different manner than
> non-vectorized execution
> ------------------------------------------------------------------------------------------
>
> Key: HIVE-18421
> URL: https://issues.apache.org/jira/browse/HIVE-18421
> Project: Hive
> Issue Type: Bug
> Components: Vectorization
> Affects Versions: 2.1.1, 2.2.0, 3.0.0, 2.3.2
> Reporter: Vihang Karajgaonkar
> Assignee: Vihang Karajgaonkar
> Priority: Major
> Attachments: HIVE-18421.01.patch, HIVE-18421.02.patch,
> HIVE-18421.03.patch, HIVE-18421.04.patch, HIVE-18421.05.patch,
> HIVE-18421.06.patch, HIVE-18421.07.patch
>
>
> In vectorized execution arithmetic operations which cause integer overflows
> can give wrong results. Issue is reproducible in both Orc and parquet.
> Simple test case to reproduce this issue
> {noformat}
> set hive.vectorized.execution.enabled=true;
> create table parquettable (t1 tinyint, t2 tinyint) stored as parquet;
> insert into parquettable values (-104, 25), (-112, 24), (54, 9);
> select t1, t2, (t1-t2) as diff from parquettable where (t1-t2) < 50 order by
> diff desc;
> +-------+-----+-------+
> | t1 | t2 | diff |
> +-------+-----+-------+
> | -104 | 25 | 127 |
> | -112 | 24 | 120 |
> | 54 | 9 | 45 |
> +-------+-----+-------+
> {noformat}
> When vectorization is turned off the same query produces only one row.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)