[ 
https://issues.apache.org/jira/browse/IMPALA-12035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17717113#comment-17717113
 ] 

Daniel Becker commented on IMPALA-12035:
----------------------------------------

The problem seems to be this:
 * in the case when an error is returned, the literal (9999999999999999999999) 
can fit into a DECIMAL and therefore is interpreted as such (see 
https://github.com/apache/impala/blob/c8aa5796d93510723342055cc70cf8d00abae754/fe/src/main/java/org/apache/impala/analysis/NumericLiteral.java#L344)
 * this DECIMAL can't fit in a BIGINT, so the cast fails (see 
https://github.com/apache/impala/blob/0a42185d17164af0c855647de25f1bc87f33ee71/be/src/runtime/decimal-value.inline.h#L124)
 * in the second case with the very big number, the literal can't even fit in a 
DECIMAL, so it is interpreted as a DOUBLE instead; this is not completely 
unreasonable as it fits in the range of DOUBLE, although of course there is 
some loss of precision
 * the DOUBLE can be cast to BIGINT, although the value is incorrect and this 
invokes undefined behaviour (see Floating–integral conversions in 
[https://en.cppreference.com/w/cpp/language/implicit_conversion);] the 
conversion is done here: 
[https://github.com/apache/impala/blob/0a42185d17164af0c855647de25f1bc87f33ee71/be/src/exprs/cast-functions-ir.cc#L61]

This can be seen in the explain statements after {{{}set explain_level=2{}}}:

 
{code:java}
Analyzed query: SELECT CAST(CAST(9999999999999999999999 AS DECIMAL(22,0)) AS 
BIGINT)
VS
Analyzed query: SELECT 
CAST(CAST(999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999
 AS DOUBLE) AS BIGINT)
{code}
Possible solutions:
 # Adding checks to integer-floating point conversions in the BE (both 
directions).
 ** Pros: 
 *** Strictly speaking this would be necessary as these conversions may invoke 
undefined behaviour, see the link above. Conversions to and from DecimalValue 
do have checks.
 ** Cons:
 *** This could lead to performance regression because of the added checks.
 *** Existing use cases could break if we return an error, although in those 
cases the results were already incorrect.
 # Adding checks only for the integer-floating point conversion of 
literals/constant expressions in the FE.
 ## Performance impact is negligible.
 ## Undefined behaviour may be invoked in other cases, as before.

 

> impala-shell accepts very big numbers but fails to store them correctly
> -----------------------------------------------------------------------
>
>                 Key: IMPALA-12035
>                 URL: https://issues.apache.org/jira/browse/IMPALA-12035
>             Project: IMPALA
>          Issue Type: Bug
>            Reporter: Bakai Ádám
>            Assignee: Daniel Becker
>            Priority: Major
>
> I tried  to insert rows very big bigints, and it worked as expected with big 
> integers (error message, no new row stored), but it didn’t work as expected 
> with ridiculously big integers( no error message, stored incorrect value). 
> Here are the commands used:
> {code:java}
> drop TABLE my_first_table2;
> CREATE TABLE my_first_table2
> (
>   id BIGINT,
>   name STRING,
>   PRIMARY KEY(id)
> );
> INSERT INTO my_first_table2 VALUES (cast(9 as BIGINT), "sarah");
> -- this works just fine as expected
> INSERT INTO my_first_table2 VALUES (cast(9999999999999999999999 as BIGINT), 
> "sarah");
> -- ERROR: UDF ERROR: Decimal expression overflowed which is expected since it 
> is over bigint max value (source: 
> https://impala.apache.org/docs/build/plain-html/topics/impala_bigint.html#:~:text=Range%3A%20%2D9223372036854775808%20..,9223372036854775807.
>  )
> INSERT INTO my_first_table2 VALUES 
> (cast(999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999
>  as BIGINT), "sarah");
> -- this succeeds and doesn't throw the same error as the previous command 
> which is concerning
> select * from my_first_table2;
> -- there are two rows in the table, and the id is incorrect in one of them  
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to