[jira] [Commented] (FLINK-21125) Sum or Sum0 overflow quietly

Sebastian Liu (Jira) Tue, 26 Jan 2021 02:49:07 -0800


    [ 
https://issues.apache.org/jira/browse/FLINK-21125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17272027#comment-17272027
 ]


Sebastian Liu commented on FLINK-21125:
---------------------------------------

Hi [~jark],

Thx a lot for your quickly reply. I also think we should fix this semantic 
issue ASAP. And I agree that one option is necessary. Meanwhile should not make 
the long-running streaming job keep failing and can't skip if the overflow 
happened. IMO, I suggest that we fix this in the following ways:
 * Do not do the type auto-cast to long like presto, even through we provide a 
configuration for the user to know: The auto-cast operation may change the 
final output type. e.g. we have a sql of "insert into xx_table select sum(y) 
from yy_table;", if sum is auto-casted to be long type, we may also need to 
change the related schema of xx_table.  In addition to this example, it‘s still 
possible to overflow even through the type is casted to be long. I think the 
core should be to guide users to design reasonable schemas and calculation 
methods. Change the aggregate function accumulate data type for user is a 
little tricky.
 * Based on the current implementation, we should catch the overflow exception 
and have a configuration, such as a bool option of _strict_sql_mode_enabled_ to 
control the behavior. When the _strict_sql_mode_enabled_ set to be true, flink 
runtime will catch and throw the overflow exception, which will make the sql 
job failed and restart. When the _strict_sql_mode_enabled_ set to be false, 
flink runtime will catch the overflow exception and just print a warning log 
for helping user to skip this error temporary.  Meanwhile, the default value of 
_strict_sql_mode_enabled_ should be true, which can help user know the 
potential overflow issue and skip this error temporary. 
 * In the end, when an overflow exception occurs, we should keep accumulate 
value at the value before this exception, instead of the overflowed value.

 

What do you think about of this? Looking forward for your feedback.

> Sum or Sum0 overflow quietly
> ----------------------------
>
>                 Key: FLINK-21125
>                 URL: https://issues.apache.org/jira/browse/FLINK-21125
>             Project: Flink
>          Issue Type: Bug
>          Components: Table SQL / Planner
>            Reporter: Sebastian Liu
>            Priority: Major
>
> Overflow is not calculated correctly in the build-in sum function of Blink 
> planner.
> For a aggregate calculation such as:
> {code:java}
> CREATE TABLE TestTable (
>   amount INT
> );
> insert into TestTable (2147483647);
> insert into TestTable (1);
> SELECT sum(amount) FROM TestTable;
> The result will be: -2147483648, which is an overflowed value and no 
> exception was thrown. {code}
> The overflow occurs quietly and is difficult to detect. 
> Compare the processing semantics of other systems：
>  * *mysql*: provide two SQL mode for handling overflow. If strict SQL mode is 
> enabled, MySQL rejects the out-of-range value with an error, and the insert 
> fails, in accordance with the SQL standard. If no restrictive modes are 
> enabled, MySQL clips the value to the appropriate endpoint of the column data 
> type range and stores the resulting value instead. FYI: 
> [https://dev.mysql.com/doc/refman/8.0/en/out-of-range-and-overflow.html]
>  * *presto*: all numeric types are automatically cast to Long type, and If 
> the long is out of range, an exception is thrown to prompt user.
> IMO, exception hint is necessary, instead of quietly overflow. 
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-21125) Sum or Sum0 overflow quietly

Reply via email to