[
https://issues.apache.org/jira/browse/FLINK-21125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17272027#comment-17272027
]
Sebastian Liu commented on FLINK-21125:
---------------------------------------
Hi [~jark],
Thx a lot for your quickly reply. I also think we should fix this semantic
issue ASAP. And I agree that one option is necessary. Meanwhile should not make
the long-running streaming job keep failing and can't skip if the overflow
happened. IMO, I suggest that we fix this in the following ways:
* Do not do the type auto-cast to long like presto, even through we provide a
configuration for the user to know: The auto-cast operation may change the
final output type. e.g. we have a sql of "insert into xx_table select sum(y)
from yy_table;", if sum is auto-casted to be long type, we may also need to
change the related schema of xx_table. In addition to this example, it‘s still
possible to overflow even through the type is casted to be long. I think the
core should be to guide users to design reasonable schemas and calculation
methods. Change the aggregate function accumulate data type for user is a
little tricky.
* Based on the current implementation, we should catch the overflow exception
and have a configuration, such as a bool option of _strict_sql_mode_enabled_ to
control the behavior. When the _strict_sql_mode_enabled_ set to be true, flink
runtime will catch and throw the overflow exception, which will make the sql
job failed and restart. When the _strict_sql_mode_enabled_ set to be false,
flink runtime will catch the overflow exception and just print a warning log
for helping user to skip this error temporary. Meanwhile, the default value of
_strict_sql_mode_enabled_ should be true, which can help user know the
potential overflow issue and skip this error temporary.
* In the end, when an overflow exception occurs, we should keep accumulate
value at the value before this exception, instead of the overflowed value.
What do you think about of this? Looking forward for your feedback.
> Sum or Sum0 overflow quietly
> ----------------------------
>
> Key: FLINK-21125
> URL: https://issues.apache.org/jira/browse/FLINK-21125
> Project: Flink
> Issue Type: Bug
> Components: Table SQL / Planner
> Reporter: Sebastian Liu
> Priority: Major
>
> Overflow is not calculated correctly in the build-in sum function of Blink
> planner.
> For a aggregate calculation such as:
> {code:java}
> CREATE TABLE TestTable (
> amount INT
> );
> insert into TestTable (2147483647);
> insert into TestTable (1);
> SELECT sum(amount) FROM TestTable;
> The result will be: -2147483648, which is an overflowed value and no
> exception was thrown. {code}
> The overflow occurs quietly and is difficult to detect.
> Compare the processing semantics of other systems:
> * *mysql*: provide two SQL mode for handling overflow. If strict SQL mode is
> enabled, MySQL rejects the out-of-range value with an error, and the insert
> fails, in accordance with the SQL standard. If no restrictive modes are
> enabled, MySQL clips the value to the appropriate endpoint of the column data
> type range and stores the resulting value instead. FYI:
> [https://dev.mysql.com/doc/refman/8.0/en/out-of-range-and-overflow.html]
> * *presto*: all numeric types are automatically cast to Long type, and If
> the long is out of range, an exception is thrown to prompt user.
> IMO, exception hint is necessary, instead of quietly overflow.
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)