[jira] [Commented] (FLINK-3586) Risk of data overflow while use sum/count to calculate AVG value
[ https://issues.apache.org/jira/browse/FLINK-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15300111#comment-15300111 ] ASF GitHub Bot commented on FLINK-3586: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/2024 > Risk of data overflow while use sum/count to calculate AVG value > > > Key: FLINK-3586 > URL: https://issues.apache.org/jira/browse/FLINK-3586 > Project: Flink > Issue Type: Sub-task > Components: Table API >Reporter: Chengxiang Li >Assignee: Fabian Hueske >Priority: Minor > > Now, we use {{(sum: Long, count: Long}} to store AVG partial aggregate data, > which may have data overflow risk, we should use unbounded data type(such as > BigInteger) to store them for necessary data types. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3586) Risk of data overflow while use sum/count to calculate AVG value
[ https://issues.apache.org/jira/browse/FLINK-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15300036#comment-15300036 ] ASF GitHub Bot commented on FLINK-3586: --- Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/2024#issuecomment-221575655 merging > Risk of data overflow while use sum/count to calculate AVG value > > > Key: FLINK-3586 > URL: https://issues.apache.org/jira/browse/FLINK-3586 > Project: Flink > Issue Type: Sub-task > Components: Table API >Reporter: Chengxiang Li >Assignee: Fabian Hueske >Priority: Minor > > Now, we use {{(sum: Long, count: Long}} to store AVG partial aggregate data, > which may have data overflow risk, we should use unbounded data type(such as > BigInteger) to store them for necessary data types. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3586) Risk of data overflow while use sum/count to calculate AVG value
[ https://issues.apache.org/jira/browse/FLINK-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299487#comment-15299487 ] ASF GitHub Bot commented on FLINK-3586: --- Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/2024#issuecomment-221477372 I would like to merge this later today. > Risk of data overflow while use sum/count to calculate AVG value > > > Key: FLINK-3586 > URL: https://issues.apache.org/jira/browse/FLINK-3586 > Project: Flink > Issue Type: Sub-task > Components: Table API >Reporter: Chengxiang Li >Assignee: Fabian Hueske >Priority: Minor > > Now, we use {{(sum: Long, count: Long}} to store AVG partial aggregate data, > which may have data overflow risk, we should use unbounded data type(such as > BigInteger) to store them for necessary data types. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3586) Risk of data overflow while use sum/count to calculate AVG value
[ https://issues.apache.org/jira/browse/FLINK-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15296731#comment-15296731 ] ASF GitHub Bot commented on FLINK-3586: --- GitHub user fhueske opened a pull request: https://github.com/apache/flink/pull/2024 [FLINK-3586] Fix potential overflow of Long AVG aggregation. Fixes a potential overflow of Long `AVG` aggregates in the Table API (intermediate sum is computed using `BigInteger` instead of `Long`). Aggregates are refactored to specify their intermediate types as `TypeInformation` instead of SQL types. Intermediate results are not exposed to Calcite and Flink internal. So SQL types are not required and need to be converted into `TypeInformation` in any case. Adds unit tests for `MIN`, `MAX´, `COUNT`, `SUM`, and `AVG` aggregates. - [X] General - [X] Documentation - No functionality added - Some ScalaDocs extended - [X] Tests & Build - Unit tests for existing Aggregates added You can merge this pull request into a Git repository by running: $ git pull https://github.com/fhueske/flink tableLongAvgOverflow Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/2024.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2024 commit a887d1d7edb2b1b96652ca5021beec123011e03a Author: Fabian HueskeDate: 2016-05-22T14:46:43Z [FLINK-3586] Fix potential overflow of Long AVG aggregation. - Add unit tests for Aggretates. > Risk of data overflow while use sum/count to calculate AVG value > > > Key: FLINK-3586 > URL: https://issues.apache.org/jira/browse/FLINK-3586 > Project: Flink > Issue Type: Sub-task > Components: Table API >Reporter: Chengxiang Li >Assignee: Fabian Hueske >Priority: Minor > > Now, we use {{(sum: Long, count: Long}} to store AVG partial aggregate data, > which may have data overflow risk, we should use unbounded data type(such as > BigInteger) to store them for necessary data types. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3586) Risk of data overflow while use sum/count to calculate AVG value
[ https://issues.apache.org/jira/browse/FLINK-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15295804#comment-15295804 ] Chengxiang Li commented on FLINK-3586: -- Not at all, feel free to take over it, Fabian. > Risk of data overflow while use sum/count to calculate AVG value > > > Key: FLINK-3586 > URL: https://issues.apache.org/jira/browse/FLINK-3586 > Project: Flink > Issue Type: Sub-task > Components: Table API >Reporter: Chengxiang Li >Assignee: Fabian Hueske >Priority: Minor > > Now, we use {{(sum: Long, count: Long}} to store AVG partial aggregate data, > which may have data overflow risk, we should use unbounded data type(such as > BigInteger) to store them for necessary data types. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3586) Risk of data overflow while use sum/count to calculate AVG value
[ https://issues.apache.org/jira/browse/FLINK-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15295591#comment-15295591 ] Fabian Hueske commented on FLINK-3586: -- I have a fix for this issue and would like to take it over. Please let me know if that's not OK with you, [~chengxiang li]. Thanks! > Risk of data overflow while use sum/count to calculate AVG value > > > Key: FLINK-3586 > URL: https://issues.apache.org/jira/browse/FLINK-3586 > Project: Flink > Issue Type: Sub-task > Components: Table API >Reporter: Chengxiang Li >Assignee: Chengxiang Li >Priority: Minor > > Now, we use {{(sum: Long, count: Long}} to store AVG partial aggregate data, > which may have data overflow risk, we should use unbounded data type(such as > BigInteger) to store them for necessary data types. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3586) Risk of data overflow while use sum/count to calculate AVG value
[ https://issues.apache.org/jira/browse/FLINK-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15190652#comment-15190652 ] Fabian Hueske commented on FLINK-3586: -- FLINK-3596 has been resolved. So intermediate types are no longer exposed to Calcite and this issue can now be addressed. > Risk of data overflow while use sum/count to calculate AVG value > > > Key: FLINK-3586 > URL: https://issues.apache.org/jira/browse/FLINK-3586 > Project: Flink > Issue Type: Sub-task > Components: Table API >Reporter: Chengxiang Li >Priority: Minor > > Now, we use {{(sum: Long, count: Long}} to store AVG partial aggregate data, > which may have data overflow risk, we should use unbounded data type(such as > BigInteger) to store them for necessary data types. -- This message was sent by Atlassian JIRA (v6.3.4#6332)