[
https://issues.apache.org/jira/browse/SPARK-28610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16921471#comment-16921471
]
Marco Gaido commented on SPARK-28610:
-------------------------------------
Hi [~Gengliang.Wang]. That's a different thing. you are doing 3 {{Add}}s and
then a sum of 1 number. To reproduce this, you should create a table with 3
rows with those vlues and sum them. Thanks.
> Support larger buffer for sum of long
> -------------------------------------
>
> Key: SPARK-28610
> URL: https://issues.apache.org/jira/browse/SPARK-28610
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 3.0.0
> Reporter: Marco Gaido
> Priority: Major
>
> The sum of a long field currently uses a buffer of type long.
> When the flag for throwing exceptions on overflow for arithmetic operations
> in turned on, this is a problem in case there are intermediate overflows
> which are then resolved by other rows. Indeed, in such a case, we are
> throwing an exception, while the result is representable in a long value. An
> example of this issue can be seen running:
> {code}
> val df = sc.parallelize(Seq(100L, Long.MaxValue, -1000L)).toDF("a")
> df.select(sum($"a")).show()
> {code}
> According to [~cloud_fan]'s suggestion in
> https://github.com/apache/spark/pull/21599, we should introduce a flag in
> order to let users choose among a wider datatype for the sum buffer using a
> config, so that the above issue can be fixed.
--
This message was sent by Atlassian Jira
(v8.3.2#803003)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]