yjshen opened a new issue, #2455:
URL: https://github.com/apache/arrow-datafusion/issues/2455
**Describe the bug**
A simple sum over Int8 column panicked at "attempt to add with overflow".
**To Reproduce**
#[tokio::test]
async fn csv_query_array_agg_simple() -> Result<()> {
let ctx = SessionContext::new();
register_aggregate_csv(&ctx).await?;
let sql =
"select c2, sum(c3) sum_c3";
let actual = execute_to_batches(&ctx, sql).await;
let expected = vec![
"+--------+",
"| sum_c3 |",
"+--------+",
"| TBD |",
"+--------+",
];
assert_batches_eq!(expected, &actual);
Ok(())
}
**Expected behavior**
The overflow should return null or wrapping around at the boundary
**Additional context**
Apache Spark provides two versions of sum: one named `Sum` that wraps around
the boundary, and `TrySum` that returns null on overflow.
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Sum.scala#L159-L255
Also, Spark uses several common types as sum result types:
```scala
protected lazy val resultType = child.dataType match {
case DecimalType.Fixed(precision, scale) =>
DecimalType.bounded(precision + 10, scale)
case _: IntegralType => LongType
case it: YearMonthIntervalType => it
case it: DayTimeIntervalType => it
case _ => DoubleType
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]