Github user ajithme commented on a diff in the pull request: https://github.com/apache/spark/pull/22401#discussion_r217038517 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Average.scala --- @@ -36,7 +36,13 @@ abstract class AverageLike(child: Expression) extends DeclarativeAggregate { } private lazy val sumDataType = child.dataType match { - case _ @ DecimalType.Fixed(p, s) => DecimalType.bounded(p + 10, s) + /* + * In case of sum of decimal ( assuming another decimal of same precision and scale) + * Refer : org.apache.spark.sql.catalyst.analysis.DecimalPrecision + * Precision : max(s1, s2) + max(p1 - s1, p2 - s2) + 1 + * Scale : max(s1, s2) + */ + case _ @ DecimalType.Fixed(p, s) => DecimalType.adjustPrecisionScale(s + (p - s) + 1, s) --- End diff -- well i tested as per your suggestion with my PR : ``` sql("create table if not exists table1(salary decimal(2,1))") (1 to 22).foreach(_ => sql("insert into table1 values(9.1)")) sql("select avg(salary) from table1").show(false) +-----------+ |avg(salary)| +-----------+ |9.10000 | +-----------+ ``` which is expected result and i don't see a overflow as divide will readjust precision. Can you test with my patch for a overflow specifically in case of average.?
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org