kecookier opened a new issue, #8985:
URL: https://github.com/apache/incubator-gluten/issues/8985
### Backend
VL (Velox)
### Bug description
Sum of double does not support the `Associative Law`. In Velox, when the
flush of Partial HashAgg is triggered, the order of double values is changed,
causing the result to be unstable. While Vanilla Spark does not have a flush,
when the input is ordered, the result is stable. Currently, in our production
ETL, we have something like `select c1, cast(sum(c2) as bigint) from t1 group
by c1`. The result from Gluten is not equal to that from Vanilla Spark.
Sum of double does not support the `Associative Law`, a simple test shows
this:
```Java
public static void test3() {
double a = 24.621, b = 12.14, c = 0.169, d = 6.865, e = 1.879, f =
16.326;
double sum1 = ((((a + f) + b) + d) + e) + c;
double sum2 = ((((c + e) + d) + b) + f) + a;
double sum3 = ((a + c) + (f + e)) + (b + d);
double sum4 = (a + b) + (c + d) + (e + f);
System.out.printf("Sum1: %.15f, as long: %d%n", sum1, (long)sum1);
System.out.printf("Sum2: %.15f, as long: %d%n", sum2, (long)sum2);
System.out.printf("Sum3: %.15f, as long: %d%n", sum3, (long)sum3);
System.out.printf("Sum4: %.15f, as long: %d%n", sum4, (long)sum4);
}
/*
Sum1: 62.000000000000000, as long: 62
Sum2: 62.000000000000000, as long: 62
Sum3: 62.000000000000010, as long: 62
Sum4: 61.999999999999990, as long: 61
*/
```
To reproduce the issue with a Gluten test:
```Scala
test("flushable aggregate rule - disable when double sum") {
withSQLConf(
"spark.gluten.sql.columnar.backend.velox.maxPartialAggregationMemory"
-> "100",
"spark.gluten.sql.columnar.backend.velox.resizeBatches.shuffleInput"
-> "false",
"spark.gluten.sql.columnar.maxBatchSize" -> "2"
) {
withTempView("t1") {
import testImplicits._
Seq((24.621d, 1), (12.14d, 1), (0.169d, 1), (6.865d, 1), (1.879d,
1), (16.326d, 1))
.toDF("c1", "c2")
.createOrReplaceTempView("t1")
runQueryAndCompare("select c2, cast(sum(c1) as bigint) from t1 group
by c2") {
df =>
{
assert(
getExecutedPlan(df).count(
plan => {
plan.isInstanceOf[RegularHashAggregateExecTransformer]
}) == 1)
assert(
getExecutedPlan(df).count(
plan => {
plan.isInstanceOf[FlushableHashAggregateExecTransformer]
}) == 1)
}
}
}
}
}
```
### Spark version
Spark-3.5.x
### Spark configurations
_No response_
### System information
_No response_
### Relevant logs
```bash
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]