sjhddh opened a new issue, #22812: URL: https://github.com/apache/datafusion/issues/22812
### Describe the bug The Spark-compatible `round()` function gives different results from Apache Spark when the input is a floating-point type (`FloatType`/`DoubleType`) and the value's binary representation is slightly off from its decimal literal. Spark's `RoundBase` rounds a double as `BigDecimal(d).setScale(scale, HALF_UP)`, where `BigDecimal(Double)` is `java.math.BigDecimal.valueOf(d)` — i.e. it parses the *shortest round-trip decimal string* of the double (`Double.toString`). DataFusion's `round_float` instead does naive binary-float arithmetic, `(value * 10^scale).round() / 10^scale`, which rounds the already-imprecise binary value and diverges at the half-way point. ### To Reproduce ```sql SELECT round(1.255::double, 2::int); -- Spark: 1.26 -- DataFusion: 1.25 SELECT round(1.005::double, 2::int); -- Spark: 1.01 -- DataFusion: 1.0 ``` The cause is that `1.255` and `1.005` are stored as binary doubles a hair below the decimal value (`1.2549999999999999...`, `1.00499999999999989...`). Spark sees the shortest decimal string (`"1.255"`, `"1.005"`) and applies HALF_UP, so the tie rounds away from zero. DataFusion multiplies the raw binary value by `100`, which stays below the half-way point, and rounds down. ### Expected behavior Match Spark: round via the shortest round-trip decimal representation with HALF_UP (ties away from zero), for both `DoubleType` and `FloatType` (Spark widens float to double first via `f.toDouble`). ### Additional context The existing doc comment on `round_float` already describes the intended `BigDecimal` / HALF_UP behaviour; the implementation simply doesn't match it. I have a fix and will open a PR referencing this issue. `datafusion/spark/src/function/math/round.rs` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
