bersprockets opened a new pull request, #44099:
URL: https://github.com/apache/spark/pull/44099
### What changes were proposed in this pull request?
In various Pandas aggregate functions, remove each comparison or arithmetic
operation between `DoubleType` and `IntergerType` in `evaluateExpression` and
replace with a comparison or arithmetic operation between `DoubleType` and
`DoubleType`.
Affected functions are `PandasStddev`, `PandasVariance`, `PandasSkewness`,
`PandasKurtosis`, and `PandasCovar`.
### Why are the changes needed?
These functions fail in interpreted mode. For example, `evaluateExpression`
in `PandasKurtosis` compares a double to an integer:
```
If(n < 4, Literal.create(null, DoubleType) ...
```
This results in a boxed double and a boxed integer getting passed to
`SQLOrderingUtil.compareDoubles` which expects two doubles as arguments. The
scala runtime tries to unbox the boxed integer as a double, resulting in an
error.
Reproduction example:
```
spark.sql("set spark.sql.codegen.wholeStage=false")
spark.sql("set spark.sql.codegen.factoryMode=NO_CODEGEN")
import numpy as np
import pandas as pd
import pyspark.pandas as ps
pser = pd.Series([1, 2, 3, 7, 9, 8], index=np.random.rand(6), name="a")
psser = ps.from_pandas(pser)
psser.kurt()
```
See Jira (SPARK-46189) for the other reproduction cases.
This works fine in codegen mode because the integer is already unboxed and
the Java runtime will implictly cast it to a double.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
New unit tests.
### Was this patch authored or co-authored using generative AI tooling?
No.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]