Devin Petersohn created SPARK-55818:
---------------------------------------
Summary: Decimal-float mixed arithmetic should always raise
TypeError
Key: SPARK-55818
URL: https://issues.apache.org/jira/browse/SPARK-55818
Project: Spark
Issue Type: Bug
Components: Pandas API on Spark
Affects Versions: 4.1.1
Reporter: Devin Petersohn
When performing arithmetic between a float Series and a decimal.Decimal scalar,
behavior changes depending on spark.sql.ansi.enabled:
* ANSI OFF: silently computes (potentially incorrect precision)
* ANSI ON: raises TypeError: Multiplication can not be applied to given types
* pandas: raises TypeError: unsupported operand type(s) for *
The ANSI ON behavior matches pandas. In order to match pandas, the TypeError
should always be raised regardless of ANSI mode.
Reproduction:
{code:java}
import decimal
import pyspark.pandas as ps
spark.conf.set("spark.sql.ansi.enabled", "false")
ps.Series([1.0, 2.0, 3.0]) * decimal.Decimal("1.5") # returns [1.5, 3.0, 4.5]
spark.conf.set("spark.sql.ansi.enabled", "true")
ps.Series([1.0, 2.0, 3.0]) * decimal.Decimal("1.5") # raises TypeError
# pandas
import pandas as pd
pd.Series([1.0, 2.0, 3.0]) * decimal.Decimal("1.5") # raises TypeError
{code}
This is a child of SPARK-55791.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]