This is an automated email from the ASF dual-hosted git repository. xinrong pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push: new f349192dec3c [SPARK-52331][PS][TESTS] Adjust test for promotion from float32 to float64 during division f349192dec3c is described below commit f349192dec3ca03f332780a6caee52a224120dba Author: Xinrong Meng <xinr...@apache.org> AuthorDate: Tue May 27 17:31:51 2025 -0700 [SPARK-52331][PS][TESTS] Adjust test for promotion from float32 to float64 during division ### What changes were proposed in this pull request? Adjust test for promotion bug from float32 to float64 during division ### Why are the changes needed? Pass nightly build with ANSI off. Part of https://issues.apache.org/jira/browse/SPARK-52169. The promotion bug is shown as below: ``` >>> ps.set_option("compute.fail_on_ansi_mode", False) >>> spark.conf.set("spark.sql.ansi.enabled", False) >>> >>> import pandas as pd >>> import numpy as np >>> pdf = pd.DataFrame( ... { ... "a": [1.0, -1.0, 0.0, np.nan], ... "b": [0.0, 0.0, 0.0, 0.0], ... }, ... dtype=np.float32, ... ) >>> >>> psdf = ps.from_pandas(pdf) >>> >>> psdf["a"] / psdf["b"] 0 inf 1 -inf 2 NaN 3 NaN dtype: float64 >>> >>> pdf["a"] / pdf["b"] 0 inf 1 -inf 2 NaN 3 NaN dtype: float32 ``` ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Test changes only. ``` % SPARK_ANSI_SQL_MODE=false ./python/run-tests --python-executables=python3.10 --testnames "pyspark.pandas.tests.computation.test_binary_ops FrameBinaryOpsTests.test_divide_by_zero_behavior" Running PySpark tests. Output is in /Users/xinrong.meng/spark/python/unit-tests.log Will test against the following Python executables: ['python3.10'] Will test the following Python tests: ['pyspark.pandas.tests.computation.test_binary_ops FrameBinaryOpsTests.test_divide_by_zero_behavior'] python3.10 python_implementation is CPython python3.10 version is: Python 3.10.16 Starting test(python3.10): pyspark.pandas.tests.computation.test_binary_ops FrameBinaryOpsTests.test_divide_by_zero_behavior (temp output: /Users/xinrong.meng/spark/python/target/f26cd9b9-f6c3-48ec-86f1-d1d7f6158361/python3.10__pyspark.pandas.tests.computation.test_binary_ops_FrameBinaryOpsTests.test_divide_by_zero_behavior__wrk8yuzn.log) Finished test(python3.10): pyspark.pandas.tests.computation.test_binary_ops FrameBinaryOpsTests.test_divide_by_zero_behavior (5s) Tests passed in 5 seconds ``` ### Was this patch authored or co-authored using generative AI tooling? No. Closes #51035 from xinrong-meng/etest_promo. Authored-by: Xinrong Meng <xinr...@apache.org> Signed-off-by: Xinrong Meng <xinr...@apache.org> --- .../pandas/tests/computation/test_binary_ops.py | 32 +++++++++++++++------- 1 file changed, 22 insertions(+), 10 deletions(-) diff --git a/python/pyspark/pandas/tests/computation/test_binary_ops.py b/python/pyspark/pandas/tests/computation/test_binary_ops.py index 1708916ee8d2..c2a773e1c2a8 100644 --- a/python/pyspark/pandas/tests/computation/test_binary_ops.py +++ b/python/pyspark/pandas/tests/computation/test_binary_ops.py @@ -114,17 +114,29 @@ class FrameBinaryOpsMixin: @unittest.skipIf(is_ansi_mode_test, ansi_mode_not_supported_message) def test_divide_by_zero_behavior(self): # float / float - for dtype in [np.float32, np.float64]: - pdf = pd.DataFrame( - { - "a": [1.0, -1.0, 0.0, np.nan], - "b": [0.0, 0.0, 0.0, 0.0], - }, - dtype=dtype, - ) - psdf = ps.from_pandas(pdf) + # np.float32 + pdf = pd.DataFrame( + { + "a": [1.0, -1.0, 0.0, np.nan], + "b": [0.0, 0.0, 0.0, 0.0], + }, + dtype=np.float32, + ) + psdf = ps.from_pandas(pdf) + # TODO(SPARK-52332): Fix promotion from float32 to float64 during division + self.assert_eq(psdf["a"] / psdf["b"], (pdf["a"] / pdf["b"]).astype(np.float64)) - self.assert_eq(psdf["a"] / psdf["b"], pdf["a"] / pdf["b"]) + # np.float64 + pdf = pd.DataFrame( + { + "a": [1.0, -1.0, 0.0, np.nan], + "b": [0.0, 0.0, 0.0, 0.0], + }, + dtype=np.float64, + ) + psdf = ps.from_pandas(pdf) + + self.assert_eq(psdf["a"] / psdf["b"], pdf["a"] / pdf["b"]) # int / int for dtype in [np.int32, np.int64]: --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org