(spark) branch master updated: [SPARK-52331][PS][TESTS] Adjust test for promotion from float32 to float64 during division

xinrong Tue, 27 May 2025 18:10:57 -0700

This is an automated email from the ASF dual-hosted git repository.

xinrong pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/master by this push:
     new f349192dec3c [SPARK-52331][PS][TESTS] Adjust test for promotion from 
float32 to float64 during division
f349192dec3c is described below

commit f349192dec3ca03f332780a6caee52a224120dba
Author: Xinrong Meng <xinr...@apache.org>
AuthorDate: Tue May 27 17:31:51 2025 -0700

    [SPARK-52331][PS][TESTS] Adjust test for promotion from float32 to float64 
during division
    
    ### What changes were proposed in this pull request?
    Adjust test for promotion bug from float32 to float64 during division
    
    ### Why are the changes needed?
    Pass nightly build with ANSI off.
    Part of https://issues.apache.org/jira/browse/SPARK-52169.
    
    The promotion bug is shown as below:
    ```
    >>> ps.set_option("compute.fail_on_ansi_mode", False)
    >>> spark.conf.set("spark.sql.ansi.enabled", False)
    >>>
    >>> import pandas as pd
    >>> import numpy as np
    >>> pdf = pd.DataFrame(
    ...     {
    ...         "a": [1.0, -1.0, 0.0, np.nan],
    ...         "b": [0.0, 0.0, 0.0, 0.0],
    ...     },
    ...     dtype=np.float32,
    ... )
    >>>
    >>> psdf = ps.from_pandas(pdf)
    >>>
    >>> psdf["a"] / psdf["b"]
    0    inf
    1   -inf
    2    NaN
    3    NaN
    dtype: float64
    >>>
    >>> pdf["a"] / pdf["b"]
    0    inf
    1   -inf
    2    NaN
    3    NaN
    dtype: float32
    ```
    
    ### Does this PR introduce _any_ user-facing change?
    No.
    
    ### How was this patch tested?
    Test changes only.
    
    ```
    % SPARK_ANSI_SQL_MODE=false  ./python/run-tests 
--python-executables=python3.10 --testnames 
"pyspark.pandas.tests.computation.test_binary_ops 
FrameBinaryOpsTests.test_divide_by_zero_behavior"
    Running PySpark tests. Output is in 
/Users/xinrong.meng/spark/python/unit-tests.log
    Will test against the following Python executables: ['python3.10']
    Will test the following Python tests: 
['pyspark.pandas.tests.computation.test_binary_ops 
FrameBinaryOpsTests.test_divide_by_zero_behavior']
    python3.10 python_implementation is CPython
    python3.10 version is: Python 3.10.16
    Starting test(python3.10): pyspark.pandas.tests.computation.test_binary_ops 
FrameBinaryOpsTests.test_divide_by_zero_behavior (temp output: 
/Users/xinrong.meng/spark/python/target/f26cd9b9-f6c3-48ec-86f1-d1d7f6158361/python3.10__pyspark.pandas.tests.computation.test_binary_ops_FrameBinaryOpsTests.test_divide_by_zero_behavior__wrk8yuzn.log)
    Finished test(python3.10): pyspark.pandas.tests.computation.test_binary_ops 
FrameBinaryOpsTests.test_divide_by_zero_behavior (5s)
    Tests passed in 5 seconds
    ```
    
    ### Was this patch authored or co-authored using generative AI tooling?
    No.
    
    Closes #51035 from xinrong-meng/etest_promo.
    
    Authored-by: Xinrong Meng <xinr...@apache.org>
    Signed-off-by: Xinrong Meng <xinr...@apache.org>
---
 .../pandas/tests/computation/test_binary_ops.py    | 32 +++++++++++++++-------
 1 file changed, 22 insertions(+), 10 deletions(-)

diff --git a/python/pyspark/pandas/tests/computation/test_binary_ops.py 
b/python/pyspark/pandas/tests/computation/test_binary_ops.py
index 1708916ee8d2..c2a773e1c2a8 100644
--- a/python/pyspark/pandas/tests/computation/test_binary_ops.py
+++ b/python/pyspark/pandas/tests/computation/test_binary_ops.py
@@ -114,17 +114,29 @@ class FrameBinaryOpsMixin:
     @unittest.skipIf(is_ansi_mode_test, ansi_mode_not_supported_message)
     def test_divide_by_zero_behavior(self):
         # float / float
-        for dtype in [np.float32, np.float64]:
-            pdf = pd.DataFrame(
-                {
-                    "a": [1.0, -1.0, 0.0, np.nan],
-                    "b": [0.0, 0.0, 0.0, 0.0],
-                },
-                dtype=dtype,
-            )
-            psdf = ps.from_pandas(pdf)
+        # np.float32
+        pdf = pd.DataFrame(
+            {
+                "a": [1.0, -1.0, 0.0, np.nan],
+                "b": [0.0, 0.0, 0.0, 0.0],
+            },
+            dtype=np.float32,
+        )
+        psdf = ps.from_pandas(pdf)
+        # TODO(SPARK-52332): Fix promotion from float32 to float64 during 
division
+        self.assert_eq(psdf["a"] / psdf["b"], (pdf["a"] / 
pdf["b"]).astype(np.float64))
 
-            self.assert_eq(psdf["a"] / psdf["b"], pdf["a"] / pdf["b"])
+        # np.float64
+        pdf = pd.DataFrame(
+            {
+                "a": [1.0, -1.0, 0.0, np.nan],
+                "b": [0.0, 0.0, 0.0, 0.0],
+            },
+            dtype=np.float64,
+        )
+        psdf = ps.from_pandas(pdf)
+
+        self.assert_eq(psdf["a"] / psdf["b"], pdf["a"] / pdf["b"])
 
         # int / int
         for dtype in [np.int32, np.int64]:


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

(spark) branch master updated: [SPARK-52331][PS][TESTS] Adjust test for promotion from float32 to float64 during division

Reply via email to