Hyukjin Kwon created SPARK-51286:
------------------------------------

             Summary: Fix test_with_none_and_nan to to pass with Arrow enabled
                 Key: SPARK-51286
                 URL: https://issues.apache.org/jira/browse/SPARK-51286
             Project: Spark
          Issue Type: Sub-task
          Components: PySpark
    Affects Versions: 4.0.0
            Reporter: Hyukjin Kwon


{code}
======================================================================
FAIL [2.389s]: test_with_none_and_nan 
(pyspark.sql.tests.connect.test_connect_creation.SparkConnectCreationTests.test_with_none_and_nan)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/__w/spark/spark/python/pyspark/testing/pandasutils.py", line 56, in 
_assert_pandas_equal
    assert_frame_equal(
  File "/usr/local/lib/python3.11/dist-packages/pandas/_testing/asserters.py", 
line 1279, in assert_frame_equal
    assert_series_equal(
  File "/usr/local/lib/python3.11/dist-packages/pandas/_testing/asserters.py", 
line 997, in assert_series_equal
    assert_numpy_array_equal(
  File "/usr/local/lib/python3.11/dist-packages/pandas/_testing/asserters.py", 
line 690, in assert_numpy_array_equal
    _raise(left, right, err_msg)
  File "/usr/local/lib/python3.11/dist-packages/pandas/_testing/asserters.py", 
line 684, in _raise
    raise_assert_detail(obj, msg, left, right, index_values=index_values)
  File "/usr/local/lib/python3.11/dist-packages/pandas/_testing/asserters.py", 
line 614, in raise_assert_detail
    raise AssertionError(msg)
AssertionError: DataFrame.iloc[:, 0] (column name="(value <=> NULL)") are 
different

DataFrame.iloc[:, 0] (column name="(value <=> NULL)") values are different 
(33.33333 %)
[index]: [0, 1, 2]
[left]:  [False, False, True]
[right]: [True, False, True]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File 
"/__w/spark/spark/python/pyspark/sql/tests/connect/test_connect_creation.py", 
line 257, in test_with_none_and_nan
    self.assert_eq(
  File "/__w/spark/spark/python/pyspark/testing/pandasutils.py", line 451, in 
assert_eq
    _assert_pandas_equal(lobj, robj, checkExact=check_exact)
  File "/__w/spark/spark/python/pyspark/testing/pandasutils.py", line 65, in 
_assert_pandas_equal
    raise PySparkAssertionError(
pyspark.errors.exceptions.base.PySparkAssertionError: 
[DIFFERENT_PANDAS_DATAFRAME] DataFrames are not almost equal:
Left:
   (value <=> NULL)  (value <=> NaN)  (value <=> 42.0)
(value <=> NULL)    bool
(value <=> NaN)     bool
(value <=> 42.0)    bool
dtype: object
Right:
   (value <=> NULL)  (value <=> NaN)  (value <=> 42.0)
(value <=> NULL)    bool
(value <=> NaN)     bool
(value <=> 42.0)    bool
dtype: object

----------------------------------------------------------------------
Ran 24 tests in 39.580s
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to