Hyukjin Kwon created SPARK-51286:
------------------------------------
Summary: Fix test_with_none_and_nan to to pass with Arrow enabled
Key: SPARK-51286
URL: https://issues.apache.org/jira/browse/SPARK-51286
Project: Spark
Issue Type: Sub-task
Components: PySpark
Affects Versions: 4.0.0
Reporter: Hyukjin Kwon
{code}
======================================================================
FAIL [2.389s]: test_with_none_and_nan
(pyspark.sql.tests.connect.test_connect_creation.SparkConnectCreationTests.test_with_none_and_nan)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/__w/spark/spark/python/pyspark/testing/pandasutils.py", line 56, in
_assert_pandas_equal
assert_frame_equal(
File "/usr/local/lib/python3.11/dist-packages/pandas/_testing/asserters.py",
line 1279, in assert_frame_equal
assert_series_equal(
File "/usr/local/lib/python3.11/dist-packages/pandas/_testing/asserters.py",
line 997, in assert_series_equal
assert_numpy_array_equal(
File "/usr/local/lib/python3.11/dist-packages/pandas/_testing/asserters.py",
line 690, in assert_numpy_array_equal
_raise(left, right, err_msg)
File "/usr/local/lib/python3.11/dist-packages/pandas/_testing/asserters.py",
line 684, in _raise
raise_assert_detail(obj, msg, left, right, index_values=index_values)
File "/usr/local/lib/python3.11/dist-packages/pandas/_testing/asserters.py",
line 614, in raise_assert_detail
raise AssertionError(msg)
AssertionError: DataFrame.iloc[:, 0] (column name="(value <=> NULL)") are
different
DataFrame.iloc[:, 0] (column name="(value <=> NULL)") values are different
(33.33333 %)
[index]: [0, 1, 2]
[left]: [False, False, True]
[right]: [True, False, True]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File
"/__w/spark/spark/python/pyspark/sql/tests/connect/test_connect_creation.py",
line 257, in test_with_none_and_nan
self.assert_eq(
File "/__w/spark/spark/python/pyspark/testing/pandasutils.py", line 451, in
assert_eq
_assert_pandas_equal(lobj, robj, checkExact=check_exact)
File "/__w/spark/spark/python/pyspark/testing/pandasutils.py", line 65, in
_assert_pandas_equal
raise PySparkAssertionError(
pyspark.errors.exceptions.base.PySparkAssertionError:
[DIFFERENT_PANDAS_DATAFRAME] DataFrames are not almost equal:
Left:
(value <=> NULL) (value <=> NaN) (value <=> 42.0)
(value <=> NULL) bool
(value <=> NaN) bool
(value <=> 42.0) bool
dtype: object
Right:
(value <=> NULL) (value <=> NaN) (value <=> 42.0)
(value <=> NULL) bool
(value <=> NaN) bool
(value <=> 42.0) bool
dtype: object
----------------------------------------------------------------------
Ran 24 tests in 39.580s
{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]