zhengruifeng opened a new pull request, #44879:
URL: https://github.com/apache/spark/pull/44879
### What changes were proposed in this pull request?
Apply approximate equality in ewm tests
### Why are the changes needed?
the `ewm` function in Spark is based on `EWM` expression in Scala, do not
need to compare the result too exactly.
on various envs, some tests may fail like:
```
Traceback (most recent call last):
File "/home/jenkins/python/pyspark/testing/pandasutils.py", line 91, in
_assert_pandas_equal
assert_frame_equal(
File
"/databricks/python3/lib/python3.10/site-packages/pandas/_testing/asserters.py",
line 1308, in assert_frame_equal
assert_series_equal(
File
"/databricks/python3/lib/python3.10/site-packages/pandas/_testing/asserters.py",
line 1018, in assert_series_equal
assert_numpy_array_equal(
File
"/databricks/python3/lib/python3.10/site-packages/pandas/_testing/asserters.py",
line 741, in assert_numpy_array_equal
_raise(left, right, err_msg)
File
"/databricks/python3/lib/python3.10/site-packages/pandas/_testing/asserters.py",
line 735, in _raise
raise_assert_detail(obj, msg, left, right, index_values=index_values)
File
"/databricks/python3/lib/python3.10/site-packages/pandas/_testing/asserters.py",
line 665, in raise_assert_detail
raise AssertionError(msg)
AssertionError: DataFrame.iloc[:, 1] (column name="b") are different
DataFrame.iloc[:, 1] (column name="b") values are different (25.0 %)
[index]: [0.9781772871933869, 0.6938842103849581, 0.05954110855254491,
0.43191250286369276]
[left]: [4.0, 2.4615384615384617, 2.848920863309352, 1.5441072688779112]
[right]: [4.0, 2.4615384615384617, 2.8489208633093526,
1.5441072688779112]Traceback (most recent call last):
File "/home/jenkins/python/pyspark/testing/pandasutils.py", line 91, in
_assert_pandas_equal
assert_frame_equal(
File
"/databricks/python3/lib/python3.10/site-packages/pandas/_testing/asserters.py",
line 1308, in assert_frame_equal
assert_series_equal(
File
"/databricks/python3/lib/python3.10/site-packages/pandas/_testing/asserters.py",
line 1018, in assert_series_equal
assert_numpy_array_equal(
File
"/databricks/python3/lib/python3.10/site-packages/pandas/_testing/asserters.py",
line 741, in assert_numpy_array_equal
_raise(left, right, err_msg)
File
"/databricks/python3/lib/python3.10/site-packages/pandas/_testing/asserters.py",
line 735, in _raise
raise_assert_detail(obj, msg, left, right, index_values=index_values)
File
"/databricks/python3/lib/python3.10/site-packages/pandas/_testing/asserters.py",
line 665, in raise_assert_detail
raise AssertionError(msg)
AssertionError: DataFrame.iloc[:, 1] (column name="b") are different
DataFrame.iloc[:, 1] (column name="b") values are different (25.0 %)
[index]: [0.9781772871933869, 0.6938842103849581, 0.05954110855254491,
0.43191250286369276]
[left]: [4.0, 2.4615384615384617, 2.848920863309352, 1.5441072688779112]
[right]: [4.0, 2.4615384615384617, 2.8489208633093526, 1.5441072688779112]
```
### Does this PR introduce _any_ user-facing change?
no, test only
### How was this patch tested?
ci
### Was this patch authored or co-authored using generative AI tooling?
no
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]