[PR] [SPARK-46856][PS][TESTS] Apply approximate equality in ewm tests [spark]

via GitHub Wed, 24 Jan 2024 23:14:29 -0800


zhengruifeng opened a new pull request, #44879:
URL: https://github.com/apache/spark/pull/44879


   ### What changes were proposed in this pull request?
   Apply approximate equality in ewm tests
   
   
   ### Why are the changes needed?
   the `ewm` function in Spark is based on `EWM` expression in Scala, do not 
need to compare the result too exactly.
   
   on various envs, some tests may fail like:
   ```
   Traceback (most recent call last):
     File "/home/jenkins/python/pyspark/testing/pandasutils.py", line 91, in 
_assert_pandas_equal
       assert_frame_equal(
     File 
"/databricks/python3/lib/python3.10/site-packages/pandas/_testing/asserters.py",
 line 1308, in assert_frame_equal
       assert_series_equal(
     File 
"/databricks/python3/lib/python3.10/site-packages/pandas/_testing/asserters.py",
 line 1018, in assert_series_equal
       assert_numpy_array_equal(
     File 
"/databricks/python3/lib/python3.10/site-packages/pandas/_testing/asserters.py",
 line 741, in assert_numpy_array_equal
       _raise(left, right, err_msg)
     File 
"/databricks/python3/lib/python3.10/site-packages/pandas/_testing/asserters.py",
 line 735, in _raise
       raise_assert_detail(obj, msg, left, right, index_values=index_values)
     File 
"/databricks/python3/lib/python3.10/site-packages/pandas/_testing/asserters.py",
 line 665, in raise_assert_detail
       raise AssertionError(msg)
   AssertionError: DataFrame.iloc[:, 1] (column name="b") are different
   DataFrame.iloc[:, 1] (column name="b") values are different (25.0 %)
   [index]: [0.9781772871933869, 0.6938842103849581, 0.05954110855254491, 
0.43191250286369276]
   [left]:  [4.0, 2.4615384615384617, 2.848920863309352, 1.5441072688779112]
   [right]: [4.0, 2.4615384615384617, 2.8489208633093526, 
1.5441072688779112]Traceback (most recent call last):
     File "/home/jenkins/python/pyspark/testing/pandasutils.py", line 91, in 
_assert_pandas_equal
       assert_frame_equal(
     File 
"/databricks/python3/lib/python3.10/site-packages/pandas/_testing/asserters.py",
 line 1308, in assert_frame_equal
       assert_series_equal(
     File 
"/databricks/python3/lib/python3.10/site-packages/pandas/_testing/asserters.py",
 line 1018, in assert_series_equal
       assert_numpy_array_equal(
     File 
"/databricks/python3/lib/python3.10/site-packages/pandas/_testing/asserters.py",
 line 741, in assert_numpy_array_equal
       _raise(left, right, err_msg)
     File 
"/databricks/python3/lib/python3.10/site-packages/pandas/_testing/asserters.py",
 line 735, in _raise
       raise_assert_detail(obj, msg, left, right, index_values=index_values)
     File 
"/databricks/python3/lib/python3.10/site-packages/pandas/_testing/asserters.py",
 line 665, in raise_assert_detail
       raise AssertionError(msg)
   AssertionError: DataFrame.iloc[:, 1] (column name="b") are different
   DataFrame.iloc[:, 1] (column name="b") values are different (25.0 %)
   [index]: [0.9781772871933869, 0.6938842103849581, 0.05954110855254491, 
0.43191250286369276]
   [left]:  [4.0, 2.4615384615384617, 2.848920863309352, 1.5441072688779112]
   [right]: [4.0, 2.4615384615384617, 2.8489208633093526, 1.5441072688779112]
   ```
   
   
   ### Does this PR introduce _any_ user-facing change?
   no, test only
   
   
   ### How was this patch tested?
   ci
   
   ### Was this patch authored or co-authored using generative AI tooling?
   no


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[PR] [SPARK-46856][PS][TESTS] Apply approximate equality in ewm tests [spark]

Reply via email to