Re: [PR] [SPARK-55363][PS][TESTS] Make ops tests with "decimal_nan" columns ignore NaN vs. None [spark]

via GitHub Wed, 04 Feb 2026 17:05:28 -0800


gaogaotiantian commented on code in PR #54146:
URL: https://github.com/apache/spark/pull/54146#discussion_r2766612761



##########
python/pyspark/pandas/tests/data_type_ops/testing_utils.py:
##########
@@ -219,3 +220,6 @@ def check_extension(self, left, right):
         pandas versions. Please refer to 
https://github.com/pandas-dev/pandas/issues/39410.
         """
         self.assert_eq(left, right)
+
+    def ignore_null(self, col):
+        return LooseVersion(pd.__version__) >= LooseVersion("3.0") and col == 
"decimal_nan"

Review Comment:
   Okay so for these tests, we are doing operations on a "decimal column" - 
which is really just `object` in pandas because pandas does not have a decimal 
dtype. The psdf output, unfortunately, also has type `object` because `object` 
+ anything is `object`. So there is no way for us to know that we should 
convert this `None` to `np.nan`.
   
   Then I guess this change is fine - we should ignore the null differences 
from operation of "decimal" data - which is just `object`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-55363][PS][TESTS] Make ops tests with "decimal_nan" columns ignore NaN vs. None [spark]

Reply via email to