gaogaotiantian commented on code in PR #54146:
URL: https://github.com/apache/spark/pull/54146#discussion_r2766612761


##########
python/pyspark/pandas/tests/data_type_ops/testing_utils.py:
##########
@@ -219,3 +220,6 @@ def check_extension(self, left, right):
         pandas versions. Please refer to 
https://github.com/pandas-dev/pandas/issues/39410.
         """
         self.assert_eq(left, right)
+
+    def ignore_null(self, col):
+        return LooseVersion(pd.__version__) >= LooseVersion("3.0") and col == 
"decimal_nan"

Review Comment:
   Okay so for these tests, we are doing operations on a "decimal column" - 
which is really just `object` in pandas because pandas does not have a decimal 
dtype. The psdf output, unfortunately, also has type `object` because `object` 
+ anything is `object`. So there is no way for us to know that we should 
convert this `None` to `np.nan`.
   
   Then I guess this change is fine - we should ignore the null differences 
from operation of "decimal" data - which is just `object`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to